Research Interests

Manipulation

Robot Learning

Generative Robotics

Human-Robot Collaboration

Projects

Publications

2025

[J29] Physics-Aware Combinatorial Assembly Sequence Planning using Data-free Action Masking
Ruixuan Liu, Alan Chen, Weiye Zhao and Changliu Liu
IEEE Robotics and Automation Letters, 2025
Citation Formats:
```
    
```
Abstract:

Combinatorial assembly uses standardized unit primitives to build objects that satisfy user specifications. This paper studies assembly sequence planning (ASP) for physical combinatorial assembly. Given the shape of the desired object, the goal is to find a sequence of actions for placing unit primitives to build the target object. In particular, we aim to ensure the planned assembly sequence is physically executable. However, ASP for combinatorial assembly is particularly challenging due to its combinatorial nature. To address the challenge, we employ deep reinforcement learning to learn a construction policy for placing unit primitives sequentially to build the desired object. Specifically, we design an online physics-aware action mask that filters out invalid actions, which effectively guides policy learning and ensures violation-free deployment. In the end, we apply the proposed method to Lego assembly with more than 250 3D structures. The experiment results demonstrate that the proposed method plans physically valid assembly sequences to build all structures, achieving a 100% success rate, whereas the best comparable baseline fails more than 40 structures.

Video:

Teaser:

[C102] Time-Optimal Trajectory Generation with Multi-Level Continuous Kinodynamics Constraints
Ruixuan Liu, Changliu Liu and Jessica Leu
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025
Citation Formats:
```
    
```
Abstract:

Time-optimal trajectory generation (TOTG) is critical in robotics applications to minimize travel time and increase robot task efficiency. To ensure the trajectory is feasible and executable by the robot, it is important to constrain the trajectory kinodynamics subject to the robot actuator limits. A typical actuator has multiple limits, 1) peak limit, and 2) multilevel continuous limits with different operation time windows. The peak limit bounds the instantaneous kinodynamics (IKD), whereas the continuous limits bound the system continuous kinodynamics (CKD). Existing works only constrain IKD, usually by the actuator peak limit, to achieve time optimality. However, a joint capable of operating at its peak limit momentarily will overheat and damage robot life if the motion continues. Alternatively, users can constrain the IKD with a reduced peak limit to avoid violating continuous limits. However, the reduced peak limit would inevitably sacrifice task efficiency. To address the challenge, this paper studies TOTG with both IKD and CKD, and proposes TOTG-C. It formulates the TOTG as a nonlinear programming (NLP). In particular, it proposes a novel formulation to encode the multi-level CKD constraints efficiently. To the best of our knowledge, TOTG-C is the first work that explicitly considers multi-level CKD constraints. We demonstrate the effectiveness and robustness of the proposed TOTG-C both in simulation and real robot experiments.
[C103] Eye-in-Finger: Smart Fingers for Delicate Assembly and Disassembly of LEGO
Zhenran Tang, Ruixuan Liu and Changliu Liu
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025
Citation Formats:
```
    
```
Abstract:

Manipulation and insertion of small and tight-toleranced objects in robotic assembly remain a critical challenge for vision-based robotics systems due to the required precision and cluttered environment. Conventional global or wrist-mounted cameras often suffer from occlusions when either assembling or disassembling from an existing structure. To address the challenge, this paper introduces "Eye-in-Finger", a novel tool design approach that enhances robotic manipulation by embedding low-cost, high-resolution perception directly at the tool tip. We validate our approach using LEGO assembly and disassembly tasks, which require the robot to manipulate in a cluttered environment and achieve sub-millimeter accuracy and robust error correction due to the tight tolerances. Experimental results demonstrate that our proposed system enables real-time, fine corrections to alignment error, increasing the tolerance of calibration error from 0.4mm to up to 2.0mm for the LEGO manipulation robot.
[C106] Generating Physically Stable and Buildable Brick Structures from Text
Ava Pun, Kangle Deng, Ruixuan Liu, Deva Ramanan, Changliu Liu and Jun-Yan Zhu
International Conference on Computer Vision, 2025
Citation Formats:
```
    
```
Abstract:

We introduce LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts. To achieve this, we construct a large-scale, physically stable dataset of LEGO designs, along with their associated captions, and train an autoregressive large language model to predict the next brick to add via next-token prediction. To improve the stability of the resulting designs, we employ an efficient validity check and physics-aware rollback during autoregressive inference, which prunes infeasible token predictions using physics laws and assembly constraints. Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing LEGO designs that align closely with the input text prompts. We also develop a text-based LEGO texturing method to generate colored and textured designs. We show that our designs can be assembled manually by humans and automatically by robotic arms. We also release our new dataset, StableText2Lego, containing over 47,000 LEGO structures of over 28,000 unique 3D objects accompanied by detailed captions, along with our code and models at the project website: this URL.

Teaser:
[C87] Automating Robot Failure Recovery Using Vision-Language Models With Optimized Prompts
Hongyi Chen, Yunchao Yao, Ruixuan Liu, Changliu Liu and Jeffrey Ichnowski
American Control Conference, 2025
Citation Formats:
```
    
```
Abstract:

Current robot autonomy struggles to operate beyond the assumed Operational Design Domain (ODD), the specific set of conditions and environments in which the system is designed to function, while the real-world is rife with uncertainties that may lead to failures. Automating recovery remains a significant challenge. Traditional methods often rely on human intervention to manually address failures or require exhaustive enumeration of failure cases and the design of specific recovery policies for each scenario, both of which are labor-intensive. Foundational Vision-Language Models (VLMs), which demonstrate remarkable common-sense generalization and reasoning capabilities, have broader, potentially unbounded ODDs. However, limitations in spatial reasoning continue to be a common challenge for many VLMs when applied to robot control and motion-level error recovery. In this paper, we investigate how optimizing visual and text prompts can enhance the spatial reasoning of VLMs, enabling them to function effectively as black-box controllers for both motion-level position correction and task-level recovery from unknown failures. Specifically, the optimizations include identifying key visual elements in visual prompts, highlighting these elements in text prompts for querying, and decomposing the reasoning process for failure detection and control generation. In experiments, prompt optimizations significantly outperform pre-trained Vision-Language-Action Models in correcting motion-level position errors and improve accuracy by 65.78% compared to VLMs with unoptimized prompts. Additionally, for task-level failures, optimized prompts enhanced the success rate by 5.8%, 5.8%, and 7.5% in VLMs’ abilities to detect failures, analyze issues, and generate recovery plans, respectively, across a wide range of unknown errors in Lego assembly.
[C96] APEX-MR: Multi-Robot Asynchronous Planning and Execution for Cooperative Assembly
Philip Huang, Ruixuan Liu, Shobhit Aggarwal, Changliu Liu and Jiaoyang Li
Robotics: Science and Systems, 2025
Citation Formats:
```
    
```
Abstract:

Compared to a single-robot workstation, a multi-robot system offers several advantages: 1) it expands the system’s workspace, 2) improves task efficiency, and more importantly, 3) enables robots to achieve significantly more complex and dexterous tasks, such as cooperative assembly. However, coordinating the tasks and motions of multiple robots is challenging due to issues, e.g., system uncertainty, task efficiency, algorithm scalability, and safety concerns. To address these challenges, this paper studies multi-robot coordination and proposes APEX-MR, an asynchronous planning and execution framework designed to safely and efficiently coordinate multiple robots to achieve cooperative assembly, e.g., LEGO assembly. In particular, APEX-MR provides a systematic approach to post-process multi-robot tasks and motion plans to enable robust asynchronous execution under uncertainty. Experimental results demonstrate that APEX-MR can significantly speed up the execution time of many long-horizon LEGO assembly tasks by 48% compared to sequential planning and 36% compared to synchronous planning on average. To further demonstrate the performance, we deploy APEX-MR to a dual-arm system to perform physical LEGO assembly. To our knowledge, this is the first robotic system capable of performing customized LEGO assembly using commercial LEGO bricks. The experiment results demonstrate that the dual-arm system, with APEX-MR, can safely coordinate robot motions, efficiently collaborate, and construct complex LEGO structures.

Video:

Teaser:

[W] Multi-Level Reasoning for Delicate Assembly using Dual Arms
Philip Huang, Ruixuan Liu, Shobhit Aggarwal, Changliu Liu and Jiaoyang Li
ICRA 2025 Workshop on Language and Semantics of Task and Motion Planning, 2025
Citation Formats:
```
    
```
[W] NeSyPack: A Neuro-Symbolic Framework for Bimanual Logistics Packing
Bowei Li, Peiqi Yu, Zhenran Tang, Han Zhou, Yifan Sun, Ruixuan Liu and Changliu Liu
RSS 2025 Workshop on Benchmarking Robot Manipulation: Improving Interoperability and Modularity, 2025
Citation Formats:
```
    
```
Abstract:

This paper presents NeSyPack, a neuro-symbolic framework for bimanual logistics packing. Our NeSyPack combines data-driven models and symbolic reasoning to build an explainable hierarchical framework that is generalizable, data-efficient, and reliable. It decomposes a task into subtasks via hierarchical reasoning, and further into atomic skills managed by a symbolic skill graph. The graph selects skill parameters, robot configurations, and task-specific control strategies for execution. This modular design enables robustness, adaptability, and efficient reuse—outperforming end-to-end models that require large-scale retraining. Using NeSyPack, our team won the First Prize in the What Bimanuals Can Do (WBCD) competition at the 2025 IEEE International Conference on Robotics & Automation (ICRA).

2024

[J23] Guard: A safe reinforcement learning benchmark
Weiye Zhao, Rui Chen, Yifan Sun, Ruixuan Liu, Tianhao Wei and Changliu Liu
Transactions on Machine Learning Research, 2024
Citation Formats:
```
    
```
[J25] Decomposition-based Hierarchical Task Allocation and Planning for Multi-Robots under Hierarchical Temporal Logic Specifications
Xusheng Luo, Shaojun Xu, Ruixuan Liu and Changliu Liu
IEEE Robotics and Automation Letters, 2024
Citation Formats:
```
    
```
Abstract:

Past research into robotic planning with temporal logic specifications, notably Linear Temporal Logic (LTL), was largely based on a single formula for individual or groups of robots. But with increasing task complexity, LTL formulas unavoidably grow lengthy, complicating interpretation and specification generation, and straining the computational capacities of the planners. A recent development has been the hierarchical representation of LTL (Luo et al., 2024) that contains multiple temporal logic specifications, providing a more interpretable framework. However, the proposed planning algorithm assumes the independence of robots within each specification, limiting their application to multi-robot coordination with complex temporal constraints. In this work, we formulated a decomposition-based hierarchical framework. At the high level, each specification is first decomposed into a set of atomic sub-tasks. We further infer the temporal relations among the sub-tasks of different specifications to construct a task network. Subsequently, a Mixed Integer Linear Program is used to assign sub-tasks to various robots. At the lower level, domain-specific controllers are employed to execute sub-tasks. Our approach was experimentally applied to domains of navigation and manipulation. The simulation demonstrated that our approach can find better solutions using less runtimes.

Video:
[J27] StableLego: Stability Analysis of Block Stacking Assembly
Ruixuan Liu, Kangle Deng, Ziwei Wang and Changliu Liu
IEEE Robotics and Automation Letters, 2024
Citation Formats:
```
    
```

[C70] Real-time Safety Index Adaptation for Parameter-varying Systems via Determinant Gradient Ascend
Rui Chen, Weiye Zhao, Ruixuan Liu, Weiyang Zhang and Changliu Liu
American control Conference, 2024
Citation Formats:
```
    
```
[C76] A Lightweight and Transferable Design for Robust LEGO Manipulation
Ruixuan Liu, Yifan Sun and Changliu Liu
International Symposium of Flexible Automation, 2024
Citation Formats:
```
    
```
Video:

Teaser:

[U] Robustifying Long-term Human-Robot Collaboration through a Hierarchical and Multimodal Framework
Peiqi Yu, Abulikemu Abuduweili, Ruixuan Liu and Changliu Liu
arXiv:2411.15711, 2024
Citation Formats:
```
    
```
Abstract:

Long-term Human-Robot Collaboration (HRC) is crucial for enabling flexible manufacturing systems and integrating companion robots into daily human environments over extended periods. This paper identifies several key challenges for such collaborations, such as accurate recognition of human plan, robustness to disturbances, operational efficiency, adaptability to diverse user behaviors, and sustained human satisfaction. To address these challenges, we model the long-term HRC task through a hierarchical task graph and presents a novel multimodal and hierarchical framework to enable robots to better assist humans to advance on the task graph. In particular, the proposed multimodal framework integrates visual observations with speech commands to facilitate intuitive and flexible human-robot interactions. Additionally, our hierarchical designs for both human pose detection and plan prediction allow better understanding of human behaviors and significantly enhance system accuracy, robustness, and flexibility. Moreover, an online adaptation mechanism enables real-time adjustment to diverse user behaviors. We deploy the proposed framework to KINOVA GEN3 robot and conduct extensive user studies on real-world long-term HRC assembly scenarios. Experimental results show that our approaches reduce task completion time by 15.9%, achieves an average task success rate of 91.8% and an overall user satisfaction score of 84% in long-term HRC tasks, showcasing its applicability in enhancing real-world long-term HRC.

Video:
[U] Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation
Shaojun Xu, Xusheng Luo, Yutong Huang, Letian Leng, Ruixuan Liu and Changliu Liu
arXiv:2408.08188, 2024
Citation Formats:
```
    
```
Abstract:

To enable non-experts to specify long-horizon, multi-robot collaborative tasks, language models are increasingly used to translate natural language commands into formal specifications. However, because translation can occur in multiple ways, such translations may lack accuracy or lead to inefficient multi-robot planning. Our key insight is that concise hierarchical specifications can simplify planning while remaining straightforward to derive from human instructions. We propose Nl2Hltl2Plan, a framework that translates natural language commands into hierarchical Linear Temporal Logic (LTL) and solves the corresponding planning problem. The translation involves two steps leveraging Large Language Models (LLMs). First, an LLM transforms instructions into a Hierarchical Task Tree, capturing logical and temporal relations. Next, a fine-tuned LLM converts sub-tasks into flat LTL formulas, which are aggregated into hierarchical specifications, with the lowest level corresponding to ordered robot actions. These specifications are then used with off-the-shelf planners. Our Nl2Hltl2Plan demonstrates the potential of LLMs in hierarchical reasoning for multi-robot task planning. Evaluations in simulation and real-world experiments with human participants show that Nl2Hltl2Plan outperforms existing methods, handling more complex instructions while achieving higher success rates and lower costs in task allocation and planning.

2023

[C62] Proactive human-robot co-assembly: Leveraging human intention prediction and robust safe control
Ruixuan Liu, Rui Chen, Abulikemu Abuduweili and Changliu Liu
IEEE Conference on Control Technology and Applications, 2023
Citation Formats:
```
    
```
Abstract:

Human-robot collaboration (HRC) is one key component to achieving flexible manufacturing to meet the different needs of customers. However, it is difficult to build intelligent robots that can proactively assist humans in a safe and efficient way due to several challenges. First, it is challenging to achieve efficient collaboration due to diverse human behaviors and data scarcity. Second, it is difficult to ensure interactive safety due to uncertainty in human behaviors. This paper presents an integrated framework for proactive HRC. A robust intention prediction module, which leverages prior task information and human-in-the-loop training, is learned to guide the robot for efficient collaboration. The proposed framework also uses robust safe control to ensure interactive safety under uncertainty. The developed framework is applied to a co-assembly task using a Kinova Gen3 robot. The experiment demonstrates that our solution is robust to environmental changes as well as different human preferences and behaviors. In addition, it improves task efficiency by approximately 15-20%. Moreover, the experiment demonstrates that our solution can guarantee interactive safety during proactive collaboration.
[C63] Zero-shot Transferable and Persistently Feasible Safe Control for High Dimensional Systems by Consistent Abstraction
Tianhao Wei, Shucheng Kang, Ruixuan Liu and Changliu Liu
IEEE Conference on Decision and Control, 2023
Citation Formats:
```
    
```

[W] Robotic LEGO Assembly and Disassembly from Human Demonstration
Ruixuan Liu, Yifan Sun and Changliu Liu
ACC Workshop on Recent Advancement of Human Autonomy Interaction and Integration, 2023
Citation Formats:
```
    
```
Video:

[U] Simulation-aided Learning from Demonstration for Robotic LEGO Construction
Ruixuan Liu, Alan Chen, Xusheng Luo and Changliu Liu
arXiv:2309.11010, 2023
Citation Formats:
```
    
```
Video:

Teaser:

2022

[C42] Safe Interactive Industrial Robots using Jerk-based Safe Set Algorithm
Ruixuan Liu, Rui Chen and Changliu Liu
International Symposium on Flexible Automation, 2022
Citation Formats:
```
    
```
Abstract:

The need to increase the flexibility of production lines is calling for robots to collaborate with human workers. However, existing interactive industrial robots only guarantee intrinsic safety (reduce collision impact), but not interactive safety (collision avoidance), which greatly limited their flexibility. The issue arises from two limitations in existing control software for industrial robots: 1) lack of support for real-time trajectory modification; 2) lack of intelligent safe control algorithms with guaranteed collision avoidance under robot dynamics constraints. To address the first issue, a jerk-bounded position controller (JPC) was developed previously. This paper addresses the second limitation, on top of the JPC. Specifically, we introduce a jerk-based safe set algorithm (JSSA) to ensure collision avoidance while considering the robot dynamics constraints. The JSSA greatly extends the scope of the original safe set algorithm, which has only been applied for second-order systems with unbounded accelerations. The JSSA is implemented on the FANUC LR Mate 200id/7L robot and validated with HRI tasks. Experiments show that the JSSA can consistently keep the robot at a safe distance from the human while executing the designated task.

Video:
[C44] Jerk-bounded Position Controller with Real-Time Task Modification for Interactive Industrial Robots
Ruixuan Liu, Rui Chen, Yifan Sun, Yu Zhao and Changliu Liu
IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2022
Citation Formats:
```
    
```
Abstract:

Industrial robots are widely used in many applications with structured and deterministic environments. However, the contemporary need requires industrial robots to intelligently operate in dynamic environments. It is challenging to design a safe and efficient robotic system with industrial robots in a dynamic environment for several reasons. First, most industrial robots require the input to have specific formats, which takes additional efforts to convert from task-level user commands. Second, existing robot drivers do not support overwriting ongoing tasks in real-time, which hinders the robot from responding to the dynamic environment. Third, most industrial robots only expose motion-level control, making it challenging to enforce dynamic constraints during trajectory tracking. To resolve the above challenges, this paper presents a jerk-bounded position control driver (JPC) for industrial robots. JPC provides a unified interface for tracking complex trajectories and is able to enforce dynamic constraints using motion-level control, without accessing servo-level control. Most importantly, JPC enables real-time trajectory modification. Users can overwrite the ongoing task with a new one without violating dynamic constraints. The proposed JPC is implemented and tested on the FANUC LR Mate 200id/7L robot with both artificially generated data and an interactive robot handover task. Experiments show that the proposed JPC can track complex trajectories accurately within dynamic limits and seamlessly switch to new trajectory references before the ongoing task ends.

Video:
[C50] Task-agnostic Adaptation for Safe Human-robot Handover
Ruixuan Liu, Rui Chen and Changliu Liu
IFAC Workshop on Cyber-Physical Human Systems, 2022
Best Student Paper Award
Citation Formats:
```
    
```
Video:

2021

[J8] Human Motion Prediction Using Adaptable Recurrent Neural Networks and Inverse Kinematics
Ruixuan Liu and Changliu Liu
IEEE Control Systems Letters, 2021
Citation Formats:
```
    
```
Abstract:

Human motion prediction, especially arm prediction, is critical to facilitate safe and efficient human-robot collaboration (HRC). This letter proposes a novel human motion prediction framework that combines a recurrent neural network (RNN) and inverse kinematics (IK) to predict human arm motion. A modified Kalman filter (MKF) is applied to adapt the model online. The proposed framework is tested on collected human motion data with up to 2 s prediction horizon. The experiments demonstrate that the proposed method improves the prediction accuracy by approximately 14% comparing to the state-of-art on seen situations. It stably adapts to unseen situations by keeping the maximum prediction error under 4 cm, which is 70% lower than other methods. Moreover, it is robust when the arm is partially occluded. The wrist prediction remains the same, while the elbow prediction has 20% less variation.

Video:

[W] IADA: Iterative Adversarial Data Augmentation Using Formal Verification and Expert Guidance
Ruixuan Liu and Changliu Liu
ICML Workshop on Human in the Loop Learning, 2021
Citation Formats:
```
    
```
Abstract:

Neural networks (NNs) are widely used for classification tasks for their remarkable performance. However, the robustness and accuracy of NNs heavily depend on the training data. In many applications, massive training data is usually not available. To address the challenge, this paper proposes an iterative adversarial data augmentation (IADA) framework to learn neural network models from an insufficient amount of training data. The method uses formal verification to identify the most "confusing" input samples, and leverages human guidance to safely and iteratively augment the training data with these samples. The proposed framework is applied to an artificial 2D dataset, the MNIST dataset, and a human motion dataset. By applying IADA to fully-connected NN classifiers, we show that our training method can improve the robustness and accuracy of the learned model. By comparing to regular supervised training, on the MNIST dataset, the average perturbation bound improved 107.4%. The classification accuracy improved 1.77%, 3.76%, 10.85% on the 2D dataset, the MNIST dataset, and the human motion dataset respectively.