Research Overview: Safe, Agile, and Interactive Robotics
The ultimate goal of our research is to develop provably safe autonomous robotic systems that can adapt to and interact with the world in the way that human beings do, so that they can better serve, assist, and collaborate with people in their daily lives across work, home, and leisure. Our current research focuses on advanced robotics for manufacturing toward Manufacturing 5.0. Our fundamental research question is: how do we design the behaviors of these robots, and verify the safety of the design, so that they may operate safely, agilely, and interactively in dynamic, uncertain, and human-involved environments?
Broadly speaking, our research efforts are devoted to deepening the level of autonomy of robotic systems. Although robots are acting autonomously in many situations (e.g., welding robots in production lines), their autonomy hardly goes beyond the operational design domain (e.g., a robot dog can hardly navigate in a floating boat if it is not trained for it). Re-purposing those robots requires significant human effort in terms of collecting data in new environments and tuning the robot behavior to meet the specifications.
In the manufacturing world, people doing these tasks are called integrators and they are much more expensive than the robot itself. Our vision for future robotics is that with increased autonomy levels, future robots can challenge the underlying assumptions they were designed or trained with, mitigate the wrong assumptions, continuously improve themselves during the interaction with their environments, and effectively communicate their needs with humans they interact with.
Our research philosophy is to leverage mathematical and formal analysis to address real-world problems. Our team has been tackling our fundamental research question by creating formal mathematical models and developing novel algorithms to ensure that our robots are agile and provably safe for safety-critical applications. To support our vision for future robotics, we plan to work on the following topics:
- developing new theory and algorithms for lifelong safety assurance of robotics systems in changing environments;
- developing new theory and algorithms to equip the robots with the ability to efficiently learn agile skills to perform various manipulation and locomotion tasks;
- developing new hardware-software integrated solutions to enhance interactivity between robots and humans.
The first pillar of our research is to design efficient algorithms to formally synthesize and verify robot controllers in changing environments (e.g., changing tasks, changing human subjects, etc). We consider the synthesis and verification of both the control policy and the control certificate with respect to specifications encoded as state-wise constraints (e.g., for collision avoidance) or more complicated temporal logic constraints.
Accomplishments
As part of our NSF CAREER project, we have developed a series of methods to synthesize safety controllers and safety certificates with respect to state-wise constraints. Building upon Changliu’s PhD work on safe set algorithms (Liu & Tomizuka, 2014), (Liu & Tomizuka, 2015), we studied a series of safety index synthesis (SIS) methods. A safety index is a scalar function that evaluates the safety potential of a state considering various uncertainties and constraints, which can be viewed as a nonlinear variant of high-order control barrier functions, and the difference between the two is discussed in (Wei & Liu, 2019). A well-synthesized safety index can be used to constrain the robot action to ensure satisfaction of state-wise safety constraints, resulting in either explicit safe controllers (Ma et al., 2022) or implicit safe controllers (e.g., projection-based safety monitor such as CBF-QP).
We started with low-dimensional analytical parameterization of the safety index and studied rule-based approaches (Zhao et al., 2021), (Zhao et al., 2023), evolutionary optimization-based approaches (Wei & Liu, 2022), (Wei et al., 2022), constrained reinforcement learning-based approaches (Ma et al., 2022), as well as sum-of-square programming-based approaches (Zhao et al., 2023), (Chen et al., 2024) for SIS. These works enabled us to safely control robots to interact with humans in close proximity (Liu et al., 2022). These approaches come with strong guarantees (mostly due to the low-dimensional parameterization), at the cost of limited expressiveness (or degree of freedom), hence may result in conservative robot actions.
We are currently investigating a broader class of safety indices that are encoded in deep neural networks. We have introduced the first neural safety index synthesis approach using adversarial optimization that scales beyond 10 dimension problems (Liu et al., 2022) and that results in less conservative safe control. Nevertheless, like all other neural network applications, this approach does not come with any guarantee, and hence necessitates the study of formal synthesis and formal verification methods.
In addition, we started to study safety index adaptation (SIA) under changing environmental conditions, e.g., a robot dog suddenly carries a heavy object and hence cannot turn as quickly as it could at obstacles. We introduced the first real-time adaptation method using gradient ascent on a sum-of-square synthesized safety index (Chen et al., 2024).
What’s Next?
Neural safety indices and neural safety controllers not only can achieve better expressiveness than their analytical counterparts, but also are more agnostic to the system dynamics (e.g., the system does not need to satisfy continuity assumptions) and the system specifications (e.g., it can generalize from state-wise constraints to temporal logic constraints). And with the development of meta-learning and continual learning methods, we hypothesize that they could also be easier to adapt under changing environments. Building upon our prior work, we are tackling the following unsolved problems from the perspectives of both formal synthesis (e.g., synthesis with guarantees) and formal verification (e.g., generating mathematical proofs):
- How to obtain provably safe neural safety indices and neural safety controllers with respect to diverse forms of specifications?
- How to adapt neural safety indices and neural safety controllers and preserve the guarantees?
The resulting algorithms and tools will all be evaluated and integrated into the GUARD benchmark that our team developed. GUARD is the state-of-the-art safe learning and control benchmark which contains a variety of robot embodiment, tasks, safety specifications, and implementation of safe learning and control algorithms (Zhao et al., 2024).
The second pillar of our research is to design efficient algorithms to enable robots to learn agile manipulation and locomotion skills so that they can better serve, assist, and collaborate with people. Our argument is that the best robot embodiment for humans to work with is a humanoid robot, since humans already have rich experience working with another human.
With the huge advancements in hardware development for humanoid robots in the past years, it is the perfect time to study human-humanoid interaction and collaboration. Nevertheless, before humanoid robots can seriously work with humans, they need to be made agile in both manipulation and locomotion.
Accomplishments
We have investigated convex optimization-based trajectory generation for manipulation and navigation (Liu et al., 2018; Liu & Tomizuka, 2017), which enables real-time robot motion planning in complex environments. However, these methods require complete models of the environments and do not work well in contact-rich tasks (e.g., delicate assembly) as contact dynamics are hard to model. We then studied learning-based approaches to deal with contact-rich manipulation and locomotion tasks. Facing the huge gap between simulated dynamics and real-world dynamics, we explored two routes:
- Sample-efficient learning directly applied on hardware.
- Reinforcement learning with sim-to-real transfer.
For the first route, we achieved several successes in manipulation tasks, in particular, electronic assembly (Chen et al., 2022) and Lego assembly (Liu et al., 2024). We leveraged hardware design (e.g., special fingers) and model-based control techniques to reduce the number of parameters to learn and then leveraged evolutionary optimization to directly optimize those parameters on hardware. For the second route, we achieved several successes in locomotion tasks, enabling robot dogs to run safely and fast (almost approaching hardware limits) (He et al., 2024) and humanoids to track human motions in real-time (missing reference).
The aforementioned work demonstrates significant progress in real-world robot learning by leveraging innovative interface design (e.g., state representation selection), reward engineering, and hyperparameter tuning to effectively adapt existing learning algorithms (e.g., CMA-ES in the first route and PPO in the second route). While these approaches highlight the importance of practical applications, they also reveal opportunities for advancing the field.
As real-world robot learning problems grow more complex, we see a compelling need for fundamental algorithmic innovations to enhance learning efficiency, adapt to changing environments, and reduce the reliance on extensive manual effort in deployment (e.g., hyperparameter tuning and reward engineering). These advancements will drive the next generation of robot learning systems.
What’s Next
Our goal is to develop a holistic theoretic foundation and new reinforcement learning algorithms that mimic human learning, while fully accounting for the problem specifics of manipulation and locomotion tasks, e.g., they are both contact-rich and contain hybrid dynamics. While this is a huge topic, our investigation starts with the following two aspects:
- How to learn with lower variance and higher success rate by best utilizing the data?
Policy-optimization-based reinforcement learning often suffers from high variance during training, which hinders skill mastery. To address this, we propose optimizing the lower probability bound of performance, approximated by the expected return minus a scaled variance. We recently developed the Absolution Policy Gradient and its variant, Proximal Absolution Policy Gradient (Zhao et al., 2024), which improve performance (high average and low variance) and learning efficiency across RL benchmarks, including humanoid robots. The next step is to incorporate safety constraints using state-wise constrained policy optimization (Zhao et al., 2023) to address real-world robot learning challenges.
- How to efficiently learn and evolve in changing environments?
Human learning is continuous, and robots should also be capable of learning new skills progressively and composing them to handle increasingly complex tasks. To achieve lifelong learning, we need algorithms inspired by the human hippocampus, which plays a crucial role in learning and memory. We have begun applying this idea, introducing long-term and short-term memory for continual learning and adaptation, to tasks such as simultaneous location and mapping (Yin et al., 2023) and human motion prediction (Abuduweili & Liu, 2021; Abuduweili & Liu, 2023), with promising results. Our current focus is on integrating different learning methods (e.g., offline reinforcement learning, imitation learning, and online reinforcement learning) and balancing memory retention and forgetting for lifelong learning.
The third pillar of our research is to design novel hardware-software integrated solutions to enhance human-robot interaction, so as to lower the barrier for people to program robots and benefit from the technology. We focus on two types of enhancements: better communication (to foster mutual understanding between the human and the robot) and better co-adaptation (to make the robot system evolve together with the human).
Accomplishments
We have extensively studied methods to understand and predict humans by observing the physical motion of the human (Liu & Liu, 2021) (Liu et al., 2023); as well as how to use that to shape a robot’s control strategies considering the fact that the robot’s physical motion will affect the human’s future responses (Pandya & Liu, 2022) (Pandya et al., 2024).
In observance of the limited bandwidth of physical motion-based information exchange, we also investigated dynamic gesture-based (Chen et al., 2023), force-feedback-based (Shek et al., 2023), touch-based (Su et al., 2023), and language-based (Luo et al., 2023) human-to-robot communication.
Nevertheless, as robots are gaining higher levels of intelligence, their decisions may become increasingly complex for humans to understand. Our argument is that in addition to studying how to let robots better understand humans, it is critical to enhance robot-to-human communication so as to foster co-adaptation and improve both the safety and agility of the interaction.
What’s Next?
Our goal is to make human-robot interaction (e.g., human-humanoid interaction) as natural as human-human interaction. While it is a broad topic, we are currently focusing on the following two directions:
-
How to best exchange information between humans and robots? And what should be communicated?
-
How to infer and influence hidden states in a human’s mind?
We aim to improve two-way communication through hardware integration and algorithm design. For hardware, we are developing multi-modal systems (e.g., speech, touch, gestures, VR/AR) to enable high-bandwidth, low-cognitive-load interactions. Our recent work OmniH2O (He et al., 2024) demonstrates a flexible multi-modal tele-operation system for humanoid robots. On the algorithm side, we focus on optimizing robot-to-human communication strategies (e.g., what, when, and how to communicate). Preliminary work on strategy explanation has significantly improved collaboration efficiency (Pandya et al., 2024). We are also exploring visual communication strategies for safety certificates to foster trust and enabling robots to explain both concrete (e.g., next actions) and abstract concepts (e.g., safety rationale).
With the overall objective of deepening the level of autonomy of robotic systems, our team is exploring the following use cases in the manufacturing domain, leveraging our fundamental research discussed above.
Intelligent Design and Prototyping
We aim to use generative AI to facilitate production design and prototyping. The key challenge is how to ensure the design is aligned with human preference and physics. We are investigating novel algorithms to enhance interactivity and alignment between human designers and generative AI in a variety of assembly design tasks. Read more in Generative Assembly via Bimanual Manipulation.
Delicate Assembly
Delicate assembly, e.g., electronic assembly (Chen et al., 2022), Lego assembly (Liu et al., 2024), and cable assembly, is still heavily done manually. The key challenge for robots to perform these assembly tasks is the lack of agility, e.g., how to coordinate two arms to finish the assembly task (such as one arm for support and one arm for manipulation), how to achieve high precision in the assembly, and how to generalize the assembly skills to arbitrary components. We are investigating novel learning and planning algorithms for intelligent dual-arm delicate assembly. Read more in Generative Assembly via Bimanual Manipulation and 6DoF Robot Assembly Station of Consumer Electronic Production.
Intelligent Surface Finishing
Surface finishing jobs, such as grinding and weld washing, are in high demand in almost all manufacturing sectors. These jobs are challenging for robots as there could be huge variances across different workpieces even in the same batch. Experienced human workers need to make many nontrivial decisions in real time to finish the job. However, with the loss of qualified human workers, there is a pressing need to automate these surface finishing tasks. Our team started tackling the problem in collaboration with several industrial partners in 2019. By integrating our novel safe control and agile compliance control methods (Zhao et al., 2020) (He et al., 2023) (missing reference), we were able to demonstrate a fully autonomous robotic solution for weld washing in the real world. This work is currently being commercialized through Instinct Robotics. Read more in Automatic Onsite Polishing of Large Complex Surfaces by Real Time Planning and Control.
Autonomous Deployment and Maintenance of Robot Systems
We have been fascinated by the idea of containerized manufacturing, where we compress the whole supply chain and put it into one shipping container, so that raw materials get in from one end and packaged products come out from the other. The shipping container can be delivered anywhere on earth, thus minimizing the risks associated with long supply chains. Inside the container, there need to be machines and robots, where the robots mostly handle the loading and packaging tasks. To realize this vision of containerized manufacturing, we need to equip the robot with strong self-awareness and decision-making capabilities, as it is almost impossible to send human integrators to debug and repair the robot once the container is shipped. We are collaborating with our industrial partners to deploy a containerized line for mask production, where our team focuses on enabling self-calibration and real-time decision-making capabilities of the robots. Additionally, we are investigating methods that leverage foundation models to facilitate error detection and recovery for these systems. One of our recent works, Meta-Control (Wei et al., 2024), enables foundation models to replace human integrators in the deployment of robot control systems using Socrates’ “art of midwifery” and model-based grounding techniques. This offers a great way to deepen the level of autonomy while maintaining high system performance.
Human-Robot Collaboration for Fixtureless Manufacturing
With the rise of high-mix, low-volume manufacturing, it is not economical to make expensive fixtures that can only be used for one product. Robots with dexterous grippers can be used to replace fixtures as they can hold all kinds of workpieces for humans to work on. Our team is working with our industrial partners to explore this idea on delicate assembly tasks. While there are many challenges, by actively predicting human intent and displaying robot intent back to the human, we were able to improve the efficiency of human-robot collaboration in these fixtureless production lines (Liu et al., 2023). Read more in Safe Uncaged Industrial Robots.
-
[C2] Control in a safe set: Addressing safety in human-robot interactions
Changliu Liu and Masayoshi Tomizuka
Dynamic Systems and Control Conference, 2014
Best Student Paper Finalist
Citation Formats:Abstract:
Human-robot interactions (HRI) happen in a wide range of situations. Safety is one of the biggest concerns in HRI. This paper proposes a safe set method for designing the robot controller and offers theoretical guarantees of safety. The interactions are modeled in a multi-agent system framework. To deal with humans in the loop, we design a parameter adaptation algorithm (PAA) to learn the closed loop behavior of humans online. Then a safe set (a subset of the state space) is constructed and the optimal control law is mapped to the set of control which can make the safe set invariant. This algorithm is applied with different safety constraints to both mobile robots and robot arms. The simulation results confirm the effectiveness of the algorithm.
-
[C3] Safe exploration: Addressing various uncertainty levels in human robot interactions
Changliu Liu and Masayoshi Tomizuka
American Control Conference, 2015
Citation Formats:Abstract:
To address the safety issues in human robot interactions (HRI), a safe set algorithm (SSA) was developed previously. However, during HRI, the uncertainty levels are changing in different phases of the interaction, which is not captured by SSA. A safe exploration algorithm (SEA) is proposed in this paper to address the uncertainty levels in the robot control. To estimate the uncertainty levels online, a learning method in the belief space is developed. A comparative study between SSA and SEA is conducted. The simulation results confirm that SEA can capture the uncertainty reduction behavior which is observed in human-human interactions.
-
[J2] Real time trajectory optimization for nonlinear robotic systems: Relaxation and convexification
Changliu Liu and Masayoshi Tomizuka
Systems & Control Letters, 2017
Citation Formats:Abstract:
Real time trajectory optimization is critical for robotic systems. Due to nonlinear system dynamics and obstacles in the environment, the trajectory optimization problems are highly nonlinear and non convex, hence hard to be computed online. Liu, Lin and Tomizuka proposed the convex feasible set algorithm (CFS) to handle the non convex optimization in real time by convexification. However, one limitation of CFS is that it will not converge to local optima when there are nonlinear equality constraints. In this paper, the slack convex feasible set algorithm (SCFS) is proposed to handle the nonlinear equality constraints, e.g. nonlinear system dynamics, by introducing slack variables to relax the constraints. The geometric interpretation of the method is discussed. The feasibility and convergence of the SCFS algorithm is proved. It is demonstrated that SCFS performs better than existing non convex optimization methods such as interior-point, active set and sequential quadratic programming, as it requires less computation time and converges faster.
-
[J3] The convex feasible set algorithm for real time optimization in motion planning
Changliu Liu, Chung-Yen Lin and Masayoshi Tomizuka
SIAM Journal on Control and optimization, 2018
Citation Formats:Abstract:
With the development of robotics, there are growing needs for real time motion planning. However, due to obstacles in the environment, the planning problem is highly non-convex, which makes it difficult to achieve real time computation using existing non-convex optimization algorithms. This paper introduces the convex feasible set algorithm (CFS) which is a fast algorithm for non-convex optimization problems that have convex costs and non-convex constraints. The idea is to find a convex feasible set for the original problem and iteratively solve a sequence of subproblems using the convex constraints. The feasibility and the convergence of the proposed algorithm are proved in the paper. The application of this method on motion planning for mobile robots is discussed. The simulations demonstrate the effectiveness of the proposed algorithm.
-
[C23] Safe Control Algorithms Using Energy Functions: A Unified Framework, Benchmark, and New Directions
Tianhao Wei and Changliu Liu
IEEE Conference on Decision and Control, 2019
Citation Formats:Abstract:
Safe autonomy is important in many application domains, especially for applications involving interactions with humans. Existing safe control algorithms are similar to each other in the sense that: they all provide control input to maintain a low value of an energy function that measures safety. In different methods, the energy function is called a potential function, a safety index, or a barrier function. The connections and relative advantages among these methods remain unclear. This paper introduces a unified framework to derive safe control laws using energy functions. We demonstrate how to integrate existing controllers based on potential field method, safe set algorithm, barrier function method, and sliding mode algorithm into this unified framework. In addition to theoretical comparison, this paper also introduces a benchmark which implements and compares existing methods on a variety of problems with different system dynamics and interaction modes. Based on the comparison results, a new method, called the sublevel safe set algorithm, is derived under the unified framework by optimizing the hyperparameters. The proposed algorithm achieves the best performance in terms of safety and efficiency on all benchmark problems.
-
[C30] Contact-Rich Trajectory Generation in Confined Environments Using Iterative Convex Optimization
Wei-Ye Zhao, Suqin He, Chengtao Wen and Changliu Liu
Dynamic Systems and Control Conference, 2020
Citation Formats:Abstract:
Applying intelligent robot arms in dynamic uncertain environments (i.e., flexible production lines) remains challenging, which requires efficient algorithms for real time trajectory generation. The motion planning problem for robot trajectory generation is highly nonlinear and nonconvex, which usually comes with collision avoidance constraints, robot kinematics and dynamics constraints, and task constraints (e.g., following a Cartesian trajectory defined on a surface and maintain the contact). The nonlinear and nonconvex planning problem is computationally expensive to solve, which limits the application of robot arms in the real world. In this paper, for redundant robot arm planning problems with complex constraints, we present a motion planning method using iterative convex optimization that can efficiently handle the constraints and generate optimal trajectories in real time. The proposed planner guarantees the satisfaction of the contact-rich task constraints and avoids collision in confined environments. Extensive experiments on trajectory generation for weld grinding are performed to demonstrate the effectiveness of the proposed method and its applicability in advanced robotic manufacturing.
Video:
-
[C38] Model-free Safe Control for Zero-Violation Reinforcement Learning
Weiye Zhao, Tairan He and Changliu Liu
Conference on Robot Learning, 2021
Citation Formats:Abstract:
Maintaining safety under adaptation has long been considered to be an important capability for autonomous systems. As these systems estimate and change the ego-model of the system dynamics, questions regarding how to develop safety guarantees for such systems continue to be of interest. We propose a novel robust safe control methodology that uses set-based safety constraints to make a robotic system with dynamical uncertainties safely adapt and operate in its environment. The method consists of designing a scalar energy function (safety index) for an adaptive system with parametric uncertainty and an optimization-based approach for control synthesis. Simulation studies on a two-link manipulator are conducted and the results demonstrate the effectiveness of our proposed method in terms of generating provably safe control for adaptive systems with parametric uncertainty.
-
[J7] Robust nonlinear adaptation algorithms for multitask prediction networks
Abulikemu Abuduweili and Changliu Liu
International Journal of Adaptive Control and Signal Processing, 2021
Citation Formats:Abstract:
High fidelity behavior prediction of intelligent agents is critical in many applications,which is challenging due to the stochasticity, heterogeneity and time-varying natureof agent behaviors. Prediction models that work for one individual may not be appli-cable to another. Besides, the prediction model trained on the training set may notgeneralize to the testing set. These challenges motivate the adoption of online adap-tation algorithms to update prediction models in real-time to improve the predictionperformance. This paper considers online adaptable multi-task prediction for bothintention and trajectory. The goal of online adaptation is to improve the performanceof both intention and trajectory predictions with only the feedback of the observedtrajectory. We first introduce a generic tau-step adaptation algorithm of the multi-taskprediction model that updates the model parameters with the trajectory predictionerror in recent tau steps. Inspired by Extended Kalman Filter (EKF), a base adaptationalgorithm Modified EKF with forgetting factor (MEKFtau) is introduced. In order toimprove the performance of MEKFtau, generalized exponential moving average filter-ing techniques are adopted. Then this paper introduces a dynamic multi-epoch updatestrategy to effectively utilize samples received in real time. With all these exten-sions, we propose a robust online adaptation algorithm: MEKF with Moving Averageand dynamic Multi-Epoch strategy (MEKFMA−ME). We empirically study the bestset of parameters to adapt in the multi-task prediction model and demonstrate theeffectiveness of the proposed adaptation algorithms to reduce the prediction error.
-
[J8] Human Motion Prediction Using Adaptable Recurrent Neural Networks and Inverse Kinematics
Ruixuan Liu and Changliu Liu
IEEE Control Systems Letters, 2021
Citation Formats:Abstract:
Human motion prediction, especially arm prediction, is critical to facilitate safe and efficient human-robot collaboration (HRC). This letter proposes a novel human motion prediction framework that combines a recurrent neural network (RNN) and inverse kinematics (IK) to predict human arm motion. A modified Kalman filter (MKF) is applied to adapt the model online. The proposed framework is tested on collected human motion data with up to 2 s prediction horizon. The experiments demonstrate that the proposed method improves the prediction accuracy by approximately 14% comparing to the state-of-art on seen situations. It stably adapts to unseen situations by keeping the maximum prediction error under 4 cm, which is 70% lower than other methods. Moreover, it is robust when the arm is partially occluded. The wrist prediction remains the same, while the elbow prediction has 20% less variation.
Video:
-
[C40] Safe Control with Neural Network Dynamic Models
Tianhao Wei and Changliu Liu
Learning for Dynamics and Control Conference, 2022
Citation Formats:Abstract:
Safety is critical in autonomous robotic systems. A safe control law should ensure forward invariance of a safe set (a subset in the state space). It has been extensively studied regarding how to derive a safe control law with a control-affine analytical dynamic model. However, how to formally derive a safe control law with Neural Network Dynamic Models (NNDM) remains unclear due to the lack of computationally tractable methods to deal with these black-box functions. In fact, even finding the control that minimizes an objective for NNDM without any safety constraint is still challenging. In this work, we propose MIND-SIS (Mixed Integer for Neural network Dynamic model with Safety Index Synthesis), the first method to synthesize safe control for NNDM. The method includes two parts: 1) SIS: an algorithm for the offline synthesis of the safety index (also called as a barrier function), which uses evolutionary methods and 2) MIND: an algorithm for online computation of the optimal and safe control signal, which solves a constrained optimization using a computationally efficient encoding of neural networks. It has been theoretically proved that MIND-SIS guarantees forward invariance and finite convergence to a subset of the user-defined safe set. And it has been numerically validated that MIND-SIS achieves safe and optimal control of NNDM. The optimality gap is less than 10−8, and the safety constraint violation is 0.
-
[C41] Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning
Haitong Ma, Changliu Liu, Shengbo Eben Li, Sifa Zheng and Jianyu Chen
Learning for Dynamics and Control Conference, 2022
Best Paper Finalist
Citation Formats:Abstract:
Safety is the major consideration in controlling complex dynamical systems using reinforcement learning (RL), where the safety certificates can provide provable safety guarantees. A valid safety certificate is an energy function indicating that safe states are with low energy, and there exists a corresponding safe control policy that allows the energy function to always dissipate. The safety certificates and the safe control policies are closely related to each other and both challenging to synthesize. Therefore, existing learning-based studies treat either of them as prior knowledge to learn the other, limiting their applicability to general systems with unknown dynamics. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificates and learns the safe control policies with constrained reinforcement learning (CRL). We do not rely on prior knowledge about either a prior control law or a perfect safety certificate. In particular, we formulate a loss function to optimize the safety certificate parameters by minimizing the occurrence of energy increases. By adding this optimization procedure as an outer loop to the Lagrangian-based CRL, we jointly update the policy and safety certificate parameters, and prove that they will converge to their respective local optima, the optimal safe policies and valid safety certificates. Finally, we evaluate our algorithms on multiple safety-critical benchmark environments. The results show that the proposed algorithm learns solidly safe policies with no constraint violation. The validity, or feasibility of synthesized safety certificates is also verified numerically.
Video:
-
[C44] Jerk-bounded Position Controller with Real-Time Task Modification for Interactive Industrial Robots
Ruixuan Liu, Rui Chen, Yifan Sun, Yu Zhao and Changliu Liu
IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2022
Citation Formats:Abstract:
Industrial robots are widely used in many applications with structured and deterministic environments. However, the contemporary need requires industrial robots to intelligently operate in dynamic environments. It is challenging to design a safe and efficient robotic system with industrial robots in a dynamic environment for several reasons. First, most industrial robots require the input to have specific formats, which takes additional efforts to convert from task-level user commands. Second, existing robot drivers do not support overwriting ongoing tasks in real-time, which hinders the robot from responding to the dynamic environment. Third, most industrial robots only expose motion-level control, making it challenging to enforce dynamic constraints during trajectory tracking. To resolve the above challenges, this paper presents a jerk-bounded position control driver (JPC) for industrial robots. JPC provides a unified interface for tracking complex trajectories and is able to enforce dynamic constraints using motion-level control, without accessing servo-level control. Most importantly, JPC enables real-time trajectory modification. Users can overwrite the ongoing task with a new one without violating dynamic constraints. The proposed JPC is implemented and tested on the FANUC LR Mate 200id/7L robot with both artificially generated data and an interactive robot handover task. Experiments show that the proposed JPC can track complex trajectories accurately within dynamic limits and seamlessly switch to new trajectory references before the ongoing task ends.
Video:
-
[C45] A Composable Framework for Policy Design, Learning, and Transfer Toward Safe and Efficient Industrial Insertion
Rui Chen, Chenxi Wang, Tianhao Wei and Changliu Liu
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022
Citation Formats:Abstract:
Delicate industrial insertion tasks (e.g., PC board assembly) remain challenging for industrial robots. The challenges include low error tolerance, delicacy of the components, and large task variations with respect to the components to be inserted. To deliver a feasible robotic solution for these insertion tasks, we also need to account for hardware limits of existing robotic systems and minimize the integration effort. This paper proposes a composable framework for efficient integration of a safe insertion policy on existing robotic platforms to accomplish these insertion tasks. The policy has an interpretable modularized design and can be learned efficiently on hardware and transferred to new tasks easily. In particular, the policy includes a safe insertion agent as a baseline policy for insertion, an optimal configurable Cartesian tracker as an interface to robot hardware, a probabilistic inference module to handle component variety and insertion errors, and a safe learning module to optimize the parameters in the aforementioned modules to achieve the best performance on designated hardware. The experiment results on a UR10 robot show that the proposed framework achieves safety (for the delicacy of components), accuracy (for low tolerance), robustness (against perception error and component defection), adaptability and transferability (for task variations), as well as task efficiency during execution plus data and time efficiency during learning.
Video:
-
[C46] Safe and Efficient Exploration of Human Models During Human-Robot Interaction
Ravi Pandya and Changliu Liu
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022
Citation Formats:Abstract:
Many collaborative human-robot tasks require the robot to stay safe and work efficiently around humans. Since the robot can only stay safe with respect to its own model of the human, we want the robot to learn a good model of the human in order to act both safely and efficiently. This paper studies methods that enable a robot to safely explore the space of a human-robot system to improve the robot’s model of the human, which will consequently allow the robot to access a larger state space and better work with the human. In particular, we introduce active exploration under the framework of energy-function based safe control, investigate the effect of different active exploration strategies, and finally analyze the effect of safe active exploration on both analytical and neural network human models.
Video:
-
[C53] Safe Control Under Input Limits with Neural Control Barrier Functions
Simin Liu, Changliu Liu and John Dolan
Conference on Robot Learning, 2022
Citation Formats: -
[J15] Persistently feasible robust safe control by safety index synthesis and convex semi-infinite programming
Tianhao Wei, Shucheng Kang, Weiye Zhao and Changliu Liu
IEEE Control Systems Letters, 2022
Citation Formats: -
[J16] A hierarchical long short term safety framework for efficient robot manipulation under uncertainty
Suqin He, Weiye Zhao, Chuxiong Hu, Yu Zhu and Changliu Liu
Robotics and Computer-Integrated Manufacturing, 2023
Citation Formats: -
[C55] Learning from physical human feedback: An object-centric one-shot adaptation method
Alvin Shek, Bo Ying Su, Rui Chen and Changliu Liu
IEEE International Conference on Robotics and Automation, 2023
Outstanding Interaction Paper
Citation Formats: -
[C56] Safety index synthesis via sum-of-squares programming
Weiye Zhao, Tairan He, Tianhao Wei, Simin Liu and Changliu Liu
American Control Conference, 2023
Citation Formats: -
[C57] Probabilistic safeguard for reinforcement learning using safety index guided gaussian process models
Weiye Zhao, Tairan He and Changliu Liu
Learning for Dynamics and Control Conference, 2023
Citation Formats: -
[C60] State-wise safe reinforcement learning: A survey
Weiye Zhao, Tairan He, Rui Chen, Tianhao Wei and Changliu Liu
International Joint Conferences on Artificial Intelligence, 2023
Citation Formats: -
[C62] Proactive human-robot co-assembly: Leveraging human intention prediction and robust safe control
Ruixuan Liu, Rui Chen, Abulikemu Abuduweili and Changliu Liu
IEEE Conference on Control Technology and Applications, 2023
Citation Formats: -
[C65] Online Model Adaptation with Feedforward Compensation
Abulikemu Abuduweili and Changliu Liu
Conference on Robot Learning, 2023
Citation Formats:Video:
-
[J17] Robust and context-aware real-time collaborative robot handling via dynamic gesture commands
Rui Chen, Alvin Shek and Changliu Liu
IEEE Robotics and Automation Letters, 2023
Citation Formats: -
[J19] Customizing Textile and Tactile Skins for Interactive Industrial Robots
Bo Ying Su, Zhongqi Wei, James McCann, Wenzhen Yuan and Changliu Liu
ASME Letters in Dynamic Systems and Control, 2023
Citation Formats: -
[J20] Bioslam: A bioinspired lifelong memory system for general place recognition
Peng Yin, Abulikemu Abuduweili, Shiqi Zhao, Lingyun Xu, Changliu Liu and Sebastian Scherer
IEEE Transactions on Robotics, 2023
Citation Formats:Video:
-
[W] Obtaining hierarchy from human instructions: an llms-based approach
Xusheng Luo, Shaojun Xu and Changliu Liu
CoRL 2023 Workshop on Learning Effective Abstractions for Planning (LEAP), 2023
Citation Formats: -
[J23] Guard: A safe reinforcement learning benchmark
Weiye Zhao, Rui Chen, Yifan Sun, Ruixuan Liu, Tianhao Wei and Changliu Liu
Transactions on Machine Learning Research, 2024
Citation Formats: -
[C67] Safety Index Synthesis with State-dependent Control Space
Rui Chen, Weiye Zhao and Changliu Liu
American Control Conference, 2024
Citation Formats: -
[C70] Real-time Safety Index Adaptation for Parameter-varying Systems via Determinant Gradient Ascend
Rui Chen, Weiye Zhao, Ruixuan Liu, Weiyang Zhang and Changliu Liu
American control Conference, 2024
Citation Formats: -
[C72] Towards Proactive Safe Human-Robot Collaborations via Data-Efficient Conditional Behavior Prediction
Ravi Pandya, Zhuoyuan Wang, Yorie Nakahira and Changliu Liu
IEEE International Conference on Robotics and Automation, 2024
Citation Formats:Video:
-
[C73] Multi-Agent Strategy Explanations for Human-Robot Collaboration
Ravi Pandya, Michelle Zhao, Changliu Liu, Reid Simmons and Henny Admoni
IEEE International Conference on Robotics and Automation, 2024
Citation Formats:Video:
-
[C76] A Lightweight and Transferable Design for Robust LEGO Manipulation
Ruixuan Liu, Yifan Sun and Changliu Liu
International Symposium of Flexible Automation, 2024
Citation Formats:Video:
-
[C77] Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Weiye Zhao, Feihan Li, Yifan Sun, Rui Chen, Tianhao Wei and Changliu Liu
International Conference on Machine Learning, 2024
Citation Formats: -
[C78] Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu and Guanya Shi
Robotics: Science and Systems, 2024
Outstanding Student Paper Award Finalist
Citation Formats: -
[C82] OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu and Guanya Shi
Conference on Robot Learning, 2024
Citation Formats: -
[C83] Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills
Tianhao Wei, Liqian Ma, Rui Chen, Weiye Zhao and Changliu Liu
Conference on Robot Learning, 2024
Citation Formats:Abstract:
The requirements for real-world manipulation tasks are diverse and often conflicting; some tasks require precise motion while others require force compliance; some tasks require avoidance of certain regions while others require convergence to certain states. Satisfying these varied requirements with a fixed state-action representation and control strategy is challenging, impeding the development of a universal robotic foundation model. In this work, we propose Meta-Control, the first LLM-enabled automatic control synthesis approach that creates customized state representations and control strategies tailored to specific tasks. Our core insight is that a meta-control system can be built to automate the thought process that human experts use to design control systems. Specifically, human experts heavily use a model-based, hierarchical (from abstract to concrete) thought model, then compose various dynamic models and controllers together to form a control system. Meta-Control mimics the thought model and harnesses LLM’s extensive control knowledge with Socrates’ "art of midwifery" to automate the thought process. Meta-Control stands out for its fully model-based nature, allowing rigorous analysis, generalizability, robustness, efficient parameter tuning, and reliable real-time execution.
Video:
Period of Performance: Now ~ Now