Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

Official implementation of Catch It.

We open-source the simulation training scripts and provide guidances to the real-robot deployment. We name the environment with Dexterous Catch with Mobile Manipulation (DCMM).

This codebase is under CC BY-NC 4.0 license, with inherited license in Legged Gym and RSL RL from ETH Zurich, Nikita Rudin and NVIDIA CORPORATION & AFFILIATES.

News

2024-10-17: Release the simulation training scripts and references for the real-robot depolyment. Have a try!

Installation

Create conda environment and install pytorch:

conda create -n dcmm python=3.8
conda activate dcmm
pip install torch torchvision torchaudio

Clone this repo and install our gym_dcmm:

git clone https://github.com/hang0610/catch_it.git
cd catch_it && pip install -e .

Install additional packages in requirements.txt:

pip install -r requirements.txt

Simulation Environment Test

Keyboard Control Test

Under gym_dcmm/envs/, run:

python3 DcmmVecEnv.py --viewer

Keyboard control:

↑ (up) : increase the y linear velocity (base frame) by 1 m/s;
↓ (down) : decrease the y linear velocity (base frame) by 1 m/s;
← (left) : increase x linear velocity (base frame) by 1 m/s;
→ (right) : decrease x linear velocity (base frame) by 1 m/s;
4 (turn left) : decrease counter-clockwise angular velocity by 0.2 rad/s;
6 (turn right) : increase counter-clockwise angular velocity by 0.2 rad/s;
+: increase the position & roll of the arm end effector by (0.1, 0.1, 0.1, 0.1) m;
-: decrease the position & roll of the arm end effector by (0.1, 0.1, 0.1, 0.1) m;
7: increase the joint position of the hand by (0.2, 0.2, 0.2, 0.2) rad;
9: decrease the joint position of the hand by (0.2, 0.2, 0.2, 0.2) rad;

Note: DO NOT change the speed of the mobile base too dramatically, or it might tip over.

Simulation Training/Testing

Training/Testing Settings

We utilize 64 CPUs and a single Nvidia RTX 3070 Ti GPU for model training. Regarding the efficiency, it is recommended to use at least 12 CPUs to create over 16 parallel environments during training.

configs/config.yaml:

# Disables viewer or camera visualization
viewer: False
imshow_cam: False
# RL Arguments
test: False # False, True
task: Tracking # Catching_TwoStage, Catching_OneStage, Tracking
num_envs: 32 # This should be no more than 2x your CPUs (1x is recommended)
object_eval: False
# used to set checkpoint path
checkpoint_tracking: ''
checkpoint_catching: ''
# checkpoint_tracking: 'assets/models/track.pth'
# checkpoint_catching: 'assets/models/catch_two_stage.pth'

num_envs (int): the number of paralleled environments;
task (str): task type (Tracking or Catching);
test (bool): Setting to True enables testing mode, while setting to False enables training mode;
checkpoint_tracking/catching (str): Load the pre-trained model for training/testing;
viewer (bool): Launch the Mujoco viewer or not;
imshow_cam (bool): Visualize the camera scene or not;
object_eval (bool): Use the unseen objects or not;

configs/train/DcmmPPO.yaml:
- minibatch_size: The batch size for network input during PPO training;
- horizon_length: The number of steps collected in a single trajectory during exploration;
Note: In the training mode, must satisfy: num_envs * horizon_length = n * minibatch_size, where n is a positive integer.

Testing

We provide our tracking model and catching model trained in a two-stage manner, which are assets/models/track.pth and assets/models/catch_two_stage.pth. You can test them for the tracking and catching task. Also, You can choose to evaluate on the training objects or the unseen objects by setting object_eval.

Testing on the Tracking Task

Under the root catch_it:

python3 train_DCMM.py test=True task=Tracking num_envs=1 checkpoint_tracking=$(path_to_tracking_model) object_eval=True viewer=$(open_mujoco_viewer_or_not) imshow_cam=$(imshow_camera_or_not)

Testing on the Catching Task

python3 train_DCMM.py test=True task=Catching_TwoStage num_envs=1 checkpoint_catching=$(path_to_catching_model) object_eval=True viewer=$(open_mujoco_viewer_or_not) imshow_cam=$(imshow_camera_or_not)

Two-Stage Training From Scratch

Stage 1: Tracking Task

Under the root catch_it, train the base and arm to track the randomly thrown objects:

python3 train_DCMM.py test=False task=Tracking num_envs=$(number_of_CPUs)

Stage 2: Catching Task

Firts, load the tracking model from stage 1, and fill its path to the checkpoint_tracking in configs/config.yaml.

We provide our tracking model, which is assets/models/track.pth, which can be used to train the catching task (stage 2) directly.

Second, train the whole body (the base, arm and hand) to catch the randomly thrown objects:

python3 train_DCMM.py test=False task=Catching_TwoStage num_envs=$(number_of_CPUs) checkpoint_tracking=$(path_to_tracking_model)

One-Stage Training From Scratch

In the one-stage training baseline, we don't pre-train a tracking model but directly train a catching model from scratch. Similar to the setting of training tracking model, run:

python3 train_DCMM.py test=False task=Catching_OneStage num_envs=$(number_of_CPUs)

Logger

You can visualize the training curves and metrics via wandb. In configs/config.yaml:

# wandb config
output_name: Dcmm
wandb_mode: "disabled"  # "online" | "offline" | "disabled"
wandb_entity: 'Your_username'
# wandb_project: 'RL_Dcmm_Track_Random'
wandb_project: 'RL_Dcmm_Catch_Random'

Real-Robot Deployment

System Overview

Mobile Base: Ranger Mini V2
Arm: XArm6
Dexterous Hand: LEAP Hand
Perception: Realsense D455
Onboard Computer: Thunderobot MIX MiniPC

Deployment Code

Our code is build upon Ubuntu 20.04, ROS Noetic. Lower or higher version may also work (not guaranteed).

Ranger Mini V2: ranger_ros
XArm6: xarm-ros
LEAP Hand: LEAP Hand ROS1 SDK
Realsense D455: realsense-ros and realsense-sdk
Camera Calibration: easy_handeye

Trouble Shooting

Contact

Yuanhang Zhang: [email protected]

Issues

You can create an issue if you meet any other bugs.

If some mujoco rendering errors happen mujoco.FatalError: gladLoadGL error, try adding the following line before main() in the train_DCMM.py and gym_dcmm/envs/DcmmVecEnv.py:
```
os.environ['MUJOCO_GL'] = 'egl'
```

Citation

Please consider citing our paper if you find this repo useful:

@article{zhang2024catchitlearningcatch,
  title={Catch It! Learning to Catch in Flight with Mobile Dexterous Hands},
  author={Zhang, Yuanhang and Liang, Tianhai and Chen, Zhenyang and Ze, Yanjie and Xu, Huazhe},
  year={2024},
  journal={arXiv preprint arXiv:2409.10319}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
configs		configs
gym_dcmm		gym_dcmm
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
train_DCMM.py		train_DCMM.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

News

Installation

Simulation Environment Test

Keyboard Control Test

Simulation Training/Testing

Training/Testing Settings

Testing

Testing on the Tracking Task

Testing on the Catching Task

Two-Stage Training From Scratch

Stage 1: Tracking Task

Stage 2: Catching Task

One-Stage Training From Scratch

Logger

Real-Robot Deployment

System Overview

Deployment Code

Trouble Shooting

Contact

Issues

Citation

About

Releases

Packages

Languages

Roundly/Catch_It_Reproduction

Folders and files

Latest commit

History

Repository files navigation

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

News

Installation

Simulation Environment Test

Keyboard Control Test

Simulation Training/Testing

Training/Testing Settings

Testing

Testing on the Tracking Task

Testing on the Catching Task

Two-Stage Training From Scratch

Stage 1: Tracking Task

Stage 2: Catching Task

One-Stage Training From Scratch

Logger

Real-Robot Deployment

System Overview

Deployment Code

Trouble Shooting

Contact

Issues

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages