Perpetual Humanoid Control for Real-time Simulated Avatars

Official implementation of ICCV 2023 paper: "Perpetual Humanoid Control for Real-time Simulated Avatars". In this paper, we present a physics-based humanoid controller that achieves high-fidelity motion imitation and fail-statue recovery in the presence of noisy input (e.g. pose estimates from video or generated from language) and unexpected falls. No external forces is used.

[paper] [website] [Video]

Add support for Unitree H1 & G1.
Add support for smplx/h (fingers!!!).
Release PHC+ model (100% success rate on AMASS) used in PULSE.
Release language-based demo code.
Release vr controller tracking code.
Release video-based demo code.
Additional instruction on Isaac Gym SMPL robot.
Release training code.
Release evaluation code.

Introduction

We present a physics-based humanoid controller that achieves high-fidelity motion imitation and fault-tolerant behavior in the presence of noisy input (e.g. pose estimates from video or generated from language) and unexpected falls. Our controller scales up to learning ten thousand motion clips without using any external stabilizing forces and learns to naturally recover from fail-state. Given reference motion, our controller can perpetually control simulated avatars without requiring resets. At its core, we propose the progressive multiplicative control policy (PMCP), which dynamically allocates new network capacity to learn harder and harder motion sequences. PMCP allows efficient scaling for learning from large-scale motion databases and adding new tasks, such as fail-state recovery, without catastrophic forgetting. We demonstrate the effectiveness of our controller by using it to imitate noisy poses from video-based pose estimators and language-based motion generators in a live and real-time multi-person avatar use case.

❗️❗️❗️Notice that the current released models used a different coordinate system as SMPL (with negative z as gravity direction), and the humanoid is modifed in a way such that it is facing positive x direction (instead of the original SMPL facing). This is reflected in a "up_right_start" flag in the humanoid robot (smpl_local_robot.py) configuration. This is done to make the humanoid's heading to be eailerly defined and flipping left and right easier, but would require further modification for converting back to SMPL (which is provided in the code). In the future I am working towards removing this modification.

❗️❗️❗️ Another notice is that while the MCP/Mixture of expert model is great for achieving high success rate, it is not absolutely necessary for the PHC to work. The PHC can work with a single primitive model and achieves high success rate; though it wouldn't have the failure state recovery capability.

Docs

Docs on SMPL_Robot
Docker Instructions (from @kexul)
Webcam Demo
Language Demo
Retargeting to Your Own Humanoids
Offline Dataset (PHC_Act)

Current Results on Cleaned AMASS (11313 Sequences)

All evaluation is done using the mean SMPL body pose and adjust the height, using the same evaluation protocal as in UHC. Noticed that different evaluation protocal will lead to different results, and Isaac gym itself can lead to (slightly) different results based on batch size/machine setup.

Models	Succ	G-MPJPE	ACC
PHC	98.9%	37.5	3.3
PHC-KP	98.7%	40.7	3.5
PHC+ in Pulse	100%	26.6	2.7
PHC-Prim (single primitive)	99.9%	25.9	2.3
PHC-Fut (using future)	100%	25.3	2.5
PHC-X-Prim (single primitive)	99.9%	24.7	3.6

Dependencies

To create the environment, follow the following instructions:

Create new conda environment and install pytroch:

conda create -n isaac python=3.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install -r requirement.txt

Download and setup Isaac Gym.
Download SMPL paramters from SMPL and SMPLX. Put them in the data/smpl folder, unzip them into 'data/smpl' folder. For SMPL, please download the v1.1.0 version, which contains the neutral humanoid. Rename the files basicmodel_neutral_lbs_10_207_0_v1.1.0, basicmodel_m_lbs_10_207_0_v1.1.0.pkl, basicmodel_f_lbs_10_207_0_v1.1.0.pkl to SMPL_NEUTRAL.pkl, SMPL_MALE.pkl and SMPL_FEMALE.pkl. For SMPLX, please download the v1.1 version. Rename The file structure should look like this:


|-- data
    |-- smpl
        |-- SMPL_FEMALE.pkl
        |-- SMPL_NEUTRAL.pkl
        |-- SMPL_MALE.pkl
        |-- SMPLX_FEMALE.pkl
        |-- SMPLX_NEUTRAL.pkl
        |-- SMPLX_MALE.pkl

Make sure you have the SMPL paramters properly setup by running the following scripts:

python scripts/vis/vis_motion_mj.py
python scripts/joint_monkey_smpl.py

The SMPL model is used to adjust the height the humanoid robot to avoid penetnration with the ground during data loading.

Use the following script to download trained models and sample data.

bash download_data.sh

this will download amass_isaac_standing_upright_slim.pkl, which is a standing still pose for testing.

To evaluate with your own SMPL data, see the script scripts/data_process/convert_data_smpl.py. Pay speical attention to make sure the coordinate system is the same as the one used in simulaiton (with negative z as gravity direction).

Evaluation

Viewer Shortcuts

Keyboard	Function
f	focus on humanoid
Right click + WASD	change view port
Shift + Right click + WASD	change view port fast
r	reset episode
j	apply large force to the humanoid
l	record screenshot, press again to stop recording
;	cancel screen shot
m	cancel termination based on imitation

... more shortcut can be found in phc/env/tasks/base_task.py

Notes on rendering: I am using pyvirtualdisplay to record the video such that you can see all humanoids at the same time (default function will only capture the first environment). You can disable it using the flag no_virtual_display=True.

You can use the render_o3d=True no_virtual_display=True flag to render the SMPL mesh together with your Isaac Gym simulation in real time like this:

to do the above visulaization, press m (to cancel termination based on imitation), and then press j (to apply a large force to the humanoid).

Imitation

SMPL and SMPL-X

PHC-X-Prim (single primitive model)

python phc/run_hydra.py learning=im_pnn_big exp_name=phc_x_pnn  env=env_im_x_pnn robot=smplx_humanoid env.motion_file=sample_data/standing_x.pkl env.training_prim=0 epoch=-1 test=True  env.num_envs=1  headless=False

PHC+: keypoint model, can getup from the ground and walk back) | Best model for video/language model

python phc/run_hydra.py learning=im_mcp_big learning.params.network.ending_act=False exp_name=phc_comp_kp_2 env.obs_v=7 env=env_im_getup_mcp robot=smpl_humanoid robot.real_weight_porpotion_boxes=False env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_kp_2/Humanoid.pth'] env.num_prim=3 env.num_envs=1  headless=False epoch=-1 test=True

PHC+: rotation + keypoint model, can getup from the ground but not walk back -- model used in PULSE due to time constraint

python phc/run_hydra.py learning=im_mcp_big  exp_name=phc_comp_3 env=env_im_getup_mcp robot=smpl_humanoid env.zero_out_far=False robot.real_weight_porpotion_boxes=False env.num_prim=3 env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_3/Humanoid.pth'] env.num_envs=1  headless=False epoch=-1 test=True

Evaluate full model:

## Shape + rotation + keypoint model

python phc/run_hydra.py learning=im_mcp exp_name=phc_shape_mcp_iccv test=True env=env_im_getup_mcp robot=smpl_humanoid_shape robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_shape_pnn_iccv/Humanoid.pth'] env.num_envs=1  headless=False epoch=-1



## keypoint model
python phc/run_hydra.py learning=im_mcp exp_name=phc_kp_mcp_iccv  test=True env=env_im_getup_mcp robot=smpl_humanoid robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_kp_pnn_iccv/Humanoid.pth'] env.num_envs=1 env.obs_v=7 headless=False epoch=-1

Evaluate on AMASS:

## rotation + keypoint model (100% - PHC+)
python phc/run_hydra.py learning=im_mcp_big  exp_name=phc_comp_3 env=env_im_getup_mcp robot=smpl_humanoid env.zero_out_far=False robot.real_weight_porpotion_boxes=False env.num_prim=3 env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_3/Humanoid.pth'] env.num_envs=1  headless=False im_eval=True


## Shape + rotation + keypoint model
python phc/run_hydra.py learning=im_mcp exp_name=phc_shape_mcp_iccv epoch=-1 test=True env=env_im_getup_mcp robot=smpl_humanoid_shape robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_shape_pnn_iccv/Humanoid.pth'] env.num_envs=1  headless=False im_eval=True


## keypoint model
python phc/run_hydra.py learning=im_mcp exp_name=phc_kp_mcp_iccv epoch=-1 test=True env=env_im_getup_mcp robot=smpl_humanoid robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_kp_pnn_iccv/Humanoid.pth'] env.num_envs=1024 env.obs_v=7  im_eval=True

Evaluate single primitive model:

## Shape + rotation + keypoint model
python phc/run_hydra.py learning=im_pnn exp_name=phc_shape_pnn_iccv epoch=-1 test=True env=env_im_pnn robot=smpl_humanoid_shape robot.freeze_hand=True robot.box_body=False env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl  env.num_envs=1  headless=False


## keypoint model
python phc/run_hydra.py learning=im_pnn exp_name=phc_kp_pnn_iccv epoch=-1 test=True env=env_im_pnn env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl robot.freeze_hand=True robot.box_body=False env.num_envs=1 env.obs_v=7  headless=False

In-the-wild Avatar Control

See Webcam Demo

The tldr is run:

python scripts/demo/video_to_pose_server.py

then

python phc/run_hydra.py learning=im_mcp exp_name=phc_kp_mcp_iccv env=env_im_getup_mcp env.task=HumanoidImMCPDemo robot=smpl_humanoid robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_kp_pnn_iccv/Humanoid.pth'] env.num_envs=1 env.obs_v=7 headless=False epoch=-1 test=True no_virtual_display=True

See Language-to-motion Demo

python phc/run_hydra.py learning=im_mcp exp_name=phc_kp_mcp_iccv env=env_im_getup_mcp env.task=HumanoidImMCPDemo robot=smpl_humanoid robot.freeze_hand=True robot.box_body=False env.z_activation=relu env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.models=['output/HumanoidIm/phc_kp_pnn_iccv/Humanoid.pth'] env.num_envs=1 env.obs_v=7 headless=False epoch=-1 test=True no_virtual_display=True

VR Controller Tracking

python phc/run_hydra.py learning=im_big exp_name=phc_prim_vr env=env_vr robot=smpl_humanoid robot.box_body=False env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl env.num_envs=1 headless=False epoch=-1 test=True no_virtual_display=True

Training

Data Processing AMASS

We train on a subset of the AMASS dataset.

For processing the AMASS, first, download the AMASS dataset from AMASS. Then, run the following script on the unzipped data:

python scripts/data_process/convert_amass_data.py

Training PHC

H1 and G1

python phc/run_hydra.py project_name=Robot_IM robot=unitree_g1 env=env_im_g1_phc env.motion_file=sample_data/dance_sample_g1.pkl learning=im_pnn_big exp_name=unitree_g1_pnn sim=robot_sim control=robot_control learning.params.network.space.continuous.sigma_init.val=-1.7

python phc/run_hydra.py  project_name=Robot_IM   robot=unitree_h1      env=env_im_h1_phc env.motion_file=sample_data/sample_dance_h1.pkl learning=im_pnn_big   exp_name=unitree_h1_pnn_realsim_092924 sim=robot_sim control=robot_control learning.params.network.space.continuous.sigma_init.val=-1.7

After training, you can run eval by adding num_threads=1 headless=False test=True epoch=-1 to the above scripts.

Train single primitive

python phc/run_hydra.py learning=im_big exp_name=phc_prim env=env_im robot=smpl_humanoid env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl

Train full PNN model

Training PHC is not super automated yet, so it requires some (a lot of) manual steps, and invovles changing the config file a couple of times during training based on the training phase. The phc_shape_pnn_train_iccv.yaml config file provides a starting point for training primitives.

First, we will train one primitive, and keep an eye on its performance (--has_eval) flag. In the config, the "training_prim" is the primitive that is being trained. This need to be updated accordingly.

python phc/run_hydra.py learning=im_pnn_big env=env_im_pnn robot=smpl_humanoid env.motion_file=[motion_file] exp_name=[exp_name]

After the performance plateaus, we will dump the most recent sequences that the primitives has failed on, and use them to train the next primitive. Here idx is the primitive that should be trained.

python scripts/pmcp/forward_pmcp.py --exp [exp_name] --epoch [epoch] --idx {idx}

The above script will dump two files: one is the next hard sequences to learn, and anothere one is the checkpoint to resume with the copied primitive.

To train the next primitive, run teh following script:

python phc/run_hydra.py learning=im_pnn_big env=env_im_pnn robot=smpl_humanoid env.motion_file=[motion_file] epoch=[include epoch+1 from previous step] env.fitting=True env.training_prim=1

Repeat this process until no hard sequences are left. Then, to train the fail-state recovery primitive on simple locomotion data.

python phc/run_hydra.py learning=im_pnn_big exp_name=phc_shape_pnn_iccv env=env_im_pnn robot=smpl_humanoid env.motion_file=[motion_file] epoch=[include current epoch + 1 from the forward_pmcp step] env.fitting=True env.training_prim=[+1] env.zero_out_far=True env.zero_out_far_train=True env.getup_udpate_epoch={epoch}

After all primitives are trained, train the composer:

python phc/run_hydra.py learning=im_mcp_big exp_name=[] env=env_im_getup_mcp robot=smpl_humanoid env.motion_file=[motion_file] env.models=['output/HumanoidIm/{exp_name}/Humanoid.pth']

When training the composer, you can repeat the process above (progressive mining hard sequences) to improve performance.

You can also just train one model for imitation (no PNN):

python phc/run_hydra.py learning=im exp_name=phc_prim_iccv  env=env_im robot=smpl_humanoid_shape robot.freeze_hand=True robot.box_body=False env.motion_file=sample_data/amass_isaac_standing_upright_slim.pkl

Trouble Shooting

Multiprocessing Issues

See this issue for some discusssions.

For the data loading part, try use:

at this line, bascially, uncomment:

mp.set_sharing_strategy('file_system')

which should fix the issue. Though using file_system has caused me problems before as well.

Success Rate

The success rate is reported as "eval_success_rate" in the wandb logging, not the "success_rate", which is a episodic success rate used during training.

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{Luo2023PerpetualHC,
    author={Zhengyi Luo and Jinkun Cao and Alexander W. Winkler and Kris Kitani and Weipeng Xu},
    title={Perpetual Humanoid Control for Real-time Simulated Avatars},
    booktitle={International Conference on Computer Vision (ICCV)},
    year={2023}
}

Also consider citing these prior works that are used in this project:

@inproceedings{rempeluo2023tracepace,
    author={Rempe, Davis and Luo, Zhengyi and Peng, Xue Bin and Yuan, Ye and Kitani, Kris and Kreis, Karsten and Fidler, Sanja and Litany, Or},
    title={Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion},
    booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2023}
}     

@inproceedings{Luo2022EmbodiedSH,
  title={Embodied Scene-aware Human Pose Estimation},
  author={Zhengyi Luo and Shun Iwase and Ye Yuan and Kris Kitani},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

@inproceedings{Luo2021DynamicsRegulatedKP,
  title={Dynamics-Regulated Kinematic Policy for Egocentric Pose Estimation},
  author={Zhengyi Luo and Ryo Hachiuma and Ye Yuan and Kris Kitani},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021}
}

References

This repository is built on top of the following amazing repositories:

Main code framework is from: IsaacGymEnvs
Part of the SMPL_robot code is from: UHC
SMPL models and layer is from: SMPL-X model

Please follow the lisence of the above repositories for usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.MD

README.MD

Perpetual Humanoid Control for Real-time Simulated Avatars

Table of Contents

News 🚩

TODOs

Introduction

Docs

Current Results on Cleaned AMASS (11313 Sequences)

Dependencies

Evaluation

Viewer Shortcuts

Imitation

SMPL and SMPL-X

In-the-wild Avatar Control

VR Controller Tracking

Training

Data Processing AMASS

Training PHC

H1 and G1

Train single primitive

Train full PNN model

Trouble Shooting

Multiprocessing Issues

Success Rate

Citation

References

Files

README.MD

Latest commit

History

README.MD

File metadata and controls

Perpetual Humanoid Control for Real-time Simulated Avatars

Table of Contents

News 🚩

TODOs

Introduction

Docs

Current Results on Cleaned AMASS (11313 Sequences)

Dependencies

Evaluation

Viewer Shortcuts

Imitation

SMPL and SMPL-X

In-the-wild Avatar Control

VR Controller Tracking

Training

Data Processing AMASS

Training PHC

H1 and G1

Train single primitive

Train full PNN model

Trouble Shooting

Multiprocessing Issues

Success Rate

Citation

References