Monocular, One-stage, Regression of Multiple 3D People

Monocular, One-stage, Regression of Multiple 3D People,
Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei,
arXiv paper (arXiv 2008.12272)

ROMP is a one-stage network for multi-person 3D mesh recovery from a single image.

Simple: Concise one-stage framework for simultaneous person detection and 3D body mesh recovery.
Fast: ROMP can run over 30 FPS on a 1070Ti GPU.
Strong: ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks.
Easy to use: We provide user friendly testing API and webcam demos.

Contact: [email protected]. Feel free to contact me for related questions or discussions!

News

*2021/9/10: Training code release. API optimization. *
2021/7/15: Adding support for an elegant context manager to run code in a notebook. See Colab demo for the details.
2021/4/19: Adding support for textured SMPL mesh using vedo. See visualization.md for the details.
2021/3/30: 1.0 version. Rebuilding the code. Release the ResNet-50 version and evaluation on 3DPW.
2020/11/26: Optimization for person-person occlusion. Small changes for video support.
2020/9/11: Real-time webcam demo using local/remote server. 2020/9/4: Google Colab demo. Saving a npy file per imag.

Try on Google Colab

Before installation, you can take a few minutes to try the prepared Google Colab demo a try.
It allows you to run the project in the cloud, free of charge.

Please refer to the bug.md for unpleasant bugs. Welcome to submit the issues for related bugs.

Installation

Please refer to install.md for installation.

Processing images

To re-implement the demo results, please run

cd ROMP
sh scripts/image.sh
# if there are any bugs about shell script, please consider run the following command instead:
python -u -m romp.predict.image --configs_yml='configs/image.yml'

Results will be saved in ROMP/demo/images_results. You can also run the code on other images via putting the images under ROMP/demo/images or passing the path of image folder via

python -u -m romp.predict.image --inputs=/path/to/image_folder --output_dir='demo/image_results'

Please refer to config_guide.md for saving the estimated mesh/Center maps/parameters dict.

Processing videos

To process videos, you can change the inputs in configs/video.yml to /path/to/your/video, then run

cd ROMP
sh scripts/video.sh

or simply run the command like

python -u -m romp.predict.video --inputs=demo/videos/sample_video.mp4 --output_dir='demo/sample_video_results'

Webcam

We also provide the webcam demo code, which can run at real-time on a 1070Ti GPU / remote server.
Currently, limited by the visualization pipeline, the webcam visualization code only support the single-person mesh.

To do this you just need to run:

cd ROMP
sh scripts/webcam.sh

Pelease refer to config_guide.md for configurations.

Blender

Export to Blender FBX

Please refer to expert.md to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with demo/videos/sample_video2_results/sample_video2.mp4, whose results would be saved to demo/videos/sample_video2_results.

Blender Addons

vltmedia/QuickMocap-BlenderAddon: Use this Blender Addon to import & clean Mocap Pose data from .npz or .pkl files. These files may have been created using Numpy, ROMP, or other motion capture processes that package their files accordingly. (github.com)
- Reads the .npz file created by ROMP. Clean & smooth the resulting keyframes.

Evaluation

Please refer to evaluation.md for evaluation on benchmarks.

TODO LIST

The code will be gradually open sourced according to:

Citation

Please considering citing

@InProceedings{ROMP,
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
title = {Monocular, One-stage, Regression of Multiple 3D People},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021}
}

Acknowledgement

We thank Peng Cheng for his constructive comments on Center map training.

Thanks to Marco Musy for his help in the textured SMPL visualization.

Thanks to Gavin Gray for adding support for an elegant context manager to run code in a notebook via this pull.

Thanks to VLT Media for adding support for running on Windows & batch_videos.py.

Here are some great resources we benefit:

SMPL models and layer is borrowed from MPII SMPL-X model.
Webcam pipeline is borrowed from minimal-hand.
Some functions are borrowed from HMR-pytorch.
Some functions for data augmentation are borrowed from SPIN.
Synthetic occlusion is borrowed from synthetic-occlusion.
The evaluation code of 3DPW dataset is brought from 3dpw-eval.
For fair comparison, the GT annotations of 3DPW dataset are brought from VIBE.
3D mesh visualization is supported by vedo, EasyMocap and Open3D.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
configs		configs
demo/images		demo/images
docs		docs
romp		romp
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monocular, One-stage, Regression of Multiple 3D People

News

Try on Google Colab

Installation

Processing images

Processing videos

Webcam

Blender

Export to Blender FBX

Blender Addons

Evaluation

TODO LIST

Citation

Acknowledgement

About

Releases

Packages

Languages

License

3013218123/ROMP_orign

Folders and files

Latest commit

History

Repository files navigation

Monocular, One-stage, Regression of Multiple 3D People

News

Try on Google Colab

Installation

Processing images

Processing videos

Webcam

Blender

Export to Blender FBX

Blender Addons

Evaluation

TODO LIST

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages