Skip to content

ROMP: Monocular, One-stage, Regression of Multiple 3D People

License

Notifications You must be signed in to change notification settings

3013218123/ROMP_orign

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Monocular, One-stage, Regression of Multiple 3D People

Google Colab demo arXiv PWC PWC

Monocular, One-stage, Regression of Multiple 3D People,
Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei,
arXiv paper (arXiv 2008.12272)

ROMP is a one-stage network for multi-person 3D mesh recovery from a single image.

  • Simple: Concise one-stage framework for simultaneous person detection and 3D body mesh recovery.

  • Fast: ROMP can run over 30 FPS on a 1070Ti GPU.

  • Strong: ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks.

  • Easy to use: We provide user friendly testing API and webcam demos.

Contact: [email protected]. Feel free to contact me for related questions or discussions!

News

*2021/9/10: Training code release. API optimization. *
2021/7/15: Adding support for an elegant context manager to run code in a notebook. See Colab demo for the details.
2021/4/19: Adding support for textured SMPL mesh using vedo. See visualization.md for the details.
2021/3/30: 1.0 version. Rebuilding the code. Release the ResNet-50 version and evaluation on 3DPW.
2020/11/26: Optimization for person-person occlusion. Small changes for video support.
2020/9/11: Real-time webcam demo using local/remote server. 2020/9/4: Google Colab demo. Saving a npy file per imag.

Try on Google Colab

Before installation, you can take a few minutes to try the prepared Google Colab demo a try.
It allows you to run the project in the cloud, free of charge.

Please refer to the bug.md for unpleasant bugs. Welcome to submit the issues for related bugs.

Installation

Please refer to install.md for installation.

Processing images

To re-implement the demo results, please run

cd ROMP
sh scripts/image.sh
# if there are any bugs about shell script, please consider run the following command instead:
python -u -m romp.predict.image --configs_yml='configs/image.yml'

Results will be saved in ROMP/demo/images_results. You can also run the code on other images via putting the images under ROMP/demo/images or passing the path of image folder via

python -u -m romp.predict.image --inputs=/path/to/image_folder --output_dir='demo/image_results'

Please refer to config_guide.md for saving the estimated mesh/Center maps/parameters dict.

Processing videos

To process videos, you can change the inputs in configs/video.yml to /path/to/your/video, then run

cd ROMP
sh scripts/video.sh

or simply run the command like

python -u -m romp.predict.video --inputs=demo/videos/sample_video.mp4 --output_dir='demo/sample_video_results'

Webcam

We also provide the webcam demo code, which can run at real-time on a 1070Ti GPU / remote server.
Currently, limited by the visualization pipeline, the webcam visualization code only support the single-person mesh.

To do this you just need to run:

cd ROMP
sh scripts/webcam.sh

Pelease refer to config_guide.md for configurations.

Blender

Export to Blender FBX

Please refer to expert.md to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with demo/videos/sample_video2_results/sample_video2.mp4, whose results would be saved to demo/videos/sample_video2_results.

Blender Addons

Evaluation

Please refer to evaluation.md for evaluation on benchmarks.

TODO LIST

The code will be gradually open sourced according to:

  • the schedule
    • demo code for internet images / videos / webcam
    • runtime optimization
    • benchmark evaluation
    • training
    • virtual character animation

Citation

Please considering citing

@InProceedings{ROMP,
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
title = {Monocular, One-stage, Regression of Multiple 3D People},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021}
}

Acknowledgement

We thank Peng Cheng for his constructive comments on Center map training.

Thanks to Marco Musy for his help in the textured SMPL visualization.

Thanks to Gavin Gray for adding support for an elegant context manager to run code in a notebook via this pull.

Thanks to VLT Media for adding support for running on Windows & batch_videos.py.

Here are some great resources we benefit:

  • SMPL models and layer is borrowed from MPII SMPL-X model.
  • Webcam pipeline is borrowed from minimal-hand.
  • Some functions are borrowed from HMR-pytorch.
  • Some functions for data augmentation are borrowed from SPIN.
  • Synthetic occlusion is borrowed from synthetic-occlusion.
  • The evaluation code of 3DPW dataset is brought from 3dpw-eval.
  • For fair comparison, the GT annotations of 3DPW dataset are brought from VIBE.
  • 3D mesh visualization is supported by vedo, EasyMocap and Open3D.

About

ROMP: Monocular, One-stage, Regression of Multiple 3D People

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.3%
  • C++ 1.5%
  • Cython 1.3%
  • C 1.2%
  • Shell 0.7%