Skip to content
forked from ttxskk/AiOS

[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

License

Notifications You must be signed in to change notification settings

CuckooEcho/AiOS

 
 

Repository files navigation

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

1SenseTime Research, 2City University of Hong Kong,
3International Digital Economy Academy (IDEA),
4S-Lab, Nanyang Technological University, 5Shanghai AI Laboratory

method

AiOS performs human localization and SMPL-X estimation in a progressive manner. It is composed of (1) the body localization stage that predicts coarse human location; (2) the Body refinement stage that refines body features and produces face and hand locations; (3) the Whole-body Refinement stage that refines whole-body features and regress SMPL-X parameters.

Preparation

The file structure should be like:

AiOS/
├── config/
└── data
    ├── body_models
    |   ├── smplx
    |   |   ├──MANO_SMPLX_vertex_ids.pkl
    |   |   ├──SMPL-X__FLAME_vertex_ids.npy
    |   |   ├──SMPLX_NEUTRAL.pkl
    |   |   ├──SMPLX_to_J14.pkl
    |   |   ├──SMPLX_NEUTRAL.npz
    |   |   ├──SMPLX_MALE.npz
    |   |   └──SMPLX_FEMALE.npz
    |   └── smpl
    |       ├──SMPL_FEMALE.pkl
    |       ├──SMPL_MALE.pkl
    |       └──SMPL_NEUTRAL.pkl
    ├── preprocessed_npz
    │   └── cache
    |       ├──agora_train_3840_w_occ_cache_2010.npz
    |       ├──bedlam_train_cache_080824.npz
    |       ├──...
    |       └──coco_train_cache_080824.npz
    ├── checkpoint
    │   └── aios_checkpoint.pth
    ├── datasets
    │   ├── agora
    |   │    └──3840x2160
    │   │        ├──train
    │   │        └──test
    │   ├── bedlam
    │   │     ├──train_images
    │   │     └──test_images
    │   ├── ARCTIC
    │   │     ├──s01
    │   │     ├──s02
    │   │     ├──...   
    │   │     └──s10
    │   ├── EgoBody
    │   │     ├──egocentric_color
    │   │     └──kinect_color
    │   └── UBody
    |       └──images
    └── checkpoint
        ├── edpose_r50_coco.pth
        └── aios_checkpoint.pth

Installtion

# Create a conda virtual environment and activate it.
conda create -n aios python=3.8 -y
conda activate aios

# Install PyTorch and torchvision.
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge

# Install Pytorch3D
git clone -b v0.6.1 https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d
pip install -v -e .
cd ..

# Install MMCV, build from source
git clone -b v1.6.1 https://github.com/open-mmlab/mmcv.git
cd mmcv
export MMCV_WITH_OPS=1
export FORCE_MLU=1
pip install -v -e .
cd ..

# Install other dependencies
conda install -c conda-forge ffmpeg
pip install -r requirements.txt 

# Build deformable detr
cd models/aios/ops
python setup.py build install
cd ../../..

Inference

  • Place the mp4 video for inference under AiOS/demo/
  • Prepare the pretrained models to be used for inference under AiOS/data/checkpoint
  • Inference output will be saved in AiOS/demo/{INPUT_VIDEO}_out
# CHECKPOINT: checkpoint path
# INPUT_VIDEO: input video path
# OUTPUT_DIR: output path
# NUM_PERSON: num of person. This parameter sets the expected number of persons to be detected in the input (image or video). 
#   The default value is 1, meaning the algorithm will try to detect at least one person. If you know the maximum number of persons
#   that can appear simultaneously, you can set this variable to that number to optimize the detection process (a lower threshold is recommended as well).
# THRESHOLD: socre threshold. This parameter sets the score threshold for person detection. The default value is 0.5. 
#   If the confidence score of a detected person is lower than this threshold, the detection will be discarded. 
#   Adjusting this threshold can help in filtering out false positives or ensuring only high-confidence detections are considered.
# GPU_NUM: GPU num. 
sh scripts/inference.sh {CHECKPOINT} {INPUT_VIDEO} {OUTPUT_DIR} {NUM_PERSON} {THRESHOLD} {THRESHOLD}

# For inferencing short_video.mp4 with output directory of demo/short_video_out
sh scripts/inference.sh data/checkpoint/aios_checkpoint.pth short_video.mp4 demo 2 0.1 8

Test

NMVE NMJE MVE MPJPE
DATASETS FB B FB B FB B F LH/RH FB B F LH/RH
BEDLAM 87.6 57.7 85.8 57.7 83.2 54.8 26.2 28.1/30.8 81.5 54.8 26.2 25.9/28.0
AGORA-Test 102.9 63.4 100.7 62.5 98.8 60.9 27.7 42.5/43.4 96.7 60.0 29.2 40.1/41.0
AGORA-Val 105.1 60.9 102.2 61.4 100.9 60.9 30.6 43.9/45.6 98.1 58.9 32.7 41.5/43.4

a. Make test_result dir

mkdir test_result

b. AGORA Validatoin

Run the following command and it will generate a 'predictions/' result folder which can evaluate with the agora evaluation tool

sh scripts/test_agora_val.sh data/checkpoint/aios_checkpoint.pth agora_val

b. AGORA Test Leaderboard

Run the following command and it will generate a 'predictions.zip' which can be submitted to AGORA Leaderborad

sh scripts/test_agora.sh data/checkpoint/aios_checkpoint.pth agora_test

c. BEDLAM

Run the following command and it will generate a 'predictions.zip' which can be submitted to BEDLAM Leaderborad

sh scripts/test_bedlam.sh data/checkpoint/aios_checkpoint.pth bedlam_test

Acknowledge

Some of the codes are based on MMHuman3D, ED-Pose and SMPLer-X.

About

[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.0%
  • Cuda 2.6%
  • Other 0.4%