MDM: Human Motion Diffusion Model

The official PyTorch implementation of the paper "Human Motion Diffusion Model".

Please visit our webpage for more details.

Bibtex

If you find this code useful in your research, please cite:

@article{tevet2022human,
  title={Human Motion Diffusion Model},
  author={Tevet, Guy and Raab, Sigal and Gordon, Brian and Shafir, Yonatan and Bermano, Amit H and Cohen-Or, Daniel},
  journal={arXiv preprint arXiv:2209.14916},
  year={2022}
}

News

📢 9/Oct/22 - Added training and evaluation scripts. Note slight env changes adapting to the new code. If you already have an installed environment, run bash prepare/download_glove.sh; pip install clearml to adapt.

📢 6/Oct/22 - First release - sampling and rendering using pre-trained models.

Getting started

This code was tested on Ubuntu 18.04.5 LTS and requires:

Python 3.7
conda3 or miniconda3
CUDA capable GPU (one is enough)

Note: AWS Ubuntu image that works is amazon/Deep Learning AMI (Ubuntu 18.04) Version 64.4

1. Setup environment

Install ffmpeg (if not already installed):

sudo apt update
sudo apt install ffmpeg

For windows use this instead.

Setup conda env:

conda env create -f environment.yml
conda activate mdm
python -m spacy download en_core_web_sm
pip install git+https://github.com/openai/CLIP.git

Download dependencies:

bash prepare/download_smpl_files.sh
bash prepare/download_glove.sh

2. Get data

There are two paths to get the data:

(a) Go the easy way if you just want to generate text-to-motion (excluding editing which does require motion capture data)

(b) Get full data to train and evaluate the model.

a. The easy way (text only)

HumanML3D - Clone HumanML3D, then copy the data dir to our repository:

cd ..
git clone https://github.com/EricGuo5513/HumanML3D.git
unzip ./HumanML3D/HumanML3D/texts.zip -d ./HumanML3D/HumanML3D/
cp -r HumanML3D/HumanML3D motion-diffusion-model/dataset/HumanML3D
cd motion-diffusion-model

b. Full data (text + motion capture)

HumanML3D - Follow the instructions in HumanML3D, then copy the result dataset to our repository:

cp -r ../HumanML3D/HumanML3D ./dataset/HumanML3D

KIT - Download from HumanML3D (no processing needed this time) and the place result in ./dataset/KIT-ML

3. Download the pretrained models

Download the model(s) you wish to use, then unzip and place it in ./save/. For text-to-motion, you need only the first one.

HumanML3D

humanml-encoder-512 (best model)

humanml-decoder-512

humanml-decoder-with-emb-512

KIT

kit-encoder-512

gdown 1PE0PK8e5a5j-7-Xhs5YET5U5pGh0c821

Generate text-to-motion

Generate from test set prompts

python -m sample --model_path ./save/humanml_trans_enc_512/model000200000.pt --num_samples 10 --num_repetitions 3

Generate from your text file

python -m sample --model_path ./save/humanml_trans_enc_512/model000200000.pt --input_text ./assets/example_text_prompts.txt

Generate a single prompt

python -m sample --model_path ./save/humanml_trans_enc_512/model000200000.pt --text_prompt "the person walked forward and is picking up his toolbox."

You can also define:

--device id.
--seed to sample different prompts.
--motion_length in seconds (maximum is 9.8[sec]).

Running those will get you:

results.npy file with text prompts and xyz positions of the generated animation
sample##_rep##.mp4 - a stick figure animation for each generated motion.

It will look something like this:

You can stop here, or render the SMPL mesh using the following script.

Render SMPL mesh

To create SMPL mesh per frame run:

python -m visualize.render_mesh --input_path /path/to/mp4/stick/figure/file

This script outputs:

sample##_rep##_smpl_params.npy - SMPL parameters (thetas, root translations, vertices and faces)
sample##_rep##_obj - Mesh per frame in .obj format.

Notes:

The .obj can be integrated into Blender/Maya/3DS-MAX and rendered using them.
This script is running SMPLify and needs GPU as well (can be specified with the --device flag).
Important - Do not change the original .mp4 path before running the script.

Notes for 3d makers:

You have two ways to animate the sequence:
1. Use the SMPL add-on and the theta parameters saved to sample##_rep##_smpl_params.npy (we always use beta=0 and the gender-neutral model).
2. A more straightforward way is using the mesh data itself. All meshes have the same topology (SMPL), so you just need to keyframe vertex locations. Since the OBJs are not preserving vertices order, we also save this data to the sample##_rep##_smpl_params.npy file for your convenience.

Editing

ETA - Nov 22

Train your own MDM

HumanML3D

python -m train.train_mdm --save_dir save/my_humanml_trans_enc_512 --dataset humanml

KIT

python -m train.train_mdm --save_dir save/my_kit_trans_enc_512 --dataset kit

Use --device to define GPU id.
Use --arch to choose one of the architectures reported in the paper {trans_enc, trans_dec, gru} (trans_enc is default).
Add --train_platform_type {ClearmlPlatform, TensorboardPlatform} to track results with either ClearML or Tensorboard.
Add --eval_during_training to run a short (90 minutes) evaluation for each saved checkpoint. This will slow down training but will give you better monitoring.

Evaluate

Takes about 20 hours (on a single GPU)
The output of this script for the pre-trained models (as was reported in the paper) is provided in the checkpoints zip file.

HumanML3D

python -m eval.eval_humanml --model_path ./save/humanml_trans_enc_512/model000475000.pt

KIT

python -m eval.eval_humanml --model_path ./save/kit_trans_enc_512/model000400000.pt

Acknowledgments

This code is standing on the shoulders of giants. We want to thank the following contributors that our code is based on:

guided-diffusion, MotionCLIP, text-to-motion, actor, joints2smpl.

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including CLIP, SMPL, SMPL-X, PyTorch3D, and uses datasets that each have their own respective licenses that must also be followed.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
assets		assets
body_models		body_models
data_loaders		data_loaders
dataset		dataset
diffusion		diffusion
eval		eval
fbx		fbx
model		model
prepare		prepare
train		train
utils		utils
visualize		visualize
.gitignore		.gitignore
FbxCommon.py		FbxCommon.py
LICENSE		LICENSE
README.md		README.md
environment-windows.yml		environment-windows.yml
environment.yml		environment.yml
fbx.so		fbx.so
fbx_output.py		fbx_output.py
fbxsip.so		fbxsip.so
install.sh		install.sh
libfbxsdk.so		libfbxsdk.so
npy2json.py		npy2json.py
run.sh		run.sh
sample.py		sample.py
sampler.py		sampler.py
server.py		server.py
setup.sh		setup.sh
visualizer.py		visualizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDM: Human Motion Diffusion Model

Bibtex

News

Getting started

1. Setup environment

2. Get data

a. The easy way (text only)

b. Full data (text + motion capture)

3. Download the pretrained models

Generate text-to-motion

Generate from test set prompts

Generate from your text file

Generate a single prompt

Render SMPL mesh

Editing

Train your own MDM

Evaluate

Acknowledgments

License

About

Releases

Packages

Languages

License

webaverse/motion-diffusion-model

Folders and files

Latest commit

History

Repository files navigation

MDM: Human Motion Diffusion Model

Bibtex

News

Getting started

1. Setup environment

2. Get data

a. The easy way (text only)

b. Full data (text + motion capture)

3. Download the pretrained models

Generate text-to-motion

Generate from test set prompts

Generate from your text file

Generate a single prompt

Render SMPL mesh

Editing

Train your own MDM

Evaluate

Acknowledgments

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages