DTF: Enhancing 3D Human Pose Estimation Amidst Severe Occlusion with Dual Transformer Fusion

Mehwish Ghafoor, Arif Mahmood, Muhammad Bilal

Model Architecture

Proposed Dual Transformer Fusion (DTF) architecture takes severly occluded 2D joint positions as input and estimate realistic 3D pose.

Environment

The code is developed and tested under the following environment:

PyTorch 1.7.1 and Torchvision 0.8.2 following the official instructions.
Install dependencies:

pip3 install -r requirements.txt

Dataset Setup

Download the dataset from the Human 3.6M website.
Set up the Human3.6M dataset as per the VideoPose3D instructions.
Alternatively, download the processed data from here.

${DTF_Occ}/
|-- dataset
|   |-- data_3d_h36m.npz
|   |-- data_2d_h36m_gt.npz
|   |-- data_2d_h36m_cpn_ft_h36m_dbb.npz

Pretrained Model

You can download pretrained model for Human 3.6M from here.

For MPI-INF-3DHP, we have followed the setting of P-STMO

Training and Test the Model for Human 3.6M

Training with 351 frames on Human 3.6M

python3 main_h36m.py --frames 351 --batch_size 32

Test

python3 main_h36m.py --test --previous_dir 'checkpoint/351_severe' --frames 351

Video Demo - Human 3.6M

3D Pose Estimations with 16 random occluded joints out of 17 for action ``Eating"

Using proposed DTF

eat_16miss.mp4

Using MHFormer

eat_16_mhformer.mp4

Using STCFormer

eat_16_stcformer.mp4

Using PSTMO

eat_16_pstmo.mp4

Performance Comparison of MPI-INF-3DHP under Severe Occlusion

mpi_16vis_2.pdf

Human 3.6M Results using CPN 2D Detections:

miss_14cmp (1).pdf

Human 3.6M Results using Stacked Hourglass 2D Detections:

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
common		common
dataset		dataset
model		model
README.md		README.md
main_h36m.py		main_h36m.py
requirements		requirements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DTF: Enhancing 3D Human Pose Estimation Amidst Severe Occlusion with Dual Transformer Fusion

Model Architecture

Environment

Dataset Setup

Pretrained Model

Training and Test the Model for Human 3.6M

Performance Comparison of MPI-INF-3DHP under Severe Occlusion

Human 3.6M Results using CPN 2D Detections:

Human 3.6M Results using Stacked Hourglass 2D Detections:

About

Releases

Packages

Languages

MehwishG/DTF

Folders and files

Latest commit

History

Repository files navigation

DTF: Enhancing 3D Human Pose Estimation Amidst Severe Occlusion with Dual Transformer Fusion

Model Architecture

Environment

Dataset Setup

Pretrained Model

Training and Test the Model for Human 3.6M

Performance Comparison of MPI-INF-3DHP under Severe Occlusion

Human 3.6M Results using CPN 2D Detections:

Human 3.6M Results using Stacked Hourglass 2D Detections:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages