PYSKL is a toolbox focusing on action recognition based on SKeLeton data with PYTorch. Various algorithms will be supported for skeleton-based action recognition. We build this project based on the OpenSource Project MMAction2.
This repo is the official implementation of PoseConv3D and STGCN++.
PYSKL is an OpenSource Project under the Apache2 license. Any contribution from the community to improve PYSKL is appreciated. For significant contributions (like supporting a novel & important task), a corresponding part will be added to our updated tech report, while the contributor will also be added to the author list.
Any user can open a PR to contribute to PYSKL. The PR will be reviewed before being merged into the master branch. If you want to open a large PR in PYSKL, you are recommended to first reach me (via my email [email protected]) to discuss the design, which helps to save large amounts of time in the reviewing stage.
- Support DG-STGCN, which is a state-of-the-art skeleton action algorithm that doesn't rely on a pre-defined graph (2022-12-12).
- The tech report of PYSKL is accepted by MM 2022 (2022-06-28).
- Support spatial augmentations and provide a benchmark on ST-GCN++ (2022-05-12).
- Support skeleton action recognition demo with GCN algorithms (2022-05-03).
- Release the skeleton annotations (HRNet 2D Pose), config files, and pre-trained ckpts for Kinetics-400. K400 is a large-scale dataset (even for skeleton), you should have
memcached
andpymemcache
installed for efficient training & testing on K400 (2022-05-01).
- DG-STGCN (Arxiv): https://arxiv.org/abs/2210.05895 [MODELZOO]
- ST-GCN (AAAI 2018): https://arxiv.org/abs/1801.07455 [MODELZOO]
- ST-GCN++ (PYSKL, Tech Report): https://arxiv.org/abs/2205.09443 [MODELZOO]
- PoseConv3D (CVPR 2022 Oral): https://arxiv.org/abs/2104.13586 [MODELZOO]
- AAGCN (TIP): https://arxiv.org/abs/1912.06971 [MODELZOO]
- MS-G3D (CVPR 2020 Oral): https://arxiv.org/abs/2003.14111 [MODELZOO]
- CTR-GCN (ICCV 2021): https://arxiv.org/abs/2107.12213 [MODELZOO]
- NTURGB+D (CVPR 2016): NTU RGB+D: A large scale dataset for 3D human activity analysis
- NTURGB+D 120 (TPAMI 2019): Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding
- Kinetics 400 (CVPR 2017): Quo vadis, action recognition? a new model and the kinetics dataset
- UCF101 (ArXiv 2012): UCF101: A dataset of 101 human actions classes from videos in the wild
- HMDB51 (ICCV 2021): HMDB: a large video database for human motion recognition
- FineGYM (CVPR 2020): Finegym: A hierarchical video dataset for fine-grained action understanding
- Diving48 (ECCV 2018): Resound: Towards action recognition without representation bias
For data pre-processing, we estimate 2D skeletons with a two-stage pose estimator (Faster-RCNN + HRNet). For 3D skeletons, we follow the pre-processing procedure of CTR-GCN. Currently, we do not provide the pre-processing scripts. Instead, we directly provide the processed skeleton data as pickle files (download links here), which can be directly used in training and evaluation. You can use vis_skeleton to visualize the provided skeleton data.
git clone https://github.com/kennymckormick/pyskl.git
cd pyskl
# Please first install pytorch according to instructions on the official website: https://pytorch.org/get-started/locally/. Please use pytorch with version smaller than 1.11.0 and larger (or equal) than 1.5.0
# The following command will install mmcv-full 1.5.0 from source, which might be very slow (take ~10 minutes). You can also follow the instruction at https://github.com/open-mmlab/mmcv to install mmcv-full from pre-built wheels, which will be much faster.
pip install -r requirements.txt
pip install -e .
# Before running the demo, make sure you have installed mmcv-full, mmpose and mmdet. You should first install mmcv-full, and then install mmpose, mmdet.
# You should run the following scripts under the directory `$PYSKL`
# Running the demo with PoseC3D trained on NTURGB+D 120 (Joint Modality), which is the default option. The input file is demo/ntu_sample.avi, the output file is demo/demo.mp4
python demo/demo_skeleton.py demo/ntu_sample.avi demo/demo.mp4
# Running the demo with STGCN++ trained on NTURGB+D 120 (Joint Modality). The input file is demo/ntu_sample.avi, the output file is demo/demo.mp4
python demo/demo_skeleton.py demo/ntu_sample.avi demo/demo.mp4 --config configs/stgcn++/stgcn++_ntu120_xsub_hrnet/j.py --checkpoint http://download.openmmlab.com/mmaction/pyskl/ckpt/stgcnpp/stgcnpp_ntu120_xsub_hrnet/j.pth
Note that for running demo on an arbitrary input video, you need a tracker to formulate pose estimation results for each frame into multiple skeleton sequences. Currently we are using a naive tracker based on inter-frame pose similarities. You can also try to write your own tracker.
You can use following commands for training and testing. Basically, we support distributed training on a single server with multiple GPUs.
# Training
bash tools/dist_train.sh {config_name} {num_gpus} {other_options}
# Testing
bash tools/dist_test.sh {config_name} {checkpoint} {num_gpus} --out {output_file} --eval top_k_accuracy mean_class_accuracy
For specific examples, please go to the README for each specific algorithm we supported.
If you use PYSKL in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry and the BibTex entry corresponding to the specific algorithm you used.
@misc{duan2022PYSKL,
url = {https://arxiv.org/abs/2205.09443},
author = {Duan, Haodong and Wang, Jiaqi and Chen, Kai and Lin, Dahua},
title = {PYSKL: Towards Good Practices for Skeleton Action Recognition},
publisher = {arXiv},
year = {2022}
}
For any questions, feel free to contact: [email protected]