A graph convolutional network for skeleton based action recognition.
This repository holds the codebase, dataset and models for the paper
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition Sijie Yan, Yuanjun Xiong and Dahua Lin, AAAI 2018.
[Arxiv Preprint]
Our codebase is based on Python 2.7. There are a few dependencies to run the code. The major libraries we used are
- PyTorch
- NumPy
We experimented on two skeleton-based action recognition datasts: NTU RGB+D and Kinetics-skeleton.
NTU RGB+D can be downloaded from their website. Only the 3D skeletons(5.8GB) modality is required in our experiments. After that, tools/ntu_gendata.py
should be used to build the database for training or evaluation:
tools/ntu_gendata.py --data_path <path to nturgbd>
where the <path to nturgbd>
points to the 3D skeletons modality of NTU RGB+D dataset you download, for example data/NTU-RGB-D/nturgbd+d_skeletons
.
Kinetics is a video-based dataset for action recognition which only provide raw video clips without skeleton data. To obatin the joint locations, we first resized all videos to the resolution of 340x256 and converted the frame rate to 30 fps. Then, we extracted skeletons from each frame in Kinetics by Openpose. The extracted skeleton data we called Kinetics-skeleton(7.5GB) can be directly downloaded from here.
It is highly recommended storing data in the SSD rather than HDD for efficiency.
We provided the trained model weithts of Temporal Conv [1] and our ST- GCN. The model weights can be downloaded by running the script
bash tools/get_reference_models.sh
Once datasets and models ready, we can start the evaluation. To evaluate all provided models, run
bash tools/evaluate_models.sh
The expected Top-1 accuracy of provided models are shown here:
Model | Kinetics- skeleton (%) |
NTU RGB+D Cross View (%) |
NTU RGB+D Cross Subject (%) |
---|---|---|---|
Temporal Conv [1] | 20.3 | 83.1 | 74.3 |
ST-GCN (Ours) | 30.7 | 88.3 | 80.5 |
[1] Kim, T. S., and Reiter, A. 2017. Interpretable 3d human action analysis with temporal convolutional networks. In BNMW CVPRW.
To train a new model, use the main.py
script. For example:
main.py --config config/Kinetics/ST-GCN.yaml
We have provided the necessary solver configs under the ./config
. The training results will be saved under the ./work_dir
by default.
You can modify the training parameters such as batch-size
and device
in the command line or config files. The order of priority is: command line > config file > default parameter. For more information, use main.py -h
.
Finally, custom model evaluation can be achieved by this command:
main.py --phase test --config <path to training config> --weights <path to model weights>