Skip to content

Commit

Permalink
Merge branch 'master' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
yysijie committed Aug 3, 2019
2 parents c1680c4 + 905330c commit e75030b
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 15 deletions.
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Spatial Temporal Graph Convolutional Networks (ST-GCN)
A graph convolutional network for skeleton based action recognition.
A graph convolutional network for skeleton-based action recognition.

<div align="center">
<img src="resource/info/pipeline.png">
Expand All @@ -12,13 +12,13 @@ This repository holds the codebase, dataset and models for the paper>
[[Arxiv Preprint]](https://arxiv.org/abs/1801.07455)

## News & Updates
- Feb. 21, 2019 - We provide pretrained models and training scripts for **NTU-RGB+D** dataset.
- July. 10, 2019 - We provide processed data on NTU-RGB+D and kinetics-skeleton.
- Feb. 21, 2019 - We provide pretrained models and training scripts on **NTU-RGB+D** and **kinetics-skeleton** datasets. So that you can achieve the performance we mentioned in the paper.
- June. 5, 2018 - A demo for feature visualization and skeleton based action recognition is released.
- June. 1, 2018 - We update our code base and complete the PyTorch 0.4.0 migration. You can switch to the old version [v0.1.0](https://github.com/yysijie/st-gcn/tree/v0.1.0)
to acquire the original setting in the paper.
- June. 1, 2018 - We update our code base and complete the PyTorch 0.4.0 migration.

## Visulization of ST-GCN in Action
Our demo for skeleton based action recognition:
Our demo for skeleton-based action recognition:
<p align="center">
<img src="resource/info/demo_video.gif", width="1200">
</p>
Expand Down Expand Up @@ -93,26 +93,36 @@ A video as above will be generated and saved under ```data/demo_result/```.

## Data Preparation

We experimented on two skeleton-based action recognition datasts: **Kinetics-skeleton** and **NTU RGB+D**. The experiments on NTU RGB+D
is not currently supported in this new version. You can switch to the old version [v0.1.0](https://github.com/yysijie/st-gcn/tree/v0.1.0)
to acquire the full experimental setting.
We experimented on two skeleton-based action recognition datasts: **Kinetics-skeleton** and **NTU RGB+D**.
Before training and testing, for convenience of fast data loading,
the datasets should be converted to proper file structure.
You can download the pre-processed data from
[GoogleDrive](https://drive.google.com/open?id=103NOL9YYZSW1hLoWmYnv5Fs8mK-Ij7qb)
and extract files with
```
cd st-gcn
unzip <path to st-gcn-processed-data.zip>
```
**Otherwise, for processing raw data by yourself,
please refer to below guidances.**

### Kinetics-skeleton
#### Kinetics-skeleton
[Kinetics](https://deepmind.com/research/open-source/open-source-datasets/kinetics/) is a video-based dataset for action recognition which only provide raw video clips without skeleton data. Kinetics dataset include To obatin the joint locations, we first resized all videos to the resolution of 340x256 and converted the frame rate to 30 fps. Then, we extracted skeletons from each frame in Kinetics by [Openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose). The extracted skeleton data we called **Kinetics-skeleton**(7.5GB) can be directly downloaded from [GoogleDrive](https://drive.google.com/open?id=1SPQ6FmFsjGg3f59uCWfdUWI-5HJM_YhZ) or [BaiduYun](https://pan.baidu.com/s/1dwKG2TLvG-R1qeIiE4MjeA#list/path=%2FShare%2FAAAI18%2Fkinetics-skeleton&parentPath=%2FShare).

After uncompressing, rebuild the database by this command:
```
python tools/kinetics_gendata.py --data_path <path to kinetics-skeleton>
```

### NTU RGB+D
#### NTU RGB+D
NTU RGB+D can be downloaded from [their website](http://rose1.ntu.edu.sg/datasets/actionrecognition.asp).
Only the **3D skeletons**(5.8GB) modality is required in our experiments. After that, this command should be used to build the database for training or evaluation:
```
python tools/ntu_gendata.py --data_path <path to nturgbd+d_skeletons>
```
where the ```<path to nturgbd+d_skeletons>``` points to the 3D skeletons modality of NTU RGB+D dataset you download.


## Testing Pretrained Models

<!-- ### Evaluation
Expand Down Expand Up @@ -144,7 +154,7 @@ The expected **Top-1** **accuracy** of provided models are shown here:
| Model| Kinetics-<br>skeleton (%)|NTU RGB+D <br> Cross View (%) |NTU RGB+D <br> Cross Subject (%) |
| :------| :------: | :------: | :------: |
|Baseline[1]| 20.3 | 83.1 | 74.3 |
|**ST-GCN** (Ours)| **31.6**| **88.8** | **81.2** |
|**ST-GCN** (Ours)| **31.6**| **88.8** | **81.6** |

[1] Kim, T. S., and Reiter, A. 2017. Interpretable 3d human action analysis with temporal convolutional networks. In BNMW CVPRW.

Expand Down
11 changes: 7 additions & 4 deletions tools/utils/visualization.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,10 +106,13 @@ def stgcn_visualize(pose,
text_3 = cv2.imread('./resource/demo_asset/attention+prediction.png', cv2.IMREAD_UNCHANGED)
text_4 = cv2.imread('./resource/demo_asset/attention+rgb.png', cv2.IMREAD_UNCHANGED)

blend(frame, text_1)
blend(skeleton, text_2)
blend(skeleton_result, text_3)
blend(rgb_result, text_4)
try:
blend(frame, text_1)
blend(skeleton, text_2)
blend(skeleton_result, text_3)
blend(rgb_result, text_4)
except:
pass

if label is not None:
label_name = 'voting result: ' + label
Expand Down

0 comments on commit e75030b

Please sign in to comment.