Merge branch 'master' into develop

Alex-bd · Aug 3, 2019 · e75030b · e75030b
2 parents c1680c4 + 905330c
commit e75030b
Show file tree

Hide file tree

Showing 2 changed files with 28 additions and 15 deletions.
diff --git a/README.md b/README.md
@@ -1,5 +1,5 @@
 # Spatial Temporal Graph Convolutional Networks (ST-GCN)
-A graph convolutional network for skeleton based action recognition.
+A graph convolutional network for skeleton-based action recognition.
 
 <div align="center">
     <img src="resource/info/pipeline.png">
@@ -12,13 +12,13 @@ This repository holds the codebase, dataset and models for the paper>
 [[Arxiv Preprint]](https://arxiv.org/abs/1801.07455)
 
 ## News & Updates
-- Feb. 21, 2019 - We provide pretrained models and training scripts for **NTU-RGB+D** dataset.
+- July. 10, 2019 - We provide processed data on NTU-RGB+D and kinetics-skeleton.
+- Feb. 21, 2019 - We provide pretrained models and training scripts on **NTU-RGB+D** and **kinetics-skeleton** datasets. So that you can achieve the performance we mentioned in the paper.
 - June. 5, 2018 - A demo for feature visualization and skeleton based action recognition is released.
-- June. 1, 2018 - We update our code base and complete the PyTorch 0.4.0 migration. You can switch to the old version [v0.1.0](https://github.com/yysijie/st-gcn/tree/v0.1.0)
-to acquire the original setting in the paper.
+- June. 1, 2018 - We update our code base and complete the PyTorch 0.4.0 migration.
 
 ## Visulization of ST-GCN in Action
-Our demo for skeleton based action recognition:
+Our demo for skeleton-based action recognition:
 <p align="center">
     <img src="resource/info/demo_video.gif", width="1200">
 </p>
@@ -93,26 +93,36 @@ A video as above will be generated and saved under ```data/demo_result/```.
 
 ## Data Preparation
 
-We experimented on two skeleton-based action recognition datasts: **Kinetics-skeleton** and **NTU RGB+D**. The experiments on NTU RGB+D
-is not currently supported in this new version. You can switch to the old version [v0.1.0](https://github.com/yysijie/st-gcn/tree/v0.1.0)
-to acquire the full experimental setting.
+We experimented on two skeleton-based action recognition datasts: **Kinetics-skeleton** and **NTU RGB+D**.
+Before training and testing, for convenience of fast data loading,
+the datasets should be converted to proper file structure. 
+You can download the pre-processed data from 
+[GoogleDrive](https://drive.google.com/open?id=103NOL9YYZSW1hLoWmYnv5Fs8mK-Ij7qb)
+and extract files with
+``` 
+cd st-gcn
+unzip <path to st-gcn-processed-data.zip>
+```
+**Otherwise, for processing raw data by yourself,
+please refer to below guidances.**
 
-### Kinetics-skeleton
+#### Kinetics-skeleton
 [Kinetics](https://deepmind.com/research/open-source/open-source-datasets/kinetics/) is a video-based dataset for action recognition which only provide raw video clips without skeleton data. Kinetics dataset include To obatin the joint locations, we first resized all videos to the resolution of 340x256 and converted the frame rate to 30 fps.  Then, we extracted skeletons from each frame in Kinetics by [Openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose). The extracted skeleton data we called **Kinetics-skeleton**(7.5GB) can be directly downloaded from [GoogleDrive](https://drive.google.com/open?id=1SPQ6FmFsjGg3f59uCWfdUWI-5HJM_YhZ) or [BaiduYun](https://pan.baidu.com/s/1dwKG2TLvG-R1qeIiE4MjeA#list/path=%2FShare%2FAAAI18%2Fkinetics-skeleton&parentPath=%2FShare).
 
 After uncompressing, rebuild the database by this command:
 ```
 python tools/kinetics_gendata.py --data_path <path to kinetics-skeleton>
 ```
 
-### NTU RGB+D
+#### NTU RGB+D
 NTU RGB+D can be downloaded from [their website](http://rose1.ntu.edu.sg/datasets/actionrecognition.asp).
 Only the **3D skeletons**(5.8GB) modality is required in our experiments. After that, this command should be used to build the database for training or evaluation:
 ```
 python tools/ntu_gendata.py --data_path <path to nturgbd+d_skeletons>
 ```
 where the ```<path to nturgbd+d_skeletons>``` points to the 3D skeletons modality of NTU RGB+D dataset you download.
 
+
 ## Testing Pretrained Models
 
 <!-- ### Evaluation
@@ -144,7 +154,7 @@ The expected **Top-1** **accuracy** of provided models are shown here:
 | Model| Kinetics-<br>skeleton (%)|NTU RGB+D <br> Cross View (%) |NTU RGB+D <br> Cross Subject (%) |
 | :------| :------: | :------: | :------: |
 |Baseline[1]| 20.3    | 83.1     |  74.3    |
-|**ST-GCN** (Ours)| **31.6**| **88.8** | **81.2** | 
+|**ST-GCN** (Ours)| **31.6**| **88.8** | **81.6** | 
 
 [1] Kim, T. S., and Reiter, A. 2017. Interpretable 3d human action analysis with temporal convolutional networks. In BNMW CVPRW. 
 

diff --git a/tools/utils/visualization.py b/tools/utils/visualization.py
@@ -106,10 +106,13 @@ def stgcn_visualize(pose,
         text_3 = cv2.imread('./resource/demo_asset/attention+prediction.png', cv2.IMREAD_UNCHANGED)
         text_4 = cv2.imread('./resource/demo_asset/attention+rgb.png', cv2.IMREAD_UNCHANGED)
 
-        blend(frame, text_1)
-        blend(skeleton, text_2)
-        blend(skeleton_result, text_3)
-        blend(rgb_result, text_4)
+        try:
+            blend(frame, text_1)
+            blend(skeleton, text_2)
+            blend(skeleton_result, text_3)
+            blend(rgb_result, text_4)
+        except:
+            pass
 
         if label is not None:
             label_name = 'voting result: ' + label