EMP-Net

The Official implementation for Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning (ECCV 2024).

Prerequisite
Data
Running
Citation
Acknowledgement
Contact

Prerequisite

Requirements:

Python>=3.6
torch>=1.5
torchvision (version corresponding with torch)
simplejson==3.11.1
decord>=0.6.0
pyyaml
einops
oss2
psutil
tqdm
pandas

optional requirements

fvcore (for flops calculation)

You can create environments simply with the following command:

conda env create -f environment.yaml

Data

First, you need to download the datasets from their original source and put them into data:

Then, prepare data according to the splits we provide.

Preprocessing Something-Something-V2 dataset

Following the setup of pytorch-video-understanding codebase, the video decoder decord has difficulty in decoding the original webm files. So we provide a script for preprocessing the .webm files in the original something-something-v2 dataset to .mp4 files. To do this, simply run:

python datasets/utils/preprocess_ssv2_annos.py --anno --anno_path path_to_your_annotation
python datasets/utils/preprocess_ssv2_annos.py --data --data_path path_to_your_ssv2_videos --data_out_path path_to_put_output_videos

Remember to make sure the annotation files are organized as follows:

-- path_to_your_annotation
    -- something-something-v2-train.json
    -- something-something-v2-validation.json
    -- something-something-v2-labels.json

Running

The entry file for all the runs are runs.

Before running, some settings need to be done.

For an example run of 1 shot on SSV2-Small, all configurations can be found from configs, you can modify any configurations as needed.

Then the codebase can be run by:

python runs/run.py --cfg ./configs/projects/EMP_Net/ssv2_small/EMP_Net_SSv2_Small_1shot_v1.yaml

Citation

If you find this model useful for your research, please use the following BibTeX entry.

@inproceedings{wuefficient,
  author={Wu, Cong and Wu, Xiao-Jun and Li, Linze and Xu, Tianyang and Feng, Zhenhua and Kittler, Josef},
  title={Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning},
  booktitle={European Conference on Computer Vision},
  year={2024},
  organization={Springer}
}

Acknowledgement

Thanks for the framework provided by CLIP-FSAR, which is source code of the published work CLIP-guided Prototype Modulating for Few-shot Action Recognition in IJCV 2023.

Contact

For any further questions, feel free to contact: [email protected].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EMP-Net

Prerequisite

Data

Preprocessing Something-Something-V2 dataset

Running

Citation

Acknowledgement

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
datasets		datasets
models		models
runs		runs
utils		utils
README.md		README.md
environment.yaml		environment.yaml

cong-wu/EMP-Net

Folders and files

Latest commit

History

Repository files navigation

EMP-Net

Prerequisite

Data

Preprocessing Something-Something-V2 dataset

Running

Citation

Acknowledgement

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages