TensorFlow implementation for the paper "Parallel Attention Network with Sequence Matching for Video Grounding" (ACL 2021 Findings): ACL version, ArXiv version.
- python3 with tensorflow (>=
1.13.1
, <=1.15.0
), tqdm, nltk, numpy, cuda10 and cudnn
The visual features of Charades-STA
, ActivityNet Captions
and TACoS
are available at Box Drive, download and place them under the ./data/features/
directory.
Download the word embeddings from here and place it to
./data/features/
directory. Directory hierarchies are shown below:
SeqPAN
|____ ckpt/
|____ data/
|____ datasets/
|____ features/
|____ activitynet/
|____ charades/
|____ tacos/
|____ glove.840B.300d.txt
...
Train
# processed dataset will be automatically generated or loaded if exist
# set `--mode test` for evaluation
# train Charades-STA dataset
python main.py --task charades --max_pos_len 64 --char_dim 50 --mode train
# train ActivityNet Captions dataset
python main.py --task activitynet --max_pos_len 100 --char_dim 100 --mode train
# train TACoS dataset
python main.py --task tacos --max_pos_len 256 --char_dim 50 --mode train
Test
# processed dataset will be automatically generated or loaded if exist
# set `--suffix xxx` to restore pre-trained parameters for evaluation
# where `xxx` denotes the name after the last `_` of the ckpt directory
# train Charades-STA dataset
python main.py --task charades --max_pos_len 64 --char_dim 50 --suffix xxx --mode test
# train ActivityNet Captions dataset
python main.py --task activitynet --max_pos_len 100 --char_dim 100 --suffix xxx --mode test
# train TACoS dataset
python main.py --task tacos --max_pos_len 256 --char_dim 50 --suffix xxx --mode test
You can also download the checkpoints for each task from here
and save them to the ./ckpt/
directory. The corresponding processed dataset is available at here, download and save them to the ./datasets/
directory.
More hyper-parameter settings are in the main.py
.
If you feel this project helpful to your research, please cite our work.
@inproceedings{zhang2021parallel,
title = "Parallel Attention Network with Sequence Matching for Video Grounding",
author = "Zhang, Hao and Sun, Aixin and Jing, Wei and Zhen, Liangli and Zhou, Joey Tianyi and Goh, Siow Mong Rick",
booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
month = aug,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.findings-acl.69",
doi = "10.18653/v1/2021.findings-acl.69",
pages = "776--790",
}