README.md

Visual Feature Extraction

You can follow these steps to extract visual features from the MUStARD dataset videos.

Download the videos from HuggingFace Hub to data/videos, placing the files without subdirectories.
Move to this directory:
```
cd visual
```
Run save_frames.sh to extract the frames from the video files:
```
./save_frames.sh
```
Create the directories data/features/, data/features/utterances_final/, and data/features/context_final/.
To extract the features and save them into large H5 files:
```
./extract_features.py resnet
```
- If you extract C3D features, first download the Sports1M-pretrained C3D weights into data/features/c3d.pickle.
- If you extract I3D features, first download the ImageNet-and-Kinetics-400-pretrained I3D weights into data/features/i3d.pt.