Pytorch implementation for attempt Neural Architecture Search on Visual Question Answering task

Description: This repository exists for legacy reason

If you make use of this code, please cite the following information [and star me (0.0)]

@{Dang_2020_NAS_Attempt
author = {Dang, Anh-Chuong}
title = {Attempt Neural Architecture Search on Visual Question Answering task},
month = {May},
year = {2020}
publisher = {GitHub}
journal = {GitHub repository}
commit = {master}
}

Abstract

This repository contains Pytorch implementation for my attempt NAS on Vision Language models (VQA task). In this work, I utilized MCAN-VQA model and factorized its operations then applied Search algorithms i.e SNAS to optimize Network's architecture.
For more detais, plz refer to my code as well as summary report summary.pdf.

Prerequisites

Dependencies

You should install some necessary packages.

Install Python >= 3.5
Install Cuda >= 9.0 and cuDNN
Install PyTorch >= 1.x with CUDA.

Install SpaCy and initialize the GloVe as follows:

$ pip install -r requirements.txt
$ wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
$ pip install en_vectors_web_lg-2.1.0.tar.gz

Setup

The image features are extracted using the bottom-up-attention strategy, with each image being represented as an dynamic number (from 10 to 100) of 2048-D features. We store the features for each image in a .npz file. You can prepare the visual features by yourself or download the extracted features from OneDrive or BaiduYun. The downloaded files contains three files: train2014.tar.gz, val2014.tar.gz, and test2015.tar.gz, corresponding to the features of the train/val/test images for VQA-v2, respectively.
For more details of setup: Please refer to repository (https://github.com/MILVLG/mcan-vqa)

Training

For search stage, run file run_search.py. Command for running search:

python run_search.py --RUN=str --GPU=str --SEED=int --PRELOAD=bool

After you achieved desired architecture, please copy it to namedtuple VQAGenotype in genotypes.py file in model folder.

For evaluation stage, run file run.py. Command for running evaluation:

python run.py --RUN=str --ARCH_NAME=str --GPU=str --SEED=int --PRELOAD=bool

where;
str: should be replaced with string element of your choices. e.g. For --RUN, option choices are {'train', 'val'}
int: a integer element of your choices
bool: boolean element, i.e. True or False

Progression

The project was under progression. However, around the end of April 2020, A great work, which has quite similar approach with more favorable results, was published hence unfortunately I decided to stop this project.
Published paper (mentioned above): Deep Multimodal Neural Architecture Search
As for personal curiosity, any further suggestions, advices are welcome.

Implementation References

https://github.com/MILVLG/mcan-vqa
https://github.com/cvlab-tohoku/Dense-CoAttention-Network

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Properties		Properties
cfgs		cfgs
results		results
utils		utils
README.md		README.md
demo.png		demo.png
run.py		run.py
run_search.py		run_search.py
summary.pdf		summary.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pytorch implementation for attempt Neural Architecture Search on Visual Question Answering task

Description: This repository exists for legacy reason

Abstract

Prerequisites

Dependencies

Setup

Training

Progression

Implementation References

About

Releases

Packages

Languages

DangChuong-DC/NAS-VQA-attempt

Folders and files

Latest commit

History

Repository files navigation

Pytorch implementation for attempt Neural Architecture Search on Visual Question Answering task

Description: This repository exists for legacy reason

Abstract

Prerequisites

Dependencies

Setup

Training

Progression

Implementation References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages