A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019.
The project is a collaborative project that was worked on by Anirudh, Aparna, Savitha and Sidharth from team Saas.
The final report for our work is here.
To work on the project, please use Python 3. Install the required packages from requirements.txt
pip install --user -r requirements.txt
NOTE - These were the packages that were default present in the UCSD DSMLP cluster. They have been uploaded here so as to ensure compatibility for runs.
Contribution guidelines are given here
To work on the dataset, we have used the official repositories provided for COCO Dataset. The first repository that is utilized as a submodule is cocoapi. This provides a wrapper to load captions that correspond to images.
The second repository that is utilized as a submodule is coco-caption. This provides a wrapper to evaluate our resultant captions by providing implementations of BLEU scores. This repository is my fork of the original repository here, rewritten in Python3
These are the steps to set up the dataset:-
- Just use the images of the dataset given in the DSMLP cluster in /datasets/COCO-2015/
- Create a sub-directory in the project root named datasets/
- Create another sub-directory within it named as COCO/
- For the captions, download the annotations from the MS COCO website.
- Download the training set annotations as a zip file from here
- Inflate inside the ./datasets/COCO/
- Similary, download the zip file for 2014 testing image information from here and follow step 6.
- Similarly, download the zip file for the 2015 testing image information from here and follow step 6.
These are the steps to get your COCO dataset image loader up and running :-
- Clone this repo recursively. To do this, run
cd src/
git submodule update --init --recursive
- Build the submodule by running the following
cd src/cocoapi/PythonAPI/
make
- Additionally symlink the pycocotools in the cocoapi's PythonAPI directory into src/ This can be done by the following
cd src/
ln -s ./cocoapi/PythonAPI/pycocotools ./
The trained models for the project can be found in this Google Drive link.
Please download each of the folders and place that folder in the outputs folder.
File | Function |
---|---|
src/alex_adam.py | Trains for 1 epoch with alexnet as encoder and standard LSTM as decoder |
src/alexnet_adam.ipynb | Loads the trained and evaluated model of Alexnet encoder and graphs the losses, BLEU scores and prints loss on validation set |
src/demo.ipynb | Runs a demo of our code to produce caption for 1 random image |
src/experiment_flip.ipynb | Runs the experiment to randomly flip an image horizontally and vertically and display the captions |
src/framework.ipynb | Our initial try for a framework |
src/framework_final.ipynb | A framework that is duplicated to be changed to run specific experiments |
src/framework_final.py | A framework that trains and evaluates a specific model |
src/nntools_modified.py | A python file provided to us in ECE285 that is modified to be specific to our needs |
src/vgg16_adam.ipynb | Loads the trained and evaluated model of VGG16 network with Adam Optimizer and graphs the losses, BLEU scores and prints loss on validation set |
src/vgg16_sgd.ipynb | Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum and graphs the losses, BLEU scores and prints loss on validation set |
src/vgg16_sgd_nesterov.ipynb | Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum, Nesterov acceleration and graphs the losses, BLEU scores and prints loss on validation set |
src/vgg16_sgd_zero_mom.ipynb | Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0 momentum and graphs the losses, BLEU scores and prints loss on validation set |
src/vgg16_adam.py | Trains network and evaluates on validation set with VGG16 encoder and Adam Optimizer for 5 epochs |
src/vgg16_sgd.py | Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum for 1 epochs |
src/vgg16_sgd_nesterov.py | Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum, Nesterov Acceleration for 4 epochs |
src/vgg16_sgd_zero_mom.py | Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer with 0 momentum for 1 epochs |
src/vocab_creator.py | Parses the training dataset of all the captions and generates vocabulary used in all experiments |