Skip to content

A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019

License

Notifications You must be signed in to change notification settings

Anirudh-Swaminathan/ece_285_fa19_project

 
 

Repository files navigation

Image Captioning - Team SaaS

A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019.

The project is a collaborative project that was worked on by Anirudh, Aparna, Savitha and Sidharth from team Saas.

The final report for our work is here.

Software Requirements

To work on the project, please use Python 3. Install the required packages from requirements.txt

pip install --user -r requirements.txt

NOTE - These were the packages that were default present in the UCSD DSMLP cluster. They have been uploaded here so as to ensure compatibility for runs.

Collaboration

Contribution guidelines are given here

Dataset Preparation Instructions

To work on the dataset, we have used the official repositories provided for COCO Dataset. The first repository that is utilized as a submodule is cocoapi. This provides a wrapper to load captions that correspond to images.

The second repository that is utilized as a submodule is coco-caption. This provides a wrapper to evaluate our resultant captions by providing implementations of BLEU scores. This repository is my fork of the original repository here, rewritten in Python3

Dataset Annotations (Captions) Download

These are the steps to set up the dataset:-

  1. Just use the images of the dataset given in the DSMLP cluster in /datasets/COCO-2015/
  2. Create a sub-directory in the project root named datasets/
  3. Create another sub-directory within it named as COCO/
  4. For the captions, download the annotations from the MS COCO website.
  5. Download the training set annotations as a zip file from here
  6. Inflate inside the ./datasets/COCO/
  7. Similary, download the zip file for 2014 testing image information from here and follow step 6.
  8. Similarly, download the zip file for the 2015 testing image information from here and follow step 6.

Image loader

These are the steps to get your COCO dataset image loader up and running :-

  1. Clone this repo recursively. To do this, run
cd src/
git submodule update --init --recursive
  1. Build the submodule by running the following
cd src/cocoapi/PythonAPI/
make
  1. Additionally symlink the pycocotools in the cocoapi's PythonAPI directory into src/ This can be done by the following
cd src/
ln -s ./cocoapi/PythonAPI/pycocotools ./

Trained Models Loading

The trained models for the project can be found in this Google Drive link.

Please download each of the folders and place that folder in the outputs folder.

Code Organization

File Function
src/alex_adam.py Trains for 1 epoch with alexnet as encoder and standard LSTM as decoder
src/alexnet_adam.ipynb Loads the trained and evaluated model of Alexnet encoder and graphs the losses, BLEU scores and prints loss on validation set
src/demo.ipynb Runs a demo of our code to produce caption for 1 random image
src/experiment_flip.ipynb Runs the experiment to randomly flip an image horizontally and vertically and display the captions
src/framework.ipynb Our initial try for a framework
src/framework_final.ipynb A framework that is duplicated to be changed to run specific experiments
src/framework_final.py A framework that trains and evaluates a specific model
src/nntools_modified.py A python file provided to us in ECE285 that is modified to be specific to our needs
src/vgg16_adam.ipynb Loads the trained and evaluated model of VGG16 network with Adam Optimizer and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd.ipynb Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd_nesterov.ipynb Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum, Nesterov acceleration and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd_zero_mom.ipynb Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0 momentum and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_adam.py Trains network and evaluates on validation set with VGG16 encoder and Adam Optimizer for 5 epochs
src/vgg16_sgd.py Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum for 1 epochs
src/vgg16_sgd_nesterov.py Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum, Nesterov Acceleration for 4 epochs
src/vgg16_sgd_zero_mom.py Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer with 0 momentum for 1 epochs
src/vocab_creator.py Parses the training dataset of all the captions and generates vocabulary used in all experiments

About

A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •