GitHub - Anirudh-Swaminathan/ece_285_fa19_project: A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019

Image Captioning - Team SaaS

A repository to collaborate on the Course Project for ECE285 course "Machine Learning for Image Processing" taken at UCSD in the Fall Quarter of 2019.

The project is a collaborative project that was worked on by Anirudh, Aparna, Savitha and Sidharth from team Saas.

The final report for our work is here.

Software Requirements

To work on the project, please use Python 3. Install the required packages from requirements.txt

pip install --user -r requirements.txt

NOTE - These were the packages that were default present in the UCSD DSMLP cluster. They have been uploaded here so as to ensure compatibility for runs.

Collaboration

Contribution guidelines are given here

Dataset Preparation Instructions

To work on the dataset, we have used the official repositories provided for COCO Dataset. The first repository that is utilized as a submodule is cocoapi. This provides a wrapper to load captions that correspond to images.

The second repository that is utilized as a submodule is coco-caption. This provides a wrapper to evaluate our resultant captions by providing implementations of BLEU scores. This repository is my fork of the original repository here, rewritten in Python3

Dataset Annotations (Captions) Download

These are the steps to set up the dataset:-

Just use the images of the dataset given in the DSMLP cluster in /datasets/COCO-2015/
Create a sub-directory in the project root named datasets/
Create another sub-directory within it named as COCO/
For the captions, download the annotations from the MS COCO website.
Download the training set annotations as a zip file from here
Inflate inside the ./datasets/COCO/
Similary, download the zip file for 2014 testing image information from here and follow step 6.
Similarly, download the zip file for the 2015 testing image information from here and follow step 6.

Image loader

These are the steps to get your COCO dataset image loader up and running :-

Clone this repo recursively. To do this, run

cd src/
git submodule update --init --recursive

Build the submodule by running the following

cd src/cocoapi/PythonAPI/
make

Additionally symlink the pycocotools in the cocoapi's PythonAPI directory into src/ This can be done by the following

cd src/
ln -s ./cocoapi/PythonAPI/pycocotools ./

Trained Models Loading

The trained models for the project can be found in this Google Drive link.

Please download each of the folders and place that folder in the outputs folder.

Code Organization

File	Function
src/alex_adam.py	Trains for 1 epoch with alexnet as encoder and standard LSTM as decoder
src/alexnet_adam.ipynb	Loads the trained and evaluated model of Alexnet encoder and graphs the losses, BLEU scores and prints loss on validation set
src/demo.ipynb	Runs a demo of our code to produce caption for 1 random image
src/experiment_flip.ipynb	Runs the experiment to randomly flip an image horizontally and vertically and display the captions
src/framework.ipynb	Our initial try for a framework
src/framework_final.ipynb	A framework that is duplicated to be changed to run specific experiments
src/framework_final.py	A framework that trains and evaluates a specific model
src/nntools_modified.py	A python file provided to us in ECE285 that is modified to be specific to our needs
src/vgg16_adam.ipynb	Loads the trained and evaluated model of VGG16 network with Adam Optimizer and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd.ipynb	Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd_nesterov.ipynb	Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0.9 momentum, Nesterov acceleration and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_sgd_zero_mom.ipynb	Loads the trained and evaluated model of VGG16 network with SGD Optimizer, 0 momentum and graphs the losses, BLEU scores and prints loss on validation set
src/vgg16_adam.py	Trains network and evaluates on validation set with VGG16 encoder and Adam Optimizer for 5 epochs
src/vgg16_sgd.py	Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum for 1 epochs
src/vgg16_sgd_nesterov.py	Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer, 0.9 momentum, Nesterov Acceleration for 4 epochs
src/vgg16_sgd_zero_mom.py	Trains network and evaluates on validation set with VGG16 encoder and SGD Optimizer with 0 momentum for 1 epochs
src/vocab_creator.py	Parses the training dataset of all the captions and generates vocabulary used in all experiments

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
learning_stuff		learning_stuff
outputs		outputs
papers		papers
problem_statements		problem_statements
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
final_report.pdf		final_report.pdf
project_stats.txt		project_stats.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning - Team SaaS

Software Requirements

Collaboration

Dataset Preparation Instructions

Dataset Annotations (Captions) Download

Image loader

Trained Models Loading

Code Organization

About

Releases

Packages

Contributors 3

Languages

License

Anirudh-Swaminathan/ece_285_fa19_project

Folders and files

Latest commit

History

Repository files navigation

Image Captioning - Team SaaS

Software Requirements

Collaboration

Dataset Preparation Instructions

Dataset Annotations (Captions) Download

Image loader

Trained Models Loading

Code Organization

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages