In this assignment you will implement recurrent networks, and apply them to image captioning on Microsoft COCO. You will also explore methods for visualizing the features of a pretrained model on ImageNet, and also this model to implement Style Transfer. The goals of this assignment are as follows:
- Understand the architecture of recurrent neural networks (RNNs) and how they operate on sequences by sharing weights over time
- Understand and implement both Vanilla RNNs and Long-Short Term Memory (LSTM) RNNs
- Understand how to sample from an RNN language model at test-time
- Understand how to combine convolutional neural nets and recurrent nets to implement an image captioning system
- Understand how a trained convolutional network can be used to compute gradients with respect to the input image
- Implement and different applications of image gradients, including saliency maps, fooling images, class visualizations.
- Understand and implement style transfer.
Please copy layers.py
and optim.py
from your homework 1 solution to the deeplearning directory. We will provide
reference files once the deadline of homework 1 is over.
Make sure your machine is set up with the assignment dependencies.
The preferred approach for installing all the assignment dependencies is to use Anaconda, which is a Python distribution that includes many of the most popular Python packages for science, math, engineering and data analysis. Once you install Anaconda you can run the following command inside the homework directory to install the required packages for this homework:
conda env create -f environment.yml
Once you have all the packages installed, run the following command every time to activate the environment when you work on the homework.
conda activate cs182_hw2
This assignment is provided pre-setup with a VirtualBox image. Installation Instructions:
- Follow the instructions here to install VirtualBox if it is not already installed.
- Download the VirtualBox image here
- Load the VirtualBox image using the instructions here
- Start the VM. The username and password are both cs182. Required packages are pre-installed and the cs182_hw2 environment activated by default.
- Download the assignment code onto the VM yourself.
I get an error "AMD-V is disabled in the BIOS" or "Intel-VT is disabled in the BIOS" or similar
Solution: See this link
The virtual machine won't boot
Solutions:
- Try increasing the number of allocated CPUs: Under Settings→System→Processor
- Try increasing the amount of allocated memory:
Once you have the starter code, you will need to download the CIFAR-10 dataset. Run the following from the homework 2 directory:
cd deeplearning/datasets
./get_assignment2_data.sh
If you don't have wget installed, you can also try
./get_assignment2_data_curl.sh
After you download data, you should start the IPython notebook server from the homework 2 directory with the following command:
jupyter notebook
If you are unfamiliar with IPython, you should read our IPython tutorial.
Once you are done working run the collect_submission.sh
script;
this will produce a file called assignment2.zip
.
Upload this file to Gradescope.
Note that Gradescope will run an autograder on the files you submit. For some
test cases, there is a nonzero (but should be very low) probability that correct
implementations may fail due to randomness. If you think your implementation is
correct, then you can simply resubmit to rerun the autograder to check whether
it really is just a particularly unlucky seed..
The IPython notebook RNN_Captioning.ipynb
will introduce you to the implementation
of vanilla recurrent neural networks for image captioning. Follow the instructions
in the notebook to complete this part.
The IPython notebook LSTM_Captioning.ipynb
will introduce you to the implementation
of LSTM for image captioning. Follow the instructions in the notebook to complete this part.
The IPython notebook NetworkVisualization.ipynb
will introduce you to various techniques
for visualizing neural network internals. Follow the instructions in the notebook to complete this part.
We will use PyTorch for this part.
The IPython notebook StyleTransfer.ipynb
will introduce you to image style transfer.
Follow the instructions in the notebook to complete this part. We will use PyTorch for this part.