Skip to content

Commit

Permalink
readme edit
Browse files Browse the repository at this point in the history
  • Loading branch information
yunjey committed Nov 1, 2016
1 parent c130a2d commit 51e44ff
Showing 1 changed file with 10 additions and 26 deletions.
36 changes: 10 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,41 +18,36 @@ Referenced author's theano code can be found [here] (https://github.com/kelvinxu

## Getting Started

#### Prerequisites
#### Prerequisites

This code is written in Python2.7 and requires [TensorFlow](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#anaconda-installation). In addition, we need to install a few more packages to process MSCOCO data set.
To install the required python packages, run:

```bash
$ pip install -r requirements
```

For evaluation, we need to clone [pycocoevalcap](http://mscoco.org/dataset/#download).
For evaluation, clone [pycocoevalcap](http://mscoco.org/dataset/#download) as below.

```bash
$ git clone https://github.com/tylin/coco-caption.git
```

#### Preparing the training data
This code is written in Python2.7 and requires [TensorFlow](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#anaconda-installation). In addition, you need to install a few more packages to process [MSCOCO data set](http://mscoco.org/home/).
To install the required python packages, run:

A script is provided to download the MSCOCO image data and [imagenet-vgg-verydeep-19.mat](http://www.vlfeat.org/matconvnet/pretrained/). Downloading the data may take several hours depending on the network speed. Run code below then the image data will be downloaded in the image directory and mat file will be downloaded in the data directory.
I have provided a script to download the MSCOCO image data and [VGGNet19](http://www.vlfeat.org/matconvnet/pretrained/). Downloading the data may take several hours depending on the network speed. Run commands below then the image data will be downloaded in the image directory and VGGNet will be downloaded in the data directory.

```bash
$ git clone https://github.com/yunjey/show-attend-and-tell-tensorflow.git
$ cd show-attend-and-tell
$ cd show-attend-and-tell-tensorflow
$ pip install -r requirements.txt
$ chmod +x ./download.sh
$ ./download.sh
```


For feeding the image to VGGNet, we should resize the MSCOCO image data set into fixed size of 224x224. Run code below then train2014_resized and val2014_resized will be created in the image folder.
For feeding the image to VGGNet, we should resize the MSCOCO image data to fixed size of 224x224. Run command below then train2014_resized and val2014_resized will be created in the image folder.

```bash
$ python resize.py
```

Before training the model, we have to preprocess the MSCOCO data set to generate `captions.pkl` and `features.hkl`. `captions.pkl` is a numpy array in which each row contains a list of word indices. Also, `features.hkl` is a numpy array which contains activation maps extracted from conv5_3 layer of VGGNet.
To generate `captions.pkl` and `features.pkl`, run :
Before training the model, you have to preprocess the MSCOCO data set to generate `captions.pkl` and `features.hkl`. captions.pkl is a numpy array in which each row contains a list of word indices. Also, features.hkl is a numpy array which contains activation maps extracted from conv5_3 layer of VGGNet.
To generate captions.pkl and features.pkl, run :

```bash
$ python prepro.py --batch_size=50 --max_length=15 --word_count_threshold=3
Expand Down Expand Up @@ -108,14 +103,3 @@ For evaluating the model, please see `evaluate_model.ipynb`.
#####(2) Generated caption: A zebra standing in the grass near a tree.
![alt text](jpg/test2.jpg "test image")

<br/>

## Training Details

![alt text](jpg/loss.jpg "loss")

![alt text](jpg/attention_w.jpg "w")

![alt text](jpg/attention_b.jpg "b")

![alt text](jpg/attention_w_att.jpg "w_att")

0 comments on commit 51e44ff

Please sign in to comment.