Merge pull request #1 from vsubhashini/master

Master
vsubhashini · Nov 26, 2017 · c4a5e90 · c4a5e90
2 parents f143efe + 52763b3
commit c4a5e90
Show file tree

Hide file tree

Showing 14 changed files with 81 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -1,9 +1,54 @@
 ## Captioning Images with Diverse Objects ##
 
+This is repository contains pre-trained models and code accompanying the paper
+[Captioning Images with Diverse Objects](https://arxiv.org/abs/1606.07770).
+
+### Novel Object Captioner (NOC)
+
+![Novel Object
+Captioner](http://bair.berkeley.edu/blog/assets/novel_image_captioning/image_0.png)
+
+While object recognition models can recognize thousands of categories of objects
+such as jackals and anteaters, description models cannot compose sentences to
+describe these objects correctly in context. Our novel object captioner model
+overcomes this problem by building visual description systems
+which can describe new objects without pairs of images and sentences about these
+objects.
+
+* [Refer to this blogpost to learn how NOC
+works](http://bair.berkeley.edu/blog/2017/08/08/novel-object-captioning/)
+
+* [Video of the Oral Talk at CVPR 2017](https://youtu.be/OQNoy4pgDr4)
+* [Slides](https://drive.google.com/open?id=0Bxz2Bk18GoW9TzRrMEZ0VVdKbzA)
+* [Project Page with additional resources](http://vsubhashini.github.io/noc.html)
+
+
+### Getting Started.
+
 To get started you need to compile from this branch of caffe:
 ```
     git clone https://github.com/vsubhashini/noc.git
 ```
+
 To compile Caffe, please refer to the [Installation page](http://caffe.berkeleyvision.org/installation.html).
 
+
+### Caption images using our pre-trained models.
+
+Pre-trained models corresponding to the results reported in the paper can be
+dowloaded here: [Drive
+link](https://drive.google.com/open?id=0B90_72zRQe88cVBNd2RQaEZEZGM), [Dropbox
+link](https://www.dropbox.com/sh/0ydd6mv1yy4dyi4/AABFzUzLNO0vssIvxrmAeG9fa?dl=0)
+
+**Change directory and download the pre-trained models.**
 ```
+    cd examples/noc
+    ./download_models.sh
+```
+
+**Run the captioner.**
+```
+    python noc_captioner.py -i images_list.txt
+```
+
+
diff --git a/examples/noc/download_models.sh b/examples/noc/download_models.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+
+echo "Downloading VGG model [~530MB] ..."
+
+wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
+
+echo "Downloading NOC Imagenet captioning model [~1.3GB] ..."
+wget --no-check-certificate https://www.dropbox.com/s/.caffemodel.h5
+
+echo "Organizing..."
+
+DIR="./models"
+if [ ! -d "$DIR" ]; then
+    mkdir $DIR
+fi
+
+mv VGG_ILSVRC_16_layers.caffemodel $DIR"/"
+mv imgnetcoco_3loss_voc72klabel_inglove_prelm75k_sgd_lr4e5_iter_80000.caffemodel.h5 $DIR"/"
+echo "Done."
+
diff --git a/examples/noc/images/COCO_val2014_000000050811.jpg b/examples/noc/images/COCO_val2014_000000050811.jpg
diff --git a/examples/noc/images/COCO_val2014_000000156704.jpg b/examples/noc/images/COCO_val2014_000000156704.jpg
diff --git a/examples/noc/images/COCO_val2014_000000190680.jpg b/examples/noc/images/COCO_val2014_000000190680.jpg
diff --git a/examples/noc/images/COCO_val2014_000000238377.jpg b/examples/noc/images/COCO_val2014_000000238377.jpg
diff --git a/examples/noc/images/COCO_val2014_000000238472.jpg b/examples/noc/images/COCO_val2014_000000238472.jpg
diff --git a/examples/noc/images/COCO_val2014_000000317479.jpg b/examples/noc/images/COCO_val2014_000000317479.jpg
diff --git a/examples/noc/images/COCO_val2014_000000417023.jpg b/examples/noc/images/COCO_val2014_000000417023.jpg
diff --git a/examples/noc/images/COCO_val2014_000000420357.jpg b/examples/noc/images/COCO_val2014_000000420357.jpg
diff --git a/examples/noc/images/n01838598_woodpecker.jpg b/examples/noc/images/n01838598_woodpecker.jpg
diff --git a/examples/noc/images/n02657694_flounder.jpg b/examples/noc/images/n02657694_flounder.jpg
diff --git a/examples/noc/images_list.txt b/examples/noc/images_list.txt
@@ -0,0 +1,10 @@
+images/COCO_val2014_000000050811.jpg
+images/COCO_val2014_000000156704.jpg
+images/COCO_val2014_000000190680.jpg
+images/COCO_val2014_000000238377.jpg
+images/COCO_val2014_000000238472.jpg
+images/COCO_val2014_000000317479.jpg
+images/COCO_val2014_000000417023.jpg
+images/COCO_val2014_000000420357.jpg
+images/n01838598_woodpecker.jpg
+images/n02657694_flounder.jpg
diff --git a/examples/noc/noc_captioner.py b/examples/noc/noc_captioner.py
@@ -481,11 +481,14 @@ def load_weights_from_h5(net_object, h5_weights_file):
 
 def main():
   parser = argparse.ArgumentParser()
-  parser.add_argument("-m", "--modelname", type=str, required=True,
+  parser.add_argument("-m", "--modelname", type=str,
+                      default="models/imgnetcoco_3loss_voc72klabel_inglove_prelm75k_sgd_lr4e5_iter_80000.caffemodel.h5",
                       help='Path to NOC model (Imagenet/CoCo).')
-  parser.add_argument("-v", "--vggmodel", type=str, required=True,
+  parser.add_argument("-v", "--vggmodel", type=str,
+                      default="models/VGG_ILSVRC_16_layers.caffemodel",
                       help='Path to vgg 16 model file.')
-  parser.add_argument("-i", "--imagelist", type=str, required=True,
+  parser.add_argument("-i", "--imagelist", type=str,
+                      default="images_list.txt",
                       help='File with a list of images (full path to images).')
   parser.add_argument("-o", "--htmlout", action='store_true', help='output images and captions as html')
   args = parser.parse_args()