-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from vsubhashini/master
Master
- Loading branch information
Showing
14 changed files
with
81 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,54 @@ | ||
## Captioning Images with Diverse Objects ## | ||
|
||
This is repository contains pre-trained models and code accompanying the paper | ||
[Captioning Images with Diverse Objects](https://arxiv.org/abs/1606.07770). | ||
|
||
### Novel Object Captioner (NOC) | ||
|
||
![Novel Object | ||
Captioner](http://bair.berkeley.edu/blog/assets/novel_image_captioning/image_0.png) | ||
|
||
While object recognition models can recognize thousands of categories of objects | ||
such as jackals and anteaters, description models cannot compose sentences to | ||
describe these objects correctly in context. Our novel object captioner model | ||
overcomes this problem by building visual description systems | ||
which can describe new objects without pairs of images and sentences about these | ||
objects. | ||
|
||
* [Refer to this blogpost to learn how NOC | ||
works](http://bair.berkeley.edu/blog/2017/08/08/novel-object-captioning/) | ||
|
||
* [Video of the Oral Talk at CVPR 2017](https://youtu.be/OQNoy4pgDr4) | ||
* [Slides](https://drive.google.com/open?id=0Bxz2Bk18GoW9TzRrMEZ0VVdKbzA) | ||
* [Project Page with additional resources](http://vsubhashini.github.io/noc.html) | ||
|
||
|
||
### Getting Started. | ||
|
||
To get started you need to compile from this branch of caffe: | ||
``` | ||
git clone https://github.com/vsubhashini/noc.git | ||
``` | ||
|
||
To compile Caffe, please refer to the [Installation page](http://caffe.berkeleyvision.org/installation.html). | ||
|
||
|
||
### Caption images using our pre-trained models. | ||
|
||
Pre-trained models corresponding to the results reported in the paper can be | ||
dowloaded here: [Drive | ||
link](https://drive.google.com/open?id=0B90_72zRQe88cVBNd2RQaEZEZGM), [Dropbox | ||
link](https://www.dropbox.com/sh/0ydd6mv1yy4dyi4/AABFzUzLNO0vssIvxrmAeG9fa?dl=0) | ||
|
||
**Change directory and download the pre-trained models.** | ||
``` | ||
cd examples/noc | ||
./download_models.sh | ||
``` | ||
|
||
**Run the captioner.** | ||
``` | ||
python noc_captioner.py -i images_list.txt | ||
``` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/bin/sh | ||
|
||
echo "Downloading VGG model [~530MB] ..." | ||
|
||
wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel | ||
|
||
echo "Downloading NOC Imagenet captioning model [~1.3GB] ..." | ||
wget --no-check-certificate https://www.dropbox.com/s/.caffemodel.h5 | ||
|
||
echo "Organizing..." | ||
|
||
DIR="./models" | ||
if [ ! -d "$DIR" ]; then | ||
mkdir $DIR | ||
fi | ||
|
||
mv VGG_ILSVRC_16_layers.caffemodel $DIR"/" | ||
mv imgnetcoco_3loss_voc72klabel_inglove_prelm75k_sgd_lr4e5_iter_80000.caffemodel.h5 $DIR"/" | ||
echo "Done." | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
images/COCO_val2014_000000050811.jpg | ||
images/COCO_val2014_000000156704.jpg | ||
images/COCO_val2014_000000190680.jpg | ||
images/COCO_val2014_000000238377.jpg | ||
images/COCO_val2014_000000238472.jpg | ||
images/COCO_val2014_000000317479.jpg | ||
images/COCO_val2014_000000417023.jpg | ||
images/COCO_val2014_000000420357.jpg | ||
images/n01838598_woodpecker.jpg | ||
images/n02657694_flounder.jpg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters