Skip to content

Commit

Permalink
script to upload/update model info as gist
Browse files Browse the repository at this point in the history
  • Loading branch information
sergeyk committed Sep 4, 2014
1 parent c6827bf commit 51c4e6e
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 16 deletions.
20 changes: 6 additions & 14 deletions docs/model_zoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
# Caffe Model Zoo

Lots of people have used Caffe to train models of different architectures and applied to different problems, ranging from simple regression to AlexNet-alikes to Siamese networks for image similarity to speech applications.

To lower the friction of sharing these models, we introduce the model zoo framework:

- A standard format for packaging Caffe model info.
Expand All @@ -26,36 +25,29 @@ User-provided models are posted to a public-editable [wiki page](https://github.
A caffe model is distributed as a directory containing:

- Solver/model prototxt(s)
- Readme.md containing
- `readme.md` containing
- YAML frontmatter
- Caffe version used to train this model (tagged release or commit hash).
- [optional] file URL and SHA1 of the trained `.caffemodel`.
- [optional] github gist id.
- Information about what data the model was trained on, explanation of modeling choices, etc.
- Information about what data the model was trained on, modeling choices, etc.
- License information.
- [optional] Other helpful scripts.

## Hosting model info

Github Gist is a good format for model info distribution because it can contain multiple files, is versionable, and has in-browser syntax highlighting and markdown rendering.

- `scripts/download_model_from_gist.sh <gist_id>`: downloads the non-binary files from a Gist into `<dirname>`
- `scripts/upload_model_to_gist.sh <dirname>`: uploads non-binary files in the model directory as a Github Gist and prints the Gist ID. If `gist_id` is already part of the `<dirname>/readme.md` frontmatter, then updates existing Gist.

Try doing `scripts/upload_model_to_gist.sh models/bvlc_alexnet` to test the uploading (don't forget to delete the uploaded gist afterward).

Downloading models is not yet supported as a script (there is no good commandline tool for this right now), so simply go to the Gist URL and click "Download Gist" for now.

### Hosting trained models

It is up to the user where to host the `.caffemodel` file.
We host our BVLC-provided models on our own server.
Dropbox also works fine (tip: make sure that `?dl=1` is appended to the end of the URL).

- `scripts/download_model_binary.py <dirname>`: downloads the `.caffemodel` from the URL specified in the `<dirname>/readme.md` frontmatter and confirms SHA1.


## Tasks

x get the imagenet example to work with the new prototxt location
x make wiki page for user-submitted models
- add flickr model to the user-submitted models wiki page
x make docs section listing bvlc-distributed models
- write the publish_model_as_gist script
- write the download_model_from_gist script
4 changes: 2 additions & 2 deletions examples/imagenet/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,9 @@ Resume Training?

We all experience times when the power goes out, or we feel like rewarding ourself a little by playing Battlefield (does anyone still remember Quake?). Since we are snapshotting intermediate results during training, we will be able to resume from snapshots. This can be done as easy as:

./build/tools/caffe train --solver=examples/imagenet/imagenet_solver.prototxt --snapshot=examples/imagenet/caffe_imagenet_10000.solverstate
./build/tools/caffe train --solver=models/bvlc_reference_caffenet/solver.prototxt --snapshot=models/bvlc_reference_caffenet/caffenet_train_10000.solverstate

where in the script `imagenet_train_10000.solverstate` is the solver state snapshot that stores all necessary information to recover the exact solver state (including the parameters, momentum history, etc).
where in the script `caffenet_train_10000.solverstate` is the solver state snapshot that stores all necessary information to recover the exact solver state (including the parameters, momentum history, etc).

Parting Words
-------------
Expand Down
1 change: 1 addition & 0 deletions models/finetune_flickr_style/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ caffemodel_url: http://dl.caffe.berkeleyvision.org/finetune_flickr_style.caffemo
license: non-commercial
sha1: 443ad95a61fb0b5cd3cee55951bcc1f299186b5e
caffe_commit: 41751046f18499b84dbaf529f64c0e664e2a09fe
gist_id: 034c6ac3865563b69e60
---

This model is trained exactly as described in `docs/finetune_flickr_style/readme.md`, using all 80000 images.
Expand Down
38 changes: 38 additions & 0 deletions scripts/upload_model_to_gist.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/bin/bash

# Check for valid directory
DIRNAME=$1
if [ ! -f $DIRNAME/readme.md ]; then
echo "usage: upload_model_to_gist.sh <dirname>"
echo " <dirname>/readme.md must exist"
fi
cd $DIRNAME
FILES=`find . -type f -maxdepth 1 ! -name "*.caffemodel*" | xargs echo`

# Check for gist tool.
gist -v >/dev/null 2>&1 || { echo >&2 "I require 'gist' but it's not installed. Do 'gem install gist'."; exit 1; }

NAME=`sed -n 's/^name:[[:space:]]*//p' readme.md`
if [ -z "$NAME" ]; then
echo " <dirname>/readme.md must contain name field in the front-matter."
fi

GIST=`sed -n 's/^gist_id:[[:space:]]*//p' readme.md`
if [ -z "$GIST" ]; then
echo "Uploading new Gist"
gist -p -d "$NAME" $FILES
else
echo "Updating existing Gist, id $GIST"
gist -u $GIST -d "$NAME" $FILES
fi

RESULT=$?
if [ $RESULT -eq 0 ]; then
echo "You've uploaded your model!"
echo "Don't forget to add the gist_id field to your <dirname>/readme.md now!"
echo "Run the command again after you do that, to make sure the Gist id propagates."
echo ""
echo "And do share your model over at https://github.com/BVLC/caffe/wiki/Model-Zoo"
else
echo "Something went wrong!"
fi

0 comments on commit 51c4e6e

Please sign in to comment.