ResNet training in Torch

This implements training of residual networks from Deep Residual Learning for Image Recognition by Kaiming He, et. al.

We wrote a more verbose blog post discussing this code, and ResNets in general here.

Requirements

See the installation instructions for a step-by-step guide.

Install Torch on a machine with CUDA GPU
Install cuDNN v4 or v5 and the Torch cuDNN bindings
Download the ImageNet dataset and move validation images to labeled subfolders

If you already have Torch installed, update nn, cunn, and cudnn.

Training

See the training recipes for addition examples.

The training scripts come with several options, which can be listed with the --help flag.

th main.lua --help

To run the training, simply run main.lua. By default, the script runs ResNet-34 on ImageNet with 1 GPU and 2 data-loader threads.

th main.lua -data [imagenet-folder with train and val folders]

To train ResNet-50 on 4 GPUs:

th main.lua -depth 50 -batchSize 256 -nGPU 4 -nThreads 8 -shareGradInput true -data [imagenet-folder]

Trained models

Trained ResNet 18, 34, 50, 101, 152, and 200 models are available for download. We include instructions for using a custom dataset, classifying an image and getting the model's top5 predictions, and for extracting image features using a pre-trained model.

The trained models achieve better error rates than the original ResNet models.

Single-crop (224x224) validation error rate

Network	Top-1 error	Top-5 error
ResNet-18	30.43	10.76
ResNet-34	26.73	8.74
ResNet-50	24.01	7.02
ResNet-101	22.44	6.21
ResNet-152	22.16	6.16
ResNet-200	21.66	5.79

Notes

This implementation differs from the ResNet paper in a few ways:

Scale augmentation: We use the scale and aspect ratio augmentation from Going Deeper with Convolutions, instead of scale augmentation used in the ResNet paper. We find this gives a better validation error.

Color augmentation: We use the photometric distortions from Andrew Howard in addition to the AlexNet-style color augmentation used in the ResNet paper.

Weight decay: We apply weight decay to all weights and biases instead of just the weights of the convolution layers.

Strided convolution: When using the bottleneck architecture, we use stride 2 in the 3x3 convolution, instead of the first 1x1 convolution.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
datasets		datasets
models		models
pretrained		pretrained
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
PATENTS		PATENTS
README.md		README.md
TRAINING.md		TRAINING.md
checkpoints.lua		checkpoints.lua
dataloader.lua		dataloader.lua
main.lua		main.lua
opts.lua		opts.lua
train.lua		train.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ResNet training in Torch

Requirements

Training

Trained models

Single-crop (224x224) validation error rate

Notes

About

Releases

Packages

Languages

License

xiaocy/fb.resnet.torch

Folders and files

Latest commit

History

Repository files navigation

ResNet training in Torch

Requirements

Training

Trained models

Single-crop (224x224) validation error rate

Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages