clarification of training procedure

dicarlolab · Apr 19, 2019 · bcb17f6 · bcb17f6
1 parent d2cc2e0
commit bcb17f6
Showing 1 changed file with 11 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -41,9 +41,17 @@ Add `--ngpus 1` if you want to run this on a GPU.
 
 ### Want to train on ImageNet?
 
-1. [Get ImageNet](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset) if you don't have it yet. If you do, note that validation images need to be put in separate folders, just like train data. Follow the instructions in that link to do so easily.
+1. You need one or two GPUs. We used Titan X, nowadays a popular choice is GeForce GTX 1080 Ti or GeForce RTX 2080 Ti.
 
-2. `python run.py train - --model Z --workers 20 --ngpus 1`. On a single Titan X, it will train for about 20 hours. Models R and S will require 2 GPUs. **NOTE** the extra `-` between `train` and `--model`!
+2. [Get ImageNet](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset) if you don't have it yet. If you do, note that validation images need to be put in separate folders, just like train data. Follow the instructions in that link to do so easily.
+
+3. You are ready to train:
+
+  - CORnet-Z: `python run.py train - --model Z --workers 20 --ngpus 1 --step_size 10 --epochs 25 --lr .01` (~20 hours)
+  - CORnet-R: `python run.py train - --model R --workers 20 --ngpus 2 --step_size 10 --epochs 25 --lr .1`
+  - CORnet-S: `python run.py train - --model S --workers 20 --ngpus 2 --step_size 20 --epochs 43 --lr .1` (several days)
+
+**NOTE** the extra `-` between `train` and `--model`!
 
 ### If installation is needed
 
@@ -54,7 +62,7 @@ Add `--ngpus 1` if you want to run this on a GPU.
 
 ## Longer Motivation
 
-Deep artificial neural networks with spatially repeated processing (aka, deep convolutional ANNs) have been established as the best class of candidate models of visual processing in primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance and increasingly better explanatory power of both neural and behavioral responses. However, from the neuroscientist's point of view, the relationship between such very deep architectures and the ventral visual pathway is incomplete in at least two ways. On one hand, current state-of-the-art ANNs appear to be too complex (e.g. now over 100 levels) compared with the relatively shallow cortical hierarchy (4-8 levels), which makes it difficult to map their elements to those in the ventral visual stream and makes it difficult to understand what they are doing. On the other hand, current state-of-the-art ANNs appear to be not complex enough in that they lack recurrent connections and the resulting neural response dynamics that are commonplace in the ventral visual stream. Here we describe our ongoing efforts to resolve both of these issues by developing a "CORnet" family of deep neural network architectures. Rather than just seeking high object recognition performance (as the state-of-the-art ANNs above), we instead try to reduce the model family to its most important elements (CORnet-Z) and then gradually build new ANNs with recurrent and skip connections while monitoring both performance and the match between each new CORnet model and a large body of primate brain and behavioral data. We report here that our current best ANN model derived from this approach (CORnet-S) is among the top models on Brain-Score, a composite benchmark for comparing models to the brain, but is simpler than other deep ANNs in terms of the number of convolutions performed along the longest path of information processing in the model. All CORnet models are available at \url{github.com/dicarlolab/CORnet}, and we plan to update this manuscript and the available models in this family as they are produced.
+Deep artificial neural networks with spatially repeated processing (aka, deep convolutional ANNs) have been established as the best class of candidate models of visual processing in primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance and increasingly better explanatory power of both neural and behavioral responses. However, from the neuroscientist's point of view, the relationship between such very deep architectures and the ventral visual pathway is incomplete in at least two ways. On one hand, current state-of-the-art ANNs appear to be too complex (e.g. now over 100 levels) compared with the relatively shallow cortical hierarchy (4-8 levels), which makes it difficult to map their elements to those in the ventral visual stream and makes it difficult to understand what they are doing. On the other hand, current state-of-the-art ANNs appear to be not complex enough in that they lack recurrent connections and the resulting neural response dynamics that are commonplace in the ventral visual stream. Here we describe our ongoing efforts to resolve both of these issues by developing a "CORnet" family of deep neural network architectures. Rather than just seeking high object recognition performance (as the state-of-the-art ANNs above), we instead try to reduce the model family to its most important elements (CORnet-Z) and then gradually build new ANNs with recurrent and skip connections while monitoring both performance and the match between each new CORnet model and a large body of primate brain and behavioral data. We report here that our current best ANN model derived from this approach (CORnet-S) is among the top models on Brain-Score, a composite benchmark for comparing models to the brain, but is simpler than other deep ANNs in terms of the number of convolutions performed along the longest path of information processing in the model. All CORnet models are available at [github.com/dicarlolab/CORnet](github.com/dicarlolab/CORnet), and we plan to update this manuscript and the available models in this family as they are produced.
 
 Read more: [Kubilius\*, Schrimpf\*, et al. (biorxiv, 2018)](https://doi.org/10.1101/408385)