Skip to content

Commit

Permalink
[docs] identify batch norm layer blobs
Browse files Browse the repository at this point in the history
  • Loading branch information
shelhamer committed Sep 13, 2016
1 parent 04f9a77 commit 3b6fd1d
Showing 1 changed file with 12 additions and 11 deletions.
23 changes: 12 additions & 11 deletions include/caffe/layers/batch_norm_layer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,19 @@ namespace caffe {
* @brief Normalizes the input to have 0-mean and/or unit (1) variance across
* the batch.
*
* This layer computes Batch Normalization described in [1]. For
* each channel in the data (i.e. axis 1), it subtracts the mean and divides
* by the variance, where both statistics are computed across both spatial
* dimensions and across the different examples in the batch.
* This layer computes Batch Normalization as described in [1]. For each channel
* in the data (i.e. axis 1), it subtracts the mean and divides by the variance,
* where both statistics are computed across both spatial dimensions and across
* the different examples in the batch.
*
* By default, during training time, the network is computing global mean/
* variance statistics via a running average, which is then used at test
* time to allow deterministic outputs for each input. You can manually
* toggle whether the network is accumulating or using the statistics via the
* use_global_stats option. IMPORTANT: for this feature to work, you MUST
* set the learning rate to zero for all three parameter blobs, i.e.,
* param {lr_mult: 0} three times in the layer definition.
* By default, during training time, the network is computing global
* mean/variance statistics via a running average, which is then used at test
* time to allow deterministic outputs for each input. You can manually toggle
* whether the network is accumulating or using the statistics via the
* use_global_stats option. IMPORTANT: for this feature to work, you MUST set
* the learning rate to zero for all three blobs, i.e., param {lr_mult: 0} three
* times in the layer definition. For reference, these three blobs are (0)
* mean, (1) variance, and (2) the moving average factor.
*
* Note that the original paper also included a per-channel learned bias and
* scaling factor. To implement this in Caffe, define a `ScaleLayer` configured
Expand Down

0 comments on commit 3b6fd1d

Please sign in to comment.