chainer.functions.batch_normalization

chainer.functions.batch_normalization(x, gamma, beta, eps=2e-5, running_mean=None, running_var=None, decay=0.9)[source]

Batch normalization function.

It takes the input variable x and two parameter variables gamma and beta. The parameter variables must both have the same dimensionality, which is referred to as the channel shape. This channel shape corresponds to the dimensions in the input which are not averaged over. Since the first dimension of the input corresponds to the batch size, the second dimension of x will correspond to the first dimension of the channel shape, the third dimension of x will correspond to the second channel dimension (if it exists) and so on. Therefore, the dimensionality of the input must be at least one plus the number of channel dimensions. The total effective “batch size” will then be considered to be the product of all dimensions in x except for the channel dimensions.

As an example, if the input is four dimensional and the parameter variables are one dimensional, then it is assumed that the first dimension of the input is the batch size, the second dimension is the channel size, and the remaining two dimensions are considered to be spatial dimensions that will be averaged over along with the batch size in the batch normalization computations. That is, the total batch size will be considered to be the product of all input dimensions except the second dimension.

Note: If this function is called, it will not be possible to access the updated running mean and variance statistics, because they are members of the function object, which cannot be accessed by the caller. If it is desired to access the updated running statistics, it is necessary to get a new instance of the function object, call the object, and then access the running_mean and/or running_var attributes. See the corresponding Link class for an example of how to do this.

Warning

train argument is not supported anymore since v2. Instead, use chainer.using_config('train', train). See chainer.using_config().

Parameters:
  • x (Variable) – Input variable.
  • gamma (Variable) – Scaling parameter of normalized data.
  • beta (Variable) – Shifting parameter of scaled normalized data.
  • eps (float) – Epsilon value for numerical stability.
  • running_mean (array) – Running average of the mean. This is a running average of the mean over several mini-batches using the decay parameter. If None, the running average is not computed. If this is None, then runnng_var must also be None.
  • running_var (array) – Running average of the variance. This is a running average of the variance over several mini-batches using the decay parameter. If None, the running average is not computed. If this is None, then running_mean must also be None.
  • decay (float) – Decay rate of moving average. It is used during training.

See: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

See also

links.BatchNormalization