chainer.functions.depthwise_convolution_2d(x, W, b=None, stride=1, pad=0)[source]

Two-dimensional depthwise convolution function.

This is an implementation of two-dimensional depthwise convolution. It takes two or three variables: the input image x, the filter weight W, and optionally, the bias vector b.

Notation: here is a notation for dimensionalities.

  • \(n\) is the batch size.
  • \(c_I\) is the number of the input.
  • \(c_M\) is the channel multiplier.
  • \(h\) and \(w\) are the height and width of the input image, respectively.
  • \(h_O\) and \(w_O\) are the height and width of the output image, respectively.
  • \(k_H\) and \(k_W\) are the height and width of the filters, respectively.
  • x (chainer.Variable or numpy.ndarray or cupy.ndarray) – Input variable of shape \((n, c_I, h, w)\).
  • W (Variable) – Weight variable of shape \((c_M, c_I, k_H, k_W)\).
  • b (Variable) – Bias variable of length \(c_M * c_I\) (optional).
  • stride (int or pair of ints) – Stride of filter applications. stride=s and stride=(s, s) are equivalent.
  • pad (int or pair of ints) – Spatial padding width for input arrays. pad=p and pad=(p, p) are equivalent.

Output variable. Its shape is \((n, c_I * c_M, h_O, w_O)\).

Return type:


Like Convolution2D, DepthwiseConvolution2D function computes correlations between filters and patches of size \((k_H, k_W)\) in x. But unlike Convolution2D, DepthwiseConvolution2D does not add up input channels of filters but concatenates them. For that reason, the shape of outputs of depthwise convolution are \((n, c_I * c_M, h_O, w_O)\), \(c_M\) is called channel_multiplier.

\((h_O, w_O)\) is determined by the equivalent equation of Convolution2D.

If the bias vector is given, then it is added to all spatial locations of the output of convolution.

See: L. Sifre. Rigid-motion scattering for image classification


>>> x = np.random.uniform(0, 1, (2, 3, 4, 7))
>>> W = np.random.uniform(0, 1, (2, 3, 3, 3))
>>> b = np.random.uniform(0, 1, (6,))
>>> y = F.depthwise_convolution_2d(x, W, b)
>>> y.shape
(2, 6, 2, 5)