chainer.functions.convolution_2d

chainer.functions.convolution_2d(x, W, b=None, stride=1, pad=0, cover_all=False)[source]

Two-dimensional convolution function.

This is an implementation of two-dimensional convolution in ConvNets. It takes three variables: the input image x, the filter weight W, and the bias vector b.

Notation: here is a notation for dimensionalities.

  • \(n\) is the batch size.
  • \(c_I\) and \(c_O\) are the number of the input and output channels, respectively.
  • \(h_I\) and \(w_I\) are the height and width of the input image, respectively.
  • \(h_K\) and \(w_K\) are the height and width of the filters, respectively.
  • \(h_P\) and \(w_P\) are the height and width of the spatial padding size, respectively.

Then the Convolution2D function computes correlations between filters and patches of size \((h_K, w_K)\) in x. Note that correlation here is equivalent to the inner product between expanded vectors. Patches are extracted at positions shifted by multiples of stride from the first position (-h_P, -w_P) for each spatial axis. The right-most (or bottom-most) patches do not run over the padded spatial size.

Let \((s_Y, s_X)\) be the stride of filter application. Then, the output size \((h_O, w_O)\) is determined by the following equations:

\[\begin{split}h_O &= (h_I + 2h_P - h_K) / s_Y + 1,\\ w_O &= (w_I + 2w_P - w_K) / s_X + 1.\end{split}\]

If cover_all option is True, the filter will cover the all spatial locations. So, if the last stride of filter does not cover the end of spatial locations, an addtional stride will be applied to the end part of spatial locations. In this case, the output size \((h_O, w_O)\) is determined by the following equations:

\[\begin{split}h_O &= (h_I + 2h_P - h_K + s_Y - 1) / s_Y + 1,\\ w_O &= (w_I + 2w_P - w_K + s_X - 1) / s_X + 1.\end{split}\]

If the bias vector is given, then it is added to all spatial locations of the output of convolution.

The output of this function can be non-deterministic when it uses cuDNN. If chainer.configuration.config.cudnn_deterministic is True and cuDNN version is >= v3, it forces cuDNN to use a deterministic algorithm.

Warning

deterministic argument is not supported anymore since v2. Instead, use chainer.using_config('cudnn_deterministic', value) (value is either True or False). See chainer.using_config().

Parameters:
  • x (Variable or numpy.ndarray or cupy.ndarray) – Input variable of shape \((n, c_I, h_I, w_I)\).
  • W (Variable or numpy.ndarray or cupy.ndarray) – Weight variable of shape \((c_O, c_I, h_K, w_K)\).
  • b (Variable or numpy.ndarray or cupy.ndarray) – Bias variable of length \(c_O\) (optional).
  • stride (int or pair of int s) – Stride of filter applications. stride=s and stride=(s, s) are equivalent.
  • pad (int or pair of int s) – Spatial padding width for input arrays. pad=p and pad=(p, p) are equivalent.
  • cover_all (bool) – If True, all spatial locations are convoluted into some output pixels.
Returns:

Output variable of shape \((n, c_O, h_O, w_O)\).

Return type:

Variable

See also

Convolution2D

Example

>>> n = 10
>>> c_i, c_o = 3, 1
>>> h_i, w_i = 30, 40
>>> h_k, w_k = 10, 10
>>> h_p, w_p = 5, 5
>>> x = np.random.uniform(0, 1, (n, c_i, h_i, w_i)).astype('f')
>>> x.shape
(10, 3, 30, 40)
>>> W = np.random.uniform(0, 1, (c_o, c_i, h_k, w_k)).astype('f')
>>> W.shape
(1, 3, 10, 10)
>>> b = np.random.uniform(0, 1, (c_o,)).astype('f')
>>> b.shape
(1,)
>>> s_y, s_x = 5, 7
>>> y = F.convolution_2d(x, W, b, stride=(s_y, s_x), pad=(h_p, w_p))
>>> y.shape
(10, 1, 7, 6)
>>> h_o = int((h_i + 2 * h_p - h_k) / s_y + 1)
>>> w_o = int((w_i + 2 * w_p - w_k) / s_x + 1)
>>> y.shape == (n, c_o, h_o, w_o)
True
>>> y = F.convolution_2d(x, W, b, stride=(s_y, s_x), pad=(h_p, w_p), cover_all=True)
>>> y.shape == (n, c_o, h_o, w_o + 1)
True