chainer.functions.deconvolution_nd

chainer.functions.deconvolution_nd(x, W, b=None, stride=1, pad=0, outsize=None)[source]

N-dimensional deconvolution function.

This is an implementation of N-dimensional deconvolution which generalizes two-dimensional one. In most of deep learning frameworks and papers, this function is called transposed convolution. But because of historical reasons (e.g. paper by Ziller Deconvolutional Networks) and backward compatibility, this function is called deconvolution in Chainer.

It takes three variables: the input x, the filter weight W, and the bias vector b.

Notation: here is a notation for dimensionalities.

  • \(N\) is the number of spatial dimensions.
  • \(n\) is the batch size.
  • \(c_I\) and \(c_O\) are the number of the input and output channels, respectively.
  • \(d_1, d_2, ..., d_N\) are the size of each axis of the input’s spatial dimensions, respectively.
  • \(k_1, k_2, ..., k_N\) are the size of each axis of the filters, respectively.
  • \(p_1, p_2, ..., p_N\) are the size of each axis of the spatial padding size, respectively.
  • \(s_1, s_2, ..., s_N\) are the stride of each axis of filter application, respectively.

If outsize option is None, the output size \((l_1, l_2, ..., l_N)\) is determined by the following equations with the items in the above list:

\[l_n = s_n (d_n - 1) + k_n - 2 p_n \ \ (n = 1, ..., N)\]

If outsize option is given, the output size is determined by outsize. In this case, the outsize \((l_1, l_2, ..., l_N)\) must satisfy the following equations:

\[d_n = \lfloor (l_n + 2p_n - k_n) / s_n \rfloor + 1 \ \ (n = 1, ..., N)\]
Parameters:
  • x (Variable or numpy.ndarray or cupy.ndarray) – Input variable of shape \((n, c_I, d_1, d_2, ..., d_N)\).
  • W (Variable or numpy.ndarray or cupy.ndarray) – Weight variable of shape \((c_I, c_O, k_1, k_2, ..., k_N)\).
  • b (Variable or numpy.ndarray or cupy.ndarray) – One-dimensional bias variable with length \(c_O\) (optional).
  • stride (int or tuple of int s) – Stride of filter applications \((s_1, s_2, ..., s_N)\). stride=s is equivalent to (s, s, ..., s).
  • pad (int or tuple of int s) – Spatial padding width for input arrays \((p_1, p_2, ..., p_N)\). pad=p is equivalent to (p, p, ..., p).
  • outsize (tuple of int s) – Expected output size of deconvolutional operation. It should be a tuple of ints \((l_1, l_2, ..., l_N)\). Default value is None and the outsize is estimated by input size, stride and pad.
Returns:

Output variable of shape \((n, c_O, l_1, l_2, ..., l_N)\).

Return type:

Variable

See also

links.DeconvolutionND, deconvolution_2d()

Example

Example1: the case when outsize is not given.

>>> n = 10
>>> c_i, c_o = 3, 1
>>> d1, d2, d3 = 5, 10, 15
>>> k1, k2, k3 = 10, 10, 10
>>> p1, p2, p3 = 5, 5, 5
>>> x = np.random.uniform(0, 1, (n, c_i, d1, d2, d3)).astype('f')
>>> x.shape
(10, 3, 5, 10, 15)
>>> W = np.random.uniform(0, 1, (c_i, c_o, k1, k2, k3)).astype('f')
>>> W.shape
(3, 1, 10, 10, 10)
>>> b = np.random.uniform(0, 1, (c_o)).astype('f')
>>> b.shape
(1,)
>>> s1, s2, s3 = 2, 4, 6
>>> y = F.deconvolution_nd(x, W, b, stride=(s1, s2, s3), pad=(p1, p2, p3))
>>> y.shape
(10, 1, 8, 36, 84)
>>> l1 = s1 * (d1 - 1) + k1 - 2 * p1
>>> l2 = s2 * (d2 - 1) + k2 - 2 * p2
>>> l3 = s3 * (d3 - 1) + k3 - 2 * p3
>>> y.shape == (n, c_o, l1, l2, l3)
True

Example2: the case when outsize is given.

>>> n = 10
>>> c_i, c_o = 3, 1
>>> d1, d2, d3 = 5, 10, 15
>>> k1, k2, k3 = 10, 10, 10
>>> p1, p2, p3 = 5, 5, 5
>>> x = np.random.uniform(0, 1, (n, c_i, d1, d2, d3)).astype('f')
>>> x.shape
(10, 3, 5, 10, 15)
>>> W = np.random.uniform(0, 1, (c_i, c_o, k1, k2, k3)).astype('f')
>>> W.shape
(3, 1, 10, 10, 10)
>>> b = np.random.uniform(0, 1, (c_o)).astype('f')
>>> b.shape
(1,)
>>> s1, s2, s3 = 2, 4, 6
>>> l1, l2, l3 = 9, 38, 87
>>> d1 == int((l1 + 2 * p1 - k1) / s1) + 1
True
>>> d2 == int((l2 + 2 * p2 - k2) / s2) + 1
True
>>> d3 == int((l3 + 2 * p3 - k3) / s3) + 1
True
>>> y = F.deconvolution_nd(x, W, b, stride=(s1, s2, s3), pad=(p1, p2, p3), outsize=(l1, l2, l3))
>>> y.shape
(10, 1, 9, 38, 87)
>>> y.shape == (n, c_o, l1, l2, l3)
True