chainerx.conv_transpose

chainerx.conv_transpose(x, w, b=None, stride=1, pad=0, outsize=None)

N-dimensional transposed convolution.

This is an implementation of N-dimensional transposed convolution, which is previously known as deconvolution in Chainer.

It takes three arrays: the input x, the filter weight w, and the bias vector b.

Notation: here is a notation for dimensionalities.

  • \(N\) is the number of spatial dimensions.

  • \(n\) is the batch size.

  • \(c_I\) and \(c_O\) are the number of the input and output channels, respectively.

  • \(d_1, d_2, ..., d_N\) are the size of each axis of the input’s spatial dimensions, respectively.

  • \(k_1, k_2, ..., k_N\) are the size of each axis of the filters, respectively.

  • \(p_1, p_2, ..., p_N\) are the size of each axis of the spatial padding size, respectively.

  • \(s_1, s_2, ..., s_N\) are the stride of each axis of filter application, respectively.

If outsize option is None, the output size \((l_1, l_2, ..., l_N)\) is determined by the following equations with the items in the above list:

\[l_n = s_n (d_n - 1) + k_n - 2 p_n \ \ (n = 1, ..., N)\]

If outsize option is given, the output size is determined by outsize. In this case, the outsize \((l_1, l_2, ..., l_N)\) must satisfy the following equations:

\[d_n = \lfloor (l_n + 2p_n - k_n) / s_n \rfloor + 1 \ \ (n = 1, ..., N)\]
Parameters
  • x (ndarray) – Input array of shape \((n, c_I, d_1, d_2, ..., d_N)\).

  • w (ndarray) – Weight array of shape \((c_I, c_O, k_1, k_2, ..., k_N)\).

  • b (None or ndarray) – One-dimensional bias array with length \(c_O\) (optional).

  • stride (int or tuple of int s) – Stride of filter applications \((s_1, s_2, ..., s_N)\). stride=s is equivalent to (s, s, ..., s).

  • pad (int or tuple of int s) – Spatial padding width for input arrays \((p_1, p_2, ..., p_N)\). pad=p is equivalent to (p, p, ..., p).

  • outsize (None or tuple of int s) – Expected output size of deconvolutional operation. It should be a tuple of ints \((l_1, l_2, ..., l_N)\). Default value is None and the outsize is estimated by input size, stride and pad.

Returns

Output array of shape \((n, c_O, l_1, l_2, ..., l_N)\).

Return type

ndarray

Note

During backpropagation, this function propagates the gradient of the output array to input arrays x, w, and b.

Example

Example1: the case when outsize is not given.

>>> n = 10
>>> c_i, c_o = 3, 1
>>> d1, d2, d3 = 5, 10, 15
>>> k1, k2, k3 = 10, 10, 10
>>> p1, p2, p3 = 5, 5, 5
>>> x = chainerx.random.uniform(0, 1, (n, c_i, d1, d2, d3)).astype(np.float32)
>>> x.shape
(10, 3, 5, 10, 15)
>>> w = chainerx.random.uniform(0, 1, (c_i, c_o, k1, k2, k3)).astype(np.float32)
>>> w.shape
(3, 1, 10, 10, 10)
>>> b = chainerx.random.uniform(0, 1, (c_o)).astype(np.float32)
>>> b.shape
(1,)
>>> s1, s2, s3 = 2, 4, 6
>>> y = chainerx.conv_transpose(x, w, b, stride=(s1, s2, s3), pad=(p1, p2, p3))
>>> y.shape
(10, 1, 8, 36, 84)
>>> l1 = s1 * (d1 - 1) + k1 - 2 * p1
>>> l2 = s2 * (d2 - 1) + k2 - 2 * p2
>>> l3 = s3 * (d3 - 1) + k3 - 2 * p3
>>> y.shape == (n, c_o, l1, l2, l3)
True

Example2: the case when outsize is given.

>>> n = 10
>>> c_i, c_o = 3, 1
>>> d1, d2, d3 = 5, 10, 15
>>> k1, k2, k3 = 10, 10, 10
>>> p1, p2, p3 = 5, 5, 5
>>> x = chainerx.array(np.random.uniform(0, 1, (n, c_i, d1, d2, d3)).astype(np.float32))
>>> x.shape
(10, 3, 5, 10, 15)
>>> w = chainerx.array(np.random.uniform(0, 1, (c_i, c_o, k1, k2, k3)).astype(np.float32))
>>> w.shape
(3, 1, 10, 10, 10)
>>> b = chainerx.array(np.random.uniform(0, 1, (c_o)).astype(np.float32))
>>> b.shape
(1,)
>>> s1, s2, s3 = 2, 4, 6
>>> l1, l2, l3 = 9, 38, 87
>>> d1 == int((l1 + 2 * p1 - k1) / s1) + 1
True
>>> d2 == int((l2 + 2 * p2 - k2) / s2) + 1
True
>>> d3 == int((l3 + 2 * p3 - k3) / s3) + 1
True
>>> y = chainerx.conv_transpose(x, w, b, stride=(s1, s2, s3), pad=(p1, p2, p3), outsize=(l1, l2, l3))
>>> y.shape
(10, 1, 9, 38, 87)
>>> y.shape == (n, c_o, l1, l2, l3)
True