chainer.functions.spatial_transformer_sampler(x, grid, **kwargs)[source]

2D Spatial Transformer sampler.

This is a differentiable image sampler. With a set of sampling points grid and an input feature map x, this produces a sampled output feature map.

This function currently only supports bilinear interpolation as a sampling kernel.

When coordinates in grid is outside range \([-1, 1]\), values are sampled from a zero padded input image.

Notatition: here is a notation for dimensionalities.

  • \(n\) is the batch size.
  • \(c_I\) is the number of the input channels.
  • \(h\) and \(w\) are the height and width of the input image, respectively.
  • \(h_O\) and \(w_O\) are the height and width of the output image.

See detail in the following paper: Spatial Transformer Networks.


cuDNN supports SpatialTransformerSampler from version 5.0.0.

  • x (Variable or N-dimensional array) – Input variable of shape \((n, c_I, h, w)\).
  • grid (Variable) –

    Coordinate variable of shape \((n, 2, h_O, w_O)\). Each coordinate defines the spatial location in the input where a sampling kernel is applied to get the value at a particular pixel in the output. grid[idx, :, i, j] corresponds to the coordinate that is used to sample the values for an output pixel at location \((i, j)\).

    In the second dimension, the first coordinate corresponds to the location along the horizontal axis, and the second coordinate corresponds to the location along the vertical axis.

    The coordinate \((-1, -1)\) corresponds to the upper-left corner of the input image.


Output feature map of shape \((n, c_I, h_O, w_O)\).

Return type: