CUDA utilities¶

Device, context and memory management on CuPy.

Chainer uses CuPy (with very thin wrapper) to exploit the speed of GPU computation. Following modules and classes are imported to cuda module for convenience (refer to this table when reading chainer’s source codes).

imported name	original name
`chainer.cuda.cupy`	`cupy`
`chainer.cuda.ndarray`	`cupy.ndarray`
`chainer.cuda.cupy.cuda`	`cupy.cuda`
`chainer.cuda.Device`	`cupy.cuda.Device`
`chainer.cuda.Event`	`cupy.cuda.Event`
`chainer.cuda.Stream`	`cupy.cuda.Stream`

Chainer replaces the default allocator of CuPy by its memory pool implementation. It enables us to reuse the device memory over multiple forward/backward computations, and temporary arrays for consecutive elementwise operations.

Devices¶

chainer.cuda.get_device(*args)[source]¶

Gets the device from a device object, an ID integer or an array object.

Note

This API is deprecated. Please use :method:`cupy.cuda.get_device_from_id` or :method:`cupy.cuda.get_device_from_array` instead.

This is a convenient utility to select a correct device if the type of arg is unknown (i.e., one can use this function on arrays that may be on CPU or GPU). The returned device object supports the context management protocol of Python for the with statement.

Parameters: args – Values to specify a GPU device. The first device object, integer or cupy.ndarray object is used to select a device. If it is a device object, it is returned. If it is an integer, the corresponding device is returned. If it is a CuPy array, the device on which this array reside is returned. If any arguments are neither integers nor CuPy arrays, a dummy device object representing CPU is returned.

Returns: Device object specified by given args.

CuPy array allocation and copy¶

Note

As of v1.3.0, the following array construction wrappers are marked as deprecated. Use the corresponding functions of the cupy module instead. The main difference of them is that the default dtype is changed from float32 to float64.

Deprecated functions	Recommended functions
`chainer.cuda.empty`	`cupy.empty()`
`chainer.cuda.empty_like`	`cupy.empty_like()`
`chainer.cuda.zeros`	`cupy.zeros()`
`chainer.cuda.zeros_like`	`cupy.zeros_like()`
`chainer.cuda.ones`	`cupy.ones()`
`chainer.cuda.ones_like`	`cupy.ones_like()`
`chainer.cuda.full`	`cupy.full()`
`chainer.cuda.full_like`	`cupy.full_like()`

chainer.cuda.copy(array, out=None, out_device=None, stream=None)[source]¶

Copies a cupy.ndarray object using the default stream.

This function can copy the device array to the destination array on another device.

Parameters:

array (cupy.ndarray) – Array to be copied.
out (cupy.ndarray) – Destination array. If it is not None, then out_device argument is ignored.
out_device – Destination device specifier. Actual device object is obtained by passing this value to get_device().
stream (cupy.cuda.Stream) – CUDA stream.

Returns:

Copied array.

If out is not specified, then the array is allocated on the device specified by out_device argument.

Return type:

cupy.ndarray

chainer.cuda.to_cpu(array, stream=None)[source]¶

Copies the given GPU array to host CPU.

Parameters:

array – Array to be sent to CPU.
stream (cupy.cuda.Stream) – CUDA stream.

Returns:

Array on CPU.

If given array is already on CPU, then this function just returns array without performing any copy.

Return type:

numpy.ndarray

chainer.cuda.to_gpu(array, device=None, stream=None)[source]¶

Copies the given CPU array to specified device.

Parameters:

array – Array to be sent to GPU.
device – Device specifier.
stream (cupy.cuda.Stream) – CUDA stream. If not None, the copy runs asynchronously.

Returns:

Array on GPU.

If array is already on GPU, then this function just returns array without performing any copy. Note that this function does not copy cupy.ndarray into specified device.

Return type:

cupy.ndarray

Kernel definition utilities¶

chainer.cuda.memoize(for_each_device=False)[source]¶

Makes a function memoizing the result for each argument and device.

This is a similar version of cupy.memoize(). The difference is that this function can be used in the global scope even if CUDA is not available. In such case, this function does nothing.

Note

This decorator acts as a dummy if CUDA is not available. It cannot be used for general purpose memoization even if for_each_device is set to False.

chainer.cuda.clear_memo()[source]¶

Clears the memoized results for all functions decorated by memoize.

This function works like cupy.clear_memo() as a counterpart for chainer.cuda.memoize(). It can be used even if CUDA is not available. In such a case, this function does nothing.

chainer.cuda.elementwise()[source]¶

Creates an elementwise kernel function.

This function uses memoize() to cache the kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.

The arguments are the same as those for cupy.ElementwiseKernel, except that the name argument is mandatory.

chainer.cuda.reduce()[source]¶

Creates a global reduction kernel function.

This function uses memoize() to cache the resulting kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.

The arguments are the same as those for cupy.ReductionKernel, except that the name argument is mandatory.

CPU/GPU generic code support¶

chainer.cuda.get_array_module(*args)[source]¶

Gets an appropriate one from numpy or cupy.

This is almost equivalent to cupy.get_array_module(). The differences are that this function can be used even if CUDA is not available and that it will return their data arrays’ array module for Variable arguments.

Parameters:	args – Values to determine whether NumPy or CuPy should be used.
Returns:	`cupy` or `numpy` is returned based on the types of the arguments.
Return type:	module

cuDNN support¶

chainer.cuda.set_max_workspace_size(size)[source]¶

Sets the workspace size for cuDNN.

Check “cuDNN Library User Guide” for detail.

Parameters:	size – The workspace size for cuDNN.

chainer.cuda.get_max_workspace_size()[source]¶

Gets the workspace size for cuDNN.

Check “cuDNN Library User Guide” for detail.

Returns:	The workspace size for cuDNN.
Return type:	int