CUDA utilities

Device, context and memory management on CuPy.

Chainer uses CuPy (with very thin wrapper) to exploit the speed of GPU computation. Following modules and classes are imported to cuda module for convenience (refer to this table when reading chainer’s source codes).

imported name original name
chainer.cuda.cupy cupy
chainer.cuda.ndarray cupy.ndarray
chainer.cuda.cupy.cuda cupy.cuda
chainer.cuda.Device cupy.cuda.Device
chainer.cuda.Event cupy.cuda.Event
chainer.cuda.Stream cupy.cuda.Stream

Chainer replaces the default allocator of CuPy by its memory pool implementation. It enables us to reuse the device memory over multiple forward/backward computations, and temporary arrays for consecutive elementwise operations.

Devices

chainer.cuda.get_device(*args)[source]

Gets the device from a device object, an ID integer or an array object.

Note

This API is deprecated. Please use get_device_from_id() or get_device_from_array() instead.

This is a convenient utility to select a correct device if the type of arg is unknown (i.e., one can use this function on arrays that may be on CPU or GPU). The returned device object supports the context management protocol of Python for the with statement.

Parameters:args – Values to specify a GPU device. The first device object, integer or cupy.ndarray object is used to select a device. If it is a device object, it is returned. If it is an integer, the corresponding device is returned. If it is a CuPy array, the device on which this array reside is returned. If any arguments are neither integers nor CuPy arrays, a dummy device object representing CPU is returned.
Returns:Device object specified by given args.

See also

See cupy.cuda.Device for the device selection not by arrays.

chainer.cuda.get_device_from_id(device_id)[source]

Gets the device from an ID integer.

Parameters:device_id (int or None) – The ID of the device which this function returns.
chainer.cuda.get_device_from_array(*arrays)[source]

Gets the device from a list of CuPy array or a single CuPy array.

The device on which the given CuPy array reside is returned.

Parameters:array (cupy.ndarray or list of cupy.ndarray) – A CuPy array which this function returns the device corresponding to. If a list of cupy.ndarray s are given, it returns the first device object of an array in the list.

CuPy array allocation and copy

chainer.cuda.copy(array, out=None, out_device=None, stream=None)[source]

Copies a cupy.ndarray object using the default stream.

This function can copy the device array to the destination array on another device.

Parameters:
  • array (cupy.ndarray) – Array to be copied.
  • out (cupy.ndarray) – Destination array. If it is not None, then out_device argument is ignored.
  • out_device – Destination device specifier. Actual device object is obtained by passing this value to get_device().
  • stream (cupy.cuda.Stream) – CUDA stream.
Returns:

Copied array.

If out is not specified, then the array is allocated on the device specified by out_device argument.

Return type:

cupy.ndarray

chainer.cuda.to_cpu(array, stream=None)[source]

Copies the given GPU array to host CPU.

Parameters:
Returns:

Array on CPU.

If given array is already on CPU, then this function just returns array without performing any copy.

Return type:

numpy.ndarray

chainer.cuda.to_gpu(array, device=None, stream=None)[source]

Copies the given CPU array to specified device.

Parameters:
  • array – Array to be sent to GPU.
  • device – Device specifier.
  • stream (cupy.cuda.Stream) – CUDA stream. If not None, the copy runs asynchronously.
Returns:

Array on GPU.

If array is already on GPU, then this function just returns array without performing any copy. Note that this function does not copy cupy.ndarray into specified device.

Return type:

cupy.ndarray

Kernel definition utilities

chainer.cuda.memoize(for_each_device=False)[source]

Makes a function memoizing the result for each argument and device.

This is a similar version of cupy.memoize(). The difference is that this function can be used in the global scope even if CUDA is not available. In such case, this function does nothing.

Note

This decorator acts as a dummy if CUDA is not available. It cannot be used for general purpose memoization even if for_each_device is set to False.

chainer.cuda.clear_memo()[source]

Clears the memoized results for all functions decorated by memoize.

This function works like cupy.clear_memo() as a counterpart for chainer.cuda.memoize(). It can be used even if CUDA is not available. In such a case, this function does nothing.

chainer.cuda.elementwise(*args, **kwargs)[source]

Creates an elementwise kernel function.

This function uses memoize() to cache the kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.

The arguments are the same as those for cupy.ElementwiseKernel, except that the name argument is mandatory.

chainer.cuda.reduce(*args, **kwargs)[source]

Creates a global reduction kernel function.

This function uses memoize() to cache the resulting kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.

The arguments are the same as those for cupy.ReductionKernel, except that the name argument is mandatory.

CPU/GPU generic code support

chainer.cuda.get_array_module(*args)[source]

Gets an appropriate one from numpy or cupy.

This is almost equivalent to cupy.get_array_module(). The differences are that this function can be used even if CUDA is not available and that it will return their data arrays’ array module for Variable arguments.

Parameters:args – Values to determine whether NumPy or CuPy should be used.
Returns:cupy or numpy is returned based on the types of the arguments.
Return type:module

cuDNN support

chainer.cuda.set_max_workspace_size(size)[source]

Sets the workspace size for cuDNN.

Check “cuDNN Library User Guide” for detail.

Parameters:size – The workspace size for cuDNN.
chainer.cuda.get_max_workspace_size()[source]

Gets the workspace size for cuDNN.

Check “cuDNN Library User Guide” for detail.

Returns:The workspace size for cuDNN.
Return type:int