CUDA utilities¶
Device, context and memory management on CuPy.
Chainer uses CuPy (with very thin wrapper) to exploit the speed of GPU
computation. Following modules and classes are imported to cuda
module for convenience (refer to this table when reading chainer’s source
codes).
| imported name | original name |
|---|---|
chainer.cuda.cupy |
cupy |
chainer.cuda.ndarray |
cupy.ndarray |
chainer.cuda.cupy.cuda |
cupy.cuda |
chainer.cuda.Device |
cupy.cuda.Device |
chainer.cuda.Event |
cupy.cuda.Event |
chainer.cuda.Stream |
cupy.cuda.Stream |
Chainer replaces the default allocator of CuPy by its memory pool implementation. It enables us to reuse the device memory over multiple forward/backward computations, and temporary arrays for consecutive elementwise operations.
Devices¶
-
chainer.cuda.get_device(*args)[source]¶ Gets the device from a device object, an ID integer or an array object.
Note
This API is deprecated. Please use :method:`cupy.cuda.get_device_from_id` or :method:`cupy.cuda.get_device_from_array` instead.
This is a convenient utility to select a correct device if the type of
argis unknown (i.e., one can use this function on arrays that may be on CPU or GPU). The returned device object supports the context management protocol of Python for the with statement.Parameters: args – Values to specify a GPU device. The first device object, integer or cupy.ndarrayobject is used to select a device. If it is a device object, it is returned. If it is an integer, the corresponding device is returned. If it is a CuPy array, the device on which this array reside is returned. If any arguments are neither integers nor CuPy arrays, a dummy device object representing CPU is returned.Returns: Device object specified by given args.See also
See
cupy.cuda.Devicefor the device selection not by arrays.
-
chainer.cuda.get_device_from_id(device_id)[source]¶ Gets the device from an ID integer.
Parameters: device_id (int or None) – The ID of the device which this function returns.
-
chainer.cuda.get_device_from_array(*arrays)[source]¶ Gets the device from a list of CuPy array or a single CuPy array.
The device on which the given CuPy array reside is returned.
Parameters: array ( cupy.ndarrayor list ofcupy.ndarray) – A CuPy array which this function returns the device corresponding to. If a list of :class:`cupy.ndarray`s are given, it returns the first device object of an array in the list.
CuPy array allocation and copy¶
Note
As of v1.3.0, the following array construction wrappers are marked as
deprecated. Use the corresponding functions of the cupy module
instead. The main difference of them is that the default dtype is changed
from float32 to float64.
| Deprecated functions | Recommended functions |
|---|---|
chainer.cuda.empty |
cupy.empty() |
chainer.cuda.empty_like |
cupy.empty_like() |
chainer.cuda.zeros |
cupy.zeros() |
chainer.cuda.zeros_like |
cupy.zeros_like() |
chainer.cuda.ones |
cupy.ones() |
chainer.cuda.ones_like |
cupy.ones_like() |
chainer.cuda.full |
cupy.full() |
chainer.cuda.full_like |
cupy.full_like() |
-
chainer.cuda.copy(array, out=None, out_device=None, stream=None)[source]¶ Copies a
cupy.ndarrayobject using the default stream.This function can copy the device array to the destination array on another device.
Parameters: - array (cupy.ndarray) – Array to be copied.
- out (cupy.ndarray) – Destination array.
If it is not
None, thenout_deviceargument is ignored. - out_device – Destination device specifier. Actual device object is
obtained by passing this value to
get_device(). - stream (cupy.cuda.Stream) – CUDA stream.
Returns: Copied array.
If
outis not specified, then the array is allocated on the device specified byout_deviceargument.Return type:
-
chainer.cuda.to_cpu(array, stream=None)[source]¶ Copies the given GPU array to host CPU.
Parameters: - array – Array to be sent to CPU.
- stream (cupy.cuda.Stream) – CUDA stream.
Returns: Array on CPU.
If given
arrayis already on CPU, then this function just returnsarraywithout performing any copy.Return type:
-
chainer.cuda.to_gpu(array, device=None, stream=None)[source]¶ Copies the given CPU array to specified device.
Parameters: - array – Array to be sent to GPU.
- device – Device specifier.
- stream (cupy.cuda.Stream) – CUDA stream. If not
None, the copy runs asynchronously.
Returns: Array on GPU.
If
arrayis already on GPU, then this function just returnsarraywithout performing any copy. Note that this function does not copycupy.ndarrayinto specified device.Return type:
Kernel definition utilities¶
-
chainer.cuda.memoize(for_each_device=False)[source]¶ Makes a function memoizing the result for each argument and device.
This is a similar version of
cupy.memoize(). The difference is that this function can be used in the global scope even if CUDA is not available. In such case, this function does nothing.Note
This decorator acts as a dummy if CUDA is not available. It cannot be used for general purpose memoization even if
for_each_deviceis set to False.
-
chainer.cuda.clear_memo()[source]¶ Clears the memoized results for all functions decorated by memoize.
This function works like
cupy.clear_memo()as a counterpart forchainer.cuda.memoize(). It can be used even if CUDA is not available. In such a case, this function does nothing.
-
chainer.cuda.elementwise()[source]¶ Creates an elementwise kernel function.
This function uses
memoize()to cache the kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.The arguments are the same as those for
cupy.ElementwiseKernel, except that thenameargument is mandatory.
-
chainer.cuda.reduce()[source]¶ Creates a global reduction kernel function.
This function uses
memoize()to cache the resulting kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.The arguments are the same as those for
cupy.ReductionKernel, except that thenameargument is mandatory.
CPU/GPU generic code support¶
-
chainer.cuda.get_array_module(*args)[source]¶ Gets an appropriate one from
numpyorcupy.This is almost equivalent to
cupy.get_array_module(). The differences are that this function can be used even if CUDA is not available and that it will return their data arrays’ array module forVariablearguments.Parameters: args – Values to determine whether NumPy or CuPy should be used. Returns: cupyornumpyis returned based on the types of the arguments.Return type: module