CUDA utilities¶
Device, context and memory management on CuPy.
Chainer uses CuPy (with very thin wrapper) to exploit the speed of GPU
computation. Following modules and classes are imported to cuda
module for convenience (refer to this table when reading chainer’s source
codes).
imported name | original name |
---|---|
chainer.cuda.cupy |
cupy |
chainer.cuda.ndarray |
cupy.ndarray |
chainer.cuda.cupy.cuda |
cupy.cuda |
chainer.cuda.Device |
cupy.cuda.Device |
chainer.cuda.Event |
cupy.cuda.Event |
chainer.cuda.Stream |
cupy.cuda.Stream |
Chainer replaces the default allocator of CuPy by its memory pool implementation. It enables us to reuse the device memory over multiple forward/backward computations, and temporary arrays for consecutive elementwise operations.
Devices¶
-
chainer.cuda.
get_device
(*args)[source]¶ Gets the device from a device object, an ID integer or an array object.
Note
This API is deprecated. Please use :method:`cupy.cuda.get_device_from_id` or :method:`cupy.cuda.get_device_from_array` instead.
This is a convenient utility to select a correct device if the type of
arg
is unknown (i.e., one can use this function on arrays that may be on CPU or GPU). The returned device object supports the context management protocol of Python for the with statement.Parameters: args – Values to specify a GPU device. The first device object, integer or cupy.ndarray
object is used to select a device. If it is a device object, it is returned. If it is an integer, the corresponding device is returned. If it is a CuPy array, the device on which this array reside is returned. If any arguments are neither integers nor CuPy arrays, a dummy device object representing CPU is returned.Returns: Device object specified by given args
.See also
See
cupy.cuda.Device
for the device selection not by arrays.
-
chainer.cuda.
get_device_from_id
(device_id)[source]¶ Gets the device from an ID integer.
Parameters: device_id (int or None) – The ID of the device which this function returns.
-
chainer.cuda.
get_device_from_array
(*arrays)[source]¶ Gets the device from a list of CuPy array or a single CuPy array.
The device on which the given CuPy array reside is returned.
Parameters: array ( cupy.ndarray
or list ofcupy.ndarray
) – A CuPy array which this function returns the device corresponding to. If a list of :class:`cupy.ndarray`s are given, it returns the first device object of an array in the list.
CuPy array allocation and copy¶
Note
As of v1.3.0, the following array construction wrappers are marked as
deprecated. Use the corresponding functions of the cupy
module
instead. The main difference of them is that the default dtype is changed
from float32 to float64.
Deprecated functions | Recommended functions |
---|---|
chainer.cuda.empty |
cupy.empty() |
chainer.cuda.empty_like |
cupy.empty_like() |
chainer.cuda.zeros |
cupy.zeros() |
chainer.cuda.zeros_like |
cupy.zeros_like() |
chainer.cuda.ones |
cupy.ones() |
chainer.cuda.ones_like |
cupy.ones_like() |
chainer.cuda.full |
cupy.full() |
chainer.cuda.full_like |
cupy.full_like() |
-
chainer.cuda.
copy
(array, out=None, out_device=None, stream=None)[source]¶ Copies a
cupy.ndarray
object using the default stream.This function can copy the device array to the destination array on another device.
Parameters: - array (cupy.ndarray) – Array to be copied.
- out (cupy.ndarray) – Destination array.
If it is not
None
, thenout_device
argument is ignored. - out_device – Destination device specifier. Actual device object is
obtained by passing this value to
get_device()
. - stream (cupy.cuda.Stream) – CUDA stream.
Returns: Copied array.
If
out
is not specified, then the array is allocated on the device specified byout_device
argument.Return type:
-
chainer.cuda.
to_cpu
(array, stream=None)[source]¶ Copies the given GPU array to host CPU.
Parameters: - array – Array to be sent to CPU.
- stream (cupy.cuda.Stream) – CUDA stream.
Returns: Array on CPU.
If given
array
is already on CPU, then this function just returnsarray
without performing any copy.Return type:
-
chainer.cuda.
to_gpu
(array, device=None, stream=None)[source]¶ Copies the given CPU array to specified device.
Parameters: - array – Array to be sent to GPU.
- device – Device specifier.
- stream (cupy.cuda.Stream) – CUDA stream. If not
None
, the copy runs asynchronously.
Returns: Array on GPU.
If
array
is already on GPU, then this function just returnsarray
without performing any copy. Note that this function does not copycupy.ndarray
into specified device.Return type:
Kernel definition utilities¶
-
chainer.cuda.
memoize
(for_each_device=False)[source]¶ Makes a function memoizing the result for each argument and device.
This is a similar version of
cupy.memoize()
. The difference is that this function can be used in the global scope even if CUDA is not available. In such case, this function does nothing.Note
This decorator acts as a dummy if CUDA is not available. It cannot be used for general purpose memoization even if
for_each_device
is set to False.
-
chainer.cuda.
clear_memo
()[source]¶ Clears the memoized results for all functions decorated by memoize.
This function works like
cupy.clear_memo()
as a counterpart forchainer.cuda.memoize()
. It can be used even if CUDA is not available. In such a case, this function does nothing.
-
chainer.cuda.
elementwise
()[source]¶ Creates an elementwise kernel function.
This function uses
memoize()
to cache the kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.The arguments are the same as those for
cupy.ElementwiseKernel
, except that thename
argument is mandatory.
-
chainer.cuda.
reduce
()[source]¶ Creates a global reduction kernel function.
This function uses
memoize()
to cache the resulting kernel object, i.e. the resulting kernel object is cached for each argument combination and CUDA device.The arguments are the same as those for
cupy.ReductionKernel
, except that thename
argument is mandatory.
CPU/GPU generic code support¶
-
chainer.cuda.
get_array_module
(*args)[source]¶ Gets an appropriate one from
numpy
orcupy
.This is almost equivalent to
cupy.get_array_module()
. The differences are that this function can be used even if CUDA is not available and that it will return their data arrays’ array module forVariable
arguments.Parameters: args – Values to determine whether NumPy or CuPy should be used. Returns: cupy
ornumpy
is returned based on the types of the arguments.Return type: module