chainer.Variable

class chainer.Variable(data=None, *, name=None, grad=None, requires_grad=True)[source]

Array with a structure to keep track of computation.

Every variable holds a data array of type either numpy.ndarray or cupy.ndarray.

A variable object holds a data array and a VariableNode object of a computational graph. If the variable is constructed by the user, the node is _root_ and does not hold any parent. If the variable is constructed by a Function object, the node holds a reference to its parent called creator. This reference is used in backpropagation to backtrack the graph.

Users can disable (resp. enable) this chaining behavior by calling no_backprop_mode() (resp. force_backprop_mode()). In the former context, a variable never creates a computational graph, whereas in the latter context, it is forced to create.

Warning

volatile argument is not supported anymore since v2. Instead, use chainer.no_backprop_mode().

Parameters:
  • data (numpy.ndarray or cupy.ndarray) – Initial data array.
  • name (str) – Name of the variable.
  • grad (numpy.ndarray or cupy.ndarray) – Initial gradient array.
  • requires_grad (bool) – Boolean indicating whether grad will be set in backward calculation.
Variables:
  • data – Data array of type either numpy.ndarray or cupy.ndarray. If it is None, the variable is left in an uninitialized state.
  • grad – Gradient array.
  • creator – The function who creates this variable. It is None if the variable is not created by any function.

Methods

__getitem__(x, slices)[source]

Extract elements from array with specified shape, axes and offsets.

Parameters:
  • x (Variable) – A variable to be sliced.
  • slices (int, slice, Ellipsis, None, integer array-like, boolean array-like or tuple of them) – It is an integer, a slice, an ellipsis, a numpy.newaxis, an integer array-like, a boolean array-like or tuple of them.
Returns:

Variable object

which contains sliced array of x.

Return type:

Variable

Note

It only supports types that are supported by CUDA’s atomicAdd when an integer array is included in slices. The supported types are numpy.float32, numpy.int32, numpy.uint32, numpy.uint64 and numpy.ulonglong.

Note

It does not support slices that contains multiple boolean arrays.

Note

See NumPy document for details of indexing.

__len__()[source]

Returns the first dimension of the data array.

Returns:Number of the first dimension of the data array.
Return type:int
__copy__()[source]
addgrad(var)[source]

Accumulates the gradient array from given source variable.

This method adds the gradient of a given variable to the gradient of this variable. The accumulation is even done across the host and different devices. If this variable has uninitialized data/grad arrays, this method initializes it with the shape of the given varaible and then accumulates the gradient.

Parameters:var (Variable) – Source variable.
backward(retain_grad=False)[source]

Runs error backpropagation (a.k.a. backprop) from this variable.

On backprop, Function.backward() is called on each Function object appearing in the backward graph starting from this variable. The backward graph is represented by backward references from variable nodes to their creators, and from functions to their input variable nodes. The backprop stops at all root nodes. Some functions set None as gradients of some inputs, where further backprop does not take place at such inputs.

This method uses grad as the initial error array. User can manually set a gradient array before calling this method. If data contains only one element (i.e., it is scalar) and grad is None, then this method automatically complements 1.0 as the initial error. This is useful on starting backprop from some scalar loss value.

Parameters:retain_grad (bool) –

If True, the gradient arrays of all intermediate variables are kept. Otherwise, grad of the intermediate variables are set to None on appropriate timing, which may reduce the maximum memory consumption.

In most cases of training some models, the purpose of backprop is to compute gradients of parameters, not of all variables, and therefore it is recommended to set this flag False.

cleargrad()[source]

Clears the gradient array.

copydata(var)[source]

Copies the data array from given source variable.

This method copies the data array from given variable to this variable. The copy is done even if the arrays reside on different devices, including across the host and a GPU device. If this variable has an uninitialized data array, this method initializes it by the data array of the given variable. Similarly, if the given variable has an uninitialized data array, this method initializes it by the data array of this variable (self). If both are uninitialized, this method does nothing.

Parameters:var (Variable) – Source variable.
debug_print()[source]

Display a summary of the stored data and location of the Variable

reshape(*shape)[source]

Returns a variable of a different shape and the same content.

See also

chainer.functions.reshape() for full documentation,

retain_data()[source]

Lets the corresponding variable node keep the underlying array.

set_creator(gen_func)[source]

Notifies the variable that the given function is its creator.

Parameters:gen_func (Function) – Function object that creates this variable as one of its outputs.
summary()[source]
to_cpu()[source]

Copies the data and gradient arrays to CPU.

to_gpu(device=None)[source]

Copies the data and gradient arrays to specified GPU.

Parameters:device – Target device specifier. If omitted, the current device is used.
transpose(*axes)[source]

Permute the dimensions of an input variable without copy.

See also

chainer.functions.transpose() for full documentation.

unchain()[source]

Deletes the reference to the creator of this variable.

This method deletes the reference to the creator from the corresponding variable node. Unlike unchain_backward(), it does not backtrack the graph.

This method is equivalent to self.creator = None.

unchain_backward()[source]

Deletes references between variable nodes and functions backward.

After this method completes, intermediate variable nodes and functions that are not referenced from anywhere are deallocated by reference count GC. Also this variable itself deletes the reference to its creator function from the node, i.e. the node becomes root in the computation graph. It indicates that backprop after unchaining stops at this variable. This behavior is useful to implement truncated BPTT.

zerograd()[source]

Initializes the gradient array by zeros.

Deprecated since version v1.15: Use cleargrad() instead.

__eq__(other)[source]
__ne__(other)[source]
__lt__(other)[source]
__le__(other)[source]
__gt__(other)[source]
__ge__(other)[source]
__nonzero__()[source]
__bool__()[source]

Attributes

creator

Function() object that created this variable.

This property has a setter to which None can be set. Setting None to this property is equivalent to call unchain(); it purges the variable from the function that created this variable.

The setter also accepts the original Function() object that created this variable. For example, you can once set None to this property and then set the original value again.

Note

Setting an irrelevant Function() object does not emit any error immediately, whereas the behavior is undefined. Do not set a Function() object that did not create this variable object.

data
dtype
grad
label

Short text that represents the variable.

name
ndim
node
rank
requires_grad

It indicates that grad will be set in backward calculation.

shape
size