Variable

class chainer.Variable(data, volatile=OFF, name=None, grad=None)[source]

Array with a structure to keep track of computation.

Every variable holds a data array of type either numpy.ndarray or cupy.ndarray.

A Variable object may be constructed in two ways: by the user or by some function. When a variable is created by some function as one of its outputs, the variable holds a reference to that function. This reference is used in error backpropagation (a.k.a. backprop). It is also used in backward unchaining. A variable that does not hold a reference to its creator is called a root variable. A variable is root if it is created by the user, or if the reference is deleted by unchain_backward().

Users can disable this chaining behavior by setting the volatile flag for the initial variables. When a function gets volatile variables as its inputs, the output variables do not hold references to the function. This acts like unchaining on every function application.

Parameters:
  • data (array) – Initial data array.
  • volatile (Flag) – Volatility flag. String (‘on’, ‘off’, or ‘auto’) or boolean values can be used, too.
  • name (str) – Name of the variable.
  • grad (array) – Initial gradient array.
Variables:
  • data – Data array of type either numpy.ndarray or cupy.ndarray.
  • grad – Gradient array.
  • creator – The function who creates this variable. It is None if the variable is not created by any function.
  • volatile – Ternary Flag object. If 'ON', the variable does not keep track of any function applications. See Flag for the detail of ternary flags.
__abs__()

Element-wise absolute.

Returns:Output variable.
Return type:Variable
__add__(rhs)

Element-wise addition.

Returns:Output variable.
Return type:Variable
__div__(rhs)

Element-wise division

Returns:Output variable.
Return type:Variable
__getitem__(x, slices)

Extract elements from array with specified shape, axes and offsets.

Parameters:
  • x (Variable) – A variable to be sliced.
  • slices (int, slice, Ellipsis, None, integer array-like, boolean array-like or tuple of them) – It is an integer, a slice, an ellipsis, a numpy.newaxis, an integer array-like, a boolean array-like or tuple of them.
Returns:

Variable object

which contains sliced array of x.

Return type:

Variable

Note

It only supports types that are supported by CUDA’s atomicAdd when an integer array is included in slices. The supported types are numpy.float32, numpy.int32, numpy.uint32, numpy.uint64 and numpy.ulonglong.

Note

It does not support slices that contains multiple boolean arrays.

Note

See NumPy document for details of indexing.

__len__()[source]

Returns the number of elements of the data array.

Returns:Number of elements of the data array.
Return type:int
__matmul__(rhs)

Matrix multiplication.

Returns:Output variable.
Return type:Variable
__mul__(rhs)

Element-wise multiplication.

Returns:Output variable.
Return type:Variable
__neg__()

Element-wise negation.

Returns:Output variable.
Return type:Variable
__pow__(rhs)

Element-wise power function.

Returns:Output variable.
Return type:Variable
__radd__(rhs)

Element-wise addition.

Returns:Output variable.
Return type:Variable
__rdiv__(rhs)

Element-wise division.

Returns:Output variable.
Return type:Variable
__rmatmul__(rhs)

Matrix multiplication.

Returns:Output variable.
Return type:Variable
__rmul__(rhs)

Element-wise multiplication.

Returns:Output variable.
Return type:Variable
__rpow__(rhs)

Element-wise power function.

Returns:Output variable.
Return type:Variable
__rsub__(rhs)

Element-wise subtraction.

Returns:Output variable.
Return type:Variable
__rtruediv__(rhs)

Element-wise division.

Returns:Output variable.
Return type:Variable
__sub__(rhs)

Element-wise subtraction.

Returns:Output variable.
Return type:Variable
__truediv__(rhs)

Element-wise division

Returns:Output variable.
Return type:Variable
addgrad(var)[source]

Accumulates the gradient array from given source variable.

This method just runs self.grad += var.grad, except that the accumulation is even done across the host and different devices.

Parameters:var (Variable) – Source variable.
backward(retain_grad=False)[source]

Runs error backpropagation (a.k.a. backprop) from this variable.

On backprop, Function.backward() is called on each Function object appearing in the backward graph starting from this variable. The backward graph is represented by backward references from variables to their creators, and from functions to their inputs. The backprop stops at all root variables. Some functions set None as gradients of some inputs, where further backprop does not take place at such input variables.

This method uses grad as the initial error array. User can manually set a gradient array before calling this method. If data contains only one element (i.e., it is scalar) and grad is None, then this method automatically complements 1.0 as the initial error. This is useful on starting backprop from some scalar loss value.

Parameters:retain_grad (bool) –

If True, the gradient arrays of all intermediate variables are kept. Otherwise, grad of the intermediate variables are set to None on appropriate timing, which may reduce the maximum memory consumption.

In most cases of training some models, the purpose of backprop is to compute gradients of parameters, not of variables, so it is recommended to set this flag False.

cleargrad()[source]

Clears the gradient array.

copydata(var)[source]

Copies the data array from given source variable.

This method just copies the data attribute from given variable to this variable, except that the copy is even done across the host and different devices.

Parameters:var (Variable) – Source variable.
debug_print()[source]

Display a summary of the stored data and location of the Variable

label

Short text that represents the variable.

reshape(*shape)[source]

Returns a variable of a different shape and the same content.

See also

chainer.functions.reshape() for full documentation,

set_creator(gen_func)[source]

Notifies the variable that the given function is its creator.

Parameters:gen_func (Function) – Function object that creates this variable as one of its outputs.
to_cpu()[source]

Copies the data and gradient arrays to CPU.

to_gpu(device=None)[source]

Copies the data and gradient arrays to specified GPU.

Parameters:device – Target device specifier. If omitted, the current device is used.
transpose(*axes)[source]

Permute the dimensions of an input variable without copy.

See also

chainer.functions.transpose() for full documentation.

unchain_backward()[source]

Deletes references between variables and functions backward.

After this method completes, intermediate variables and functions that are not referenced from anywhere are deallocated by reference count GC. Also this variable itself deletes the reference to its creator function, i.e. this variable becomes root in the computation graph. It indicates that backprop after unchaining stops at this variable. This behavior is useful to implement truncated BPTT.

zerograd()[source]

Initializes the gradient array by zeros.

Deprecated since version v1.15: Use cleargrad() instead.