# chainer.FunctionNode¶

class chainer.FunctionNode[source]

Function node of the computational graph.

FunctionNode is a class representing a node in a computational graph. The node corresponds to an application of a differentiable function to input variables.

When a differentiable function is applied to `Variable` objects, it creates an instance of FunctionNode implementation and calls its `apply()` method. The `apply()` method basically does the following three things.

1. Adding an edge from the function node to the variable node corresponding to each input. The node of each input is extracted by `Variable.node`.

2. Computing the output arrays of the function.

3. Creating a `Variable` object for each output array and adding an edge from the node of the variable to the function node.

The output variables are then returned.

Example

Let `x` be an instance of `Variable` and `f` be an instance of `FunctionNode` taking only one argument. Then the following code

```>>> import numpy, chainer
>>> x = chainer.Variable(numpy.zeros(10))
>>> f = chainer.functions.math.identity.Identity()
>>> y = f.apply((x,))[0]
```

computes a new variable `y` and creates backward references. The backward references are actually set as per the following diagram:

```x.node <--- f <--- y.node
```

If an application of another function `g` occurs as

```>>> g = chainer.functions.math.identity.Identity()
>>> z = g.apply((x,))[0]
```

then the graph grows with a branch:

```         |--- f <--- y.node
x.node <-+
|--- g <--- z.node
```

Note that the branching is correctly managed on backward computation, i.e. the gradients from `f` and `g` are accumulated to the gradient of `x`.

Every function-node implementation should provide `forward()` and `backward()`. Instead of overriding `forward()`, one can also implement `forward_cpu()` and `forward_gpu()` when the implementations for CPU and GPU arrays are totally different.

Note that the input and output variables are inaccessible from `backward()` by default. If it needs accesses to these variables, the `forward()` method (or its CPU/GPU variants) has to call `retain_inputs()` and `retain_outputs()` appropriately. The retained input/output variables can be accessed from `backward()` by calling `get_retained_inputs()` and `get_retained_outputs()`.

Note

There are two types of differentiable functions in Chainer (since v3). The first type is of a function using a subclass of `Function`, which is called old-style differentiable function. The second type is of a function using a subclass of `FunctionNode`, which is called new-style differentiable function. There are several advantages on using the new-style differentiable function.

• The new-style differentiable function supports differentiable backpropagation. The backpropagated gradients computed through the new-style differentiable functions themselves support further backpropagations so that the automatic higher-order differentiation is available.

• The backpropagation of the new-style differentiable function can be more computationally efficient because the interface allows an implementation to omit the computation of unneeded input gradients.

Note that the new-style differentiable function is the standard way of defining a function node of the computational graph in Chainer; old- style differentiable functions are implemented as wrappers of the new- style differentiable functions.

Variables
• ~FunctionNode.inputs – A tuple of the input `VariableNode` objects.

• ~FunctionNode.outputs – A tuple of weak references to the output `VariableNode` objects.

• ~FunctionNode.rank (int) – An ordinal following the topological order of the computational graph.

• ~FunctionNode.stack – Stack trace retrieved at the forward computation. The stack trace is available only in the debug mode.

New in version 3.0.0.

Methods

__call__(*args, **kwargs)[source]

Call self as a function.

Registers a function hook.

Parameters
• hook (FunctionHook) – Function hook to be registered.

• name (str) – Name of the function hook. The name must be unique among function hooks registered to this function. If `None`, the default name of the function hook is used.

apply(inputs)[source]

Computes output variables and grows the computational graph.

Basic behavior is expressed in the documentation of `FunctionNode`.

Note

If the `data` attributes of the input variables exist on a GPU device, that device is made current before calling `forward()`, so implementers do not need to take care of device selection in most cases.

Parameters

inputs – Tuple of input variables. Each element can be either `Variable` or N-dimensional array. If the element is an ndarray, it is automatically wrapped with `Variable`.

Returns

A tuple of output `Variable` objects.

This method is used to compute one step of the backpropagation corresponding to the forward computation of this function node. Given the gradients w.r.t. output variables, this method computes the gradients w.r.t. specified input variables. Note that this method does not need to compute any input gradients not specified by `target_input_indices`.

Unlike `Function.backward()`, gradients are given as `Variable` objects and this method itself has to return input gradients as `Variable` objects. It enables the function node to return the input gradients with the full computational history, in which case it supports differentiable backpropagation or higher-order differentiation.

The default implementation returns `None` s, which means the function is not differentiable.

Parameters
• target_input_indexes (tuple of int) – Sorted indices of the input variables w.r.t. which the gradients are required. It is guaranteed that this tuple contains at least one element.

• grad_outputs (tuple of `Variable`s) – Gradients w.r.t. the output variables. If the gradient w.r.t. an output variable is not given, the corresponding element is `None`.

Returns

Tuple of variables that represent the gradients w.r.t. specified input variables. The length of the tuple can be same as either `len(target_input_indexes)` or the number of inputs. In the latter case, the elements not specified by `target_input_indexes` will be discarded.

`backward_accumulate()` provides an alternative interface that allows you to implement the backward computation fused with the gradient accumulation.

Computes gradients w.r.t. specified inputs and accumulates them.

This method provides a way to fuse the backward computation and the gradient accumulations in the case that the multiple functions are applied to the same variable.

Users have to override either of this method or `backward()`. It is often simpler to implement `backward()` and is recommended if you do not need to provide efficient gradient accumulation.

Parameters
• target_input_indexes (tuple of int) – Sorted indices of the input variables w.r.t. which the gradients are required. It is guaranteed that this tuple contains at least one element.

• grad_outputs (tuple of Variable) – Gradients w.r.t. the output variables. If the gradient w.r.t. an output variable is not given, the corresponding element is `None`.

• grad_inputs (tuple of Variable) – Gradients w.r.t. the input variables specified by `target_input_indexes`. These values are computed by other computation paths. If there is no gradient value existing for the variable, the corresponding element is `None`. See also the note below.

Returns

Tuple of variables that represent the gradients w.r.t. specified input variables. Unlike `backward()`, the length of the tuple must be same as that of `target_input_indices`.

Note

Gradient variables in `grad_outputs` are distinct, even if a variable is passed to multiple input arguments of the function. This is an implementation-detail convention to avoid the complication of correctly accumulating gradients in such a case.

Usually, only the first position of `grad_inputs` corresponding to these input arguments may contain the gradient variable corresponding to that input variable, and other entries are set to `None`. This is not the case with the `lazy_grad_sum` feature. This behavior might be changed in a future version.

check_layout_forward(inputs)[source]
check_type_forward(in_types)[source]

Checks types of input data before forward propagation.

This method is called before `forward()` and validates the types of input variables using the type checking utilities.

Parameters

in_types (TypeInfoTuple) – The type information of input variables for `forward()`.

delete_hook(name)[source]

Unregisters the function hook.

Parameters

name (str) – The name of the function hook to be unregistered.

forward(inputs)[source]

Computes the output arrays from the input arrays.

It delegates the procedure to `forward_cpu()` or `forward_gpu()` by default. Which of them this method selects is determined by the type of input arrays. Implementations of `FunctionNode` must implement either CPU/GPU methods or this method.

Parameters

inputs – Tuple of input array(s).

Returns

Tuple of output array(s).

Warning

Implementations of `FunctionNode` must take care that the return value must be a tuple even if it returns only one array.

forward_chainerx(inputs)[source]

Computes the output arrays from the input ChainerX arrays.

This method may check the input arrays and other attributes to see if the computation can be done using ChainerX implementation. If it’s not supported, `chainer.Fallback` should be returned instead of output arrays. In that case, computation using conventional Python implementation will be performed.

Parameters

inputs – Tuple of input array(s).

Returns

Tuple of output array(s) or `chainer.Fallback`.

forward_cpu(inputs)[source]

Computes the output arrays from the input NumPy arrays.

Parameters

inputs – Tuple of input `numpy.ndarray` objects.

Returns

Tuple of output arrays. Each element can be NumPy or CuPy arrays.

Warning

Implementation of `FunctionNode` must take care that the return value must be a tuple even if it returns only one array.

forward_gpu(inputs)[source]

Computes the output arrays from the input CuPy arrays.

Parameters

inputs – Tuple of input `cupy.ndarray` objects.

Returns

Tuple of output arrays. Each element can be NumPy or CuPy arrays.

Warning

Implementation of `FunctionNode` must take care that the return value must be a tuple even if it returns only one array.

get_retained_inputs()[source]

Returns a tuple of retained input variables.

This method is used to retrieve the input variables retained in `forward()`.

Returns

A tuple of retained input variables, if available. Otherwise return None.

get_retained_outputs()[source]

Returns a tuple of retained output variables.

This method is used to retrieve the output variables retained in `forward()`.

Returns

A tuple of retained output variables, if available. Otherwise return None.

Note

This method does a tricky thing to support the case of an output node garbage-collected before this method is called; in this case, this method creates a fresh variable node that acts as an output node of the function node.

retain_inputs(indexes)[source]

Lets specified input variable nodes keep data arrays.

By calling this method from `forward()`, the function node can specify which inputs are required for backprop. The input variables with retained arrays can then be obtained by calling `get_retained_inputs()` from inside `backward()`.

Unlike `Function`, the function node DOES NOT keep input arrays by default. If you want to keep some or all input arrays, do not forget to call this method.

Note that this method must not be called from the outside of `forward()`.

Parameters

indexes (iterable of int) – Indexes of input variables that the function will require for backprop.

retain_outputs(indexes)[source]

Lets specified output variable nodes keep data arrays.

By calling this method from `forward()`, the function node can specify which outputs are required for backprop. If this method is not called, no output variables will be marked to keep their data array at the point of returning from `apply()`. The output variables with retained arrays can then be obtained by calling `get_retained_outputs()` from inside `backward()`.

Note

It is recommended to use this method if the function requires some or all output arrays in backprop. The function can also use output arrays just by keeping references to them directly, although it might affect the performance of later function applications on the output variables.

Note that this method must not be called from the outside of `forward()`.

Parameters

indexes (iterable of int) – Indexes of output variables that the function will require for backprop.

unchain()[source]

Purges in/out nodes and this function node itself from the graph.

__eq__(value, /)

Return self==value.

__ne__(value, /)

Return self!=value.

__lt__(value, /)

Return self<value.

__le__(value, /)

Return self<=value.

__gt__(value, /)

Return self>value.

__ge__(value, /)

Return self>=value.

Attributes

chainerx_device = None
input_layouts
inputs = None
is_elementwise = False
label

Short text that represents the function.

The default implementation returns its type name. Each function should override it to give more information.

local_function_hooks

Ordered dictionary of registered function hooks.

Contrary to `chainer.thread_local.function_hooks`, which registers its elements to all functions, Function hooks in this property is specific to this function.

output_data

A tuple of the retained output arrays.

This property is mainly used by `Function`. Users basically do not have to use this property; use `get_retained_outputs()` instead.

output_layouts
outputs = None
rank = 0
stack = None