Link and Chain¶
-
class
chainer.
Link
(**params)[source]¶ Building block of model definitions.
Link is a building block of neural network models that support various features like handling parameters, defining network fragments, serialization, etc.
Link is the primitive structure for the model definitions. It supports management of parameter variables and persistent values that should be incorporated to serialization. Parameters are variables registered via the
add_param()
method, or given to the initializer method. Persistent values are arrays, scalars, or any other serializable values registered via theadd_persistent()
method.Note
Whereas arbitrary serializable objects can be registered as persistent values, it is strongly recommended to just register values that should be treated as results of learning. A typical example of persistent values is ones computed during training and required for testing, e.g. running statistics for batch normalization.
Parameters and persistent values are referred by their names. They can be accessed as attributes of the links. Link class itself manages the lists of names of parameters and persistent values to distinguish parameters and persistent values from other attributes.
Link can be composed into more complex models. This composition feature is supported by child classes like
Chain
andChainList
. One can create a chain by combining one or more links. See the documents for these classes for details.As noted above, Link supports the serialization protocol of the
Serializer
class. Note that only parameters and persistent values are saved and loaded. Other attributes are considered as a part of user program (i.e. a part of network definition). In order to construct a link from saved file, other attributes must be identically reconstructed by user codes.Example
This is a simple example of custom link definition. Chainer itself also provides many links defined under the
links
module. They might serve as examples, too.Consider we want to define a simple primitive link that implements a fully-connected layer based on the
linear()
function. Note that this function takes input units, a weight variable, and a bias variable as arguments. Then, the fully-connected layer can be defined as follows:import chainer import chainer.functions as F import numpy as np class LinearLayer(chainer.Link): def __init__(self, n_in, n_out): # Parameters are initialized as a numpy array of given shape. super(LinearLayer, self).__init__( W=(n_out, n_in), b=(n_out,), ) self.W.data[...] = np.random.randn(n_out, n_in) self.b.data.fill(0) def __call__(self, x): return F.linear(x, self.W, self.b)
This example shows that a user can define arbitrary parameters and use them in any methods. Links typically implement the
__call__
operator.Parameters: params – Names, shapes, and optional dtypes of initial parameters. The keywords are used as the parameter names and the corresponding values consist either of the shape or a tuple of shape and a dtype (shape, dtype). If only the shape is supplied, the default dtype will be used. Variables: name (str) – Name of this link, given by the parent chain (if exists). -
add_param
(name, shape, dtype=<type 'numpy.float32'>, initializer=None)[source]¶ Registers a parameter to the link.
The registered parameter is saved and loaded on serialization and deserialization, and involved in the optimization. The data and gradient of the variable are initialized by NaN arrays. If
initializer
is notNone
, the data is initialized byinitializer
.If the supplied
name
argument corresponds to an uninitialized parameter (that is, one that was added with theadd_uninitialized_param()
method),name
will be removed from the set of uninitialized parameters.The parameter is set to an attribute of the link with the given name.
Parameters: - name (str) – Name of the parameter. This name is also used as the attribute name. Any uninitialized parameters with the same name will be removed.
- shape (int or tuple of ints) – Shape of the parameter array.
- dtype – Data type of the parameter array.
- initializer (chainer.initializer.Initializer) – If it is not
None
, the data is initialized with the given initializer. Note that in this casedtype
argument is ignored.
-
add_persistent
(name, value)[source]¶ Registers a persistent value to the link.
The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.
Parameters: - name (str) – Name of the persistent value. This name is also used for the attribute name.
- value – Value to be registered.
-
add_uninitialized_param
(name)[source]¶ Registers an uninitialized parameter to the link.
An uninitialized parameter is defined as a parameter that has a name but that does not yet have a shape. If the shape of a parameter depends on the shape of the inputs to the
__call__
operator, it can be useful to defer initialization (that is, setting the shape) until the first forward call of the link. Such parameters are intended to be defined as uninitialized parameters in the initializer and then initialized during the first forward call.An uninitialized parameter is intended to be registered to a link by calling this method in the initializer method. Then, during the first forward call, the shape of the parameter will be determined from the size of the inputs and the parameter must be initialized by calling the
add_param()
method.Parameters: name – (str): Name of the uninitialized parameter.
-
addgrads
(link)[source]¶ Accumulates gradient values from given link.
This method adds each gradient array of the given link to corresponding gradient array of this link. The accumulation is even done across host and different devices.
Parameters: link (Link) – Source link object.
-
children
()[source]¶ Returns a generator of all child links.
Returns: A generator object that generates all child links.
-
cleargrads
()[source]¶ Clears all gradient arrays.
This method should be called before the backward computation at every iteration of the optimization.
-
copy
()[source]¶ Copies the link hierarchy to new one.
The whole hierarchy rooted by this link is copied. The copy is basically shallow, except that the parameter variables are also shallowly copied. It means that the parameter variables of copied one are different from ones of original link, while they share the data and gradient arrays.
The name of the link is reset on the copy, since the copied instance does not belong to the original parent chain (even if exists).
Returns: Copied link object. Return type: Link
-
copyparams
(link)[source]¶ Copies all parameters from given link.
This method copies data arrays of all parameters in the hierarchy. The copy is even done across the host and devices. Note that this method does not copy the gradient arrays.
Parameters: link (Link) – Source link object.
-
has_uninitialized_params
¶ Check if the link has uninitialized parameters.
Returns: True
if the link has any uninitialized parameters. Otherwise returnsFalse
.Return type: bool
-
links
(skipself=False)[source]¶ Returns a generator of all links under the hierarchy.
Parameters: skipself (bool) – If True
, then the generator skips this link and starts with the first child link.Returns: A generator object that generates all links.
-
namedlinks
(skipself=False)[source]¶ Returns a generator of all (path, link) pairs under the hierarchy.
Parameters: skipself (bool) – If True
, then the generator skips this link and starts with the first child link.Returns: A generator object that generates all (path, link) pairs.
-
namedparams
()[source]¶ Returns a generator of all (path, param) pairs under the hierarchy.
Returns: A generator object that generates all (path, parameter) pairs. The paths are relative from this link.
-
params
()[source]¶ Returns a generator of all parameters under the link hierarchy.
Returns: A generator object that generates all parameters.
-
serialize
(serializer)[source]¶ Serializes the link object.
Parameters: serializer (AbstractSerializer) – Serializer object.
-
to_cpu
()[source]¶ Copies parameter variables and persistent values to CPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override this method to do so.
Returns: self
-
to_gpu
(device=None)[source]¶ Copies parameter variables and persistent values to GPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override this method to do so.
Parameters: device – Target device specifier. If omitted, the current device is used. Returns: self
-
xp
¶ Array module for this link.
Depending on which of CPU/GPU this link is on, this property returns
numpy
orcupy
.
-
zerograds
()[source]¶ Initializes all gradient arrays by zero.
This method can be used for the same purpose of cleargrads, but less efficient. This method is left for backward compatibility.
Deprecated since version v1.15: Use
cleargrads()
instead.
-
-
class
chainer.
Chain
(**links)[source]¶ Composable link with object-like interface.
Composability is one of the most important features of neural nets. Neural net models consist of many reusable fragments, and each model itself might be embedded into a larger learnable system. Chain enables us to write a neural net based on composition, without bothering about routine works like collecting parameters, serialization, copying the structure with parameters shared, etc.
This class actually provides a way to compose one or more links into one structure. A chain can contain one or more child links. Child link is a link registered to the chain with its own name. The child link is stored to an attribute of the chain with the name. User can write a whole model or a fragment of neural nets as a child class of Chain.
Each chain itself is also a link. Therefore, one can combine chains into higher-level chains. In this way, links and chains construct a link hierarchy. Link hierarchy forms a tree structure, where each node is identified by the path from the root. The path is represented by a string like a file path in UNIX, consisting of names of nodes on the path, joined by slashes
/
.Example
This is a simple example of custom chain definition. Chainer itself also provides some chains defined under the
links
module. They might serve as examples, too.Consider we want to define a multi-layer perceptron consisting of two hidden layers with rectifiers as activation functions. We can use the
Linear
link as a building block:import chainer import chainer.functions as F import chainer.links as L class MultiLayerPerceptron(chainer.Chain): def __init__(self, n_in, n_hidden, n_out): # Create and register three layers for this MLP super(MultiLayerPerceptron, self).__init__( layer1=L.Linear(n_in, n_hidden), layer2=L.Linear(n_hidden, n_hidden), layer3=L.Linear(n_hidden, n_out), ) def __call__(self, x): # Forward propagation h1 = F.relu(self.layer1(x)) h2 = F.relu(self.layer2(h1)) return self.layer3(h2)
Child links are registered via the initializer method. They also can be registered by the
add_link()
method. The forward propagation is often implemented as The__call__
operator as the above example, though it is not mandatory.Parameters: links – Child links. The keywords are used as their names. The names are also set to the links. -
add_link
(name, link)[source]¶ Registers a child link to this chain.
The registered link is saved and loaded on serialization and deserialization, and involved in the optimization. The registered link is called a child. The child link is set to an attribute of the chain with the given name.
This method also sets the
name
attribute of the registered link. If the given link already has the name attribute set, then it raises an error.Parameters:
-
-
class
chainer.
ChainList
(*links)[source]¶ Composable link with list-like interface.
This is another example of compositional link. Unlike
Chain
, this class can be used like a list of child links. Each child link is indexed by a non-negative integer, and it maintains the current number of registered child links. Theadd_link()
method inserts a new link at the end of the list. It is useful to write a chain with arbitrary number of child links, e.g. an arbitrarily deep multi-layer perceptron.Note that this class does not implement all methods of
list
.Parameters: links – Initial child links. -
__getitem__
(index)[source]¶ Returns the child at given index.
Parameters: index (int) – Index of the child in the list. Returns: The index
-th child link.Return type: Link
-
add_link
(link)[source]¶ Registers a child link to this chain.
The registered link is saved and loaded on serialization and deserialization, and involved in the optimization. The registered link is called a child. The child link is accessible via
children()
generator, which returns a generator running through the children in registered order.This method also sets the
name
attribute of the registered link. If the given link already has the name attribute set, then it raises an error.Parameters: link (Link) – The link object to be registered.
-