chainer.Sequential¶

class chainer.Sequential(*layers)[source]¶

Sequential model which has a single-stream forward pass.

Warning

This feature is experimental. The interface can change in the future.

This class enables to construct a network which has sequential structure easily. While Chain and ChainList can only take Link object as input to their constructor, this Sequential can take arbitrary number of any callable objects for the forward pass computation. A Sequential calls the given callable objects sequentially inside of the __call__() method in the same order as the given argments. Therefore, you do not need to write the forward pass computation explicitly.

Example

The below example code shows how to use this class to construct a simple sequential network:

import chainer
import chainer.functions as F
import link.Links as L
from chainer import Sequential

# Model definition without writing __call__ function
model = Sequential(
    L.Linear(n_in, n_hidden),
    F.relu,
    L.Linear(n_hidden, n_hidden),
    F.relu,
    L.Linear(n_hidden, n_out)
)

# Compute the forward pass
y = model(x)

where x denotes a mini-batch of n_in-dimensional input vectors.

Furthermore, Sequential supports built-in list APIs, so you can concatenate Sequential objects to create a longer Sequential model easily with the same ways as Python lists:

model_A = Sequential(L.Linear(10, 10), F.relu)
model_B = Sequential(L.Linear(10, 10), F.sigmoid)
model_C = model_A + model_B

To repeat a Sequential object multiple times, you can use repeat() method.

model_D = model_A.repeat(3)

You can also add your own functions or any callable objects to a Sequential object:

from link.Links.model.vision.vgg import VGG16Layers()

model = Sequential()
model.append(L.Linear(n_out, n_hidden))
model.append(F.relu)
model.append(F.Reshape((1, 3, 224, 224)))
model.append(VGG16Layers())
model.append(lambda x: x['prob'])

y = model(x)

The above code example shows how to add some layers to the model using append() method and then add a large network (VGG16Layers) and finally add a lambda function to extract the prob output.

You can check the structure of your model briefly using print as following:

>>> print(model_C)
     Linear        W(10, 10)       b(10,)
     relu
     Linear        W(10, 10)       b(10,)
     sigmoid

Note

Note that a Sequential link which has at least one lambda function as its member cannot be pickled. So, please use partial method from functools package instead:

from functools import partial

# This is not pickable
model = Sequential(
    L.Convolution2D(None, 64, 3, 1, 1),
    lambda x: F.max_pooling_2d(x, 2)
)

# This is pickable
model = Sequential(
    L.Convolution2D(None, 64, 3, 1, 1),
    partial(F.max_pooling_2d, ksize=2)
)

Parameters:	layers – The layers which are called in its order. Each component should be a callable object including `Link` object and functions defined under the `chainer.functions`, e.g., `relu`, etc.

Methods

__call__(*x)[source]¶

Forward pass computation.

This method performs the forward pass computation by giving the input variable x to the layers registered in the constructor in the same order as the order in which the argments are given to the constructor.

It should be noted that the input variable is given directly to the first layer and all intermediate outputs generated during the forward pass are also directly fed to the next layer. Therefore, the number of outputs at a layer should be the same as the number of inputs at the next layer.

Parameters:	x – Input variables.
Returns:	The output of the final layer in the given layers.

__getitem__(i)[source]¶

Returns the child at given index.

Parameters:	index (int) – Index of the child in the list.
Returns:	The `index`-th child link.
Return type:	Link

__setitem__(i, layer)[source]¶

__len__()[source]¶: Returns the number of children.

__iter__()[source]¶

add_link(link)[source]¶

Registers a child link and adds it to the tail of the list.

Parameters:	link (Link) – The link object to be registered.

add_param(name, shape=None, dtype=<class 'numpy.float32'>, initializer=None)[source]¶

Registers a parameter to the link.

Deprecated since version v2.0.0: Assign a Parameter object directly to an attribute within init_scope() instead. For example, the following code

link.add_param('W', shape=(5, 3))

can be replaced by the following assignment.

with link.init_scope():
    link.W = chainer.Parameter(None, (5, 3))

The latter is easier for IDEs to keep track of the attribute’s type.

Parameters:

name (str) – Name of the parameter. This name is also used as the attribute name.
shape (int or tuple of ints) – Shape of the parameter array. If it is omitted, the parameter variable is left uninitialized.
dtype – Data type of the parameter array.
initializer – If it is not None, the data is initialized with the given initializer. If it is an array, the data is directly initialized by it. If it is callable, it is used as a weight initializer. Note that in these cases, dtype argument is ignored.

add_persistent(name, value)[source]¶

Registers a persistent value to the link.

The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.

Parameters:	name (str) – Name of the persistent value. This name is also used for the attribute name. value – Value to be registered.

addgrads(link)[source]¶

Accumulates gradient values from given link.

This method adds each gradient array of the given link to corresponding gradient array of this link. The accumulation is even done across host and different devices.

Parameters:	link (Link) – Source link object.

append(layer)[source]¶

Registers a child link and adds it to the tail of the list.

This is equivalent to add_link(). This method has been added to emulate the list interface.

Parameters:	link (Link) – The link object to be regsitered.

children()[source]¶

Returns a generator of all child links.

Returns:	A generator object that generates all child links.

clear()[source]¶

cleargrads()[source]¶

Clears all gradient arrays.

This method should be called before the backward computation at every iteration of the optimization.

copy(mode='share')[source]¶

Copies the link hierarchy to new one.

The whole hierarchy rooted by this link is copied. There are three modes to perform copy. Please see the document for the argument mode below.

The name of the link is reset on the copy, since the copied instance does not belong to the original parent chain (even if exists).

Parameters:	mode (str) – It should be either `init`, `copy`, or `share`. `init` means parameter variables under the returned link object is re-initialized by calling their `initialize()` method, so that all the parameters may have different initial values from the original link. `copy` means that the link object is deeply copied, so that its parameters are not re-initialized but are also deeply copied. Thus, all parameters have same initial values but can be changed independently. `share` means that the link is shallowly copied, so that its parameters’ arrays are shared with the original one. Thus, their values are changed synchronously. The default `mode` is `share`.
Returns:	Copied link object.
Return type:	Link

copyparams(link)[source]¶

Copies all parameters from given link.

This method copies data arrays of all parameters in the hierarchy. The copy is even done across the host and devices. Note that this method does not copy the gradient arrays.

Parameters:	link (Link) – Source link object.

count(layer)[source]¶

count_by_layer_type(type_name)[source]¶

Count the number of layers by layer type.

This method counts the number of layers which have the name given by the argment type_name. For example, if you want to know the number of Linear layers included in this model, type_name should be Linear. If you want to know the number of Function classes or user-defined functions which have a specific name, type_name should be the function name, e.g., relu or reshape, etc.

Parameters:	type_name (str) – The class or function name of a layer you want to enumerate.

count_params()[source]¶

Counts the total number of parameters.

This method counts the total number of scalar values included in all the Parameters held by this link and its descendants.

If the link containts uninitialized parameters, this method raises a warning.

Returns:	The total size of parameters (int)

disable_update()[source]¶

Disables update rules of all parameters under the link hierarchy.

This method sets the enabled flag of the update rule of each parameter variable to False.

enable_update()[source]¶

Enables update rules of all parameters under the link hierarchy.

This method sets the enabled flag of the update rule of each parameter variable to True.

extend(sequential)[source]¶

flatten()[source]¶

Flatten nested Sequential links.

This method flattens all the nested Sequential links inside this Sequential link.

Returns:	A flattened `Sequential` object.

Example

>>> import chainer
>>> import chainer.functions as F
>>> import chainer.links as L
>>> a = chainer.Sequential(L.Linear(None, 10), F.relu)
>>> b = chainer.Sequential(L.Linear(None, 10), F.relu)
>>> a.append(b)
>>> print(a)  # Without flatten
0       Linear  W(None) b(10,)
1       relu
2       Sequential      which has 2 layers
>>> print(a.flatten())  # With flatten
0       Linear  W(None) b(10,)
1       relu
2       Linear  W(None) b(10,)
3       relu

index(layer, start=None, end=None)[source]¶

init_scope()[source]¶

Creates an initialization scope.

This method returns a context manager object that enables registration of parameters (and links for Chain) by an assignment. A Parameter object can be automatically registered by assigning it to an attribute under this context manager.

Example

In most cases, the parameter registration is done in the initializer method. Using the init_scope method, we can simply assign a Parameter object to register it to the link.

class MyLink(chainer.Link):
    def __init__(self):
        super().__init__()
        with self.init_scope():
            self.W = chainer.Parameter(0, (10, 5))
            self.b = chainer.Parameter(0, (5,))

insert(i, layer)[source]¶

links(skipself=False)[source]¶

Returns a generator of all links under the hierarchy.

Parameters:	skipself (bool) – If `True`, then the generator skips this link and starts with the first child link.
Returns:	A generator object that generates all links.

namedlinks(skipself=False)[source]¶

Returns a generator of all (path, link) pairs under the hierarchy.

Parameters:	skipself (bool) – If `True`, then the generator skips this link and starts with the first child link.
Returns:	A generator object that generates all (path, link) pairs.

namedparams(include_uninit=True)[source]¶

Returns a generator of all (path, param) pairs under the hierarchy.

Parameters:	include_uninit (bool) – If `True`, it also generates uninitialized parameters.
Returns:	A generator object that generates all (path, parameter) pairs. The paths are relative from this link.

params(include_uninit=True)[source]¶

Returns a generator of all parameters under the link hierarchy.

Parameters:	include_uninit (bool) – If `True`, it also generates uninitialized parameters.
Returns:	A generator object that generates all parameters.

pop(i=-1)[source]¶

register_persistent(name)[source]¶

Registers an attribute of a given name as a persistent value.

This is a convenient method to register an existing attribute as a persistent value. If name has been already registered as a parameter, this method removes it from the list of parameter names and re-registers it as a persistent value.

Parameters:	name (str) – Name of the attribute to be registered.

remove(layer)[source]¶

remove_by_layer_type(type_name)[source]¶

Remove layers by layer type.

This method removes layers from the Sequential object by the layer’s class name or function name. If you want to remove a Link, the argment type_name should be its class name, e.g., Linear or Convolution2D, etc. If you want to remove a Function class or any other callable objects, type_name should be the function name, e.g., relu or reshape, etc.

Parameters:	type_name (str) – The name of a layer you want to remove.

repeat(n_repeat, mode='init')[source]¶

Repeats this link multiple times to make a Sequential.

This method returns a Sequential object which has the same Link multiple times repeatedly. The mode argument means how to copy this link to repeat.

Example

You can repeat the same link multiple times to create a longer Sequential block like this:

class ConvBNReLU(chainer.Chain):

    def __init__(self):
        super(ConvBNReLU, self).__init__()
        with self.init_scope():
            self.conv = L.Convolution2D(
                None, 64, 3, 1, 1, nobias=True)
            self.bn = L.BatchNormalization(64)

    def __call__(self, x):
        return F.relu(self.bn(self.conv(x)))

net = ConvBNReLU().repeat(16, mode='init')

The net object contains 16 blocks, each of which is ConvBNReLU. And the mode was init, so each block is re-initialized with different parameters. If you give copy to this argument, each block has same values for its parameters but its object ID is different from others. If it is share, each block is same to others in terms of not only parameters but also the object IDs because they are shallow-copied, so that when the parameter of one block is changed, all the parameters in the others also change.

Parameters:

n_repeat (int) – Number of times to repeat.
mode (str) – It should be either init, copy, or share. init means parameters of each repeated element in the returned Sequential will be re-initialized, so that all elements have different initial parameters. copy means that the parameters will not be re-initialized but object itself will be deep-copied, so that all elements have same initial parameters but can be changed independently. share means all the elements which consist the resulting Sequential object are same object because they are shallow-copied, so that all parameters of elements are shared with each other.

serialize(serializer)[source]¶

Serializes the link object.

Parameters:	serializer (AbstractSerializer) – Serializer object.

to_cpu()[source]¶

Copies parameter variables and persistent values to CPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override this method to do so.

Returns: self

to_gpu(device=None)[source]¶

Copies parameter variables and persistent values to GPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override this method to do so.

Parameters:	device – Target device specifier. If omitted, the current device is used.

Returns: self

to_intel64()[source]¶: Copies parameter variables and persistent values to CPU.

zerograds()[source]¶

Initializes all gradient arrays by zero.

This method can be used for the same purpose of cleargrads, but less efficient. This method is left for backward compatibility.

Deprecated since version v1.15: Use cleargrads() instead.

__add__(other)[source]¶

Attributes

update_enabled¶: True if at least one parameter has an update rule enabled.

within_init_scope¶

True if the current code is inside of an initialization scope.

See init_scope() for the details of the initialization scope.

xp¶

Array module for this link.

Depending on which of CPU/GPU this link is on, this property returns numpy or cupy.