Chainer
v2.0.2
  • Installation Guide
  • Chainer Tutorial
  • Chainer Reference Manual
    • Core functionalities
    • Utilities
    • Assertion and Testing
    • Standard Function implementations
    • Standard Link implementations
      • Learnable connections
        • chainer.links.Bias
        • chainer.links.Bilinear
        • chainer.links.Convolution2D
        • chainer.links.ConvolutionND
        • chainer.links.Deconvolution2D
        • chainer.links.DeconvolutionND
        • chainer.links.DepthwiseConvolution2D
        • chainer.links.DilatedConvolution2D
        • chainer.links.EmbedID
        • chainer.links.GRU
        • chainer.links.Highway
        • chainer.links.Inception
        • chainer.links.InceptionBN
        • chainer.links.Linear
        • chainer.links.LSTM
        • chainer.links.MLPConvolution2D
        • chainer.links.NStepBiGRU
        • chainer.links.NStepBiLSTM
        • chainer.links.NStepBiRNNReLU
        • chainer.links.NStepBiRNNTanh
        • chainer.links.NStepGRU
        • chainer.links.NStepLSTM
        • chainer.links.NStepRNNReLU
        • chainer.links.NStepRNNTanh
        • chainer.links.Scale
        • chainer.links.StatefulGRU
        • chainer.links.StatefulPeepholeLSTM
        • chainer.links.StatelessLSTM
      • Activation/loss/normalization functions with parameters
      • Machine learning models
      • Pre-trained models
    • Optimizers
    • Serializers
    • Function hooks
    • Weight Initializers
    • Dataset examples
    • Iterator examples
    • Trainer extensions
    • Trainer triggers
    • Caffe Reference Model Support
    • Visualization of Computational Graph
    • Environment variables
  • Upgrade Guide from v1 to v2
  • Contribution Guide
  • API Compatibility Policy
  • Tips and FAQs
  • Comparison with Other Frameworks
  • License
Chainer
  • Docs »
  • Chainer Reference Manual »
  • Standard Link implementations »
  • chainer.links.LSTM
  • Edit on GitHub

chainer.links.LSTM¶

class chainer.links.LSTM(in_size, out_size=None, **kwargs)[source]¶

Fully-connected LSTM layer.

This is a fully-connected LSTM layer as a chain. Unlike the lstm() function, which is defined as a stateless activation function, this chain holds upward and lateral connections as child links.

It also maintains states, including the cell state and the output at the previous time step. Therefore, it can be used as a stateful LSTM.

This link supports variable length inputs. The mini-batch size of the current input must be equal to or smaller than that of the previous one. The mini-batch size of c and h is determined as that of the first input x. When mini-batch size of i-th input is smaller than that of the previous input, this link only updates c[0:len(x)] and h[0:len(x)] and doesn’t change the rest of c and h. So, please sort input sequences in descending order of lengths before applying the function.

Parameters:
  • in_size (int) – Dimension of input vectors. If it is None or omitted, parameter initialization will be deferred until the first forward data pass at which time the size will be determined.
  • out_size (int) – Dimensionality of output vectors.
  • lateral_init – A callable that takes numpy.ndarray or cupy.ndarray and edits its value. It is used for initialization of the lateral connections. May be None to use default initialization.
  • upward_init – A callable that takes numpy.ndarray or cupy.ndarray and edits its value. It is used for initialization of the upward connections. May be None to use default initialization.
  • bias_init – A callable that takes numpy.ndarray or cupy.ndarray and edits its value It is used for initialization of the biases of cell input, input gate and output gate.and gates of the upward connection. May be a scalar, in that case, the bias is initialized by this value. If it is None, the cell-input bias is initialized to zero.
  • forget_bias_init – A callable that takes numpy.ndarray or cupy.ndarray and edits its value It is used for initialization of the biases of the forget gate of the upward connection. May be a scalar, in that case, the bias is initialized by this value. If it is None, the forget bias is initialized to one.
Variables:
  • upward (Linear) – Linear layer of upward connections.
  • lateral (Linear) – Linear layer of lateral connections.
  • c (Variable) – Cell states of LSTM units.
  • h (Variable) – Output at the previous time step.

Methods

__call__(x)[source]¶

Updates the internal state and returns the LSTM outputs.

Parameters:x (Variable) – A new batch from the input sequence.
Returns:Outputs of updated LSTM units.
Return type:Variable
__getitem__(name)[source]¶

Equivalent to getattr.

add_link(name, link)[source]¶

Registers a child link to this chain.

Deprecated since version v2.0.0: Assign the child link directly to an attribute within an initialization scope, instead. For example, the following code

chain.add_link('l1', L.Linear(3, 5))

can be replaced by the following line.

with self.init_scope():
    chain.l1 = L.Linear(3, 5)

The latter one is easier for IDEs to keep track of the attribute’s type.

Parameters:
  • name (str) – Name of the child link. This name is also used as the attribute name.
  • link (Link) – The link object to be registered.
add_param(name, shape=None, dtype=<type 'numpy.float32'>, initializer=None)[source]¶

Registers a parameter to the link.

Deprecated since version v2.0.0: Assign a Parameter object directly to an attribute within an initialization scope instead. For example, the following code

link.add_param('W', shape=(5, 3))

can be replaced by the following assignment.

with self.init_scope():
    link.W = chainer.Parameter(None, (5, 3))

The latter one is easier for IDEs to keep track of the attribute’s type.

Parameters:
  • name (str) – Name of the parameter. This name is also used as the attribute name.
  • shape (int or tuple of ints) – Shape of the parameter array. If it is omitted, the parameter variable is left uninitialized.
  • dtype – Data type of the parameter array.
  • initializer – If it is not None, the data is initialized with the given initializer. If it is an array, the data is directly initialized by it. If it is callable, it is used as a weight initializer. Note that in these cases, dtype argument is ignored.
add_persistent(name, value)[source]¶

Registers a persistent value to the link.

The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.

Parameters:
  • name (str) – Name of the persistent value. This name is also used for the attribute name.
  • value – Value to be registered.
addgrads(link)[source]¶
children()[source]¶
cleargrads()[source]¶

Clears all gradient arrays.

This method should be called before the backward computation at every iteration of the optimization.

copy()[source]¶
copyparams(link)[source]¶
disable_update()[source]¶

Disables update rules of all parameters under the link hierarchy.

This method sets the :attr:~chainer.UpdateRule.enabled` flag of the update rule of each parameter variable to False.

enable_update()[source]¶

Enables update rules of all parameters under the link hierarchy.

This method sets the enabled flag of the update rule of each parameter variable to True.

init_scope(*args, **kwds)[source]¶

Creates an initialization scope.

This method returns a context manager object that enables registration of parameters (and links for Chain) by an assignment. A Parameter object can be automatically registered by assigning it to an attribute under this context manager.

Example

In most cases, the parameter registration is done in the initializer method. Using the init_scope method, we can simply assign a Parameter object to register it to the link.

class MyLink(chainer.Link):
    def __init__(self):
        super().__init__()
        with self.init_scope():
            self.W = chainer.Parameter(0, (10, 5))
            self.b = chainer.Parameter(0, (5,))
links(skipself=False)[source]¶
namedlinks(skipself=False)[source]¶
namedparams(include_uninit=True)[source]¶
params(include_uninit=True)[source]¶
register_persistent(name)[source]¶

Registers an attribute of a given name as a persistent value.

This is a convenient method to register an existing attribute as a persistent value. If name has been already registered as a parameter, this method removes it from the list of parameter names and re-registers it as a persistent value.

Parameters:name (str) – Name of the attribute to be registered.
reset_state()[source]¶

Resets the internal state.

It sets None to the c and h attributes.

serialize(serializer)[source]¶
set_state(c, h)[source]¶

Sets the internal state.

It sets the c and h attributes.

Parameters:
  • c (Variable) – A new cell states of LSTM units.
  • h (Variable) – A new output at the previous time step.
to_cpu()[source]¶
to_gpu(device=None)[source]¶
zerograds()[source]¶

Initializes all gradient arrays by zero.

This method can be used for the same purpose of cleargrads, but less efficient. This method is left for backward compatibility.

Deprecated since version v1.15: Use cleargrads() instead.

Attributes

update_enabled¶

True if at least one parameter has an update rule enabled.

within_init_scope¶

True if the current code is inside of an initialization scope.

See init_scope() for the details of the initialization scope.

xp¶

Array module for this link.

Depending on which of CPU/GPU this link is on, this property returns numpy or cupy.

Next Previous

© Copyright 2015, Preferred Networks, inc. and Preferred Infrastructure, inc.. Revision 0f6e5bd1.

Built with Sphinx using a theme provided by Read the Docs.