chainer.links.NaryTreeLSTM¶

class chainer.links.NaryTreeLSTM(in_size, out_size, n_ary=2)[source]¶

N-ary TreeLSTM unit.

Warning

This feature is experimental. The interface can change in the future.

This is a N-ary TreeLSTM unit as a chain. This link is a fixed-length arguments function, which compounds the states of all children nodes into the new states of a current (parent) node. states denotes the cell state, \(c\), and the output, \(h\), which are produced by this link. This link doesn’t keep cell and hidden states internally.

For example, this link is called such as func(c1, c2, h1, h2, x) if the number of children nodes was set 2 (n_ary = 2), while func(c1, c2, c3, h1, h2, h3, x) if that was 3 (n_ary = 3). This function is dependent from an order of children nodes unlike Child-Sum TreeLSTM. Thus, the returns of func(c1, c2, h1, h2, x) are different from those of func(c2, c1, h2, h1, x).

Parameters:	in_size (int) – Dimension of input vectors. out_size (int) – Dimensionality of cell and output vectors. n_ary (int) – The number of children nodes in a tree structure.
Variables:	W_x (chainer.links.Linear) – Linear layer of connections from input vectors. W_h (chainer.links.Linear) – Linear layer of connections between (\(a\), \(i\), \(o\), all \(f\)) and the output of each child. \(a\), \(i\), \(o\) and \(f\) denotes input compound, input gate, output gate and forget gate, respectively. \(a\), input compound, equals to \(u\) in the paper by Tai et al.

See the papers for details: Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, and A Fast Unified Model for Parsing and Sentence Understanding.

Tai et al.’s N-Ary TreeLSTM is little extended in Bowman et al., and this link is based on the variant by Bowman et al. Specifically, eq. 10 in Tai et al. has only one \(W\) matrix to be applied to \(x\), consistently for all children. On the other hand, Bowman et al.’s model has multiple matrices, each of which affects the forget gate for each child’s cell individually.

Methods

__call__(*cshsx)[source]¶

Returns new cell state and output of N-ary TreeLSTM.

Parameters:	cshsx (list of `Variable`) – Arguments which include all cell vectors and all output vectors of fixed-length children, and an input vector. The number of arguments must be same as `n_ary * 2 + 1`.
Returns:	Returns \((c_{new}, h_{new})\), where \(c_{new}\) represents new cell state vector, and \(h_{new}\) is new output vector.
Return type:	tuple of ~chainer.Variable

__getitem__(name)[source]¶: Equivalent to getattr.

add_link(name, link)[source]¶

Registers a child link to this chain.

Deprecated since version v2.0.0: Assign the child link directly to an attribute within init_scope() instead. For example, the following code

chain.add_link('l1', L.Linear(3, 5))

can be replaced by the following line.

with chain.init_scope():
    chain.l1 = L.Linear(3, 5)

The latter is easier for IDEs to keep track of the attribute’s type.

Parameters:	name (str) – Name of the child link. This name is also used as the attribute name. link (Link) – The link object to be registered.

add_param(name, shape=None, dtype=<class 'numpy.float32'>, initializer=None)[source]¶

Registers a parameter to the link.

Deprecated since version v2.0.0: Assign a Parameter object directly to an attribute within init_scope() instead. For example, the following code

link.add_param('W', shape=(5, 3))

can be replaced by the following assignment.

with link.init_scope():
    link.W = chainer.Parameter(None, (5, 3))

The latter is easier for IDEs to keep track of the attribute’s type.

Parameters:

name (str) – Name of the parameter. This name is also used as the attribute name.
shape (int or tuple of ints) – Shape of the parameter array. If it is omitted, the parameter variable is left uninitialized.
dtype – Data type of the parameter array.
initializer – If it is not None, the data is initialized with the given initializer. If it is an array, the data is directly initialized by it. If it is callable, it is used as a weight initializer. Note that in these cases, dtype argument is ignored.

add_persistent(name, value)[source]¶

Registers a persistent value to the link.

The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.

Parameters:	name (str) – Name of the persistent value. This name is also used for the attribute name. value – Value to be registered.

addgrads(link)[source]¶

children()[source]¶

cleargrads()[source]¶

Clears all gradient arrays.

This method should be called before the backward computation at every iteration of the optimization.

copy()[source]¶

copyparams(link)[source]¶

disable_update()[source]¶

Disables update rules of all parameters under the link hierarchy.

This method sets the enabled flag of the update rule of each parameter variable to False.

enable_update()[source]¶

Enables update rules of all parameters under the link hierarchy.

This method sets the enabled flag of the update rule of each parameter variable to True.

init_scope()[source]¶

Creates an initialization scope.

This method returns a context manager object that enables registration of parameters (and links for Chain) by an assignment. A Parameter object can be automatically registered by assigning it to an attribute under this context manager.

Example

In most cases, the parameter registration is done in the initializer method. Using the init_scope method, we can simply assign a Parameter object to register it to the link.

class MyLink(chainer.Link):
    def __init__(self):
        super().__init__()
        with self.init_scope():
            self.W = chainer.Parameter(0, (10, 5))
            self.b = chainer.Parameter(0, (5,))

links(skipself=False)[source]¶

namedlinks(skipself=False)[source]¶

namedparams(include_uninit=True)[source]¶

params(include_uninit=True)[source]¶

register_persistent(name)[source]¶

Registers an attribute of a given name as a persistent value.

This is a convenient method to register an existing attribute as a persistent value. If name has been already registered as a parameter, this method removes it from the list of parameter names and re-registers it as a persistent value.

Parameters:	name (str) – Name of the attribute to be registered.

serialize(serializer)[source]¶

to_cpu()[source]¶

to_gpu(device=None)[source]¶

zerograds()[source]¶

Initializes all gradient arrays by zero.

This method can be used for the same purpose of cleargrads, but less efficient. This method is left for backward compatibility.

Deprecated since version v1.15: Use cleargrads() instead.