class chainer.links.ResNet101Layers(pretrained_model='auto', downsample_fb=False)[source]¶
A pre-trained CNN model with 101 layers provided by MSRA.
When you specify the path of the pre-trained chainer model serialized as
a .npz file in the constructor, this chain model automatically
initializes all the parameters with it.
This model would be useful when you want to extract a semantic feature
vector per image, or fine-tune the model on a different dataset.
Note that unlike VGG16Layers, it does not automatically download a
pre-trained caffemodel. This caffemodel can be downloaded at
If you want to manually convert the pre-trained caffemodel to a chainer
model that can be specified in the constructor,
please use convert_caffemodel_to_npz classmethod instead.
ResNet101 has 44,549,224 trainable parameters, and it’s 43% fewer than
ResNet152 model, while the top-5 classification accuracy on ImageNet
dataset drops 1.1% from ResNet152. For many cases, ResNet50 may have the
best balance between the accuracy and the model size.
pretrained_model (str) – the destination of the pre-trained
chainer model serialized as a .npz file.
If this argument is specified as auto,
it automatically loads and converts the caffemodel from
where $CHAINER_DATASET_ROOT is set as
$HOME/.chainer/dataset unless you specify another value
by modifying the environment variable. Note that in this case the
converted chainer model is stored on the same directory and
automatically used from the next time.
If this argument is specified as None, all the parameters
are not initialized by the pre-trained model, but the default
initializer used in the original paper, i.e.,
downsample_fb (bool) – If this argument is specified as False,
it performs downsampling by placing stride 2
on the 1x1 convolutional layers (the original MSRA ResNet).
If this argument is specified as True, it performs downsampling
by placing stride 2 on the 3x3 convolutional layers
available_layers (list of str) – The list of available layer names
used by forward and extract methods.
The latter is easier for IDEs to keep track of the attribute’s
name (str) – Name of the parameter. This name is also used as the
shape (int or tuple of ints) – Shape of the parameter array. If it
is omitted, the parameter variable is left uninitialized.
dtype – Data type of the parameter array.
initializer – If it is not None, the data is initialized with
the given initializer. If it is an array, the data is directly
initialized by it. If it is callable, it is used as a weight
initializer. Note that in these cases, dtype argument is
The whole hierarchy rooted by this link is copied. There are three
modes to perform copy. Please see the document for the argument
The name of the link is reset on the copy, since the copied instance
does not belong to the original parent chain (even if exists).
mode (str) – It should be either init, copy, or share.
init means parameter variables under the returned link
object is re-initialized by calling their
initialize() method, so that all the
parameters may have different initial values from the original
copy means that the link object is deeply copied, so that
its parameters are not re-initialized but are also deeply
copied. Thus, all parameters have same initial values but can
be changed independently.
share means that the link is shallowly copied, so that its
parameters’ arrays are shared with the original one. Thus,
their values are changed synchronously. The default mode
This method copies data arrays of all parameters in the hierarchy. The
copy is even done across the host and devices. Note that this method
does not copy the gradient arrays.
From v5.0.0: this method also copies the persistent values (e.g. the
moving statistics of BatchNormalization). If
the persistent value is an ndarray, the elements are copied. Otherwise,
it is copied using copy.deepcopy(). The old behavior (not copying
persistent values) can be reproduced with copy_persistent=False.
The difference of directly executing forward is that
it directly accepts images as an input and automatically
transforms them to a proper variable. That is,
it is also interpreted as a shortcut method that implicitly calls
prepare and forward functions.
Unlike predict method, this method does not override
chainer.config.train and chainer.config.enable_backprop
configuration. If you want to extract features without updating
model parameters, you need to manually set configuration when
calling this method as follows:
# model is an instance of ResNetLayers (50 or 101 or 152 layers)withchainer.using_config('train',False):withchainer.using_config('enable_backprop',False):feature=model.extract([image])
test and volatile arguments are not supported
anymore since v2. Instead, users should configure
training and volatile modes with train and
Note that default behavior of this method is different
between v1 and later versions. Specifically,
the default values of test in v1 were True (test mode).
But that of chainer.config.train is also True
(train mode). Therefore, users need to explicitly switch
train to False to run the code in test mode and
enable_backprop to False to turn off
coputational graph construction.
This method returns a context manager object that enables registration
of parameters (and links for Chain) by an assignment.
A Parameter object can be automatically registered
by assigning it to an attribute under this context manager.
In most cases, the parameter registration is done in the
initializer method. Using the init_scope method, we can
simply assign a Parameter object to register
it to the link.
Registers an attribute of a given name as a persistent value.
This is a convenient method to register an existing attribute as a
persistent value. If name has been already registered as a
parameter, this method removes it from the list of parameter names
and re-registers it as a persistent value.
name (str) – Name of the attribute to be registered.
The net object contains 16 blocks, each of which is
ConvBNReLU. And the mode was init, so each block
is re-initialized with different parameters. If you give
copy to this argument, each block has same values for its
parameters but its object ID is different from others. If it is
share, each block is same to others in terms of not only
parameters but also the object IDs because they are shallow-copied,
so that when the parameter of one block is changed, all the
parameters in the others also change.
mode (str) – It should be either init, copy, or share.
init means parameters of each repeated element in the
returned Sequential will be re-initialized,
so that all elements have different initial parameters.
copy means that the parameters will not be re-initialized
but object itself will be deep-copied, so that all elements
have same initial parameters but can be changed independently.
share means all the elements which consist the resulting
Sequential object are same object because
they are shallow-copied, so that all parameters of elements
are shared with each other.