chainer.GradientMethod¶
- class chainer.GradientMethod[source]¶
Base class of all single gradient-based optimizers.
This is an extension of the
Optimizer
class. Typical gradient methods that just require the gradient at the current parameter vector on an update can be implemented as its child class.This class uses
UpdateRule
to manage the update rule of each parameter. A child class of GradientMethod should overridecreate_update_rule()
to create the default update rule of each parameter.This class also provides
hyperparam
, which is the hyperparameter used as the default configuration of each update rule. All built-in gradient method implementations also provide proxy properties that act as aliases to the attributes ofhyperparam
. It is recommended that you provide such an alias to each attribute. It can be done by only adding one line for each attribute usingHyperparameterProxy
.- Variables
~GradientMethod.hyperparam (Hyperparameter) – The hyperparameter of the gradient method. It is used as the default configuration of each update rule (i.e., the hyperparameter of each update rule refers this hyperparameter as its parent).
Methods
- add_hook(hook, name=None, timing='auto')[source]¶
Registers a hook function.
Hook function is typically called right after the gradient computation, though the timing depends on the optimization method, and the timing attribute.
- Parameters
hook (callable) – Hook function. If
hook.call_for_each_param
is true, this hook function is called for each parameter by passing the update rule and the parameter. Otherwise, this hook function is called only once each iteration by passing the optimizer.name (str) – Name of the registration. If omitted,
hook.name
is used by default.timing (str) – Specifies when the hook is called. If ‘auto’, the timimg property of the hook will decide the timing. If ‘pre’, the hook will be called before any updates. If ‘post’, the hook will be called after any updates.
- create_update_rule()[source]¶
Creates a new update rule object.
This method creates an update rule object. It is called by
setup()
to set up an update rule of each parameter. Each implementation of the gradient method should override this method to provide the default update rule implementation.- Returns
Update rule object.
- Return type
- new_epoch(auto=False)[source]¶
Starts a new epoch.
This method increments the
epoch
count. Note that if the optimizer depends on the epoch count, then user should call this method appropriately at the beginning of each epoch.- Parameters
auto (bool) – Should be
True
if this method is called by an updater. In this case,use_auto_new_epoch
should be set toTrue
by the updater.
- reallocate_cleared_grads()[source]¶
Reallocate gradients cleared by
cleargrad()
.This method allocates arrays for all gradients which have
None
. This method is called before and after every optimizer hook. If an inheriting optimizer does not require this allocation, the optimizer can override this method with a blank function.
- remove_hook(name)[source]¶
Removes a hook function.
- Parameters
name (str) – Registered name of the hook function to remove.
- serialize(serializer)[source]¶
Serializes or deserializes the optimizer.
It only saves or loads the following things:
It does not saves nor loads the parameters of the target link. They should be separately saved or loaded.
- Parameters
serializer (AbstractSerializer) – Serializer or deserializer object.
- setup(link)[source]¶
Sets a target link and initializes the optimizer states.
Given link is set to the
target
attribute. It also prepares the optimizer state dictionaries corresponding to all parameters in the link hierarchy. The existing states are discarded.- Parameters
link (Link) – Target link object.
- Returns
The optimizer instance.
Note
As of v4.0.0, this function returns the optimizer instance itself so that you can instantiate and setup the optimizer in one line, e.g.,
optimizer = SomeOptimizer().setup(link)
.
- update(lossfun=None, *args, **kwds)[source]¶
Updates parameters based on a loss function or computed gradients.
This method runs in two ways.
If
lossfun
is given, then it is used as a loss function to compute gradients.Otherwise, this method assumes that the gradients are already computed.
In both cases, the computed gradients are used to update parameters. The actual update routines are defined by the update rule of each parameter.
- use_cleargrads(use=True)[source]¶
Enables or disables use of
cleargrads()
in update.- Parameters
use (bool) – If
True
, this function enables use of cleargrads. IfFalse
, disables use of cleargrads (zerograds is used).
Deprecated since version v2.0: Note that
update()
callscleargrads()
by default.cleargrads()
is more efficient thanzerograds()
, so one does not have to calluse_cleargrads()
. This method remains for backward compatibility.
- __eq__(value, /)¶
Return self==value.
- __ne__(value, /)¶
Return self!=value.
- __lt__(value, /)¶
Return self<value.
- __le__(value, /)¶
Return self<=value.
- __gt__(value, /)¶
Return self>value.
- __ge__(value, /)¶
Return self>=value.
Attributes
- epoch = 0¶
- t = 0¶
- target = None¶
- use_auto_new_epoch = False¶