chainer.optimizers.SMORMS3¶
-
class
chainer.optimizers.
SMORMS3
(lr=0.001, eps=1e-16)[source]¶ Simon Funk’s SMORMS3.
See http://sifter.org/~simon/journal/20150420.html.
Parameters: Methods
-
add_hook
(hook, name=None)[source]¶ Registers a hook function.
Hook function is typically called right after the gradient computation, though the timing depends on the optimization method.
Parameters: - hook (function) – Hook function. If
hook.call_for_each_param
is true, this hook function is called for each parameter by passing the update rule and the parameter. Otherwise, this hook function is called only once each iteration by passing the optimizer. - name (str) – Name of the registration. If omitted,
hook.name
is used by default.
- hook (function) – Hook function. If
-
new_epoch
()[source]¶ Starts a new epoch.
This method increments the
epoch
count. Note that if the optimizer depends on the epoch count, then user should call this method appropriately at the beginning of each epoch.
-
reallocate_cleared_grads
()[source]¶ Reallocate gradients cleared by
cleargrad()
.This method allocates arrays for all gradients which have
None
. This method is called before and after every optimizer hook. If an inheriting optimizer does not require this allocation, the optimizer can override this method with a blank function.
-
remove_hook
(name)[source]¶ Removes a hook function.
Parameters: name (str) – Registered name of the hook function to remove.
-
serialize
(serializer)[source]¶ Serializes or deserializes the optimizer.
It only saves or loads the following things:
- Optimizer states
- Global states (
t
andepoch
)
It does not saves nor loads the parameters of the target link. They should be separately saved or loaded.
Parameters: serializer (AbstractSerializer) – Serializer or deserializer object.
-
update
(lossfun=None, *args, **kwds)[source]¶ Updates parameters based on a loss function or computed gradients.
This method runs in two ways.
- If
lossfun
is given, then it is used as a loss function to compute gradients. - Otherwise, this method assumes that the gradients are already computed.
In both cases, the computed gradients are used to update parameters. The actual update routines are defined by the update rule of each parameter.
- If
-
use_cleargrads
(use=True)[source]¶ Enables or disables use of
cleargrads()
in update.Parameters: use (bool) – If True
, this function enables use of cleargrads. IfFalse
, disables use of cleargrads (zerograds is used).Deprecated since version v2.0: Note that
update()
callscleargrads()
by default.cleargrads()
is more efficient thanzerograds()
, so one does not have to calluse_cleargrads()
. This method remains for backward compatibility.
Attributes
-
eps
¶ Alias to
self.hyperparam.eps
-
lr
¶ Alias to
self.hyperparam.lr
-