chainer.functions.softmax_cross_entropy¶

chainer.functions.softmax_cross_entropy(x, t, normalize=True, cache_score=True, class_weight=None, ignore_label=-1, reduce='mean', enable_double_backprop=False)[source]¶

Computes cross entropy loss for pre-softmax activations.

Parameters:	x (`Variable` or `numpy.ndarray` or `cupy.ndarray`) – Variable holding a multidimensional array whose element indicates unnormalized log probability: the first axis of the variable represents the number of samples, and the second axis represents the number of classes. While this function computes a usual softmax cross entropy if the number of dimensions is equal to 2, it computes a cross entropy of the replicated softmax if the number of dimensions is greater than 2. t (`Variable` or `numpy.ndarray` or `cupy.ndarray`) – Variable holding a signed integer vector of ground truth labels. If `t[i] == ignore_label`, corresponding `x[i]` is ignored. normalize (bool) – If `True`, this function normalizes the cross entropy loss across all instances. If `False`, it only normalizes along a batch size. cache_score (bool) – When it is `True`, the function stores result of forward computation to use it on backward computation. It reduces computational cost though consumes more memory. If `enable_double_backprop` option is `True`, this option is forcibly turned off and the function does not cache the intermediate value. class_weight (`Variable` or `numpy.ndarray` or `cupy.ndarray`) – An array that contains constant weights that will be multiplied with the loss values along with the second dimension. The shape of this array should be `(x.shape[1],)`. If this is not `None`, each class weight `class_weight[i]` is actually multiplied to `y[:, i]` that is the corresponding log-softmax output of `x` and has the same shape as `x` before calculating the actual loss value. ignore_label (int) – Label value you want to ignore. Its default value is `-1`. See description of the argument t. reduce (str) – A string that determines whether to reduce the loss values. If it is `'mean'`, it computes the sum of the individual cross entropy and normalize it according to `normalize` option. If it is `'no'`, this function computes cross entropy for each instance and does not normalize it (`normalize` option is ignored). In this case, the loss value of the ignored instance, which has `ignore_label` as its target value, is set to `0`. enable_double_backprop (bool) – If `True`, this function uses implementation that supports higher order differentiation. If `False`, it uses single-backprop implementation. This function use the single-backprop version because we expect it is faster. So, if you need second or higher derivatives, you need to turn it on explicitly.
Returns:	A variable holding a scalar array of the cross entropy loss. If `reduce` is `'mean'`, it is a scalar array. If `reduce` is `'no'`, the shape is same as that of `x`.
Return type:	Variable

Note

This function is differentiable only by x.

Example

>>> x = np.array([[-1, 0, 1, 2], [2, 0, 1, -1]]).astype(np.float32)
>>> x
array([[-1.,  0.,  1.,  2.],
       [ 2.,  0.,  1., -1.]], dtype=float32)
>>> t = np.array([3, 0]).astype(np.int32)
>>> t
array([3, 0], dtype=int32)
>>> y = F.softmax_cross_entropy(x, t)
>>> y
variable(0.44018972)
>>> log_softmax = -F.log_softmax(x)
>>> expected_loss = np.mean([log_softmax[row, column].data for row, column in enumerate(t)])
>>> y.array == expected_loss
True