chainer.functions.hinge

chainer.functions.hinge(x, t, norm='L1', reduce='mean')[source]

Computes the hinge loss for a one-of-many classification task.

\[L = \frac{1}{N} \sum_{n=1}^N \sum_{k=1}^K \left[ \max(0, 1 - \delta\{t_n = k\} x_{nk}) \right]^p\]

where \(N\) denotes the batch size and \(K\) is the number of classes of interest,

\[\begin{split}\delta \{ {\rm condition} \} = \left \{ \begin{array}{cc} 1 & {\rm if~condition\ is\ true} \\ -1 & {\rm otherwise,} \end{array} \right.\end{split}\]

and

\[\begin{split}p = \left \{ \begin{array}{cc} 1 & {\rm if~norm} = {\rm L1} \\ 2 & {\rm if~norm} = {\rm L2.} \end{array} \right.\end{split}\]

Let the hinge loss function \(l(x, \delta)\) be \(\left[\max(0, 1 - \delta x) \right]^p\). When \(x\) and \(\delta\) have the same sign (meaning \(x\) predicts the proper score for classification) and \(|x| \geq 1\), the hinge loss \(l(x, \delta) = 0\), but when they have opposite sign, \(l(x, \delta)\) increases linearly with \(x\).

The output is a variable whose value depends on the value of the option reduce. If it is 'no', it holds the elementwise loss values. If it is 'mean', it takes the mean of loss values.

Parameters
  • x (Variable or N-dimensional array) – Input variable. The shape of x should be (\(N\), \(K\)).

  • t (Variable or N-dimensional array) – The \(N\)-dimensional label vector with values \(t_n \in \{0, 1, 2, \dots, K-1\}\). The shape of t should be (\(N\),).

  • norm (string) – Specifies norm type. Either 'L1' or 'L2' is acceptable.

  • reduce (str) – Reduction option. Its value must be either 'mean' or 'no'. Otherwise, ValueError is raised.

Returns

A variable object holding a scalar array of the hinge loss \(L\). If reduce is 'no', the output variable holds array whose shape is same as one of (hence both of) input variables. If it is 'mean', the output variable holds a scalar value.

Return type

Variable

Example

In this case, the batch size N is 2 and the number of classes K is 3.

>>> x = np.array([[-2.0, 3.0, 0.5],
...               [5.0, 2.0, -0.5]]).astype(np.float32)
>>> x
array([[-2. ,  3. ,  0.5],
       [ 5. ,  2. , -0.5]], dtype=float32)
>>> t = np.array([1, 0]).astype(np.int32)
>>> t
array([1, 0], dtype=int32)
>>> F.hinge(x, t)
variable(2.5)
>>> F.hinge(x, t, reduce='no')
variable([[0. , 0. , 1.5],
          [0. , 3. , 0.5]])
>>> F.hinge(x, t, norm='L2')
variable(5.75)