chainer.functions.contrastive

chainer.functions.contrastive(x0, x1, y, margin=1, reduce='mean')[source]

Computes contrastive loss.

It takes a pair of samples and a label as inputs. The label is \(1\) when those samples are similar, or \(0\) when they are dissimilar.

Let \(N\) and \(K\) denote mini-batch size and the dimension of input variables, respectively. The shape of both input variables x0 and x1 should be (N, K). The loss value of the \(n\)-th sample pair \(L_n\) is

\[L_n = \frac{1}{2} \left( y_n d_n^2 + (1 - y_n) \max ({\rm margin} - d_n, 0)^2 \right)\]

where \(d_n = \| {\bf x_0}_n - {\bf x_1}_n \|_2\), \({\bf x_0}_n\) and \({\bf x_1}_n\) are \(n\)-th K-dimensional vectors of x0 and x1.

The output is a variable whose value depends on the value of the option reduce. If it is 'no', it holds the elementwise loss values. If it is 'mean', this function takes a mean of loss values.

Parameters
  • x0 (Variable or N-dimensional array) – The first input variable. The shape should be (N, K), where N denotes the mini-batch size, and K denotes the dimension of x0.

  • x1 (Variable or N-dimensional array) – The second input variable. The shape should be the same as x0.

  • y (Variable or N-dimensional array) – Labels. All values should be 0 or 1. The shape should be (N,), where N denotes the mini-batch size.

  • margin (float) – A parameter for contrastive loss. It should be positive value.

  • reduce (str) – Reduction option. Its value must be either 'mean' or 'no'. Otherwise, ValueError is raised.

Returns

A variable holding the loss value(s) calculated by the above equation. If reduce is 'no', the output variable holds array whose shape is same as one of (hence both of) input variables. If it is 'mean', the output variable holds a scalar value.

Return type

Variable

Note

This cost can be used to train siamese networks. See Learning a Similarity Metric Discriminatively, with Application to Face Verification for details.

Example

>>> x0 = np.array([[-2.0, 3.0, 0.5], [5.0, 2.0, -0.5]]).astype(np.float32)
>>> x1 = np.array([[-1.0, 3.0, 1.0], [3.5, 0.5, -2.0]]).astype(np.float32)
>>> y = np.array([1, 0]).astype(np.int32)
>>> F.contrastive(x0, x1, y)
variable(0.3125)
>>> F.contrastive(x0, x1, y, margin=3.0)  # harder penalty
variable(0.3528857)
>>> z = F.contrastive(x0, x1, y, reduce='no')
>>> z.shape
(2,)
>>> z.array
array([0.625, 0.   ], dtype=float32)