chainer.functions.connectionist_temporal_classification

chainer.functions.connectionist_temporal_classification(x, t, blank_symbol, input_length=None, label_length=None, reduce='mean')[source]

Connectionist Temporal Classification loss function.

Connectionist Temporal Classification(CTC) [Graves2006] is a loss function of sequence labeling where the alignment between the inputs and target is unknown. See also [Graves2012]

The output is a variable whose value depends on the value of the option reduce. If it is 'no', it holds the samplewise loss values. If it is 'mean', it takes the mean of loss values.

Parameters
  • x (list or tuple of Variable) – A list of unnormalized probabilities for labels. Each element of x, x[i] is a Variable object, which has shape (B, V), where B is the batch size and V is the number of labels. The softmax of x[i] represents the probabilities of the labels at time i.

  • t (Variable or N-dimensional array) – A matrix including expected label sequences. Its shape is (B, M), where B is the batch size and M is the maximum length of the label sequences. All elements in t must be less than V, the number of labels.

  • blank_symbol (int) – Index of blank_symbol. This value must be non-negative.

  • input_length (Variable or N-dimensional array) – Length of sequence for each of mini batch x (optional). Its shape must be (B,). If the input_length is omitted or None, it assumes that all of x is valid input.

  • label_length (Variable or N-dimensional array) – Length of sequence for each of mini batch t (optional). Its shape must be (B,). If the label_length is omitted or None, it assumes that all of t is valid input.

  • reduce (str) – Reduction option. Its value must be either 'mean' or 'no'. Otherwise, ValueError is raised.

Returns

A variable holding a scalar value of the CTC loss. If reduce is 'no', the output variable holds array whose shape is (B,) where B is the number of samples. If it is 'mean', it holds a scalar.

Return type

Variable

Note

You need to input x without applying to activation functions(e.g. softmax function), because this function applies softmax functions to x before calculating CTC loss to avoid numerical limitations. You also need to apply softmax function to forwarded values before you decode it.

Note

This function is differentiable only by x.

Note

This function supports (batch, sequence, 1-dimensional input)-data.

Graves2006

Alex Graves, Santiago Fernandez, Faustino Gomez, Jurgen Schmidhuber, Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks

Graves2012

Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks