chainer.dataset.tabular.DelegateDataset

class chainer.dataset.tabular.DelegateDataset(dataset)[source]

A helper class to implement a TabularDataset.

This class wraps an instance of TabularDataset and provides methods of TabularDataset. This class is useful to create a custom dataset class by inheriting it.

>>> import numpy as np
>>>
>>> from chainer.dataset import tabular
>>>
>>> class MyDataset(tabular.DelegateDataset):
...
...     def __init__(self):
...         super().__init__(tabular.from_data((
...             ('a', np.arange(10)),
...             ('b', self.get_b),
...             ('c', [3, 1, 4, 5, 9, 2, 6, 8, 7, 0]),
...             (('d', 'e'), self.get_de))))
...
...     def get_b(self, i):
...         return 'b[{}]'.format(i)
...
...     def get_de(self, i):
...         return {'d': 'd[{}]'.format(i), 'e': 'e[{}]'.format(i)}
...
>>> dataset = MyDataset()
>>> len(dataset)
10
>>> dataset.keys
('a', 'b', 'c', 'd', 'e')
>>> dataset[0]
(0, 'b[0]', 3, 'd[0]', 'e[0]')
Parameters

dataset (chainer.dataset.TabularDataset) – An underlying dataset.

Methods

__getitem__(index)[source]

Returns an example or a sequence of examples.

It implements the standard Python indexing and one-dimensional integer array indexing. It uses the get_example() method by default, but it may be overridden by the implementation to, for example, improve the slicing performance.

Parameters

index (int, slice, list or numpy.ndarray) – An index of an example or indexes of examples.

Returns

If index is int, returns an example created by get_example. If index is either slice or one-dimensional list or numpy.ndarray, returns a list of examples created by get_example.

Example

>>> import numpy
>>> from chainer import dataset
>>> class SimpleDataset(dataset.DatasetMixin):
...     def __init__(self, values):
...         self.values = values
...     def __len__(self):
...         return len(self.values)
...     def get_example(self, i):
...         return self.values[i]
...
>>> ds = SimpleDataset([0, 1, 2, 3, 4, 5])
>>> ds[1]   # Access by int
1
>>> ds[1:3]  # Access by slice
[1, 2]
>>> ds[[4, 0]]  # Access by one-dimensional integer list
[4, 0]
>>> index = numpy.arange(3)
>>> ds[index]  # Access by one-dimensional integer numpy.ndarray
[0, 1, 2]
__len__()[source]

Returns the number of data points.

__iter__()[source]
asdict()[source]

Return a view with dict mode.

Returns

A view whose mode is dict.

astuple()[source]

Return a view with tuple mode.

Returns

A view whose mode is tuple.

concat(*datasets)[source]

Stack datasets along rows.

Parameters

datasets (iterable of TabularDataset) – Datasets to be concatenated. All datasets must have the same keys.

Returns

A concatenated dataset.

convert(data)[source]

Convert fetched data.

This method takes data fetched by fetch() and pre-process them before passing them to models. The default behaviour is converting each column into an ndarray. This behaviour can be overridden by with_converter(). If the dataset is constructed by concat() or join(), the converter of the first dataset is used.

Parameters

data (tuple or dict) – Data from fetch().

Returns

A tuple or dict. Each value is an ndarray.

fetch()[source]

Fetch data.

This method fetches all data of the dataset/view. Note that this method returns a column-major data (i.e. ([a[0], ..., a[3]], ..., [c[0], ... c[3]]), {'a': [a[0], ..., a[3]], ..., 'c': [c[0], ..., c[3]]}, or [a[0], ..., a[3]]).

Returns

If mode is tuple, this method returns a tuple of lists/arrays. If mode is dict, this method returns a dict of lists/arrays.

get_example(i)[source]

Returns the i-th example.

Implementations should override it. It should raise IndexError if the index is invalid.

Parameters

i (int) – The index of the example.

Returns

The i-th example.

get_examples(indices, key_indices)[source]

Return a part of data.

Parameters
  • indices (list of ints or slice) – Indices of requested rows. If this argument is None, it indicates all rows.

  • key_indices (tuple of ints) – Indices of requested columns. If this argument is None, it indicates all columns.

Returns

tuple of lists/arrays

join(*datasets)[source]

Stack datasets along columns.

Parameters

datasets (iterable of TabularDataset) – Datasets to be concatenated. All datasets must have the same length

Returns

A joined dataset.

transform(keys, transform)[source]

Apply a transform to each example.

Parameters
  • keys (tuple of strs) – The keys of transformed examples.

  • transform (callable) – A callable that takes an example and returns transformed example. mode of transformed dataset is determined by the transformed examples.

Returns

A transfromed dataset.

transform_batch(keys, transform_batch)[source]

Apply a transform to examples.

Parameters
  • keys (tuple of strs) – The keys of transformed examples.

  • transform_batch (callable) – A callable that takes examples and returns transformed examples. mode of transformed dataset is determined by the transformed examples.

Returns

A transfromed dataset.

with_converter(converter)[source]

Override the behaviour of convert().

This method overrides convert().

Parameters

converter (callable) – A new converter.

Returns

A dataset with the new converter.

__eq__(value, /)

Return self==value.

__ne__(value, /)

Return self!=value.

__lt__(value, /)

Return self<value.

__le__(value, /)

Return self<=value.

__gt__(value, /)

Return self>value.

__ge__(value, /)

Return self>=value.

Attributes

keys
mode
slice

Get a slice of dataset.

Parameters
  • indices (list/array of ints/bools or slice) – Requested rows.

  • keys (tuple of ints/strs or int or str) – Requested columns.

Returns

A view of specified range.