chainer.dataset.tabular.DelegateDataset¶

class chainer.dataset.tabular.DelegateDataset(dataset)[source]¶

A helper class to implement a TabularDataset.

This class wraps an instance of TabularDataset and provides methods of TabularDataset. This class is useful to create a custom dataset class by inheriting it.

>>> import numpy as np
>>>
>>> from chainer.dataset import tabular
>>>
>>> class MyDataset(tabular.DelegateDataset):
...
...     def __init__(self):
...         super().__init__(tabular.from_data((
...             ('a', np.arange(10)),
...             ('b', self.get_b),
...             ('c', [3, 1, 4, 5, 9, 2, 6, 8, 7, 0]),
...             (('d', 'e'), self.get_de))))
...
...     def get_b(self, i):
...         return 'b[{}]'.format(i)
...
...     def get_de(self, i):
...         return {'d': 'd[{}]'.format(i), 'e': 'e[{}]'.format(i)}
...
>>> dataset = MyDataset()
>>> len(dataset)
10
>>> dataset.keys
('a', 'b', 'c', 'd', 'e')
>>> dataset[0]
(0, 'b[0]', 3, 'd[0]', 'e[0]')

Parameters: dataset (chainer.dataset.TabularDataset) – An underlying dataset.

Methods

__getitem__(index)[source]¶

Returns an example or a sequence of examples.

It implements the standard Python indexing and one-dimensional integer array indexing. It uses the get_example() method by default, but it may be overridden by the implementation to, for example, improve the slicing performance.

Parameters: index (int, slice, list or numpy.ndarray) – An index of an example or indexes of examples.
Returns: If index is int, returns an example created by get_example. If index is either slice or one-dimensional list or numpy.ndarray, returns a list of examples created by get_example.

Example

>>> import numpy
>>> from chainer import dataset
>>> class SimpleDataset(dataset.DatasetMixin):
...     def __init__(self, values):
...         self.values = values
...     def __len__(self):
...         return len(self.values)
...     def get_example(self, i):
...         return self.values[i]
...
>>> ds = SimpleDataset([0, 1, 2, 3, 4, 5])
>>> ds[1]   # Access by int
1
>>> ds[1:3]  # Access by slice
[1, 2]
>>> ds[[4, 0]]  # Access by one-dimensional integer list
[4, 0]
>>> index = numpy.arange(3)
>>> ds[index]  # Access by one-dimensional integer numpy.ndarray
[0, 1, 2]

__len__()[source]¶: Returns the number of data points.

__iter__()[source]¶

asdict()[source]¶

Return a view with dict mode.

Returns: A view whose mode is dict.

astuple()[source]¶

Return a view with tuple mode.

Returns: A view whose mode is tuple.

concat(*datasets)[source]¶

Stack datasets along rows.

Parameters: datasets (iterable of TabularDataset) – Datasets to be concatenated. All datasets must have the same keys.
Returns: A concatenated dataset.

convert(data)[source]¶

Convert fetched data.

This method takes data fetched by fetch() and pre-process them before passing them to models. The default behaviour is converting each column into an ndarray. This behaviour can be overridden by with_converter(). If the dataset is constructed by concat() or join(), the converter of the first dataset is used.

Parameters: data (tuple or dict) – Data from fetch().
Returns: A tuple or dict. Each value is an ndarray.

fetch()[source]¶

Fetch data.

This method fetches all data of the dataset/view. Note that this method returns a column-major data (i.e. ([a[0], ..., a[3]], ..., [c[0], ... c[3]]), {'a': [a[0], ..., a[3]], ..., 'c': [c[0], ..., c[3]]}, or [a[0], ..., a[3]]).

Returns: If mode is tuple, this method returns a tuple of lists/arrays. If mode is dict, this method returns a dict of lists/arrays.

get_example(i)[source]¶

Returns the i-th example.

Implementations should override it. It should raise IndexError if the index is invalid.

Parameters: i (int) – The index of the example.
Returns: The i-th example.

get_examples(indices, key_indices)[source]¶

Return a part of data.

Parameters

indices (list of ints or slice) – Indices of requested rows. If this argument is None, it indicates all rows.
key_indices (tuple of ints) – Indices of requested columns. If this argument is None, it indicates all columns.

Returns

tuple of lists/arrays

join(*datasets)[source]¶

Stack datasets along columns.

Parameters: datasets (iterable of TabularDataset) – Datasets to be concatenated. All datasets must have the same length
Returns: A joined dataset.

transform(keys, transform)[source]¶

Apply a transform to each example.

Parameters

keys (tuple of strs) – The keys of transformed examples.
transform (callable) – A callable that takes an example and returns transformed example. mode of transformed dataset is determined by the transformed examples.

Returns

A transfromed dataset.

transform_batch(keys, transform_batch)[source]¶

Apply a transform to examples.

Parameters

keys (tuple of strs) – The keys of transformed examples.
transform_batch (callable) – A callable that takes examples and returns transformed examples. mode of transformed dataset is determined by the transformed examples.

Returns

A transfromed dataset.

with_converter(converter)[source]¶

Override the behaviour of convert().

This method overrides convert().

Parameters: converter (callable) – A new converter.
Returns: A dataset with the new converter.

__eq__(value, /)¶: Return self==value.

__ne__(value, /)¶: Return self!=value.

__lt__(value, /)¶: Return self<value.

__le__(value, /)¶: Return self<=value.

__gt__(value, /)¶: Return self>value.

__ge__(value, /)¶: Return self>=value.

Attributes

keys¶

mode¶

slice¶

Get a slice of dataset.

Parameters

indices (list/array of ints/bools or slice) – Requested rows.
keys (tuple of ints/strs or int or str) – Requested columns.

Returns

A view of specified range.