chainer.datasets.split_dataset_random

chainer.datasets.split_dataset_random(dataset, first_size, seed=None)[source]

Splits a dataset into two subsets randomly.

This function creates two instances of SubDataset. These instances do not share any examples, and they together cover all examples of the original dataset. The split is automatically done randomly.

Parameters
  • dataset – Dataset to split.

  • first_size (int) – Size of the first subset.

  • seed (int) – Seed the generator used for the permutation of indexes. If an integer being convertible to 32 bit unsigned integers is specified, it is guaranteed that each sample in the given dataset always belongs to a specific subset. If None, the permutation is changed randomly.

Returns

Two SubDataset objects. The first subset contains first_size examples randomly chosen from the dataset without replacement, and the second subset contains the rest of the dataset.

Return type

tuple