- class chainer.iterators.MultiprocessIterator(dataset, batch_size, repeat=True, shuffle=None, n_processes=None, n_prefetch=1, shared_mem=None, order_sampler=None, dataset_timeout=30.0, maxtasksperchild=None)¶
Dataset iterator that loads examples in parallel.
This is an implementation of
Iteratorthat loads examples with worker processes. It uses the standard
multiprocessingmodule to parallelize the loading. The dataset is sent to the worker processes in the standard way using pickle.
Note that this iterator effectively prefetches the examples for the next batch asynchronously after the current batch is returned.
This iterator saves
Nonein snapshots since some serializers do not support
When you are using OpenCV somewhere in your code and the
MultiprocessIteratoris used in the training code, the training loop may get stuck at some point. In such situation, there are several workarounds to prevent the process got stuck.
Set the environment variable as follows:
import cv2in your training script.
dataset (Dataset) – Dataset to iterate.
batch_size (int) – Number of examples within each batch.
repeat (bool) – If
True, it infinitely loops over the dataset. Otherwise, it stops iteration at the end of the first epoch.
shuffle (bool) – If
True, the order of examples is shuffled at the beginning of each epoch. Otherwise, examples are extracted in the order of indexes. If
order_sampleris given, the behavior is the same as the case with
n_processes (int) – Number of worker processes. The number of CPUs is used by default.
n_prefetch (int) – Number of prefetch batches.
shared_mem (int) – The size of using shared memory per data. If
None, size is adjusted automatically.
dataset_timeout (float) –
MultiprocessIterator.TimeoutWarningwill be issued after this time in seconds elapsed in each dataset realization.
Noneto disable the warning. You can turn this warning into an error by using
warnings.simplefilter( 'error', chainer.iterators.MultiprocessIterator.TimeoutWarning)
order_sampler (callable) – A callable that generates the order of the indices to sample in the next epoch when a epoch finishes. This function should take two arguments: the current order and the current position of the iterator. This should return the next order. The size of the order should remain constant. This option cannot be used when
maxtasksperchild (int) – Number of tasks a worker of prefetch process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. If
None, worker processes will live as long as the pool.
- __exit__(exc_type, exc_value, traceback)¶
Returns the next batch.
This is a part of the iterator protocol of Python. It may raise the
StopIterationexception when it stops the iteration.
Finalizes the iterator and possibly releases the resources.
This method does nothing by default. Implementation may override it to better handle the internal resources.
This method can be called multiple times.
Serializes the internal state of the iterator.
This is a method to support the serializer protocol of Chainer.
It should only serialize the internal state that changes over the iteration. It should not serialize what is set manually by users such as the batch size.
- __eq__(value, /)¶
- __ne__(value, /)¶
- __lt__(value, /)¶
- __le__(value, /)¶
- __gt__(value, /)¶
- __ge__(value, /)¶