Datamanager

The datamanager wraps a dataloader and combines it with augmentations and multiprocessing.

BaseDataManager

class BaseDataManager(data, batch_size, n_process_augmentation, transforms, sampler_cls=<class 'delira.data_loading.sampler.sequential_sampler.SequentialSampler'>, sampler_kwargs=None, data_loader_cls=None, dataset_cls=None, load_fn=<function default_load_fn_2d>, from_disc=True, **kwargs)[source]

Bases: object

Class to Handle Data Creates Dataset , Dataloader and BatchGenerator

property batch_size

Property to access the batchsize

Returns

the batchsize

Return type

int

property data_loader_cls

Property to access the current data loader class

Returns

Subclass of SlimDataLoaderBase

Return type

type

property dataset

Property to access the current dataset

Returns

the current dataset

Return type

AbstractDataset

get_batchgen(seed=1)[source]

Create DataLoader and Batchgenerator

Parameters

seed (int) – seed for Random Number Generator

Returns

Batchgenerator

Return type

Augmenter

Raises

AssertionErrorBaseDataManager.n_batches is smaller than or equal to zero

get_subset(indices)[source]

Returns a Subset of the current datamanager based on given indices

Parameters

indices (iterable) – valid indices to extract subset from current dataset

Returns

manager containing the subset

Return type

BaseDataManager

property n_batches

Returns Number of Batches based on batchsize and number of samples

Returns

Number of Batches

Return type

int

Raises

AssertionErrorBaseDataManager.n_samples is smaller than or equal to zero

property n_process_augmentation

Property to access the number of augmentation processes

Returns

number of augmentation processes

Return type

int

property n_samples

Number of Samples

Returns

Number of Samples

Return type

int

property sampler

Property to access the current sampler

Returns

the current sampler

Return type

AbstractSampler

train_test_split(*args, **kwargs)[source]

Calls :method:`AbstractDataset.train_test_split` and returns a manager for each subset with same configuration as current manager

Parameters
  • *args – positional arguments for sklearn.model_selection.train_test_split

  • **kwargs – keyword arguments for sklearn.model_selection.train_test_split

property transforms

Property to access the current data transforms

Returns

The transformation, can either be None or an instance of AbstractTransform

Return type

None, AbstractTransform

update_state_from_dict(new_state: dict)[source]

Updates internal state and therfore the behavior from dict. If a key is not specified, the old attribute value will be used

Parameters

new_state (dict) –

The dict to update the state from. Valid keys are:

  • batch_size

  • n_process_augmentation

  • data_loader_cls

  • sampler

  • sampling_kwargs

  • transforms

If a key is not specified, the old value of the corresponding attribute will be used

Raises

KeyError – Invalid keys are specified

Augmenter

class Augmenter(data_loader: delira.data_loading.data_loader.BaseDataLoader, transforms, n_process_augmentation, sampler, sampler_queues: list, num_cached_per_queue=2, seeds=None, **kwargs)[source]

Bases: object

Class wrapping MultiThreadedAugmentor and SingleThreadedAugmenter to provide a uniform API and to disable multiprocessing/multithreading inside the dataloading pipeline

static _Augmenter__identity_fn(*args, **kwargs)

Helper function accepting arbitrary args and kwargs and returning without doing anything

Parameters
  • *args – keyword arguments

  • **kwargs – positional arguments

_finish()[source]

Property to provide uniform API of _finish

Returns

either the augmenter’s _finish method (if available) or __identity_fn (if not available)

Return type

Callable

_fn_checker(function_name)[source]

Checks if the internal augmenter has a given attribute and returns it. Otherwise it returns __identity_fn

Parameters

function_name (str) – the function name to check for

Returns

either the function corresponding to the given function name or __identity_fn

Return type

Callable

_next_queue()[source]
property _start

Property to provide uniform API of _start

Returns

either the augmenter’s _start method (if available) or __identity_fn (if not available)

Return type

Callable

next()[source]

Function to sample and load

Returns

the next batch

Return type

dict

property num_batches

Property returning the number of batches

Returns

number of batches

Return type

int

property num_processes

Property returning the number of processes to use for loading and augmentation

Returns

number of processes to use for loading and augmentation

Return type

int

restart()[source]

Property to provide uniform API of restart

Returns

either the augmenter’s restart method (if available) or __identity_fn (if not available)

Return type

Callable