Datasets

The Dataset the most basic class and implements the loading of your dataset elements. You can either load your data in a lazy way e.g. loading them just at the moment they are needed or you could preload them and cache them.

Datasets can be indexed by integers and return single samples.

To implement custom datasets you should derive the AbstractDataset

AbstractDataset

class AbstractDataset(data_path, load_fn, img_extensions, gt_extensions)[source]

Bases: object

Base Class for Dataset

_make_dataset(path)[source]

Create dataset

Parameters:path (str) – path to data samples
Returns:data: List of sample paths if lazy; List of samples if not
Return type:list
train_test_split(*args, **kwargs)[source]

split dataset into train and test data

Parameters:
  • *args – positional arguments of train_test_split
  • **kwargs – keyword arguments of train_test_split
Returns:

  • BlankDataset – train dataset
  • BlankDataset – test dataset

See also

sklearn.model_selection.train_test_split

BaseLazyDataset

class BaseLazyDataset(data_path, load_fn, img_extensions, gt_extensions, **load_kwargs)[source]

Bases: delira.data_loading.dataset.AbstractDataset

Dataset to load data in a lazy way

_is_valid_image_file(fname)[source]

Helper Function to check wheter file is image file and has at least one label file

Parameters:fname (str) – filename of image path
Returns:is valid data sample
Return type:bool
_make_dataset(path)[source]

Helper Function to make a dataset containing paths to all images in a certain directory

Parameters:path (str) – path to data samples
Returns:list of sample paths
Return type:list
Raises:AssertionError – if path is not a valid directory
train_test_split(*args, **kwargs)

split dataset into train and test data

Parameters:
  • *args – positional arguments of train_test_split
  • **kwargs – keyword arguments of train_test_split
Returns:

  • BlankDataset – train dataset
  • BlankDataset – test dataset

See also

sklearn.model_selection.train_test_split

BaseCacheDataset

class BaseCacheDataset(data_path, load_fn, img_extensions, gt_extensions, **load_kwargs)[source]

Bases: delira.data_loading.dataset.AbstractDataset

Dataset to preload and cache data

Notes

data needs to fit completely into RAM!

_is_valid_image_file(fname)[source]

Helper Function to check wheter file is image file and has at least one label file

Parameters:fname (str) – filename of image path
Returns:is valid data sample
Return type:bool
_make_dataset(path)[source]

Helper Function to make a dataset containing all samples in a certain directory

Parameters:path (str) – path to data samples
Returns:list of sample paths
Return type:list
Raises:AssertionError – if path is not a valid directory
train_test_split(*args, **kwargs)

split dataset into train and test data

Parameters:
  • *args – positional arguments of train_test_split
  • **kwargs – keyword arguments of train_test_split
Returns:

  • BlankDataset – train dataset
  • BlankDataset – test dataset

See also

sklearn.model_selection.train_test_split