Chainer¶

AbstractChainerNetwork¶

class AbstractChainerNetwork(**kwargs)[source]¶

Bases: chainer.Chain, delira.models.backends.chainer.abstract_network.ChainerMixin

Abstract Class for Chainer Networks

_init_kwargs = {}¶

static closure(model, data_dict: dict, optimizers: dict, losses={}, metrics={}, fold=0, **kwargs)[source]¶

default closure method to do a single training step; Could be overwritten for more advanced models

Parameters

model (AbstractChainerNetwork) – trainable model
data_dict (dict) – dictionary containing the data
optimizers (dict) – dictionary of optimizers to optimize model’s parameters; ignored here, just passed for compatibility reasons
losses (dict) – dict holding the losses to calculate errors; ignored here, just passed for compatibility reasons
metrics (dict) – dict holding the metrics to calculate
fold (int) – Current Fold in Crossvalidation (default: 0)
**kwargs – additional keyword arguments

Returns

dict – Metric values (with same keys as input dict metrics)
dict – Loss values (with same keys as input dict losses; will always be empty here)
dict – dictionary containing all predictions

abstract forward(*args, **kwargs) → dict[source]¶

Feeds Arguments through the network

Parameters

*args – positional arguments of arbitrary number and type
**kwargs – keyword arguments of arbitrary number and type

Returns

dictionary containing all computation results

Return type

dict

property init_kwargs¶

Returns all arguments registered as init kwargs

Returns: init kwargs
Return type: dict

static prepare_batch(batch: dict, input_device, output_device)[source]¶

Helper Function to prepare Network Inputs and Labels (convert them to correct type and shape and push them to correct devices)

Parameters

batch (dict) – dictionary containing all the data
input_device (chainer.backend.Device or string) – device for network inputs
output_device (torch.device) – device for network outputs

Returns

dictionary containing data in correct type and shape and on correct device

Return type

dict

DataParallelChainerNetwork¶

class DataParallelChainerNetwork(module: delira.models.backends.chainer.abstract_network.AbstractChainerNetwork, devices: list, output_device=None, batch_dim=0)[source]¶

Bases: delira.models.backends.chainer.abstract_network.AbstractChainerNetwork

A Wrapper around a AbstractChainerNetwork instance to implement parallel training by splitting the batches

static _gather(predictions, dim, target_device)[source]¶

Re-Builds batches on the target device

Parameters

predictions (list) – list containing the predictions from all replicated models
dim (int) – dimension to use for concatenating single predictions
target_device (str or chainer.backend.Device) – the device, the re-built batch should lie on

Returns

the rebuild batch (lying on target_device)

Return type

Any

_init_kwargs = {}¶

static _scatter(inputs, kwargs, target_devices: list, dim=0)[source]¶

Scatters all inputs (args and kwargs) to target devices and splits along given dimension

Parameters

inputs (list or tuple) – positional arguments
kwargs (dict) – keyword arguments
target_devices (list) – list of target device (either string or chainer.backend.Device)
dim (int) – the dimension, which should be used for splitting the batch

Returns

tuple – scattered positional arguments
tuple – scattered keyword arguments

cleargrads()[source]¶

property closure¶

default closure method to do a single training step; Could be overwritten for more advanced models

Parameters

model (AbstractChainerNetwork) – trainable model
data_dict (dict) – dictionary containing the data
optimizers (dict) – dictionary of optimizers to optimize model’s parameters; ignored here, just passed for compatibility reasons
losses (dict) – dict holding the losses to calculate errors; ignored here, just passed for compatibility reasons
metrics (dict) – dict holding the metrics to calculate
fold (int) – Current Fold in Crossvalidation (default: 0)
**kwargs – additional keyword arguments

Returns

dict – Metric values (with same keys as input dict metrics)
dict – Loss values (with same keys as input dict losses; will always be empty here)
dict – dictionary containing all predictions

forward(*args, **kwargs)[source]¶

Scatters the inputs (both positional and keyword arguments) across all devices, feeds them through model replicas and re-builds batches on output device

Parameters

*args – positional arguments of arbitrary number and type
**kwargs – keyword arguments of arbitrary number and type

Returns

combined output from all scattered models

Return type

Any

property init_kwargs¶

Returns all arguments registered as init kwargs

Returns: init kwargs
Return type: dict

params(include_uninit=True)[source]¶

Only the parameters of the module on the first device will actually be updated, all the other parameters will be replicated by the optimizer after an update

Parameters: include_uninit (bool) –
Returns
Return type: a generator holding the root-modules parameters

property prepare_batch¶

Helper Function to prepare Network Inputs and Labels (convert them to correct type and shape and push them to correct devices)

Parameters

batch (dict) – dictionary containing all the data
input_device (chainer.backend.Device or string) – device for network inputs
output_device (torch.device) – device for network outputs

Returns

dictionary containing data in correct type and shape and on correct device

Return type

dict

zerograds()[source]¶

DataParallelChainerOptimizer¶

class DataParallelChainerOptimizer(optimizer)[source]¶

Bases: chainer.Optimizer

An Optimizer-Wrapper to enable DataParallel. Basically this forwards all functions to the interal optimizer, but registers the additional hooks needed for DataParallel (namely ParallelOptimizerUpdateModelParameters as a post-update hook and ParallelOptimizerCumulateGradientsHook as a pre-update hook)

property _loss_scale¶

property _loss_scale_max¶

property _loss_scaling_is_dynamic¶

property _pre_update_hooks¶

property add_hook¶

property call_hooks¶

property check_nan_in_grads¶

property epoch¶

classmethod from_optimizer_class(optim_cls, *args, **kwargs)[source]¶

Parameters

optim_cls (subclass of chainer.Optimizer) – the optimizer to use internally
*args – arbitrary positional arguments (will be used for initialization of internally used optimizer)
**kwargs – arbitrary keyword arguments (will be used for initialization of internally used optimizer)

property is_safe_to_update¶

property loss_scaling¶

property new_epoch¶

property remove_hook¶

property serialize¶

property set_loss_scale¶

setup(link)[source]¶

Calls the setup method of the internal optimizer and registers the necessary grads for data-parallel behavior

Parameters: link (DataParallel) – the target, whoose parameters should be updated

property target¶

property update¶

property update_loss_scale¶

property use_auto_new_epoch¶

ParallelOptimizerUpdateModelParameters¶

ParallelOptimizerCumulateGradientsHook¶

class ParallelOptimizerCumulateGradientsHook[source]¶

Bases: object

A hook which sums up all replication’s gradients in a DataParallel-Scenario

call_for_each_param = False¶

name = 'DataParallelCumulateGradients'¶

timing = 'pre'¶