Chainer

AbstractChainerNetwork

class AbstractChainerNetwork(**kwargs)[source]

Bases: chainer.Chain, delira.models.backends.chainer.abstract_network.ChainerMixin

Abstract Class for Chainer Networks

_init_kwargs = {}
static closure(model, data_dict: dict, optimizers: dict, losses={}, metrics={}, fold=0, **kwargs)[source]

default closure method to do a single training step; Could be overwritten for more advanced models

Parameters
  • model (AbstractChainerNetwork) – trainable model

  • data_dict (dict) – dictionary containing the data

  • optimizers (dict) – dictionary of optimizers to optimize model’s parameters; ignored here, just passed for compatibility reasons

  • losses (dict) – dict holding the losses to calculate errors; ignored here, just passed for compatibility reasons

  • metrics (dict) – dict holding the metrics to calculate

  • fold (int) – Current Fold in Crossvalidation (default: 0)

  • **kwargs – additional keyword arguments

Returns

  • dict – Metric values (with same keys as input dict metrics)

  • dict – Loss values (with same keys as input dict losses; will always be empty here)

  • dict – dictionary containing all predictions

abstract forward(*args, **kwargs) → dict[source]

Feeds Arguments through the network

Parameters
  • *args – positional arguments of arbitrary number and type

  • **kwargs – keyword arguments of arbitrary number and type

Returns

dictionary containing all computation results

Return type

dict

property init_kwargs

Returns all arguments registered as init kwargs

Returns

init kwargs

Return type

dict

static prepare_batch(batch: dict, input_device, output_device)[source]

Helper Function to prepare Network Inputs and Labels (convert them to correct type and shape and push them to correct devices)

Parameters
  • batch (dict) – dictionary containing all the data

  • input_device (chainer.backend.Device or string) – device for network inputs

  • output_device (torch.device) – device for network outputs

Returns

dictionary containing data in correct type and shape and on correct device

Return type

dict

DataParallelChainerNetwork

class DataParallelChainerNetwork(module: delira.models.backends.chainer.abstract_network.AbstractChainerNetwork, devices: list, output_device=None, batch_dim=0)[source]

Bases: delira.models.backends.chainer.abstract_network.AbstractChainerNetwork

A Wrapper around a AbstractChainerNetwork instance to implement parallel training by splitting the batches

static _gather(predictions, dim, target_device)[source]

Re-Builds batches on the target device

Parameters
  • predictions (list) – list containing the predictions from all replicated models

  • dim (int) – dimension to use for concatenating single predictions

  • target_device (str or chainer.backend.Device) – the device, the re-built batch should lie on

Returns

the rebuild batch (lying on target_device)

Return type

Any

_init_kwargs = {}
static _scatter(inputs, kwargs, target_devices: list, dim=0)[source]

Scatters all inputs (args and kwargs) to target devices and splits along given dimension

Parameters
  • inputs (list or tuple) – positional arguments

  • kwargs (dict) – keyword arguments

  • target_devices (list) – list of target device (either string or chainer.backend.Device)

  • dim (int) – the dimension, which should be used for splitting the batch

Returns

  • tuple – scattered positional arguments

  • tuple – scattered keyword arguments

cleargrads()[source]
property closure

default closure method to do a single training step; Could be overwritten for more advanced models

Parameters
  • model (AbstractChainerNetwork) – trainable model

  • data_dict (dict) – dictionary containing the data

  • optimizers (dict) – dictionary of optimizers to optimize model’s parameters; ignored here, just passed for compatibility reasons

  • losses (dict) – dict holding the losses to calculate errors; ignored here, just passed for compatibility reasons

  • metrics (dict) – dict holding the metrics to calculate

  • fold (int) – Current Fold in Crossvalidation (default: 0)

  • **kwargs – additional keyword arguments

Returns

  • dict – Metric values (with same keys as input dict metrics)

  • dict – Loss values (with same keys as input dict losses; will always be empty here)

  • dict – dictionary containing all predictions

forward(*args, **kwargs)[source]

Scatters the inputs (both positional and keyword arguments) across all devices, feeds them through model replicas and re-builds batches on output device

Parameters
  • *args – positional arguments of arbitrary number and type

  • **kwargs – keyword arguments of arbitrary number and type

Returns

combined output from all scattered models

Return type

Any

property init_kwargs

Returns all arguments registered as init kwargs

Returns

init kwargs

Return type

dict

params(include_uninit=True)[source]

Only the parameters of the module on the first device will actually be updated, all the other parameters will be replicated by the optimizer after an update

Parameters

include_uninit (bool) –

Returns

Return type

a generator holding the root-modules parameters

property prepare_batch

Helper Function to prepare Network Inputs and Labels (convert them to correct type and shape and push them to correct devices)

Parameters
  • batch (dict) – dictionary containing all the data

  • input_device (chainer.backend.Device or string) – device for network inputs

  • output_device (torch.device) – device for network outputs

Returns

dictionary containing data in correct type and shape and on correct device

Return type

dict

zerograds()[source]

DataParallelChainerOptimizer

class DataParallelChainerOptimizer(optimizer)[source]

Bases: chainer.Optimizer

An Optimizer-Wrapper to enable DataParallel. Basically this forwards all functions to the interal optimizer, but registers the additional hooks needed for DataParallel (namely ParallelOptimizerUpdateModelParameters as a post-update hook and ParallelOptimizerCumulateGradientsHook as a pre-update hook)

property _loss_scale
property _loss_scale_max
property _loss_scaling_is_dynamic
property _pre_update_hooks
property add_hook
property call_hooks
property check_nan_in_grads
property epoch
classmethod from_optimizer_class(optim_cls, *args, **kwargs)[source]
Parameters
  • optim_cls (subclass of chainer.Optimizer) – the optimizer to use internally

  • *args – arbitrary positional arguments (will be used for initialization of internally used optimizer)

  • **kwargs – arbitrary keyword arguments (will be used for initialization of internally used optimizer)

property is_safe_to_update
property loss_scaling
property new_epoch
property remove_hook
property serialize
property set_loss_scale
setup(link)[source]

Calls the setup method of the internal optimizer and registers the necessary grads for data-parallel behavior

Parameters

link (DataParallel) – the target, whoose parameters should be updated

property target
property update
property update_loss_scale
property use_auto_new_epoch

ParallelOptimizerUpdateModelParameters

ParallelOptimizerCumulateGradientsHook

class ParallelOptimizerCumulateGradientsHook[source]

Bases: object

A hook which sums up all replication’s gradients in a DataParallel-Scenario

call_for_each_param = False
name = 'DataParallelCumulateGradients'
timing = 'pre'