DistributedDPOptimizer

class opacus.optimizers.ddpoptimizer.DistributedDPOptimizer(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False)[source]

DPOptimizer compatible with distributed data processing

Parameters
  • optimizer (Optimizer) – wrapped optimizer.

  • noise_multiplier (float) – noise multiplier

  • max_grad_norm (float) – max grad norm used for gradient clipping

  • expected_batch_size (Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required is loss_reduction="mean", ignored if loss_reduction="sum"

  • loss_reduction (str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”

  • generator – torch.Generator() object used as a source of randomness for the noise

  • secure_mode (bool) – if True uses noise generation approach robust to floating point arithmetic attacks. See _generate_noise() for details

add_noise()[source]

Adds noise to clipped gradients. Stores clipped and noised result in p.grad

step(closure=None)[source]

Performs a single optimization step (parameter update).

Parameters

closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

Return type

Optional[Tensor]