DistributedDPOptimizerFastGradientClipping

class opacus.optimizers.ddpoptimizer_fast_gradient_clipping.DistributedDPOptimizerFastGradientClipping(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False, **kwargs)[source]

opacus.optimizers.optimizer.DPOptimizer compatible with distributed data processing

Parameters:
  • optimizer (Optimizer) – wrapped optimizer.

  • noise_multiplier (float) – noise multiplier

  • max_grad_norm (float) – max grad norm used for calculating the standard devition of noise added

  • expected_batch_size (Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required is loss_reduction="mean", ignored if loss_reduction="sum"

  • loss_reduction (str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”

  • generator – torch.Generator() object used as a source of randomness for the noise

  • secure_mode (bool) – if True uses noise generation approach robust to floating point arithmetic attacks. See _generate_noise() for details

add_noise()[source]

Adds noise to clipped gradients. Stores clipped and noised result in p.grad

step(closure=None)[source]

Perform a single optimization step to update parameter.

Parameters:

closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Return type:

Optional[Tensor]