DistributedPerLayerOptimizer¶
- class opacus.optimizers.ddp_perlayeroptimizer.DistributedPerLayerOptimizer(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False, **kwargs)[source]¶
DPOptimizerthat implements per layer clipping strategy and is compatible with distributed data parallel- Parameters:
optimizer (
Optimizer) – wrapped optimizer.noise_multiplier (
float) – noise multipliermax_grad_norm (
List[float]) – max grad norm used for gradient clippingexpected_batch_size (
Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required isloss_reduction="mean", ignored ifloss_reduction="sum"loss_reduction (
str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”generator – torch.Generator() object used as a source of randomness for the noise
secure_mode (
bool) – ifTrueuses noise generation approach robust to floating point arithmetic attacks. See_generate_noise()for details
- property accumulated_iterations: int¶
Returns number of batches currently accumulated and not yet processed.
In other words
accumulated_iterationstracks the number of forward/backward passed done in between two optimizer steps. The value would typically be 1, but there are possible exceptions.Used by privacy accountants to calculate real sampling rate.
- class opacus.optimizers.ddp_perlayeroptimizer.SimpleDistributedPerLayerOptimizer(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False, **kwargs)[source]¶
- Parameters:
optimizer (
Optimizer) – wrapped optimizer.noise_multiplier (
float) – noise multipliermax_grad_norm (
float) – max grad norm used for gradient clippingexpected_batch_size (
Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required isloss_reduction="mean", ignored ifloss_reduction="sum"loss_reduction (
str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”generator – torch.Generator() object used as a source of randomness for the noise
secure_mode (
bool) – ifTrueuses noise generation approach robust to floating point arithmetic attacks. See_generate_noise()for details