DistributedPerLayerOptimizer¶

class opacus.optimizers.ddp_perlayeroptimizer.DistributedPerLayerOptimizer(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False, **kwargs)[source]¶

DPOptimizer that implements per layer clipping strategy and is compatible with distributed data parallel

Parameters:

optimizer (Optimizer) – wrapped optimizer.
noise_multiplier (float) – noise multiplier
max_grad_norm (List[float]) – max grad norm used for gradient clipping
expected_batch_size (Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required is loss_reduction="mean", ignored if loss_reduction="sum"
loss_reduction (str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”
generator – torch.Generator() object used as a source of randomness for the noise
secure_mode (bool) – if True uses noise generation approach robust to floating point arithmetic attacks. See _generate_noise() for details

property accumulated_iterations: int¶

Returns number of batches currently accumulated and not yet processed.

In other words accumulated_iterations tracks the number of forward/backward passed done in between two optimizer steps. The value would typically be 1, but there are possible exceptions.

Used by privacy accountants to calculate real sampling rate.

add_noise()[source]¶: Adds noise to clipped gradients. Stores clipped and noised result in p.grad

clip_and_accumulate()[source]¶: Performs gradient clipping. Stores clipped and aggregated gradients into p.summed_grad``

pre_step(closure=None)[source]¶

Perform actions specific to DPOptimizer before calling underlying optimizer.step()

Parameters:: closure (Optional[Callable[[], float]]) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
Return type:: Optional[float]

class opacus.optimizers.ddp_perlayeroptimizer.SimpleDistributedPerLayerOptimizer(optimizer, *, noise_multiplier, max_grad_norm, expected_batch_size, loss_reduction='mean', generator=None, secure_mode=False, **kwargs)[source]¶

Parameters:

optimizer (Optimizer) – wrapped optimizer.
noise_multiplier (float) – noise multiplier
max_grad_norm (float) – max grad norm used for gradient clipping
expected_batch_size (Optional[int]) – batch_size used for averaging gradients. When using Poisson sampling averaging denominator can’t be inferred from the actual batch size. Required is loss_reduction="mean", ignored if loss_reduction="sum"
loss_reduction (str) – Indicates if the loss reduction (for aggregating the gradients) is a sum or a mean operation. Can take values “sum” or “mean”
generator – torch.Generator() object used as a source of randomness for the noise
secure_mode (bool) – if True uses noise generation approach robust to floating point arithmetic attacks. See _generate_noise() for details

DistributedPerLayerOptimizer¶

Opacus

Navigation

Related Topics