Multipliers

Multiplier objects represent the dual variables \( \vlambda \) and \( \vmu \) in the constrained problem. They are required by certain formulations, such as Lagrangian and AugmentedLagrangian.

In a generic formulation \( \Lag \), the dual variables correspond to the inner maximization variables:

\[ \min_{\vx \in \reals^d} \,\, \max_{\vlambda \ge \vzero, \vmu} \,\, \Lag(\vx,\vlambda, \vmu). \]

In Cooper, multipliers are implemented as torch.nn.Modules, ensuring compatibility with PyTorch’s autograd capabilities. They are evaluated via a forward() call.

The cooper.multipliers module provides three types of multipliers:

DenseMultiplier: Represents each multiplier individually, with each entry in the multiplier vector corresponding to a separate constraint.
IndexedMultiplier: Similar to DenseMultiplier, but supports efficient indexing. This is useful for scenarios where constraints are sampled, allowing for sparse multiplier accessing and updates (see Constraint Sampling).
ImplicitMultiplier: Instead of storing multipliers explicitly, ImplicitMultipliers compute their values through a forward call on an arbitrary torch.nn.Module. This is particularly useful when the number of constraints is too large to maintain individual multipliers.

Linking constraints and multipliers

Constraint objects require an associated Multiplier when the problem formulation demands it. You can check this using the expects_multiplier attribute of a Formulation sub-class. To ensure compliance, pass a Multiplier object to the Constraint constructor.

multiplier = ...
constraint = cooper.Constraint(
    multiplier=multiplier,
    constraint_type=cooper.ConstraintType.INEQUALITY,
    formulation_type=cooper.formulations.Lagrangian,
)

Note

The helper methods CMP.multipliers and CMP.named_multipliers allow iteration over the multipliers associated with constraints registered in a CMP. For more details, see Registering constraints in a CMP.

Explicit (Non-Parametric) Multipliers

Consider the following Lagrangian formulation of a constrained optimization problem:

\[ \Lag(\vx, \vlambda, \vmu) = f(\vx) + \vlambda^\top \vg(\vx) + \vmu^\top \vh(\vx), \]

where \(\vlambda = [\lambda_i]_{i=1}^m\) and \(\vmu = [\mu_i]_{i=1}^n\) are the Lagrange multipliers associated with the equality and inequality constraints, respectively.

ExplicitMultiplier objects represent the vectors \(\vlambda\) and \(\vmu\) directly, by storing one decision variable per constraint. Cooper provides two types of explicit multipliers: DenseMultipliers and IndexedMultipliers.

Initialization

To create an ExplicitMultiplier, you must specify either (i) the number of associated constraints or (ii) an initial value for each multiplier entry.

The example below illustrates how to construct multiplier objects. Note that the syntax is consistent between DenseMultiplier and IndexedMultiplier.

# When specifying the number of constraints, all multipliers are initialized to zero
multiplier = cooper.multipliers.DenseMultiplier(
    num_constraints=3,
    device=torch.device("cpu"),
    dtype=torch.float32,
)

# When `init` is provided, `num_constraints` is inferred from its shape
multiplier = cooper.multipliers.IndexedMultiplier(
    init=torch.ones(7),
    device=torch.device("cuda"),
    dtype=torch.float16,
)

Evaluating an `ExplicitMultiplier`

Cooper stores multiplier values in the weight attribute, which can be accessed for inspecting their behavior.

However, to leverage PyTorch’s autograd functionality, we recommend evaluating multipliers via their forward() method. For example:

# `DenseMultiplier`s do not require arguments during evaluation
multiplier_value = multiplier()

# `IndexedMultiplier`s require indices for evaluation
indices = torch.tensor([0, 2, 4, 6])
multiplier_value = multiplier(indices)

class cooper.multipliers.ExplicitMultiplier(num_constraints=None, init=None, device=None, dtype=torch.float32)[source]

An ExplicitMultiplier holds a torch.nn.parameter.Parameter (weight) which explicitly contains the value of the Lagrange multipliers associated with a Constraint in a ConstrainedMinimizationProblem.

Parameters:

num_constraints (Optional[int]) – Number of constraints associated with the multiplier.
init (Optional[Tensor]) – Tensor used to initialize the multiplier values. If both init and num_constraints are provided, init must have shape (num_constraints,).
device (Optional[device]) – Device for the multiplier. If None, the device is inferred from the init tensor or the default device.
dtype (dtype) – Data type for the multiplier. Default is torch.float32.

static initialize_weight(num_constraints, init, device=None, dtype=torch.float32)[source]

Initialize the weight of the multiplier. If both init and num_constraints are provided (and the shapes are consistent), init takes precedence. Otherwise, the weight is initialized to torch.zeros() of shape (num_constraints,).

Raises:

ValueError – If both num_constraints and init are None.
ValueError – If both num_constraints and init are provided but their shapes are inconsistent.
ValueError – If the provided init is not a 1D tensor.

Return type:

Tensor

sanity_check()[source]

Ensures multipliers for inequality constraints are non-negative.

Raises:: ValueError – If the multiplier is associated with an inequality constraint and any of its entries is negative.
Return type:: None

post_step_()[source]

Projects (in-place) multipliers associated with inequality constraints so that they remain non-negative. This function is called after each dual optimizer step.

Return type:: None

abstract forward(*args, **kwargs)

Return the current value of the multiplier.

Return type:: Tensor

Dense Multipliers

DenseMultiplier objects are ideal for problems with a small to medium number of constraints, where all constraints are measured or observed at each iteration.

We refer to this type of multiplier as dense because all multipliers are utilized at every optimization step (e.g., during the computation of the Lagrangian), as opposed to only a subset being used.

A DenseMultiplier is essentially a wrapper around a torch.Tensor to provide an interface consistent with other types of multipliers. It implements the forward() method, which returns all multipliers as a single tensor. This method takes no arguments.

For large-scale Constraint objects (e.g., one constraint per training example) or problems where constraints are sampled and not all constraints are observed simultaneously, consider using an IndexedMultiplier or ImplicitMultiplier instead.

class cooper.multipliers.DenseMultiplier(num_constraints=None, init=None, device=None, dtype=torch.float32)[source]

Sub-class of ExplicitMultiplier for constraints that are all evaluated at every optimization step.

forward()[source]

Returns the current value of the multiplier.

Return type:: Tensor

Indexed Multipliers

Like DenseMultipliers, IndexedMultipliers represent the multiplier tensors directly, but allow efficiently accessing and updating specific entries by index.

IndexedMultiplier objects are designed for situations where only a subset of constraints are observed at each iteration, rather than all constraints. This approach is particularly useful when the number of constraints is large, such as in tasks where a constraint is imposed for each data. In such cases, measuring all constraints simultaneously can be computationally prohibitive.

IndexedMultiplier objects model \(\vlambda\) and \(\vmu\) explicitly, just like DenseMultipliers, but allow fetching and updating them by index. Given indices idx, the forward() method of an IndexedMultiplier object returns the multipliers corresponding to the indices in idx. IndexedMultipliers enable time-efficient retrieval of multipliers for only the sampled constraints, while also supporting memory-efficient sparse gradients (on GPU).

To use an IndexedMultiplier, after computing the constraints you must provide the observed constraint indices to the constraint_features argument of the ConstraintState(). Cooper will then know which multipliers to fetch and update during optimization. For example, if you measured the constraints at indices 0, 11, and 17, you would set the constraint_features attribute as follows:

observed_violation_tensor = torch.tensor([3, 1, 4])
observed_constraint_indices = torch.tensor([0, 11, 17])

constraint_state = cooper.ConstraintState(
    violation=observed_violation_tensor,
    constraint_features=observed_constraint_indices
)

Warning

The forward() call of an IndexedMultiplier expects a list of indices corresponding to the constraints whose multipliers you want to fetch. If you want to fetch all multipliers, you must provide a list of all constraint indices.

class cooper.multipliers.IndexedMultiplier(num_constraints=None, init=None, device=None, dtype=torch.float32)[source]

ExplicitMultiplier for indexed constraints which are evaluated only for a subset of constraints on every optimization step.

forward(indices)[source]

Return the current value of the multiplier at the provided indices.

Parameters:: indices (Tensor) – Indices of the multipliers to return. The shape of indices must be (num_indices,).
Raises:: ValueError – If indices dtype is not torch.long.
Return type:: Tensor

Implicit (Parametric) Multipliers

Rather than maintaining a separate learnable parameter for each multiplier, it can be more practical to model the multipliers as functions of the constraints. This approach is particularly beneficial when the number of constraints is large, as explicitly maintaining a Lagrange multiplier for each constraint can be computationally expensive or infeasible.

We can represent the multipliers functionally as follows: \(\vlambda_{\phi}: \mathbb{R}^d \to \mathbb{R}^m\) and \(\vmu_{\psi}: \mathbb{R}^d \to \mathbb{R}^n\). In this representation, \(d\) refers to the dimensionality of the constraint feature space, while \(\phi\) and \(\psi\) are the parameters of the multiplier functions. These functions calculate the multiplier values based on the input constraint features.

To support this functional approach, Cooper provides the ImplicitMultiplier class. To use it, you must implement its forward() method, which takes a tensor of constraint features as input and returns the corresponding multiplier values. For example, to define a linear model for the multipliers, you can write:

import torch
import torch.nn as nn

class ImplicitMultiplier(nn.Module):
    def __init__(self, input_dim, output_dim):
        super().__init__()
        self.linear = nn.Linear(input_dim, output_dim)

    def forward(self, constraint_features):
        return self.linear(constraint_features)

This approach can be extended to more complex models, such as neural networks.

Much like the IndexedMultiplier, you need to provide the constraint features associated with the observed constraints to the constraint_features attribute of the ConstraintState. Cooper will then perform forward and backward passes through the multiplier model and update its parameters accordingly.

Note

Due to their functional nature, implicit multipliers can (approximately) represent infinitely many constraints. This capability is inspired by the “Lagrange multiplier model” proposed by [NCZ+20].

Warning

Because of the high flexibility of implicit multipliers, the post_step_ method is not implemented in the base class. For applications involving inequality constraints and implicit multipliers, you must implement the logic to ensure the non-negativity of the multipliers associated with inequality constraints.

class cooper.multipliers.ImplicitMultiplier(*args, **kwargs)[source]

An implicit multiplier is a torch.nn.Module that computes the value of a Lagrange multiplier associated with a Constraint based on the “features” for each constraint. The multiplier is implicitly represented by its parameters.

abstract forward()[source]

Return the current value of the multiplier.

Return type:: Tensor

abstract post_step_()[source]

This method is called after each step of the dual optimizer and allows for additional post-processing of the implicit multiplier module or its parameters.

Return type:: None

Base Class

class cooper.multipliers.Multiplier(*args, **kwargs)[source]

abstract forward(*args, **kwargs)[source]

Return the current value of the multiplier.

Return type:: Tensor

abstract post_step_()[source]

Post-step function for multipliers. This function is called after each step of the dual optimizer, and allows for additional post-processing of the implicit multiplier module or its parameters.

Return type:: None

sanity_check()[source]

Perform sanity checks on the multiplier. This method is called after setting the constraint type and ensures consistency between the multiplier and the constraint type. For example, multipliers for inequality constraints must be non-negative.

Return type:: None

Checkpointing

To save the current multipliers of a CMP, use the state_dict() method to create a state checkpoint. Later, you can restore this state using load_state_dict(). This process captures the multiplier and penalty coefficient values (see CMP Checkpointing for details).

Multipliers

Explicit (Non-Parametric) Multipliers

Initialization

Evaluating an ExplicitMultiplier

Dense Multipliers

Indexed Multipliers

Implicit (Parametric) Multipliers

Base Class

Checkpointing

Evaluating an `ExplicitMultiplier`