21.4.5.1. Augmented Lagrange Multiplier Method

ALM method may be called as Method of Multiplier (MOM) or Primal-Dual Method. Let’s consider Lagrangian functional only for equality constraints.

\(L\left( \mathbf{x} \right)=f\left( \mathbf{x} \right)+{{\mathbf{\lambda }}^{T}}\mathbf{h}\left( \mathbf{x} \right)\)

Now, for a Lagrange multiplier vector \({{\mathbf{\lambda }}^{*}}\), suppose that there is an optimum \({{\mathbf{x}}^{*}}\) for the following unconstrained optimization problem.

\(\underset{\mathbf{x}}{\mathop{\min }}\,L\left( \mathbf{x},{{\mathbf{\lambda }}^{*}} \right)\)

If \({{\mathbf{x}}^{*}}\) satisfy all the equality constraints \(\left( \mathbf{h}\left( {{\mathbf{x}}^{*}} \right)=\mathbf{0} \right)\) in the original design problem, \({{\mathbf{x}}^{*}}\) is an optimum for the original optimization problem and \({{\mathbf{\lambda }}^{*}}\) is a Lagrange multiplier optimum. Consequently, the original optimization problem can be transformed into the following problem that have the same optimum \({{\mathbf{x}}^{*}}\) and \({{\mathbf{\lambda }}^{*}}\).

\(\underset{\mathbf{x}}{\mathop{\min }}\,L\left( \mathbf{x},\mathbf{\lambda } \right)\)

subject to \({{h}_{i}}\left( \mathbf{x} \right)=0,\text{ }i=1,2,...,l.\)

In order to avoid the unboundness of Lagrangian, a penalty function is introduced. We call it as augmented Lagrangian.

\(A\left( \mathbf{x},\mathbf{\lambda },\mathbf{r} \right)=L\left( \mathbf{x},\mathbf{\lambda } \right)+\frac{1}{2}\sum\limits_{i=1}^{l}{{{r}_{i}}{{h}_{i}}{{\left( \mathbf{x} \right)}^{2}}}\)

where, \({{r}_{i}}\) is the penalty parameter for the ith equality constraint. In the ALM method, the unconstrained optimization tool sequentially minimize the augmented Lagrangian for the given value of \({{r}_{i}}\) and \({{\lambda }_{i}}\). Then, these two parameters are modified to satisfy the optimality condition.

The update rule for Lagrange multipliers can be determined from the following relation.

\(\underset{\mathbf{x}\to {{\mathbf{x}}^{*}}}{\mathop{\lim }}\,\nabla A\left( \mathbf{x},\mathbf{\lambda },\mathbf{r} \right)=\nabla L\left( {{\mathbf{x}}^{*}},{{\mathbf{\lambda }}^{*}} \right)\)

This implies

\(\underset{\mathbf{x}\to {{\mathbf{x}}^{*}}}{\mathop{\lim }}\,\left( {{\lambda }_{i}}+{{r}_{i}}{{h}_{i}}\left( \mathbf{x} \right) \right)=\lambda _{i}^{*},\text{ }i=1,2,...,l.\)

Hence, the update rule for Lagrange multipliers is

\(\lambda _{i}^{k+1}=\lambda _{i}^{k}+r_{i}^{k}{{h}_{i}}\left( {{\mathbf{x}}^{k}} \right),\text{ }i=1,2,...,l\)

where, the superscript \(k\) is the iteration of ALM algorithm.

Inequality constraints are transformed into equality constraints by adding slack variables \(\left( {{\theta }_{j}}>0 \right)\). Thus, the augmented Lagrangian becomes

\(A\left( \mathbf{x},\mathbf{\theta },\mathbf{\mu },\mathbf{r} \right)=f\left( \mathbf{x} \right)+\sum\limits_{j=1}^{m}{\left[ {{\mu }_{j}}\left( {{g}_{j}}\left( \mathbf{x} \right)+{{\theta }_{j}} \right)+\frac{1}{2}{{r}_{j}}{{\left( {{g}_{j}}\left( \mathbf{x} \right)+{{\theta }_{j}} \right)}^{2}} \right]}\).

Then, a new primal variables are \(\mathbf{\bar{x}}=\left\{ \mathbf{x},\mathbf{\theta } \right\}\). The augmented Lagrangian should satisfy the optimality condition for slack variable \({{\theta }_{j}}\).

\(\nabla_{\theta}, \mathbf{A} \left( \mathbf{x},\mathbf{\theta },\mathbf{\mu },\mathbf{r} \right) = \sum\limits_{j=1}^{m} { \left( {\mu}_j+{r}_j \left( {g}_j \left( x \right) + {\theta}_j \right) \right) }= 0\)

Hence, an optimum of slack variable \({{\theta }_{j}}\) is

\(\theta _{j}^{*}=\max \left\{ 0,-\frac{{{\mu }_{j}}}{{{r}_{j}}}-{{g}_{j}}\left( \mathbf{x} \right) \right\}\).

Now, these optimum values are substituted into the originally transformed form.

\({{g}_{j}}(x)+\theta _{j}^{*}={{g}_{j}}(x)+\max \left\{ 0,-\frac{{{\mu }_{j}}}{{{r}_{j}}}-{{g}_{j}}\left( \mathbf{x} \right) \right\}=\max \left\{ {{g}_{j}}(x),-\frac{{{\mu }_{j}}}{{{r}_{j}}} \right\}\)

Hence, the augmented Lagrangian for inequality constraints are transformed into the following simple functional.

\(A\left( \mathbf{x},\mathbf{\mu },\mathbf{r} \right)=f\left( \mathbf{x} \right)+\sum\limits_{j=1}^{m}{\left[ {{\mu }_{j}}{{\Psi }_{j}}\left( \mathbf{x},{{\mu }_{j}},{{r}_{j}} \right)+\frac{1}{2}{{r}_{j}}{{\Psi }_{j}}{{\left( \mathbf{x},{{\mu }_{j}},{{r}_{j}} \right)}^{2}} \right]}\)

where, \({{\Psi }_{j}}\left( \mathbf{x},{{\mu }_{j}},{{r}_{j}} \right)=\max \left\{ {{g}_{j}}\left( \mathbf{x} \right),-\frac{{{\mu }_{j}}}{{{r}_{j}}} \right\}\). Also, the Lagrange multiplier update rule is defined as

\(\mu _{j}^{k+1}=\mu _{j}^{k}+{{r}_{j}}{{\Psi }_{j}}\left( {{\mathbf{x}}^{k}},\mu _{j}^{k},r_{j}^{k} \right),\text{ }j=1,2,...,l\).

In order to sequentially solve the augmented Lagrangian, AutoDesign uses quasi-Newton (BFGS) for \(n\le 50\) and conjugate gradient method (Hestnes-Stiefel) for \(n>50\).