ESSENTIALLY QUADRATIC PENALTY FUNCTIONS AND THE MARATOS EFFECT by Nikiforos A. Papadakis and Panos Y. Papalambros Technical Report UM-MEAM-89-06 June 1989 DESIGN LABORATORY THE UNIVERSITY OF MICHIGAN ANN ARBOR

ESSENTIALLY QUADRATIC PENALTY FUNCTIONS AND THE MARATOS EFFECT by N. A. Papadakis Research Fellow and P. Y. Papalambros Associate Professor Department of Mechanical Engineering and Applied Mechanics The University of Michigan Ann Arbor, Michigan 48109 June 1989 ABSTRACT An augmented Lagrangian function with essentially quadratic penalty terms is proposed as a merit function for performing line search in sequential quadratic programming methods. Convergence of the stepsize to unity is established thus eliminating occurence of the Maratos Effect. Global convergence issues are also discussed. An inexact line search routine based on the Goldstein termination criteria, together with numerical experiments are presented. Key Words: Merit functions, sequential quadratic programming, Maratos Effect, inexact line search, superlinear convergence, global convergence

1. Introduction Sequential quadratic programming (SQP) methods have been used extensively for solving nonlinear constrained optimization problems. An SQP method requires three basic steps. At the beginning of each iteration, a quadratic programming approximation to the original nonlinear problem, at the current estimate of the solution xk, is performed. The quadratic subproblem is then solved to obtain a search direction dk, emanating from Xk. At the second step, a line search along dk is performed to reduce the value of an appropriate merit function and thus obtain a better estimate Xk+l. Finally, an approximation of the Hessian of the Lagrangian function at Xk+1 is computed using first-order information and a variable metric formula, in preparation for the next iteration. These methods have several desirable features. In particular, SQP methods with appropriate merit functions are globally convergent (see, e.g., Han [7]). Locally, close to the problem solution, if unity step sizes are accepted by the line search routine, Qsuperlinear convergence is achieved (Han [8], Powel [13]). Only first order information is needed and the methods perform equally well for both equality and inequality constrained models. An unfavorable characteristic of SQP methods, when they employ nonsmooth merit functions, is the so-called Maratos Effect. As the solution of the problem is approached, the step length calculated by the line search routine may not converge to unity and the superlinear convergence may be destroyed (see, Maratos [10], Chamberlain [2]). Several remedies have been developed to deal with the Maratos Effect. Chamberlain et al.[3], developed a watchdog technique, while Mayne & Polak [11], Gabay [6] and Fukushima [5], developed a line search strategy along a curved direction that tends to follow the constraint surface. Differentiable merit functions have also been used as merit functions, see, e.g., Di Pillo & Grippo [4]. Schittkowski [19], Powell & Yuan [16], and Rustem

[17], devised differentiable merit functions and developed line search procedures that accept unit step sizes, when approaching the solution. Continuous penalty functions deserving more careful consideration are those with second derivatives increasing as the feasible set is approached. In a previous investigation (Papadakis & Papalambros [12]), we examined the use of the merit function m(x, c) = f(x) + c f nj(x) gj(x) (1.1) J wheref(x) is the function to be minimized, gj(x) is the constraint function of the jth violated constraint, c an appropriate penalty coefficient and 1 if Igj(x)l > 1 nj(x) = { 1 - ln3gj(x) otherwise. A line search algorithm based on this merit fiunction, local convergence analysis and numerical results are discussed in the cited reference. Those promising results motivated further study of this type of merit functions. In the present article we consider a line search algorithm based on essentially quadratic penalty functions, first proposed by Kort & Bertsekas [9]. In the next section, we discuss the basic reasons favoring the use of essentially quadratic functions in the context of SQP methods. We then establish local convergence of the stepsize to unity, and we analyze the global convergence problem. We discuss an algorithmic implementation and illustrate the behavior of the method using two simple examples. 2. Motivation Consider the nonlinear constrained optimization problem minimize f(x) subject to g(x) < 0, (2.1)

where x is in RI andf, g = [gI...- gm]T are twice continuously differentiable functions. The quadratic programming approximation at xk (which is the current estimate of the solution x at the beginning of the kth iteration) has the form minimize dTBk dk + Vfr d (2.2a) 2k kk k subject to gk + VgT dk < 0 (2.2b) where dk is the direction of search emanating from Xk, and Vfk, Vgk denote the gradients off, g evaluated at xk. Matrix Bk is a positive definite approximation to the Hessian of the Lagrangian function L(x, X) = f(x) + X*Tg(x) (2.3) at the location Xk, a* = [X....,Xdm]T being the Lagrange multiplier vector at the solution of Eq. (2.1). Solving the problem in Eq.(2.2) gives the direction of search dk and multipliers Xk = [Xl.....,Xm]T corresponding to the linearized constraints. The next iterate Xk+l is given by Xk+1 = xk + ak dk, (2.4) where ak is selected so that an appropriate merit function is decreased. Finally, the Hessian approximation Bk+l is computed with a variable metric formula (Powell [14]), using function and gradient values at xk, Xk+l, and the values of the multiplier vector Xk as an approximation to.

Consider now an augmented Lagrangian - type of merit function with the quadratic term replaced by an "essentially quadratic" penalty function (Bertsekas 1982). Let us define m(x, X, c, p) = f(x) + XT g(x) + c P(p, g(x)) (2.5) where P(p, g(x)) = I gj(x) I Pj + gj(x)2 (2.6) with J = { j I gj(x) > 0} (2.7) and p = [Pi,..., PJ], Pj E (1, 2] and c > O0. Function P(p, g(x)) is continuously differentiable and strictly convex with respect to g(x), and P(p, O) = 0 (2.8) VP(p, O) = 0. From Eq.(2.6) we may easily verify that V2P is not continuously differentiable. In fact, for any pj E (1, 2], V2P tends to infinity as gj(x), j E J, tend to zero. The use of augmented Lagrangian functions of the form in Eq.(2.5) with pj E (1, 2] is discussed by Bertsekas in [1]. It is noted there, that multiplier methods based on essentially quadratic penalty functions, even with a finite penalty coefficient, are superlinearly convergent. However, a significant disadvantage of these methods is that the essentially quadratic terms may produce ill-conditioned unconstrained problems and the superlinear convergence may not be achieved. The purpose of using the penalty function described by Eq.(2.6) as a merit function in SQP methods, is to provide strong and differentiable representation of the remaining infeasibility, as the solution of the problem is approached. It has been observed that the

Maratos Effect usually occurs in the vicinity of x*, when the gradients Vf(x) and Vg(x) are almost linearly dependent and almost orthogonal to the direction of search. In that respect, this phenomenon resembles the instability of line search algorithms for unconstrained optimization emerging when the direction of search and the gradient of the objective are almost orthogonal. A feature of SQP methods which enhances significantly the appearance of the Maratos Effect is that, as x is approached, dk lies almost entirely in the null space of Vg(xk). During the late stages of the process, the direction dk is given approximately by a quasi-Newton step, minimizing the quadratic approximation of the Lagrangian function. Thus, the use of function P(p, g(x)) in Eq.(2.5) amplifies the role of the existing infeasibility in the line search, while the "orthogonality" of dk, Vg(xk) tends to eliminate the effect of ill-conditioning mentioned before. 3. Convergence Analysis To insure the existence of a point xk+l satisfying the descent condition mk(O) > mk(a), (3.1) where mk(a) denotes the merit function defined by Eqs.(2.5) to (2.7) evaluated at (xk + adk, kk, Ck, Pk), a E (0, 1], we impose the sign restriction n mk(O) < 0. (3.2) From Eq.(2.5) to (2.7) we have ni(O) = VfT dk + XkTVgk dk + CkVPkdk (3.3) where

VPk = j Vgi(k), (3.4) j = gj(xk) P1 + gj(xk) > 0. (3.5) Also, from the first order necessary conditions for optimality of problem Eq.(2.2) we have Vfk + Vgkk = - Bkdk (3.6) From Eqs.(3.4) and (2.2b) we obtain VPTdk < - I Xj gj(xk) < 0, (3.7) Finally, from Eqs.(3.3) and (3.7), and given Bk > 0, we have mk(O) < - dBk dk + Ck gj(xk) < 0, (3.8) which is satisfied for any ck > 0. Next we examine the behavior of function mk(a) as xk tends to be a stationary point of problem (2. 1). As it will be shown in this section, a particularly good choice of the components of vector Pk is pj = {1 - [1 + gj + Vgjdk]gj / [gjlngj + Vgjdk] if gj(xk) E (0,1) (3.9) 2 otherwise Let us also define Yj = (Pj - 1)gj pJ - 2 + 1 >0. (3.10) 8o

An important consequence of the selection described by Eq. (3.9) is that the merit function mk(a) can be very easily convexified. In fact we have: Lemma 3.1. If we keep the penalty coefficient fixed at any positive value c, there exists a certain iteration K such that the function mk(o), k > K, is convex. Proof: We may assume without loss of generality that as dk - O there exists a sufficiently large k such that the activity of the constraint set remains invariant (the set J defined by Eq. (2.7) becomes independent of k), and it makes no difference if the violated constraints are treated as equalities, see, e. g. Powell [13]. Furthermore, mk(ax), a E (0, 1], can be approximated by its second order Taylor expansion mk(a) = mk(O) + a m(O) + -a2mk(O), (3.11) with T - mk(O) = dk [V2Lk + cX[YVjV ~ + PjV2gj]] dk, (3.12) V2Lk = V2fk+ xjk V2gj. (3.13) For k large and xk close enough to x*, we know that there exists a positive number X such that for c' > X the matrix V2Lk + c' VgjVgj becomes positive definite [1]. Therefore, if c satisfies c > X ] min{j, j E J), (3.14)

then V2Lk + cX; yjVgjVg (3.15) is positive definite. Additionally, as dk - 0, Eq. (3.9) combined with Eqs. (3.5) and (3.10) result in j -e, (3.16) (3.17) =Yj gj lngj (3.17) Therefore, if ck satisfies Eq. (3.14) and for k large, we have m "(0) = d [V2Lk + c E YjVgjVgY dk + O(ldkl2) > 0, (3.18) Given that yj(Vgj dk)2 = O(lldkll / Ilngjl), mk(a) is convex. 0 Lemma 3.1 is based on the fact that the representation of the existing infeasibility by P(p, g(x)) is convex with respect to g(x) and strong enough to insure convexity of mk(a), even if ck tends to zero, provided it satisfies Eq. (3.14). Therefore, if c is fixed to a certain positive value, mk(oc) becomes convex and we do not need to guess the threshold value X to achieve convexity. Next we examine the convergence of the stepsize ok to unity as Xk -— > x*. We assume that the sequences {Xk}, (Bk), (dk} are bounded, Vg(xk) has full rank and that the matrices Bk are positive definite and uniformly bounded. 10

Lemma 3.2. Let the above assumptions be satisfied and let the components of vector Pk be selected according to Eq. (3.9). Then, for k large enough, mk(l) < mk(0). (3.19) Proof: From Eqs. (3.3) to (3.6) we have mk(O) = dTBk dk - c Pj Vgjdk (3.20) Since matrices Bk are bounded, Eq. (3.20), may be written as mk(O) = -c E f3j Vgjdk + O(lldkll2). (3.21) To show that inequality (3.19) holds, it is sufficient to show that mk(O) + mk(0)/2 < -mk(0)/2. (3.22) Furthermore, because of Eqs. (3.12), (3.20), and Lemma 3. 1, it is enough to have dT[V2Lk - Bk +c jV2gj] dk +c [gj(Vgjdk)2 + 3PjVgjdk] < 0 (3.23) Taking into account Eqs. (3.15) and (3.21), Eq. (3.23) becomes E [yj(Vgjdk)2 + 3jVgjdk] + O(lldkll2) = 0. (3.24) We examine next the inequality h(pj) = hyj Vgjdk + 3j > 0, j J, (3.25) where Yj, 13j are given by Eqs. (3.5) and (3.10) respectively. We have 1 1

h"(pj) = gjj- 2 ngj j Ingj + Vgjdk (2 + (pj - 1) lngj)] (3.26) Therefore, if pj satisfies the inequality 1 < Pj< <j, j =1- gj/Vgjdk-2/lngj, (3.27) function h(pj) is convex. Moreover, h'(1) < 0, h(2) = 2(gj + Vgjdk) < 0, (3.28) and for dk small enough, h(l) = 1 + gj + Vgjdk > 0. (3.29) From the above we conclude that Eq. (3.25) treated as equality, has only one solution in the interval [1, rj]. The first Newton iteration for solving Eq. (3.25) results in pj = 1- [1 + gj + Vgjdk]gj [gjlngj + Vgjdk], (3.30) Because of the geometry of function h(pj) discussed before, pj given by Eq. (3.30) satisfies Eq. (3.25) as strict inequality and for k large enough Eq. (3.23) and thus Eq. (3.19) becomes valid ~ Lemma 3.2 guarantees the acceptance of unity stepsize as dk -w O. Moreover, from Eq. (3.22) we have

mk(l) - mk(O) < -c X yj(Vgj dk)2 + O(lldkll)2 (3.31) and because of Eq. (3.17) mk(l)- mk(O) < -c O(lldkll/Ilngl) + O(lldkll2) (3.32) Therefore, although the terms O(lldkII2) depend on c (see also Eq. (3.23)), small values of c are enough to insure validity of Eq. (3.32). In fact, from Eqs. (3.23), (3.31) and (3.32) we may conclude that small values of the penalty parameter c are enough to compensate for the effect of the Hessian approximations ( Bk converges to V2L(x*, X*) on the tangent plane only) and eliminate the effect of the constraint curvature. Finally, from Eq. (3.30) we may also conclude that as dk-4 0, pj - 1, and the penalty function P(Pk, gk) tends uniformly to the absolute value penalty function while differentiability is maintained. 4. Global Convergence The exponent vector Pk and the form of the merit function defined in the last section depend on the location xk. Although the analysis we performed so far ensures the existence of xk+1, such that Eq. (3.1) is valid and unity stepsize becomes acceptable at the end of the process, it may arise questions concerning the global convergence of the algorithm. The penalty coefficient may be kept constant but the global descent condition mk(xk) > mk (xk+l) > mk+l1 (Xk+l) > mk+l(Xk+2) > * (4.1) may not be valid. Note that in Eq. (4.1), function mk(.) is evaluated with X = Xk, P = Pk, k = 0,1,.... 13

One means to achieve a consistent measure of the progress of the algorithm, is to establish a correspondence between the sequence in Eq. (4.1) and the associated sequence of values of the absolute value meritfunction defined by ma(x) = f(x) + ca Pa(g(x)), (4.2) where the set J is defined by Eq. (2.7), and Pa(g(X)) = Igj(x), (4.3) J Ca > max ()x k). (4.4) J J, J It is known that if ca satisfies Eq. (4.4), and ma(.) is used as the objective of the line search algorithm so that ma(Xk) > ma(xk+l) >..., (4.5) then the SQP method is globally convergent (Han [7]). Next we define the sets L = {j I gj(xk) > O,gj(Xk+l) > 0) M = {j Igj(xk) > O, gj(xk+l) < 0) (4.6) N = {j I gj(xk) < O, gj(xk+l) > 0) Let us first assume that L r M ~ 0. Inequality (3.1) implies (4.5), if ca[P(gk)- Pa(gk+l)] > c [P(Pk, gk)- P(Pk, gk+ l)] + 4T[gk - gk+ 1]- (4.7) 14

If xk+1 satisfies Pa(gk) > Pa(gk+ 1) (4.8) then Ca can be selected so that both Eqs.(4.7) and (4.4) are valid for any iteration. Let (k be a stepsize satisfying the local descent condition (3. 1) and small enough so that Eq. (4.7) may be approximated by dk [akVgj(xk + Vgj(Xk)dk + gj(xk) + d [(ak - aj) Vgj(xk + ajdk) + )2 V2gj(xk + ajdk)dk] < 0, (4.9) where aj, j E N, is the steplength with gj(xk + ajdk) = 0. (4.10) From the definition Eq. (4.6) we also have Vgj(xk)dk< - gj(xk) < O, j E L, (4.11) and Vgj(xk + j dk)dk > 0, j E N. (4.12) Furthermore, ak satisfies Eq. (3.1) and since mk(ak) behaves like a quadratic form, for any a E (0, akl, we have 15

mk(a) mk(ak)- (4.13) The cardinality of set N is an increasing function of xk. Because of Eq. (4.13) we may select a stepsize axk E (0, akJ such that the local descent condition (3.1) is valid and the contribution of set N in Eq. (4.9) becomes small. Moreover, from Eq. (4.13) the terms contributed by the sets L, M in (4.9) are negative and bounded away from zero. Because of the definition of set M and Eq. (4.11) we may always select a steplength such that Eq. (4.9) and consequently Eq. (4.8) is valid. The case L r M = 0, N ~ 0 may be treated in a similar way. We only note that in this case we may select ak such that fk+l -fk < 0, (4.14) and Eq. (4.5) becomes valid. Moreover, as xk approaches x*, the sets M, N become empty and from Eq. (4.9) we may immediately conclude that Eq. (4.5) does not prevent unity stepsize to be accepted by the line search routine, and so superlinear convergence may be achieved. Finally, we note that to check if {xk, k = 0, 1,... ) is globally convergent, we only need additional constraint and objective function evaluations. The stepsize ak may be reduced until inequalities (4.8) or (4.14) become valid, while local descent with respect to mk(a) is always satisfied. Since the line search algorithm can be considered as a closed map on a compact set, global convergence can be achieved. 5. The Algorithmic Implementation and Numerical Results An inexact line search algorithm was developed using the Goldstein's termination criterion. At the ith iteration of the line search algorithm, a stepsize aki is accepted if 16

Bk(ak,i) > mk(ak,i) > rk(ak,i), (5.1) where Bk(aXk,i) = mk(O) + ~km'(0) ak,i, (5.2) rk(aki) = mk(0) + (1 - k) m'(0) aki (5.3) with Ck E (0, 2) and k,1 = 1. If mk(ak,i) < Fk(ak,i), ak,i+l is calculated as the intersection of the second order approximation of mk(a) (based on mk(O), mk(O), mk(ak,i)) with Bk(a). Similarly, if mk(ak,i) > Bk(ak,i), akj+l is selected as the intersection of the same approximation with Fk(aX). The algorithm also employs a bracketing procedure. An interval [ry, pi], yl = 0, pi > 1, which contains a subset of acceptable values of ok is used. If the value of ak,i is such that m(ak,i) < r(ai), the lower bound yj is increased by the amount 6i = 6(Pi - yi), Y small. Similarly, if m(ak,i) > B(ak,i), then Pi is decreased by 6i. Let us discuss the performance of the above algorithm on two well known examples, devised to test line search routines. We have chosen Ek = 104, Yo = 0, bo = 1, o = 0.1. In both cases the Hessian of the Lagrangian has been approximated by using the Broyden-Fletcher-Goldfarb-Shanno update and the value of penalty coefficient was set equal to one. Finally, we have used the termination criteria Ildkll < 10-5, IlIg(x)ll 10-5. The first problem (see also Maratos [10]) is minimize x2 + x2 subjectto (xl + 1)2 + x2 4=0, 17

with solution (1, O)T. Four different initial points and two different initial Hessian approximations, shown in Table 1, have been examined. In all cases, fast convergence of the stepsize to unity has been achieved. The second problem devised by Powel [1982] is minimize 10(x2 + x2 - 1)- xl subject to x2 + x22 -1 =0 with solution (1,O)T. Again, the numerical results summarized in Table 2 confirm rapid convergence of the stepsize to unity and superlinear convergence of the SQP algorithm. Direct comparison of the results described in Tables 1 and 2 indicates that the convergence of the stepsize to unity is nearly unaffected by the guess of Bo and by the way the sequence (Bk, k = 0, 1,...) converges to V2L(x*, X*) on the tangent plane. It appears that this is due to the fact that the use of the merit function described in Section 2 emphasizes the part of the Newton iteration concerning feasibility, thus reducing the effect of Bk on the line search algorithm. Besides the results presented in Tables 1 and 2, additional tests have been carried out with penalty coefficients varying from 0.5 to 2.0. In most of the cases, the number of iterations needed to achieve unity stepsizes was identical to those given in Tables 1 and 2. 6. Conclusion The purpose of this work was to examine the use of augmented Lagrangian merit functions, with the quadratic term replaced by an essentially quadratic penalty function. The use of essentially quadratic penalty terms emphasized the effect of the existing infeasibility 18

in the line search routine and increased the reliability of the method. The analysis of Section 3 established convergence of the stepsize to unity and gave a rational way for selecting the exponents of the essentially quadratic functions. As the solution of the problem is approached the essentially quadratic terms converge to the absolute value penalty function providing a strong differentiable representation of the remaining infeasibility. The use of this augmented Lagrangian merit function, enhancing the role of the constraint linearization, arrests the growth of the penalty parameter, restricts the effect of the Hessian approximation and limits the effect of the constraint curvature on the line search routine. In Section 4 we described a procedure that insures that the sequence I xk, k = 0, 1,...}, generated by decreasing the merit function along the direction of search, is globally convergent. This procedure is based on a parallelism between the values of the augmented Lagrangian function and the absolute penalty value merit function and requires only function evaluations. Numerical results on the two classical test problems were very encouranging. Further testing on larger problems is necessary to confirm general utility of this merit function. Acknowledgement This research has been partially supported by the National Science Foundation Grant No. DMC85-14721 at The University of Michigan. This support is gratefully acknowledged. 19

TABLE 1. Performance of the algorithm on Maratos' problem. xo (0.985, 0.2) (1.002, 0.1) (0.99999, 0.2) (0, J-3) Bo I 21 I 21 I 21 I 21 MI 4 5 3 4 4 5 8 7 I* 1 1 1 1 1 1 4 3* FE 5 6 4 5 5 6 12 9 GE 4 5 3 4 4 5 8 7 *For iterations 1, and 3 to 7, ak = 1 while a2 < 1 TABLE 2. Perforance of the algotithm on Powell's problem. Xo (0.8, 0.6) (0.1, 0) (50,50) Bo I 201 I 201 I 201 MI 6 8 7 7 13 14 1* 1 3* 1 1 1 11** FE 7 10 8 8 14 15 GE 5 7 6 6 12 13 * For iterations 1, and 3 to 8, ak = 1, while a2 < 1 ** For iterations 1 to 9, and 11 to 14, ak = 1, while al0 < 1 Abbreviations: MI: Major Iterations to Achieve Convergence I*: Iteration at which stepsize converged to unity FE: Objective and Constraint Function Evaluation GE: Objective and Constraint Gradient Evaluation 20

References [1] Bertsekas, D. P., "Constrained Optimization and Lagrange Multiplier Methods," Academic Press, New York, 1982. [2] Chamberlain, R. M., "Some Examples of Cylcing in Variable Metric Methods for Constrained Optimization," Mathematical Programming Study, Vol. 16, pp. 18-44, 1982. [3] Chamberlain, R. M., Powell, M. J. D., Lemarechal C., and Pedensen H. C., "The Watchdog Technique for Forcing Convergence in Algorithms for Constrained Optimization," Mathematical Programming Study, Vol. 16, pp. 1-17, 1982. [4] Di Pillo, G. and Grippo, L., "A New Class of Augmented Lagrangians in Nonlinear Programming," SIAM J. Control Optim. 17, pp 618-628, 1979. [5] Fukushima, M., "A Successive Quadratic Programming Algorithm with Global and Superlinear Convergence Properties," Mathematical Programming 35, pp. 253-264, 1986. [6] Gabay, D., "Reduced Quasi-Newton Methods with Feasibility Improvement for Nonlinear Constrained Optimization," Mathematical Programming Study, Vol. 16, pp. 18-44, 1982. [7] Han, S. P., "A Globally Convergent Method for Nonlinear Programming," Journal of Optimization Theory and Applications, Vol 22, pp. 297-309, 1977. [8] Han, S.P., "Superlinearly Convergent Variable-Metric Algorithms for General Nonlinear Programming Problems," Mathematical Programming, Vol. 11, pp. 243-283, 1976. [9] Kort, B. W. and Bertsekas, D. P., "A New Penalty Function Method For Constrained Minimization," Proc. IEEE Conf. Desition & Control, New Orleans, 1972. [10] Maratos, N., "Exact Penalty Function Algorithms for Finite Dimensional and Control Optimization Problems," University of London, Ph.D. Thesis, 1978. [ 11] Mayne, D. Q., and Polak, E., "A superlinearly Convergent Algorithm for Constrained Optimization Problems," Mathematical Programming Study, Vol 16, pp. 45-61, 1982. 21

[12] Papadakis, N. A., and Papalambros, P. Y., "A New Merit Function For Variable Metric Constrained Optimization," Technical Report # UM-MEAM89-02, Dept of Mechanical Engineering & Applied Mechanics, The University of Michigan, Ann Arbor, 1989. [13] Powell, M. J. D., "The Convergence of Variable Metric Methods for Nonlinearly Constrained Optimization Calculations," in Nonlinear Programming 3, ( O. L. Manasarian, R. R. Meyer and S. M. Robinson, eds.), Academic Press, 1976, pp 27-65. [14] Powell, M. J. D., "Variable Metric Methods for Constrained Optimization," in Mathematical Programming. the State of the Art, (A. Bachem, M. Gr6tschel, and B. Korte, eds.), Springer-Verlag, 1982, pp. 288-311. [15] Powell, M. J. D., "Variable Metric Methods for Constrained Optimization," Nonlinear Optimization. Theory and Algorithms,(L. C. W. Dixon, E. Spedicato and G. P. Szeg6, eds.), Birkhauser, Boston 1980, pp. 279-294. [16] Powell, M. J. D. and Yuan Y., "A Recursive Quadratic Programming Algorithm that uses Differentiable Exact Penalty Functions," Mathematical Programming 35, pp. 265-278, 1986. [17] Rustem B., "Convergent Stepsizes for Constrained Optimization Algorithms," Journal of Optimization Theory and Applications, Vol. 49, No. 1, pp. 135-160, 1986. [18] Schittkowski K., "A Numerical Comparison of Optimization Software Using Randomly Generated Test Problems," in Numerical Optimization of Dynamic Systems, (L. C. W. Dixon and G. P. Szeg6i, eds.), NorthHolland Publishing Co., Amsterdam, 1980. [19] Schittkowski K., "On the Convergence of a Sequential Quadratic Programming Method with an Augmented Langrangian Line Search Function," Technical Report Vol 82-4,Systems Optimization Laboratory, Department of Operations Research, Stanford University, 1982. 22