DEFINING PROPERTIES FOR DECOMPOSITION IN NONLINEAR PROGRAMMING Technical Report No. UM-MEAM 97-04 Terrance C. Wagner Panos Y. Papalambros 1 INTRODUCTION Five defining properties of an NLP problem determine which decomposition method is appropriate. The report reviews and compares prevalent methods in the context of these properties to develop a set of relationships between a given problem structure and a coordination strategy. Two of the most important properties are linking variables and linking functions which are most easily defined by the functional dependence table (FDT) discussed in Section 2. The other three determine convergence properties of the methods. The defining properties, the coordination strategy, and the merits and demerits of each method are itemized: prevalent hierarchical methods are presented in Section 3; prevalent non-hierarchical methods are presented in Section.4. Features are compared and summarized in Section 5. Inherent in each of the methods is an authority model which is also discussed. The report emphasizes defining properties of the NLP and features of the coordination strategy and hence, the reader is referred to the original sources for more explicit details of the algorithms in each coordination strategy. I

2 DEFINING PROPERTIES The most important defining properties are linking variables and linking functions. Linking variables are loosely defined as variables that, when held fixed, effect independent optimization problems; similarly, linking functions (usually constraints) are loosely defined as functions that, when deleted (relaxed), effect independent optimization problems. Rigorous mathematical definition of these properties is given in Chapter 3 of Wagner (1993). A useful artifice for characterizing defining properties is a table of Booleans called a functional dependence table (FDT). Using the integers'1' and'0' for true and false, respectively, rows are labeled with function names, columns are labeled with variable names, and the element in the ith row andjth column is non-zero if the ith function depends on the jth variable. The functional dependence table for the example NLP given as Equation (1) (from Kirsch [1981]) is shown in Table 1. Each term in the objective is represented separately in the FDT. min f(x) = 400 x +20x2 + 130x3 2 2 subject to: gl(x) = 190 x - 43.6 + 14.9 x4 - 1.44 x4 0 2 2 g2(x)= 38 x2 - 183.3 + 36 x4 - 2.67x4 <0 (1) 2 2 g3(x) = 650 x3 - 244 +45.9 x4 - 3.29 x4< 0 g4(x) = 3.5 - x4 < 0 g5(x) = 4 - 6.5 < 0 The FDT's of many NLP problems are presented in this report, and shading facilitates an easy visual comparison of the structure. Fi,.lr I r a) shows Table 1 using shading for the Boolean value'true'. If variables and t'i:..re partitioned into vectors with disjoint 2

index sets, compact forms like Figure 1 (b) can be constructed. In the compact form, a shaded block implies that a k-vector of scalar-valued functions depends on an j-vector of variables. Every function in the k-vector need not depend on every variable in the j-vector, but every function in the k-vector depends on at least one variable in the j-vector and every variable in the j-vector appears in at least one function in the k-vector. For example, in Figure l(b), the functions fl and f2, do not depend on X3, but the function partition, fl = (fl, f2, f3), depends on the variable partition, (xl) = (xl, x2, x3). Formally, the compact form can be constructed if and only if the columns in the k x j sub-table of the original dependence table can be arranged such that at least a diagonal of non-zero entries exists. Table 1. Functional Dependence Table for (1) xI x2 x3 x4 f 1 0o 0 0 f2 0 I 1 0 0 f3 0 0 1 0 g1 1 0 0 1 2 0 1 0 1 g3 0 0 1 1 g4 0 0 0 1 g5 0 0 0 1 1 x 2 x3 x4 1 x 2 f 1 1 I f g 3-'Si"a~as f!1- (f,' f2, f3 ) g I I, |' (a) (b) Figure 1. (a) FDT using shading for Booleans (b) FDT by vector partitions. Additional defining properties are additive separability, linearity, and convexity. 3

example, the objective in Equation (1) is additively separable. Linear properties allow use of more classical techniques, as in Lasdon [1971]; convex properties often guarantee convergence; monotonic properties facilitate analytic solution in the subproblems. Linearity is easily identified; techniques for identifying and exploiting monotonicity are well documented (Papalambros and Wilde, [1988]). The defining properties are summarized in Table 2. Table 2 Defining Properties of NLP Problems Utilized in Decomposition 1. Linking variables 2. Linking constraints 3. Additive separability 4. Linearity/Convexity 5. Monotonicity Most decomposition methods exploit at least one of the defining properties to construct a coordination strategy. Certain variables are identified as coordinating variables, the remainder as local variables. A master problem is formulated in terms of coordinating variables; subproblems solve for optimal local variables treating the coordinating variables as parameters. This generic strategy applies to hierarchic and non-hierarchic methods and is summarized in Algorithm 0. Algorithm 0: Generic Coordination Strategy for NLP Decomposition Methods. 1. Initialization. 2. Solve master problem to obtain optimal coordinating variables. 3. Solve subproblem(s) to obtain optimal local variables. 4. Optimality or Convergence Test. 5. If Step 4 fails make modifications based on Step 4 information and return to Step 2; otherwise stop. Decomposed solution solves original problem. The selection of coordinating variables depends on the properties of the optimization problem. Generally, feasible decomposition methods choose linking variables as coordinating variables; dual decomposition methods choose dual variables associated with the linking constraints as coordinating variaihles. Separability in the objective function or 4

constraints often effects a separable Lagrangian which can be decomposed. Specific instances of linearity, convexity, and monotonicity enhance convergence properties of the coordination strategies. 3 HIERARCHICAL DECOMPOSITION METHODS Dual Decomposition Methods The Lagrange formulation transforms a constrained problem into an unconstrained minimization problem. The Karush-Kuhn-Tucker conditions cast the NLP problem as a set of nonlinear zero-valued equalities. The regularity assumption guarantees existence of the Lagrange multipliers, X and g, for equalities and inequalities respectively. For differentiable functions, the stationarity conditions imply that an interior optimum, (x*, X*,,*), is a saddle point of the Lagrange function, L. Formulation of the dual problem leads to the min-max iterative strategy that serves as the generic algorithm for dual decomposition methods given as Algorithm 1. Figure 2 illustrates the strategy. Necessary conditions can be exploited to derive update formulas for the dual variables. Master Problem in dual space. max L (X, ji, x*) k*,i, 1 13 x* min L (X*, A*, x) Subproblem in primal space. Figure 2 Two level structure of dual methods. Algorithm 1: Coordination Strategy for Dual Decomposition Methods. 1. Initialize k = 0, (xk, Xk, gk). 5

2. Holding (Xk, gk) constant, obtain xlk by solving min L(Xk,gk, x) x 3. Holding xk constant, obtain (Xk, gk ) by solving max L(, g, xk ) 4. If converged stop; otherwise increment k, set (xk, Xk, gk) = (xk*1, k*1, gk*l ) and go to 2. By formulating the dual, the original NLP problem is partitioned into a two level problem where the optimal dual variables are sought in the master problem and the optimal primal variables are sought in the subproblem. If the objective function and the constraints are sums, the vector x can often be partitioned into p vectors, xi; i = 1,..., p where the indices of xi and xj form disjoint sets for i-j. The Lagrangian can then be partitioned intop functions, L(x, X, g) = L1(xl, X, g) + L2(x2, X, g) +... + Lp(xp, X, g). (2) resulting in p independent subproblems. Three prevalent dual methods by Dantzig and Wolfe [1960], Takahashi [1964], and Lasdon [1968], have the common defining property of linking functions. To facilitate a concise description of each method, an FDT schematic of the original problem is presented followed by a table with the format of Table 3. The original problem is stated in the upper left entry; properties and reformulations are listed in the upper right entry; the master problem is shown in the lower left; the subproblem(s) in the lower right. Variable value passing is shown in the middle column. The classical dual decomposition method proposed by Dantzig and Wolfe [1960] is applicable to any linear program, but is very effective in solving LP problems with constraint sets having the FDT shown in Figure 3. The primal form of such problems, given in Table 3, is a set of p linear programs linked by the constraints, A0x = bo. The master problem is formulated in terms ot the extreme points of the feasible domain. 6

Assuming the p polytopes, Si = {xi: Aixi = bi; xi > 0} are bounded, any point, xi can be represented as a weighted sum of its ti extreme points, (xeil,..., eiti). The original problem can be rewritten as the master and subproblems in Table 3. Recall1 that for linear programs the basic feasible solution that minimizes the objective is the optimal solution and that a basic feasible solution is an extreme point. Starting with an initial basic feasible solution, the simplex method iteratively solves the linear program by updating the basic feasible solution based on the non-basic variable that most reduces the objective function. The measure of that reduction is called the relative cost coefficient. The Dantzig-Wolfe method implements the simplex method in a two step fashion. Given a basic feasible solution, Xb, the master problem determines the optimal weighting coefficients, a, which in turn allow computation of the simplex multipliers, X, associated with the basic feasible solution. The multipliers associated with the linking constraints, Do, are a sub-vector of X, and their values are passed to the subproblems as parameters. The minimum cost coefficient associated with non-basic variables in each of the subproblems is an explicit function of the ith subproblem solution, xi* and the multipliers, X. The subproblem with the minimum cost coefficient that most reduces the objective contains the non-basic variable to update the basic feasible solution. The size of the subproblems effects reduced storage and computation in determining the relative cost coefficients. It constitutes the numerical advantage of the method. The algorithm- is summarized in Algorithm 2. 1 See Bazaara et al. [1990] or Luc:'.ct cr 1984] for a thorough review of linear programming solution methods. 7

Table 3 Dantzig-Wolfe Decomposition Dantzig-Wolfe Original Problem Properties/Transformations min cTx Linear in x x Linking functions: Ao x = bo s.to: Ao x = bo Extreme points transform: Ai xi = bi i = 1,., p ti x > 0 xi= cij Xeij j=1 i=,..., p Cost of the jth extreme point: (scalar): Pij = CTxeij p =(Pll,..., Pltl,..., Ppl,..., Pptp) Activity vector: qij = Aoxeij Define: a = (all,... alt,..., api..., aptp)T P =(P11,..., P t1..., Pp..., Pptp)T q = (ql,..., qtl,..., qpl....., qptp) Basic solution: xb Basis matrix: B Master Problem Subproblem min pTa min(ciT -XoTAoi) xi a Xo= xi s.to: q a = bo s.to: Ai xi= bi ti xi > 0 X aij=1: i=l 1..., p j=1 a2 0 new XT = CT B-1 basis b Xb QXO= first m rows of X x1 x2...Xp A2 A p:::I —..........::.:-... Figure 3. FDT schematic of LP constraints in D-W Decomposition. 8

Algorithm 2: Coordination Strategy for Dantzig-Wolfe Decomposition. 1. (Initialization) Select an initial basis feasible solution for the master problem. 2. (Master Problem) Solve the master problem to obtain the multipliers, X. 3. (Subproblem) With X fixed, obtain the optimal solution xi* for each subproblem. Compute the minimum cost coefficient associated with each subproblem. 4. (Optimality Test) If all cost coefficients are non-negative, stop. The current basic feasible solution is optimal. Otherwise, use the solution of the subproblem with the minimum cost coefficient to update the basic feasible solution. 5. Return to 2 with the new basic feasible solution. The economic interpretation of the decomposition method, considers the LP problem as a model of a multidivisional firm minimizing cost with constraints on shared resources. If management regulates prices that the divisions must pay for common resources, the division that can best reduce cost at those prices is incorporated into the master plan. Initialization of the basic feasible solution is interpreted as management formulating a master plan; computation of the simplex multipliers is interpreted as management setting prices. In Step 3, each division reports its potential cost improvement based on the prices, the activity with the greatest improvement is determined in Step 4, and the master plan is updated in Step 5. Takahashi [1964] proposed a dual method for convex programs which have the FDT schematic of Figure 4 and linear constraints of the form in Table 4. The method makes no assumptions about separability of the objective function. The constraints, hc, are coupling constraints in the sense that the problem would be easier to solve without them. For example, the remaining constraints, hs, may possess the structure shown in Figure 4. Formulating the dual of the coupling constraints, hc, yields the master problem and subproblem given in Table 4. Since Xc is unconstrained, any direction for X in the master problem is feasible. If the objective function is convex, it can be shown that the dual function, h(k), is differentiable (Geoffrion [1971]). Because the dual is linear in X, the gradient of the dual, Vxc h(X)= h.x,. (3) 9

is a feasible direction for improving the objective of the master problem. Takahashi suggests Algorithm 3, which uses a short step, a, in the direction of the gradient. A line search could also be used in Step 4. x1 x2.. Xp::;::!;n: -!::i - ~:~::~ii;:~?;!::::;!i~:i~:i::.:: - -:...: -..:.- -.: -.:.: -.~~!i~i~iii~::ii;:: i.....:................f ________-_-_ __-__-__. - - -......: —_...........: i::::ii::ii. Figure 4. FDT schematic for Takahashi's Dual Decompostion Method. Table 4. Takashi's Decomposit.ion. Takahashi Original Problem Properties/Transformations min f(x) Convex f(x) to:x~~~~~ hc ~~~Linear constraints ~s. ~to: he ~ (x)~ ~= 0o~ ~.Linking functions: hc (x) hs(x) = 0 x E Sb (simple bounds) Master Problem Subproblem max h(Xc) = f(x) + XcThc (x) Xc:: min f(x) + XcThc (x) Xc x <= x s. to: hs (x)= 0 x Sb (simple bounds). Algorithm 3: Coordination Strategy for Takahashi's Dual Decomposition. 1. Initialize k = 0, and the multipliers, Xc = k. 2. (Subproblem) Solve the subproblem to obtain xk. 3. (Optimality Test) If hc (xk )= 0 (or Ilhc (xk )Il E), then xk is optimal. Stop. Otherwise go to 4. 4. (Master Problem) Let Xk+l = Xk + a hc (xk ) where a > 0. Return to 2. The NLP problem with the FDT in Figure 5 has the separability properties summarized in Table 5. The objective function and the constraints are additively separable; Lasdon proposed Algorithm 4 below for s.oc problems. The formulation given in Table 5 10 | rnax h~~~c) = f~~~x) + Xcthc (x) | Xcr | min f~~~~~~~x) +..c...c (x) l................................x E S.(simpl.bounds Algorithm 3: Coordination Strategy for Takahashi's Dual Decomposition.~~~~~~~~~~~~~~~~~~~.............. 1. Initialize k = 0, and the multipliers, Bc = Xk.~~~~~~~~~~~~~~~~~~............ 2. (Subproblem) Solve the subproblem to obtain xk......... 3. (Optimality Test) If hc (xk ) = 0 (or llhC (Xk )11 < e), then xk is optimal.... Stop.. Otherwise go to 4.~~~~~~~~~~~~~~~~~~~~~~~.......... Th NP role wthth DTinFiur Sha tesearbiit poprte summarized in Table 5. The objective function und the constraints are additively separable;~~~~~~~~~~...................... Lasdon proposed Algorithm 4 below for st12 El rohlems. The fonnulation given in Table S~~~~~~~~~~~~~~~~~~~~~~.............................1 0

shows only inequalities but the method can be applied to probems with equalities. The Lagrange function is additively separable with respect to the variables xi and linked by the multipliers, gl. The coordination strategy in Algorithm 4 solves for gt in the master problem and for xi in each of the subproblems. X I X 2 x 3 X g1 1 - a............ f2........., g2,p 2~ ^ ~. ^... ng1,2 ng P.:. -,p.. Figure 5. FDT schematic for Lasdon's Dual Decomposition Method. Algorithm 4: Coordination Strategy for Lasdon's Dual Decomposition. 1. Initialize k = O and gk ~0. 2. (Subproblem) Solve the subproblems with = to obtain a solution x............... 3. (Master Problem) Solve the master problem using the gradient of the dual as a search direction and a line search in a. 4. (Convergence Test) If II |jk - gk-1 II < ~ stop; otherwise glk+l = gik + ask, increment k and return to 2. 11 Figure 5. FDT schematic for Lasdon's Dual Decomposition Method.~~~~~~~~~........... Algorithm 4: Coordination Strategy for Lasdon's Dual Decomposition.~~~~~~~~~~~~~~~~~~~~~~~..........

Table 5. Lasdon's Decomposition. Lasdon Original Problem Properties/Transformations p f(x) separable wrt xi m in ~ fi (xi) gj(x) separable wrt xi;j = 1,..., ng X... Xp i=l s. to: P gj (x) =: gji(xi) O;j = 1,..., ng i=l Master Problem Subproblem P "ng max h(gl,a) = I fi(xi) min fi (xi) + 2 jtj gji(xi) ta i=l = xi j=l + (j + as )Tg (x) s. to: a > 0 i = 1... p Search direction, s: sj = gj(x) if uj > 0 = max (0, gj(x)) if gj = 0 j = l,...ng Lasdon suggests a gradient-based algorithm with line search for solving the master problem in the dual space. In the case of equality constraints, variable metric or conjugate gradient methods are suggested. Since the Lagrangian is nonlinear in x, the subproblem solutions implicitly depend on g. Note, x*(g ) may be non-smooth with respect to g. In such regions, the derivative of the dual, V h(g) = g(x(g )) (4) will not be continuous; only one-sided derivatives will exist. Lasdon suggests a piece-wise linear approximation to the dual to surmount this difficulty. Convergence is guaranteed as long as h(g) remains differentiable. Dual decomposition methods exploit separability in the Lagrangian effected when dual variables are held fixed. The design'crprctation of the master problem is that of a 12

coordinator setting priorities (the multipliers) to which the subproblems respond. Structure can be imposed at the expense of dimensionality to allow use of a dual method as discussed next. Goal Coordination Constraint sets linked as shown in Figure 6, can be decoupled by introducing additional primal variables and equalities. Decomposition is obtained at the expense of added dimensionality. For example, if the constraint vectors gl and g2 are coupled through a variable x, they can be decoupled by adding a variable, y, an equality h(x,y) = x -y = 0, and respective multiplier X. A dual formulation controls X in the master problem. For ny variables, separability is imposed by introducing an ny-vector of variables, y, and ny equality constraints. Under such circumstances the multipliers associated with the equalities appear in the master problem. Dual methods of this type are called goal coordination methods because the master problem is coordinating the goal of meeting the equality constraints. y x x 2.. x g 1X.I X 2 Xp - 8: i: 9. -.*:* g'.., <*.E.g::i::::;.:................................A>....................... A'............... gp —--- g.............. S S s R i -:::.:!:::.::::... g P g,....................... Figure 6. FDT Schematic of effect of introduction of y on constraint set for Goal Coordination Methods. Wismer and Chattergy [1978] proposed this approach even when the coupling is high and use the KKT conditions to derive a gradient algorithm for the master problem. They note, however, that the variable bookkeeping is arduous. For a completely coupled constraint set, the approach implies the introdluction of (p-l)n new variables and equalities, which for p 2 2, effects a master problIrll.::cr than the original. The merit of imposed 13

structure at the expense of added dimensionality determines the cost-to-benefit ratio of the method. Diaz and Belding [1991] formulate a master problem using goal programming instead of the dual to accommodate problems with such structure. Feasible Decomposition Methods The fundamental characteristic of a feasible method is that the original design variables are partitioned into two sets: global variables y and local variables x. The values of the vector y vary in the master problem, the optimal values y* are treated as parameters in the subproblem; the values of the vector x vary in the subproblem and the optimal values x* are returned to the master problem. The simplest two-level structure is illustrated in Figure 7. The methods are sometimes called model coordination methods because model variables are coordinated in both the master problem and the subproblems. The methods are attractive in design problems because even if convergence of the coordination is not obtained, the intermediate solutions are feasible and usually represent an improvement in the objective function. The design interpretation of the master problem is that of a coordinator who has authority over variables common to design groups which act independently. The subproblem solution, x*, typically depends on y, and this must be accounted for in the master problem for the method to have convergence properties. Rarely, is an explicit solution obtained, so an approximation of the dependence x*(y) is usually required. Numerical continuation methods can trace out x*(y) when the dimension of y is small, (three or less) and a linear approximation can be constructed using sensitivity derivatives when the dimension of y is larger. Allgower and Georg [1990] present a comprehensive review of continuation methods, and Beltracchi [1988] gives a comprehensive review of the computational methods for sensitivity derivatives. These coordination strategies are sometimes called projection methods because the subproblem solution is'projected' onto the y space (C(it I tron [ 1971]). 14

Master problem in coordinating variables y. min f(y,x*) ye RnY h(y) = 0 g(y) < 0 y,* x* min f(y*,x) xe Rn h(y,x) = 0 g(y,x) < 0 Subproblem in local variables x. Figure 7. Two level structure of feasible methods. y x.::::::.::...:....::....::.:.:::::.:::::..::::...............................!.....................! g by ~'~"~~"~~'~'4a:,ii/Siii l, b x L''.:...:.......:'.....':::: Figure 8. FDT schematic for Johnson's Two-Stage Decomposition. Johnson [1984a] proposed a straightforward decomposition method given as Algorithm 5 for problems with the structure shown in Figure 8. Algorithm 5:Coordination Strategy for Johnson's Two-Stage Decomposition. 1. (Subproblem) Solve the subproblem of Figure 7 to obtain the single-valued vector, x* = x*(y). 2. (Master Problem) Solve the master problem of Figure 7, either analytically, or using a conventional numerical scheme, using x* = x*(y) to obtain y*. Assuming a relationship, x* = x*(y), can be determined in the subproblem, the master problem seeks y* to minimize the objective function, f(y,x*(y)). Johnson's exposition assumes the partition is chosen such that optimality conditions in the subproblem can yield an explicit single-,aluhdl relationship, x* = x*(y). If an analytical 15 V:::::S.......:..................................''...:.:::::::::.:::::::s::::,:::........................................::::::::......:- S.............................:::::::::::::::::::::::: S::-::::-:::::::::....::f.:S::::.:::::::'::::::..::::: i igure 8. FDT schematic for Johnson's Two-Stage Decomposition. Johnson~~~~~~~~~~~~~~~~~~~~~~~~~~~~..................se....raghtforard deompostion mthod gven a Algorithm S for problems with the structure shown in Figure 8.~~~~~................... Algorithm Coordination Strategy for Johnson's Two-Stage Decomposition.................. X*(Y).~~~~~~~~~~~~~~~~~~1

solution is determined also in the master problem, a global optimum is obtained and the method should converge in one overall iteration. Finding a partition that guarantees an explicit single-valued function a priori is not trivial. In some cases, even if x* = x* (y) exists, nonlinearities may make the master problem of lower dimension but more difficult to solve than the original problem of higher dimension. While not addressed by Johnson, the requirement for an explicit relationship may be circumvented if the optimality conditions of the subproblem are used to derive an approximation of x*(y). For example, sensitivity derivatives in the subproblem could provide a linear approximation that could be used in the master problem. Separable Methods NLP problems with the primal form given in Equation (5) can utilize several feasible decomposition methods. P min fi (y, xi) + fo(y) y Xl... Xp i=l subject to h (y) = 0 g (y) < 0 hi(y, xi) =O for i = 1,..., p gi(y, xi) < 0 for i = 1,..., p. (5) y e Sb ny Xi E Sb ni Sbn represents simple bound or side constraints in Rn. Figure 9 illustrates the FDT of Equation (5). The objective function is a sum, and the vectors gi and hi are independent of xj, for i, j = 1,.., p and i-j. The Lagrangian is then additively separable with respect to (xi, Xi iL ) for i = 1,.., p, L(y, x, X,,u) = Ly(y, Xy, gy) + Li(y, xI, 1, ll) +... + Lp(y, xp, Xp, gp). (6) If y is temporarily treated as a parameter, the p terms of the Lagrangian can be minimized independently. 16

y x! X2 * * x pX f I::......... ----- *.: A;-............,:..-: hi.........':'""'..............''....................,,- " -:^''''''.__________......................:...-:.....-... - —....... t p:: -.^ -::.:___________:*: - —:::::;:::............................................. ~ ~ ~ ~ ~ ~ ~ ~...............:::.:.:.:..... -:.::::....:::... —.............. Lgr.. ": s:: -:. s:- paab.fea.:b:..:.: m.:::............ -....................................................:..:.:......:....... -.. -.. -....i...............................................-:.:..::,..... -,itia ize..... 0...:-......... -.. i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~ ~ ~ ~ ~ ~~~~~ {}i.-.. -: -...:,,.:,.. 4iur.... shnvergec ct on NLP mtfo r sep ampleeb dco p -.sto -..... -. st................................. -.-............ -. -.......................... -........-. --..........:: -........... -.......................:.........................-. -......,....... -- -............................... —:.::..-.:... -.. -. - —..... -.......,.Ho n constansovethe indep e ndent..........en k a bd re.c...o.....................::,:...... -...........................:,.., -...................... t.....o.... - fi.y.......)- - _. fi.:........................ -............................ -......... -.................. i.. i- l.. *.-........ -................... E......... -..... -........... 6.:.........:.:.Alorth 6: coonvrecrdinteion Streategy for Krc'DexampleoIy kl[_,sitiop; 2Holdn kcntnsletherwis inrepenent kupol so Tablndwt respect to 2 Xi.... obtain...1 3. Modify the value Of yk to yk+1 to reduce the objective in the feasible domain by a................... sufficient~~~~~~~~~~~~~~~~~~~~~~~~........t..... I;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~........l................ X~i) i~~~~~~~~~~~~~~~~~~~~~~~l~ ~ ~ ~ ~~........... 4. If convergence criterion are met, for example, 11 yk yk~~~~~~~~~......~,..stop; otherwise increment k and~~~~~~~~~~..eturn.. 2..................~~~~1

Table 6 Kirsch's Decomposition. Kirsch Orignal Problem Properties / Transformations p Lagrangian separable when y is fixed. min fo(y) + I fi(y, xi) |y x ii= s. to: h(y) = 0 hi(y,xi) =O i 1... p g(y) < 0 gi(y,xi) < 0 i = 1..., p Master Problem Subproblem p y min fi(y, xi) min fo(y) + I fi(Y, xi(y)) xi i=l = x s.to: hi(y, xi) = 0 s.to: h(y) =0 gi(, xi) < 0. g(y) <0 i = 1... p Step 3 is the essence of a robust algorithm. Kirsh offers no formal algorithm for step 3 but presents several examples which use analytical solutions x*(y) to determine directions in y. Algorithm 6 serves as a generic algorithm to describe feasible decomposition methods which utilize separability in the objective and constraints. If explicit subproblem solutions, x*i(y), exist and are used in Step 3, Johnson's multistage decomposition method [1984b] can be interpreted as an implementation of Kirsch's method. Also, the algorithms of Rosen [1963], Benders [1962], and Azarm and Li [1988] can be viewed as special forms of this generic algorithm. These are described next. The specific form of the original problem for Rosen's method is given in Table 7. The feature of this structure is that for fixed y, the problem decomposes into p independent linear programs in xi given as the subproblems in Table 7. By solving the linear programs independently and constructing a piece-wise linear approximation of xi (y) in the subproblem, a master problem is constructed strictly as function of y. Solution of the linear subprograms with the simplex method yields optimal basis matrices and simplex multipliers that are used to set up a master problem in y. The algorithm converges for the case when the cost vector c and the con.! l:rnt:nIltrix A are independent of y, and the 18

constraints b(y) are convex. Algorithm 7 summarizes Rosen's method. Table 7 Rosen's Decomposition Rosen Original Problem Properties/Transformations p Linear in xi; i = 1,..., p. min ci(y)T xi+ cO(y) Define basis, B, non-basis, D. y x i=l F B1- F iy s. to: bi(y) - Ai(y) xi < 0 i = 1,.... p. Ai =[ B b = b(Y) y E S feasible domain) i Qi =Di B i Linear approximation of constraints in y. Master Problem Subproblem P min ((y)Z iT bBi(y) + co(y) y => min ciT i i=l x s. to: bi(y) - Ai (y) xi 0. [Qi Vy b Bi() - Vy bDi(y) I (y - yk LP solution: x*i = B1i b() _ k - k hX*i *i = cT B-1 > bDi(yk) - Qi bBi(Y ) i Algorithm 7: Coordination Strategy for Rosen's Decomposition. 1. Initialize k=0; choose a feasible vector yk = y~ E S. 2. (Subproblem) Solve the linear subproblems with y = yk, obtaining optimal solutions x*i, basis matrices Bi and simplex variables X*i 3. (Master Problem) Solve the master problem to obtain y*. 4. (Optimality Test) If (p(y* ) converged, and all simplex multipliers are non-negative the solution is optimal. Stop. Otherwise the ith linear subproblem containing the least simplex multiplier contains the non-basic variable to be used in updating the basis matrix. Update and return to 3. 5. (Feasibility Test) If (p(y* )< (p(yk ) check feasibility. (a) If no constraints are violated, set k = k+1, return to 2 with yk = y* (b) If some constraints are violated, perform line search into feasible region and update y* based on line search. (c) If line search is unsuccessful, the violated constraint identifies non-basic variable for basis update. Using this new basis return to step 3. 19

Table 8 Benders' Decomposition Benders Original Problem Properties/Transformations min cT x + f(y) Linear in x y x Formulate dual in x and solve using a s. to: b - Ax - F(y) < 0 relaxed constraint set. x>0 ye S. Master Problem Subproblem min vo + f(y) y,vo => v(y) = max (b - F(y))T jt yvo vt P T s. to: AT < c s. to:vo (b-F(y))T P i E Ip I I (for some extreme points) r p 0 (b - F(y))Tr 0 i Ir (for some extreme rays) Benders [1962] proposed Algorithm 8 for problems of the original form in Table 8. The matrix A is m Xn, x and c are n-vectors, b an m-vector, y a ny-vector, f a scalar valued function of y, F is an m-vector of scalar functions dependent only on y, and S is an arbitrary subset of Rny. No assumptions are made about linearity with respect to y, nor continuity on S. The set S may also be discrete valued. The algorithm exploits the ydependence of the constraint set to formulate a subproblem relating y and x. The linearity with respect to x in the subproblem is exploited to formulate a dual subproblem in terms of extreme points. A relaxation strategy is used to reduce the dimensionality of the dual formulation. Algorithm 8: Coordination Strategy for Benders' Decomposition. 1. Initialize the index sets of the extreme rays and extreme points to the empty set. Solve the master problem with respect to y and Vo. (a) If it is infeasible, stop; original is infeasible. (b) If it is finite, it is optimal. Stop. (c) If it is unbounded, go to 3. 3. (Subproblem) Solve dual form of the subproblem wrt g (y is fixed) and obtain v'(y). (a) If dual is infeasible stop; original is unbounded. (b) If dual is unbounded go to 6. (c) Go to 4. 4.(Subproblem Optimality Test) (a)If v*(y) = vo, Stop. The current vector. (y, x) solves original prolem. (b) Otherwise v*(y) < (b - F(y))T pX tfr some multipliers. Go to 5. 20

5. (Dual is bounded but Optimality Test failed) (a) Update index set, Ip and constraint on vo to vo > (b - F(y))T g* and return to 6.(Dual unbounded) The simplex method will locate some extreme ray and an extreme point such that the dual objective approaches +oo along a half-line of the two. Add the ray index to the master problem. In addition, if the constraint containing the extreme point is violated, add the point index to the master problem. Return to (2). Table 9 Azarm and Li's Decomposition. Azarm and Li Original Problem Properties/Transformations p Monotonic in xi min fo(y) + I fi(y, xi) |y x ii=l s. to: g(y) < 0 gi(y,xi) 0 i = 1..., p Master Problem Subproblem p y => min fi(Y, xi) min fo(y) + I fi(y, x*i(y) xi y i=l = s. to: g(y) < 0 x*(y) s. to: gi(y,xi) < 0 i= 1,..., p If the NLP in Equation (5) has no equality constraints and is monotonic in xi, in both the objective function and in every constraint of the ith subproblem, Azarm and Li's Monotonicity Based Decomposition Method (MBDM) can be used. Table 9 gives the master problem and subproblems. Global monotonicity properties surface in the subproblem that can be exploited when y is fixed. Problems with only linear terms in x and cross terms y x can readily exploit this method. The method also assumes the number of active constraints is exactly equal to the dimension of xi. This assumption may preclude use of the method on many problems. The partition allows elimination of known inactive constraints (in the subproblems) reducing the dimensionality of the master problem. In this context it may be viewed as a special model reduction technique. The strategy is summarized in Algorithm 9. Algorithm 9: Coordination Strategy for Azarm and Li's Decomposition. 1. Initialize k = 0, yk = y0. 2. (Subproblem) Use monotonicity to solve subproblems and obtain x*i(y). 21

3. (Master Problem) Solve the master problem with respect to y using a conventional method to obtain y*. 4. If f converges stop; otherwise, increment k, set yk = y* return to 2. The key difficulty with the method is that the monotonicity analysis usually yields more than one candidate active constraint in step 2. If, for the ith subproblem, there are Ki candidate active sets, and qij(y) is the solution vector resulting from the jth candidate active set for the ith subproblem, the subproblem solution x*i(y) can be expressed as x*i(y) = max {qij(y); j = 1,..., Ki}. (6) Evaluation of the max function in the subproblem, requires an estimate of y. Moves in y in the master problem may invalidate x*i(y); evaluating the max function in the master problem may introduce discontinuities, a real difficulty for a conventional SQP algorithm. Three strategies could preclude this. First, the master problem could be formulated as a mixed discrete problem where, the candidate solutions, qij(Y); j = 1,..., Ki (7) are treated as discrete variables. Second, the max function could be evaluated in the subproblem and appropriate move limits on y also be computed and returned to the master problem. Third, the candidate active constraints could be represented by a K-S function (Kreisselmeier-Steinhauser [1979]) which asymptotically envelopes the dominant constraint. The MBDM is very similar to Johnson's Multi-Stage Decomposition. Johnson suggested solving the subproblems analytically and the master problem either analytically or using a conventional numerical method. If monotonicity analysis is the method for the analytical solution in the subproblem and an SQP method solves the master problem, the MBDM can viewed as an implementation of Johnson's multistage method on problems with special structure. 22

Wismer and Chattergy [1978] proposed a method where the objective is separable with respect to xi but the constraints are coupled. The original problem, the master problem, and the subproblems are given in Table 10. Table 10 Wismer and Chattergy's Decomposition Wismer and Chattergy I Original Problem Properties/Transformations p Objective separable mi n l: fi (Xi) Highly coupled constraints | x I...xp- ip = IiLagrangian minimization Introduce y s~. to: g~(x) ~0 h(x, y) = x - y =0 gi(x) = gi(xi, Y I.., yj,.. Yp) i E R ni for i = 1,..., p. j:i Master Problem Subproblem P Y min Z Li(y,x*,X*, g,) x,,, Xi ) y i=l Axihi gi Figure 10 shows the FDT for Wismer and Chattergy's method. The strategy begins by introducing a vector of new variables, y, and a vector of equality constraints, h(x, y) = x - y = 0, both of dimension n. The vector h is partitioned into p vectors, hi, and y is partitioned into p vectors yi each associated with vector xi. The vectors hi and yi are each of dimension ni. Since xj = yj, for j = 1,..., p, the following transformation holds for the ith vector of inequality constraints, gi(x) = gi(xi, Yl,.., Yj,.., Yp; jei) = gi(xi, Yj; j=1,..., p; ji ). (8) To formulate the Lagrangian, introduce p vectors of Lagrange multipliers hi associated with hi and introduce p vectors of Lagrange multipliers gi associated with gi. Defining Lagrange functions for the ith partition, Li(xi, y,Xi,i ) = fi (xi)+X hi (xi. yi )+ Ki gi (xi, Yj; j=,.., p; ji) (9) the Lagrangian for the original problem is the sum, p L (x,y,X,g)= i L ix,.y.X^, ) (10) i=l which effects p independent Lagrangi ans i' l Is ftixed. A conventional SQP algorithm 23

can usually solve this subproblem and provide (x*, X*, g* ). The master problem is to minimize L with respect to y. A gradient can be employed to reduce L on each iteration in the master problem, k+l =yk x 1 ( 1) Y Yj -'i (11) yi with a > 0. xI x 2 x3 x. f2:-*.' fp:I:-: I-: Figure 10. FDT schematic for Wismer and Chattergy's Decomposition Method. Algorithm 10: Coordination Strategy for Wismer and Chattergy's Decomposition. 1. Initialize k= 0, feasible x = xk, and yk = xk 2. Holding yk constant, solve subproblems using any method to obtain, (x*i,, g, i ) 3. Update y in master problem using Equation (11). 4. (Convergence test) If II f(y ) - f(yk) II <, stop; Otherwise increment k and go to (2). the Lagrangian. A line search in Step 3, instead of a fixed a may also be used. the problem. Another drawback is that certain partitions of g may result in unbounded subproblems. Subproblems need to be checked for boundedness as part of the subproblem solution. 24 4.~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~~~~~~~~~......rg nc................................................. O therw ise~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~........een... and........o. to (2.................. Advantages~~~~~~~~~~~~~~~~.....i metho are tha.... a handl..hl......... constrain.sets.an the m aster problem converges if local solutions are............... feasible...................W he..n.l....a........s.................... cannot be found, the most recent feasible solution, while perhaps not optimal,... has.. reduced..................................... the Lagrangian. A line search in Step 3, instead of a fixed a may also............. be... used............................................ One~~~~~~~~~~~~~~~~~~~~........ag... tha.h..rdu to.........e.h....e....ibls i the problem. Another drawback is that certain partiti....s............s.lt i........d.. subproblems.~~~~~~~~~~~~~~~~~~~~~~~~~~~9..........s nee.........ke for.o n e n s.............. u pr solution.~ ~ ~ ~~~~~~~~~~~~~~~~.......................................................... -...........................

y x1 x 2 f 1 =- p f..........,::......: hp 8..::;......i gby..... gb2.... Figure 11. FDT schematic for Sobieskis Decomposition Method. gbp Sobieski [1982] proposed a method where the constraints can be partitioned with respect to xi but the objective is not separable. The FDT structure is shown in Figure 11 and the problem form is given in Table 11. The objective function is only a function of global variables y. Note that partitioning is not driven by any structure in the objective function. Local variables, xi, appear only in the constraint set. The constraint set is partitioned by the incidence of the local variables. Such structure occurs when design objectives can be expressed as functions of many detailed variables, but all variables need not appear explicitly in the objective function. Problems where the objective is expressed in terms of behavior variables which in turn are functions of geometric variables also tend to have this structure. 25

Table 11 Sobieski's Decomposition Sobieski Original Problem Properties/Transformations m i n f(y) Non-separable opjective y x... Xp Separable constraints s. to: h(y) = 0 g(y) < 0 gb(y) < 0 hi(y, Xi) = 0 gi(y, xi) < 0 gbi(xi) < 0 for i = 1,... p. Master Problem Subproblem y => ngi min f(y) min Pi (y, xi)= <gij(y,x)>2 Y xi j=l s. to: h(y) = 0 Pi(y), s. to: hi(y, xi) = 0 gb(y) ~ 0 x*i(y) gbi( xi ) < 0 gbi(x*i(Y)) < 0 G(y)= <gij(y,xi)> = 0 for gij(y,xi) < 0 <g(y)> P*i(y) = gij(y,xi) for gij(y,xi)> 0. +I Pii(Y)- Pt < i=l i= =l_ Since the objective function is independent of x, it appears only in the master problem. In the spirit of penalty methods, an objective function is formed for each subproblem, where a measure of constraint violation is minimized. Assuming gi(y, xi) is a vector of scalar-valued functions defined on R ngi, a penalty associated with the ith subproblem, Pi(y, xi), is minimized as shown in the subproblem of Table 11. If the sensitivity derivatives, Vy Pi(Y,x*) (12) Vy x* are also computed when solving the subproblem, a linear approximation of the subproblem solution can be constructed for use in the master problem. The algorithm is summarized in Algorithm 11. Note the bounds on xi are included in the master problem. No proof of convergence is given, but Sobieski suggests the addition of move limits on y to keep the linear approximations valid. 26

Algorithm 11: Coordination Strategy for Sobieski's Decomposition. 1. Initialize k = 0 and all variables y = yk and xi = xk for i = 1,..., p. 2. (Subproblem) For a given yk, solve the p independent subproblems with respect to xi to obtain xi. (a) Compute sensitivity derivatives of the subproblem solution with respect to y, Vy xki, Vy Pi (y, x ). (Finite difference is suggested.) (b) Using the sensitivity derivatives computed in (a), construct a linear representation of Pi (yk, xi ) and xki about the point yk yielding xky) and Pi(y, xky)). yielding x*1(y) and Pity, X*,(y) 3. (Master Problem) Solve the master problem with respect to y and obtain solution y* 4. (Termination Test) If llf(y*) - f(yk) II < ~ and G(y) < Eg stop; otherwise set yk+l = y*, xktl = xi and k = k+l. Return to 2. The method can be interpreted as Langrangian decomposition in the context of penalty methods. If all constraints except for simple bounds and equalities are represented with exterior penalty functions in a modified objective, the Lagrangian of the penalty formulation is additively separable with respect to xi. The master problem minimizes this Lagrangian with respect to y and the subproblems independently minimize each of the terms in the Lagrangian with respect to xi. Sobieski et al. [1985] used the method solve a three member portal framework problem. The decomposed solution compared favorably to the solution obtained without decomposition. In the portal frame problem, the K-S cumulative constraint (KresselmeierSteinhauser, [1979]) was used instead of the penalty function given in Table 11. Parksinson et al. [1987] evaluated Sobieski's decomposition method on several test problems from Stoecker [1981] but with a different approximation scheme. Using Design of Experiment techniques to determine'test' points in the y space, they approximated the subproblem solution dependence on y with regression equations. The objectives of problems solved with decomposition were within a few per cent of the non-decomposed solutions. 27

Table 12 Summary of Defining Properties for Prevalent Hierarchical Decomposition Methods Original Problem Master Problem Subproblem Convergence Method Objective Constraint Manipulation Solution Solution conditions Dantizig- Linear Some Formulatie dual Outer Simplex Yes Wolfe Program highly wrt coupling linearization (Dual) coupling constraints Relaxation constraint Takahashi Convex Equalities; Formulate dual Feasible Conventional Yes (Dual) some wrt coupling Directions highly constraints coupling constraint Lasdon Additively Each Formulate dual Gradient Conventional Yes (Dual) Separable constraint method; or is tangential additively approx. separable Wismer- Additively Highly For p partitions Gradient for Conventional Yes, if subproblems Chattergy separable coupled introduce dual variables stay feasible; (Dual) k- vectors y,h. usingKKT k = nP - n. conditions. Diaz- Goal Linking Conventional Conventional Not guaranteed Belding Programs variables NLP (GRG) NLP (GRG); (Dual-like) y Sensitivity wrt to y Johnson No special No special Projection Analytical or Analytical; Yes, if subproblem (Feasible) properties structure Numerical Suggests is single valued. M.O.D. Kirsch Additively Linking Projection No Not guaranteed (Feasible) Separable variables suggestions Y Rosen Linear in x Linear in x Projection Piecewise Simplex Yes, if A,e are (Feasible) linear wrt y. Method indepenndent of y. Benders Linear in Linear in x Projection Outer Dual or Primal Yes, explicitly for (Feasible) x. linearization/ Method assumed form. Relaxation Azarm-Li Additively Linking Projection SQP Monotonicity Yes, if subproblem (Feasible) separable variables Analysis are single valued monotonic y functions. (single wrt x. _____step) Wismer- Additively Highly Introduce Gradient Conventional Yes, if subproblems Chattergy Separable Coupled n-vector y Method (using stay feasible (Feasible)____ KKT) Sobieski Not Linking Penalty Conventional Conventional; Not guaranteed but (Feasible) dependent variables formulation of NLP method Sensitvity move limits can on x y linking analysis wrt help. ______constraints __y_ The hierarchical methods reviewed in this section are summarized in Table 12. Dual methods have the common definin.;.:~,it' linking functions; feasible methods 28

have the common defining property of linking variables. Additional properties of additive separability, linearity, and monotonicity determine the suitability of a given method for a given problem. The underlying assumption to the use of these methods is the structure of the original problem, represented here by the structure in the FDT, is known. 4 NON-HIERARCHICAL METHODS The methods reviewed thus far deal with independent subproblems generated when either linking functions or linking variables are accommodated in the master problem. However, problems may have both linking functions and linking variables as the FDT's of Figure 12 illustrate. Using the methods described earlier, a three-level method could be used to decompose the NLP. Use of a dual method would account for the linking functions, and the subproblem would be further decomposed using a feasible decomposition method; such an approach is termed dual-feasible. Here, linking in the subproblems is accounted for by using a feasible method to further decompose the problem. Such a strategy is essentially a recursive decomposition, and could be invoked to any desired level. If the problem is highly coupled as in Figure 12 (b), such a method may not succeed. To facilitate independent subproblems in such circumstances, approximations are required to account for the interdependence of the subproblems. Ritter [1967], Wismer and Chattergy [1978], Sobieski [1988], Pan and Diaz [1990], Wu [1991], and Unger et al. [1992] have proposed various methods to handle such coupling. Such methods have acquired the name non-hierarchical decomposition because the information coupling is nonhierarchical. Note, however, that each method has a coordination strategy that is strictly hierarchical. The approximations are effectively'cuts' that allow non-hierarchical information coupling to be handled hierarchically. 29

The discussion that follows considers linear programs first. Ritter [1967] proposed a method for NLP's with the structure in Figure 12 (a) which are linear in x, with linking variables y, and with linking constraints. The subproblem is the linear program in x with y fixed. Solution of the subproblem yields basis vectors Xib and non-basis vectors Xid. The master problem is formulated as an NLP problem in y and non-basis vector xid. The master problem is interpreted as a relaxation strategy because explicit elimination of Xib from the master problem effects relaxation of the constraints on Xib. Under explicit conditions, convergence is guaranteed and computational experience with the method is given in Grigoriadis and Ritter[1968]. Y Xl... Xp Y Xl... Xp (a) (b) Figure 12. FDT for NLP with linking functions and linking variables. (a) low coupling. (b) high coupling. Wismer and Chattergy [1978] suggested a relaxation strategy for decomposing highly coupled constraint sets in problems with an additively separable objective in x1. Introduction of an n-vector of variables y an n-vector of constraints, h(yx)=y-x=0, (13) and the KKT conditions yield an explicit relationship for the multipliers, 30 - - - RR -S 2 R Ra R i ^ R -R>R ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~............ a......RS.f............ R~RSS RRSR R SS R R R RSR S-X X -- S * RR -|f....( a)............... Figure 12. FDT for NLP with linking functions and linking variables. (a) low coupling. (b)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.................... high..ou.p..ling............................[97 ] s g e td ea ain tae y f r e o p sn highly....ope......in sesi.polmswt.n.diivl.eprbeobetvei.i Introuctio ofannvetr.f.aials..a.-vcorofcnsrins............)........1 3 an.heKT. odiinsyed nexlci..l..cnhp o.temutples........... ~ ~ ~ ~ ~ ~ 3

P ag T xi = K j (14) j=1 Yyi In each iteration of the coordination strategy, the master problem sets y = x* and computes X from Equation (14). Clearly, the method requires the sensitivity derivatives ag/ay in each subproblem. These derivatives are the mechanism which accounts for subproblem coupling. Pan and Diaz [1990] presented an algorithm for decomposing a highly coupled NLP problem as follows. The variables are partitioned into p arbitrary partitions xi and p subproblems are defined. The ith subproblem solves the entire NLP with respect to xi holding all other variables fixed. Results using the first order sensitivity theorem (Fiacco [1983]) are employed to develop a scheme for sequencing the solution of the subproblems. The master problem examines the constraint activity in each subproblem and identifies a constraint active in all subproblems. The subproblem indices are ranked according to the value of the Lagrange multiplier associated with such a constraint. The sequence for the next iteration of subproblems is based on this ranking. The authors report two examples where the method was effective in solving a highly coupled NLP, but caution that experience is still limited. The method uses the Lagrange multiplier as a measure of a variable partition's effect on the entire problem and gives that method priority in the sequence of the subproblems. Sobieski [1988, 1990] proposed a method for highly coupled problems which makes no assumptions about separability of the objective. Multi-disciplinary problems where the input-output relationships for function evaluations are defined a priori motivated the method. The method can take advantage of discipline-dependent optimization strategies (for example, optimality criteria in structural optimization). Treating input as the independent variables and output as the dependent variables for a given function, the implicit function theorem is used to dcrl\e the Global Sensitivity Equations (GSE), 31

(Sobieski, [1990]) which provide the derivative of the output with respect to the input at a constant function value. If a given output is required as input in another function, the derivatives facilitate decoupling of the two functions. The GSE approach decouples the constraint sets. The constraint vector is partitioned into p subvectors based on physical subsytems and each sub-vector is represented with a single cumulative constraint. The variables are partitioned based on- effectiveness coefficients (Hajela and Sobieski [1981]) which quantify the impact of the variable on reducing the objective and each cumulative constraint. Also, a pxp matrix of'responsibility' coefficients r weight the effect of the ith variable partition on reducing the jth cumulative constraint violation. For the feasible regions, a pxp matrix of'tradeoff coefficients t weight the effect of the ith variable partition on increasing the jth cumulative constraint. The system NLP is approximated with the original objective, the p cumulative constraints, and the bound constraints. The ith subproblem minimizes the system objective with respect to xi subject to a p vector of cumulative constraints modified with a p-vector of weights defined in terms of r and t. These are fixed in the subproblems. The'coordination' problem minimizes the system objective with respect to the coefficients r and t. The coefficients r and t coordinate all subproblem coupling in the master problem. In this context, the method bears resemblance to a dual method. Wu [1991] proposed a method for highly coupled problems which requires no optimization in the subproblems. The method begins with partitioning the original variables into coordinating variables y and local variables x. The local variables x are further partitioned into p partitions. Partitioning by discipline (aspect decomposition) or by component (object decomposition) is suggested but the partitions can also be arbitrary. Similarly, the constraints are partitioned into p sets and the ith set of constraints is represented in the master problem with a single cumulative constraint, Qi. 32

A p-vector of multipliers a is introduced where the scalar ai is associated with the ith partition of variables. The coordinating variables are (y, a). The master problem minimizes the objective with respect to (y, a) subject to the p cumulative constraints. A master problem iteration yields a vector of Lagrange multipliers, gt, associated with each cumulative constraint. The local variables are represented in the master problem through the vector a. At each iteration the subproblem simply computes the derivative, df Of a T = - + I (15) dxi axi axi and the local variables are represented in the master problem as k+l k df Xi = i axi - dxi (16) The method has several interesting features. It is a model reduction method; the cumulative constraints serve to reduce the dimensionality of the constraint set; the vector a reduces the dimension of the local variable space. Second, no optimization is done in the subproblems; they only report gradient information. Third, the derivative in Equation (15) is a measure of the local variable's effect on the objective and the derivative's magnitude and sign is used to increase or decrease the value of the local variable. Fourth, the vector a is analogous to a set of step-sizes in a line search algorithm. Fifth, the method needs no move limits. The method was evaluated with numerous small to medium size problems from the literature. The accuracy of the solutions depend primarily on the accuracy of the cumulative constraints. The method is derived on the assumption that there are no changes in constraint activity which may limit its generality. Unger et al.[1992] proposed the use of variable complexity models to decompose the analysis sequence in a multi-disciplinary design problem. The problem of interest was the design of a transport wing for minimum weight which required both structural analysis and aerodynamic analysis. The aerodynamics are modeled by vortex-lattice theory and the structure is described at two levels of corprlc\ity. A simple algebraic equation model of structural weight is first used in a numcr:<.:; ";:/lllation procedure to obtain a design that 33

approximately accounts for effects of wing geometry on structural weight. The design is then refined based on a more complex finite-element model for the wing structure. They demonstrate how variable level of detail in the iteration sequence leads to substantial savings in computational effort. They report a 75% reduction in CPU time with this approach. The method can be interpreted as a model reduction decomposition technique. A function of many variables is replaced with a function of few; the many finite element variables are reduced to a few geometric variables and the required function evaluation is replaced with a simpler evaluation: an algebraic expression. 5 SUMMARY The report reviewed and compared coordination strategies for hierarchically and non-hierarchically coupled optimization problems. The coordination strategy is the fundamental element of a decomposition method and depends very strongly on the defining properties of the problem. Figure 13 summarizes the FDT structures for most of the methods reviewed. The figure illustrates the defining properties of linking functions, linking variables, and separability. Moving down the figure the problem coupling increases. The coordination strategy of every method presented has an underlying generic strategy given in Algorithm 0. While coupling may be non-hierarchical, the coordination strategies are strictly hierarchical. A master problem coordinates the solution of the independent subproblems. The algorithms in the coordination strategy take explicit advantage of structure (found or imposed) in the primal problem statement. Dual methods accommodate linking functions by coordinating dual variables in the master problem; feasible methods accommodate linking variables by coordinating them in the master problem; problems with linking functions and linking variables can be accommodated with a recursive 34

decomposition or by incorporating a mechanism into the coordinating problem to account for coupling. That mechanism may be a Lagrange multiplier (Pan and Diaz [1990]), a sensitivity coefficient (Sobieski [1990]) or a simpler model (Unger et al. [1992]). f.. f.............. Dual Methods Feasible Separable Dantzig-Wolfe Bender, Rosen, Kirsch, Azarm and Li Dual-Feasible Feasible Non-separable Ritter Sobieski Wismer and Wu Chattergy Pan and Diaz (Non-hierarchical) (Non-hierarchical) g >...,B., g. -_ -....... w_Figure 13. Summary of FDT structures for de composition methods reviewed.

The design interpretation of the decomposition methods lies primarily in the authority model of the master problem. In dual methods, the master problem is setting priorities with the multipliers; in feasible methods the master problem is directly controlling those variables common to all subproblems. In problems with non-hierarchical coupling, the master problem coordinates independent subproblems through some measure of the subproblem effects on the system. The mechanisms vary, but several of the coordination strategies bear a strong resemblance to dual methods. The assumption throughout this report is that the defining properties for decomposition were known. For the general NLP this is not easy to accomplish. An undirected graph representation of the FDT coupled with specific partitioning algorithms facilitates identification of defining properties in the general NLP. Wagner (1993) used a k-clique graph representation of the FDT and heuristics to find desired defining properties; Michelena and Papalambros (1995) used a hypergraph representation and spectral partitioning to find optimal defining properties. Krishnamachari and Papalambros (1996) used a bipartiite graph of the FDT and integer programming to find optimal defining properties. 36

REFERENCES Azarm, S., and Li, W.C., 1988, "A Two-level Decomposition Method for Design Optimization", Engineering Optimization, Vol. 13, pp211-224. Azarm, S., and Li, W.C., 1989, "Multi-level Design Optimization Using Global Monotonicity Analysis", ASME Transactions, Journal of Mechanisms, Transmissions, and Automation in Design, Vol. 111, pp259-263. Benders, J.F., 1962, "Partitioning Procedures for Solving Mixed Variables Programming Problems", Numerische Mathematik, Vol. 4., pp. 238-252. Bloebaum, C.L., 1991, "Formal and Heuristic System Decomposition Methods in Multidisciplinary Synthesis", NASA Contractor Report 4413. Dantzig, G.B., and Wolfe, P., 1960, "Decomposition Principle for Linear Programming", Operations Research, Vol. 8, pp 101-1 11. Diaz, A.R., and Belding, B., 1991, "Modeling and Solution Strategies for Hierarchical Design Optimization", Engineering Optimization, Vol. 16, pp 247-273. Geoffrion, A.M., 1971, "Large Scale Linear and Nonlinear Programming", Optimization of Large Systems, ed. D. A. Wismer, McGraw-Hill Book Company, New York. Grigoriadis, M.D., and Ritter, K., 1968, "A Decomposition Method for Structured Linear and Nonlinear Programs", Computer Sciences Technical Report!0, Computer Sciences Department, University of Wisconsin. Hajela, P., Sobieszczanski-Sobieski, J., 1981, "The Controlled Growth Method - A Tool for Structural Optimization", Proceedings of the AIAA/ASME/ASCE/AHS 22nd Structures, Structural Dynamics, and Materials Conference, Atlanta, GA. Johnson, R. C., 1984, "A Method of Optimal Design Extended to Systems of Mechanical Elements Having Interrelated Boundary Effects", PhD Dissertation, Dept. of Mechanical Engineering, University of Rohester, Rochester, NY. Johnson, R. C., Benson, R. C., 1984a, "A Basic Two-stage Decomposition Strategy for Design Optimization", ASME Transactions, Journal of Mechanisms, Transmissions, and Automation in Design, Vol. 106, pp380-386. Johnson, R. C., Benson, R. C., 1984b, "A Multi-stage Decomposition Strategy for Design Optimization", ASME Transactions, Journal of Mechanisms, Transmissions, and Automation in Design, Vol. 106, pp. 387-393. 37

Krishnamachari, R., and Papalambros, P., "Optimal Hierarchical Decomposition Synthesis Using Integer Programming", Advances in Design Automation, 1996, ASME, New York. Kirsch, U., 1981, Optimum Structural Design, McGraw-Hill Company, New York, NY. Kreisselmeier, G., and Steinhauser, R., 1979, "Systematic Control Design by Optimizing a Vector Performance Index", Proceedings of the IFAC Symposium on Computer Aided Design of Control Systems, Zurich, Switzerland, pp. 113-117. Lasdon, L.S., 1968, "Duality and Decomposition in Mathematical Programming", IEEE Transactions of System Science and Cybernetics, Vol. SSC-4, No.2, July. Lasdon, L. S., 1970, Optimization Theory for Large Systems, The Macmillan Company, London. Lootsma, F.A., and Ragsdell, K.M., 1988, "State-of-the-art in Parallel Non-linear Optimization", Parallel Computing, Vol. 6, pp. 133-155. Lootsma, F.A., 1989, "Exploitation of Structure in Non-linear Optimization", Parallel Computing, Vol. 6, pp. 31-45. Michelena, N., and Papalambros, P., "Optimal Model Based Decomposition of Powertrain System Design", ASME Journal of Mechanical Design, Vol. 117, No. 4, 1995, pp. 499-505. Pan, J. and Diaz, A. R., 1990, "Some Results in Optimization of Non-hierarchic Systems", ASME Journal of Mechanical Design, Vol. 112, September pp 399 - 405. Papalambros, P. Y. and Wilde, D.J., 1988, Principles of Optimal Design: Modeling and Computation, Cambridge University Press, New York, NY. Parkinson, A., Balling, R., Wu, A., and Free, J., 1987, "A General Strategy for Decomposing Large Design Problems Based on Optimization and Statistical Test Plans", Internation Conference on Engineering Design 87, Boston. Ritter, K., 1967, "A Decomposition Method for Linear Programming Problems with Coupling Constraints and Variables", Technical Report No.739, Mathematics Research Center, University of Wisconsin. Rogers, J.L., 1989, "A Knowledge Based Tool for Multi-level Decomposition of a Complex Design Problem", NASA TP 2903, Langley Research Center, Hampton, VA. Rosen, J.B., 1963, "Convex Partition Programming", in Recent Advances in Mathematical Programming, pp 159-176, R. L. Graves and P.Wolfe, eds., McGraw-Hill, Inc. New York, NY. Sobieszczanski-Sobieski, J., Barthelemy. J. F.. and Riley, K. M., 1981 "Sensitivity of Optimum Solutions to Problem Parameters", Proceedings of AIAA/ASME/ASCE/AHS 22nd Strau t(Ire. Structural Dynamics and Materials Conference, Atlanta, GA, April 6 \ I \ A Paper No. 81-0548. 38

Sobieszczanski-Sobieski, J., 1982, "A Linear Decomposition Method for Large Optimization Problems - Blueprint for Development", NASA TM83248, Langley Research Center, Hampton, VA. Sobieszczanski-Sobieski, J., James, B.B, and Dovi, A.R., 1985, "Structural Optimization by Multilevel Decomposition", AIAA Journal, Vol. 23, November, pp. 17751782. Sobieszczanski-Sobieski, J., and Haftka, R. T., 1987, " Multi-disciplinary and Multi-level Optimum Design", Computer Aided Optimal Design: Structural and Mechanical Systems, ed. C.A. Mota Soares, Springer-Verlag, Berlin. Sobieszczanski-Sobieski, J., 1988, "Optimization by Decomposition: A Step from Hierarchic to-Non-hierarchic Systems", NASA TM 101494, Langley Research Center, Hampton, VA. Sobieszczanski-Sobieski, J., 1990, "Sensitivity of Complex, Internally Coupled Systems", AIAA Journal, Vol. 28., January, pp. 153-160. Takahashi, I., 1964, "Variable Separation Principle for Mathematical Programming ", Journal of the Operations Research Society of Japan, Vol. 6, No. 1., pp 82-105. Unger, E.R, Hutchison, M.G., Rais-Rohani, M., Haftka, R.T., and Grossman, B., 1992, "Variable Complexity Multi-disciplinary Design of a Transport Wing", Int'l Journal of Systems Automation: Research and Applications, Vol. 2, pp. 87-113. Wagner, T.C., A General Decomposition Methodologyfor Optimal System Design, Ph.D. Dissertation, Department of Mechanical Engineering and Applied Mechanics, University of Michigan, 1993. Wagner, T.C., and Papalambros, P., 1993, "A General Framework for Decomposition Analysis in Optimal Design", ASME Advances in Design Optimization, Vol. 2, 1993, pp. 315-325. Watton, J.D., 1989, "Automatic Reformulation of Mechanical Design Constraints to Enhance Qualitative and Quantitative Design Decisions", Ph.D. Dissertation, Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, PA. Wismer, D. A.(ed.), 1971, Optimization Methods for Large Scale Systems, McGraw-Hill Book Company, New York. Wismer D.A, and Chattergy, R., 1978, Introduction to Non-linear Optimization: A Problem Solving Approach, North-Holland, New York, NY. Wu, B.C., 1991, "Optimization Based Design of Non-hierarchical Systems", Ph.D. Dissertation, Dept. of Mechanical Engineering, University of Maryland, College Park, ML. 39