MONOTONICITY IN MATHEMATICAL PROGRAMMING by Panos Papalambros Department of Mechanical Engineering and Applied Mechanics The University of Michigan M. B. Suryanarayana Department of Mathematics Eastern Michigan University Ann Arbor June 1981

ABSTRACT Coordinatewise monotonicity of the objective and constraint functions with respect to the decision variables in mathematical programs may be used to identify active and inactive constraints. When applicable, this may lead to the global optimum directly or at least provide significant simplifications. Monotonicity in multidimensional spaces is studied for systems of inequalities and two necessary conditions of optimality are derived. Several examples demonstrate how to use monotonicity analysis. The ideas discussed originated from engineering design applications.

1. Introduction. Mathematical programming models of engineering design problems generally result in nonlinearities, nonconvexity, discrete variables and multiple optima. Identification of the global optimum for problems involving more than just few variables can be very difficult. Experience has shown, however, that these problems often have many inequality constraints which are active, i.e. satisfied as strict equalities at the optimum. Moreover due to modeling practices, it is not unusual to have a constraint-bound optimum, i.e. the number of active constraints to equal the number of design variables [1]. A reason for this widespread constraint activity is an observed frequent property of engineering design problems, namely to have the objective and constraint functions monotonic with respect to some or all the design variables. This observation led to the development of monotonicity analysis, a methodology for constraint activity identification which has been used to study and solve several design problems [1-9]. Although the motivation comes from engineering design, monotonicity arguments may apply to mathematical programming problems in general. Linear programs, for example, have always constraint bound solutions due to monotonicity. Nonlinear programs may be reduced significantly using monotonicity arguments and thus making subsequent numerical treatment easier and more reliable, particularly when a global optimum is sought. The present paper provides a mathematical background for monotonicity properties that were used rather

informally in the previous engineering publications [1-9]. A particular type of monotonicity is defined in multidimensional space, a theory on monotone inequalities that can be used for model manipulations is described and two necessary conditions for optimality are presented. A general application procedure with some examples is included to demonstrate how the theory can be used. The references cited above can provide further insight for applications. 2. Monotonicity in Multidimensional Spaces. The classical notion of monotonicity for a real-valued function f(x) of a real variable x lends itself to a variety of generalizations when we consider real-valued functions of several variables. In this section we examine some of these and indicate the one used in later sections. Given a function f: A + R, where A is a subset of the real line R, we say that f is monotone nondecreasing, if for all x1, x2 in A, X1 < X2 implies f(x < f(x2) (1) or equivalently (x1-x2)(f(x1) - f(x2)) > 0. (2) We say that f is strictly nondecreasing or increasing, if f is monotone nondecreasing and in addition f is one to one. Consequently, for increasing functions conditions (1) and (2) should be modified to have strict inequalities. To generalize the above concepts to functions of several variables, we introduce some notation: Let En denote the n-dimensional Euclidean space and E+n the

positive orthant of E that is En xz(x1,...,xr)S E,.. L+ n xi > 0, i=1,2,...,n. Clearly E+n is a closed convex cone in En and gives rise to a partial ordering < defined on En by x < y if and only if, xi < yi, it1,...,n; that is, x < y if and only if (y-x) ~ En+. In general, if C is a closed convex cone in En, then C induces a partial ordering < defined by x < y if and only if (y-x) C C. Given a real valued function f on En and a closed convex cone C in En, we can define f to be C-monotone if f(x1) < f(x2) whenever (x2-x1) c C. Also, f is said to be strictly C-monotone if f is C-monotone and one to one, that is, (x2-x1) C C and x1lX2 imply f(x1) < f(x2). It is to be observed that we did not use the phrase "nondecreasing" because the cone C might define an ordering which could lead either to "nondecreasing" or "nonincreasing". For example if n=l and C z {x/x < O} then (x2-x1) E C implies X2< X1 and if in this case f(x1) < f(x2), then we are led to "nonincreasing" function concept. Another point of interest is that, given any real-valued one to one function f on En, we can define a partial order <f in En as follows: x1 <f x2 if and only if f(x1) < (x2). However, this partial order need not be generated by a cone in the above sense. For a different notion of monotonicity we refer to Minty [12] and Rockafellar [13]. Let us now define another concept of monotonicity that we will use in the present discussion. Let X be an arbitrary (finite or infinite) subset of En and let Xi Ixi/ there exist x1,...,xi_1,xi+1,...xn with (x1,...,xn)C

X} that is, Xi is the projection of X in the ith coordinate. In most applications, X C En_. Given a real valued function f on X, a point x C X and an integer i with 1 < i < n, we shall say that f is increasing at x with respect to the ith coordinate, if for all y,z real with (X'I''' xi-'?Y'x i+l'* 1'n, ) C X and (1'''''i-l'Z'Xi+1'*''''n) C X, y < z implies f(y) < f(z). For convenience of notation we shall write (xi,z) v (x1,...,Xi_'Z'Xi+' X X*Xn) Thus, f is increasing with respect to the ith coordinate in X means that at each point x E X, if we fix all coordinates except the ith one, then in the remaining ith variable, f is monotone increasing. We define f to be decreasing with respect to the ith coordinate if (-f) is increasing in the ith coordinate. In summary, DEFINITION 2.1. A function f is coordinatewise monotone on X with respect to (wrt) the ith coordinate, if and only if f is either increasing or decreasing wrt the ith coordinate. Note that the above definition corresponds to strict monotonicity. If the projection Xi of X is open in E1, if f is continuously differentiable on Xi and if f is increasing wrt the ith coordinate, then the partial derivative fi tf/txi is nonnegative at each point x eX, since the partial derivative is defined in terms of the difference quotients, each of which is positive. Conversely, if the partial derivative'f/axi is positive at x e X (strictly greater than zero), then f is increasing wrt the ith

coordinate at the point x. Since the monotonicity of f may be in different directions wrt different coordinates, we shall use the superscript +(-) on the ith coordinate to mean increasing (ca-+ 0 (decreasing). Thus, the notation f(x1,x2,X3,x4) would mean that f is increasing wrt the 1st coordinate, decreasing wrt to the 2d, independent wrt the 3d and undetermined wrt the 4th coordinate. Note that the concept of coordinatewise monotonicity is used also in the theory of generalized convexity of composite functions (as in Mangasarian [10] and Avriel [11]). Two functions, one increasing and the other decreasing wrt a particular variable, are said to have opposite monotonicity or to be monotonic in the opposite sense. Two or more functions, all either increasing or decreasing, are said to have the same monotonicity or to be monotcnic in the same sense. For a function of a single variable, strict monotonicity means that the function is one to one and hence invertible. It is of interest to note that the corresponding inverse function (which obviously exists) is also monotonic and in the same sense. We may view this as an implicit function result. For example, given an (one to one) increasing function f(x), let F(x,y) denote f(x)-y so that F(x,y)-O is the same as f(x)-y. Then for each x there is at most one y (which we shall denote by P0(X) such that F(x,po(x))=O; in fact, 40(x) - f(x) and clearly %0 and f are monotonic in the same sense. Also, since f is one to

one, for each y in the range of f, there is at most one x (which we shall denote by P0 (y)) such that F(PO(y),y)=O; in fact O0(y) - f-l(y) and clearly %O, f-1 and f are all monotonic in the same sense. It is to be noted that if we have f(x+), then f(x)-y - F(x+,y ) and in this case both P(x) and P0(y) are increasing functions. On the other 0 hand, if we have f(x-), then F(x-,y ) and in this case both C0(x) and %0(y) are decreasing functions. We shall now state and prove a theorem which shows that the above comments are true in a more general setting. THEOREM 2.1. Let R be the set of real numbers and let X, Y, S denote (finite or infinite) subsets of R. Let F: XxY + S be a real-valued function (coordinatewise) monotone on XxY. Then, for each s c S, there exists monotone functions 4-;(x) and (y) (alternatively written 5 5 as V(s;x) and P(s;y) such that F(x,4(s;x)) - F(P(s;y),y) * s, with the understanding that these equalities hold for all x in the domain of %~(.)={x/ there isoa y with F(x,y)=s} and for all y in the domain of -C(.). Furthermore, if F(x,y) is monotonic in the same (opposite) sense wrt x and y then the functions ~(s;x) and P(s;y) are both decreasing (increasing). PROOF. For convenience of notation and without loss of generality, let s=O E S. For a fixed x c X, the function F(x,y) is monotonic wrt y (and one to one) and thus there is at most one value of y, denoted by ~0(X) such that F(x, ~o(x)) - O. Thus-P0(x) is a function of x with domain

{x C X / there is a y such that F(x,y)=O}. Reversing the roles of x and y we obtain a function p0O(y) such that F(o (y),y)=O with domain {y s Y / there is an x such that F(x,y)-O} For the second part of the theorem, we assume F(x+,y+) and take x1,x2 to be any two elements from the domain of ~0 with x1 < x2. Let x1 0 t(xl) and Y2: O(Y2) so that F(xl,yl)=0 and F(x2,y2)=O. We wish to show that y1 > Y2, i.e. 0 is decreasing wrt x. Assuming yl < Y2 and since F is increasing wrt y (for fixed x), we get F(x2,Y1) < F(x2,y2) - F(x1,y1). But since F is increasing wrt x (for fixed y) and since x1 < x2, we get F(xl,Y1) < F(x2,Y1), a contradiction. Hence we must have Y1 > Y2, i.e. x1 < x2 implies 0O(x1) > 0O(x2), or %ois increasing wrt x. Reversing the roles of x and y we obtain that 0O(y) is decreasing wrt y. Next let us assume F(x,y+). As before, let x1 < x2 and y1 = c0(x1), Y2 - 0O(X2)' We observe again that assuming y1 > Y2 leads to F(xl,Y1) > F(x1,Y2) and since F(x1,Y1) = F(x2,y2) this implies F(x2,y2) > F(x1,y2) which contradicts F(x2,y2) < F(x1,y2) obtained from x1 < x2 and F decreasing wrt x. Thus, when F is monotonic in the opposite sense wrt x and y, x1<x2 implies Y1<Y2 and %O(x) is increasing wrt x. Similarly 0o(y) is increasing wrt y. This completes the proof. REMARK 2.1. The main property of order in R that we needed in the above proof is that the set R is totally ordered under <. Thus, for Yl Y22, we have either yl < Y2 or Y2 < Yl' The above theorem is valid for any totally ordered set

10 R. Theorem 2.1 can be generalized easily to functions of several variables to get the following PROPOSITION 2.1. Let Xi C R, i=l,...,n be n subsets (finite or infinite) of R and let X-{x/x=(xl,...,xn), xi~ Xi, i=l,...,n}. Let F: X + R be (coordinatewise) monotone on X. Then, for each s in the range of F and for each i=1l,...,n there exists a function 4(i,s;xi ) of the variable xi =(X,...xi l, xi+l,..., xn) such that c(i,s;xi ) is (coordinatewise) monotone wrt xi. Furthermore, for 1 < j < n and i:j, if F is monotone in the same (opposite) sense wrt xi and xj, then S(i,s;xi ) is decreasing (increasing) wrt xj. REMARK 2.2. In all the above considerations, no assumption is made of continuity, nor differentiablity and in fact the domains of the functions need not be infinite so that the above results can be stated in terms of sequences. 3. Monotone Inequalities. In this section we consider inequalities involving monotone functions. For a discussion of linear inequalities in the context of optimization we refer to Mickle and Sze [14] and Shefi [15]. Here we consider real-valued functions f(x) which are (coordinatewise) monotone on some subset X of E n the set X being finite or infinite. Following a design practice stemming from geometric programming we shall consider inequalities written in the normalized form f(x) < 1. An inequality is said to be tight at a point x if the

11 inequality is satisfied as an equality at x. Thus an inequality f(x) < 1 is tight at x if f(x) -. If f is coordinatewise monotone, then by definition f is one to one and thus there can be at most one solution for the equation f(x)-1 and so there can be at most one point x X where f(x) < 1 is tight. An inequality is said to be redundant if it is not tight at any point. Thus f(x) < 1 is redundant if the equation f(x)=1 has no real solutions. An inequality of the form t < a or t > a, where a is a fixed real number, is called a simple bound on the (real) variable t. When a simple bound is written in normalized form tC1a < 1 or ta1 < 1 it is referred to as a simple inequality. In the following we shall consider mainly normalized inequalities of the form f(x) < 1. A normalized inequality is called monotone wrt a coordinate x. if the function f(x) is (coordinatewise) monotone wrt xi. Without loss of generality, in the statements involving a monotone function f we can assume that f is monotone increasing wrt any particular xi. In fact, if f is not increasing wrt xi, then f must be decreasing wrt xi, so that if we rewrite f \\sing the.eciproca xi-1 instea6 o T i', we can obtain a function f which is increasing wrt the new variable x. _X1 -11 Xi xi ~ The following theorem shows that a monotone inequality, however complicated its algebraic form is equivalent to a simple inequality. THEOREM 3.1. Every nonredundant increasing inequality in

12 one variable can be reduced to a simple inequality providing a simple upper bound on the variable. PROOF. According to the above assumptions the inequality is of the form f(x+) < 1. Since the inequality is not redundant, there exists a (single) point x such that f(x)I1 and f(x) < f(x). But then x < x, since otherwise x > x and f(x+) would imply f(x) > f(x) —1, a contradiction. Thus x < x, or xx1 < 1. REMARK 3.1. Clearly f(x+) < 1 implies an upper bound on x, while f(x-) < 1 implies a lower bound on x. REMARK 3.2. THe above theorem relies on the fact that < is a total order on the real line so that if x i x0 then x > xO. Since coordinatewise ordering in the n-dimensional Euclidean space is only a partial order, the above theorem does not hold for inequalities with several variables. However we can use the argument above to conclude the following: PROPOSITION 3.1. If f(x) < 1 for all x in X, the domain of f, with f increasing wrt each coordinate, and f(x)=l for some x c X, then x is a maximal element in X wrt the coordinatewise ordering; that is, for each x c X, if xi > Xi for each i, then x-x. REMARK 3.3. By suitably modifying the definition of partial order on Rn, the above proposition on the existence of maximal element x can be proved for functions which are coordinatewise monotone, even if monotonicity may not be in the same sense wrt all the coordinates. We shall now use the above observations on the

13 existence of maximal elements to derive some important results for simplification of systems of inequalities. THEOREM 3.2. Let f1'k Cfn' be n given increasing functions with a common domain D C R. Let Fi 2 {x/fi(x) < 1; 1 < i < n} and let there exist an xi Fi such that fi(xi): 1. Let F4rn Fi be nonemply. If xl,...,xn are all distinct, then 1 1n the following two statements are equivalent: (a) For some i0 with 1 < i0 < n, xi. is the lowest value of the numbers x1,...,xn. (b) For some iO with 1 < io < n, xi E F and for each E F, fio (x) < 1 and fi(x) < 1 for each i; i0. PROOF. (a) implies (b): Since each fi (x) is increasing wrt x, we have from theorem 3.1 that xi is the maximum of Fi so that, since F C Fi we have x < x. for each x s F and each i. But since xi are all distinct and xi is the 0 lowest of them, we obtain that xi is the lowest upper bound of F, i.e. x < xio and x < xi for i/iO. But then, since fi is increasing (and one to one) we get for all x C F, fi(x) < fi(xi) or f(x) < 1 for ii0o, while fio(x) < 1 Also, xi E F because for each i, x < x and as such, _l o x0 1 fi(xio) < fi(xi) 1. (b) implies (a): For all x E F let fi (x) < 1 for ifio and f (x) < 1. Also, for the same index i0, let x. s F. If fo - - ome i1i we have xi < R. then fi ( < ) i ) or for some ifiO we have xi < xio, then fi(xi) < fi(Xio < fi(xi ) which contradicts the assumption that x. S F C F. 1 1 -1 z {x/fi (x) ( 11. This completes the proof. REMARK 3.4. In design optimization it is important to identify the design requirements which are critical at the

optimum. A critical requirement corresponds to an inequality constraint which is active, i.e. one whose presence defines the location of the optimal point. Usually an active inequality constraint is tight at the optimum. The above theorem 3.2 states that in a set of increasing inequalities, the active one has the smallest root for the corresponding (tight) equation. From a practical standpoint this is useful in eliminating several inactive constraints in a problem with monotone inequality constraints. One should notice the distinction between redundant and inactive: redundant inequalities are never tight, while inactive inequalities are not tight at the optimal point though they may be tight at some other points. Inactive constraints can be ignored in locating an optimal point. This leads naturally to the concept of dominance similar to that studied by Wilde [2]. DEFINITION 3.1. An inequality f(x) < 1, x E Rn is dominant over the inequality g(x) < 1, if and only if f(x) < 1 implies g(x) < 1 for all x. LEMMA 3.1. Let f(x) < 1 be dominant over g(x) < 1 and let f and g be both increasing wrt xeR. Let f(xf) - 1 and g(x ) - 1 for xf, xgeR. Then xf < xg. PROOF. This follows immediately from Theorem 3.1 which shows that xg is an upper bound for G = {x/g(x) < 1} and x f G because f(x) 1 implies g(x) < 1 by dominance. REMARK 3.5. In an optimization problem we can omit all nondominant constraints from consideration. We shall now return to inequalities with several

15 variables. In Proposition 3.1 we obtained the existence of maximal elements for such inequalities. We shall now consider bounds in a different sense. DEFINITION 3.2. An inequality f(x) < 1 is said to be partially simple wrt xi, if and only if f(x1,...,xn) can be written as xig(x1,,xi_,xi+1..* xn). The following theorem shows how to simplify the constraint set using the fact that a tight multivariable inequality represents a hypersurface and thus there are infinite solutions for the corresponding equation. For brevity, we shall consider a function of two variables f(x,y). THEOREM 3.3. Let f(x+,y ) and f(xo,y) 1. Then, for any (x,y) such that f(x,y) < 1, we have x > x0 implies y < YO and y > YO implies x < x0. PROOF. Let (x,y) be such that f(x,y) < 1 and let x > x0. If possible let y > yO. Then, since f is increasing wrt both x and y, we have f(x,y) > f(x0,y) > f(xo,yo) - 1, a contradiction. Thus, y < yO. Similar arguments show that Y > YO implies x < xO. REMARK 3.6. Using the above theorem, a simple inequality of the form x < xO can be combined with another monotonic inequality of the form f(x,y-) < 1 to yield another simple inequality. More precisely, let f(x,y) be monotone decreasing wrt x and y and let f(x,y) < 1 for all x > 0, y > O. Let there exist some xO > O, yO > O such that f(xo'Yo) = 1. Then, the pair of inequalities x < xO and f(x,y) < 1 is equivalent to the pair xxO 1 < 1 and yy 1 <

16 1. Similarly, if f(x, y) < 1 and x < xO with f(xoYO) = 1, then y < yO. In fact, if y > yO then f(x,y) > f(xy )> f(xO,yO) = 1, a contradiction. These observations may help significantly the simplification of a system of nonlinear monotone inequalities [5]. REMARK 3.7. If a nonredundant inequality is written in two different equivalent ways, such as f(x) < 1 and g(x) < 1 with f and g both monotone (and dominant over each other), then f and g must have the same monotonicity. In fact, if Xf and xg are such that f(xf) - 1, g(xg) ) 1 and f(x+), then f(x) < 1 is equivalent to x < xf while if g is decreasing wrt x, then g(x) < 1 is equivalent to x > xg. Since x < Xf and x > xg cannot be equivalent, we conclude that g must be increasing, if f is increasing. Similarly for f,g decreasing. REMARK 3.8. Monotonicity of normalized inequalities is affected by the algebraic manipulations that convert the original inequality to a normalized one. Given two nonredundant equivalent inequalities f(x) < 1 and g(x) < 1, the fact that f is mcnotone does not imply that g has to be monotone. However, if both are monotone, then by remark 3.7 they must be monotonic in the same sense. This observation has direct implication on the modeling of constraints that arise from physical considerations in design optimization problems. Careless algebraic manipulations may lead to models with lack of monotonicity or even with nonsimply connected feasible domains, while the original physics of the problem indicate otherwise.

17 The next theorem is useful when an active inequality constraint that cannot be solved explicitly for a particular variable is to be used for eliminating this variable from the problem while retaining existing monotonicities. THEOREM 3.4. Every monotone nonredundant inequality can be transformed into a partially simple one wrt each monotonic variable with the original monotonicities preserved. PROOF. For convenience of notation, let us consider an inequality with only two variables such as f(x+,y+) < 1. We need to show that f(x+,y) ( 1 is equivalent to xg(y+) < 1. We assume as usual that the domain of f is {(x,y)/x>O, y>O}. Since f(x,y) < 1 is assumed to be nonredundant, there exist x0 > O, yO > O such that f(x0o,Y) - 1. Then by Theorem 2.1, there exists a function i(y ) such that xo(p(yo))-1 1. Since xO > O, obviously 4(yo) > O. Let us now consider x > O, y > O and the product x(p(y))- 1. Since f is increasing wrt x and y and f(xoYO) = 1 while f(x,y) < 1 for all x > O, y > O, we must have x < x0 and y < yO. Since p(y-), we get i(y) > p(yo) > O and (4(y))-1 < (< (y)) -1, so that x(i(y)) 1 < xO((y))-1 1. It suffices to take g(y) - (4(y)) 1 to have g(y+) and complete the proof. 4. Necessary Conditions at Optimality. In the present section we shall obtain two necessary conditions for mathematical programs that possess certain monotonicity properties. Principles exploiting systematically monotone

18 properties of functions in the present context were first developed by Wilde [2,3]. The major thrust here is that we do not assume continuity or differentiability of the constraint functions and thus the results are valid for descrete problems as well, where the independent variables may take descrete values, a situation very common in engineering design. Let us pose the following nonlinear monotone programming problem in normalized form Problem P1 minimize f (x), x EX Rn subject to fi(x) < 1, i 1,..,m where fo,fl,.,f fm are all coordinatewise monotone and where R n denotes the strict positive orthant of the n-dimensional Euclidean space, so that for xEX, we have x (X1,...,xn) and xi > 0 for 1 < i < n. In the theorems below we assume that Problem P1 is feasible and it has an optimal solution. THEOREM 4.1. Let all the constraints in Problem P1 be nonredundant. If in Problem P1 the objective function fo is monotone wrt xi, then there exists at least one active constraint with opposite monotonicity to fo wrt xi. PROOF. For convenience of notation, let i = 1 and assume that fg is decreasing wrt xi. Then the value of fo can be indefinitely decreased by increasing values of xl, unless x1 is bounded above and this upper bound can be found only if there is an index j with fj increasing wrt xl, so that fj(x) < 1 would lead to an upper bound on x1 of the form x1

19 < x1i (see Remark 3.1). If on the other hand fo is increasing wrt xl, then there is some j, 1 < j < m such that fj is decreasing wrt x1. Otherwise all fj are increasing wrt x1 and thus fj(x) < 1 leads to an upper bound on xl, for each j, so that we obtain zero to be the only lower bound for x1 and fo can be lowered indefinitely by getting x1 closer and closer to zero. Since no vector with a zero component can be a valid optimal solution, we get a contradiction to the existence of optimal solution. We conclude that there is at least one fj which is decreasing wrt x1. REMARK 4. 1. The above theorem can be applied to any program that has an objective monotonic wrt one or more variables in order to test whether a bounded (optimal) solution exists. The presence of a nonmonotonic constraint would generally satisfy boundedness. COROLLARY 4.1. If the objective fo is monotone wrt xi, then the dominant inequality of the set of constraints with opposite monotonicities is active. PROOF. The dominant constraint provides the gib or lub on Xi (depending on whether the constraint is decreasing or increasing). Now we pose the following Mayer type problem in normalized form Problem P2 minimize f (x), X X c R+n subject to fi(xu) < 1, i -, where x _ (x1,...,xn)' xi > 0 for 1 < i < n and u

20 (Ul,...,m), uj > 0 for 1 < j < m. THEOREM 4.2. Let there be some j, 1 < j < m, such that fi(x,u) is monotone wrt uj for all i. Let us assume that Problem P2 has a solution (x,u). Then there exists an optimal solution (x,u ) of Problem P2 such that the following holds: Either all the constraints fi(x,u) < 1 are inactive, or there is a pair of constraints with opposite monotonicities, both of which are active at (X,u ). PROOF. (i) Let us assume that fi(uj+) for all i. (The argument is the same for fi(uj ) for all i). Suppose (x,u) is an optimal solution for which fi < 1 is active, so that f. (x,u) - 1, for some i0. Then, we choose u such that 0 < * v uj < uj and ui ui for i;I j. Since fi(uj+) for all i, we get fi(x,u ) < fi(x,u) < 1 for all i and the value of the objective f(0 x), being independent of u, is same for (x,-) and (x,u ). Thus, all constraints are inactive for (x,u ), which is also optimal. (ii) By renumbering if necessary, let us assume that fl,''''fk are decreasing wrt uj and fk+1 l'''fp are increasing wrt uj for 1 < k < p. Let fi(x,u) < 1 be equivalent to uj > ij for 1 < i < k and to -uj < ij for k+1 < i < p. In fact, ij - i(X'Ulu1** uj l'uj+l'u -'Um)' Let us also assume without loss of generality that ij >.ij for 1 < i < k and cpj <.ij for k+1 < i < p. Then the above inequalities imply lO < uj < 3 pj Let us now consider two cases. Case (a): j < pj Then

21 * * * there is some uj such that lj < uj (< pj. We choose u so that ui - ui for i f j. Then (x,u ) is optimal and at this point all fi are inactive. Case (b): l1j = ~pj In this case the constraints f < 1 and f < 1 are both active 1 -- at (x,u) and they have opposite monotonicities wrt uj. REMARK 4.2. In theorem 4.2 no assumption of differentiability was made. If the functions fi, 0 < i < p,are differentiable and Kuhn-Tucker conditions are applicable, then we can say that for every optimal solution (x,u ) the conclusion of the theorem holds. Related results can be found in Wilde [2,3] for single-term continuous, differentiable functions and in Papalambros [7] for posynomials. REMARK 4.3. In view of theorem 4.2 we can simplify the process of locating optimal solutions —if any one solution would be acceptable. For example, in the case where there exists a control variable (that is a variable wrt which the objective is independent) such that all constraints involving this variable are of the same monotonicity wrt it, all these constraints can be ignored. 5. Suggested Procedure. There are certain steps suggested by the discussion in the previous sections which can be taken in the study of programs in general. They are particularly important for problems arising from physical applications where the adequacy and completeness of the optimization model are not known with certainty. Here we

22 give some suggestions that have been found very useful in practical applications. 1. Since a given inequality may be equivalent to several normalized forms, we choose, when possible, that form where the corresponding function is monotone. Thus, for example, if fl(x) < 1 and f2(x) < 1 are equivalent (dominant over each other) and if f (x) is monotone wrt more coordinates than f2(x), then we prefer the representation f l(x) < 1. 2. We look for simple inequalities of the form xiai < 1 and use these to simplify, when possible, other monotone inequalities, in a manner described in section 3. Thus, for example, if xa < 1 and fl(x-,y ) < 1, then we can replace f (xy) < 1 by a simple inequality of the form yb < 1. 3. Negative variables should be replaced by new variables with opposite sign. In a problem where variables may take both positive and negative values we may transform them into strictly positive ones by a change of datum. When this is not possible, we can consider several subproblems in each of which, the variables all keep the same sign. The globally optimal solution can be then found by comparing the solutions of the subproblems. This case decomposition can be applied also to problems with nonmonotonic functions by examining subproblems defined over regions of the feasible space, where monotonicities may be determined [6]. 6. Examples. In this section we apply the ideas of monotonicity analysis to some simple examples of the type used to test numerical algorithms. Applications to

23 engineering design problems can be found in the references given in the introduction. EXAMPLE 1 (F. J. Gould [181 in Lootsma [16]). The objective function is nonconvex and the constraint region is a narrow, half-moon shaped valley of the type described in Fletcher and McCann?17]. The normalized problem is minimize f(x1,x2) = (x1-10)3 + (x2-20)3 subject to -1 R1(x1) 13x1 < 1 R2(x1x2) 100E(x1-5)2 + (x -5)2x+1 ( 1 2 1 2 - 2 R3(x1,X2) = (1/82.81)[(x1-6)2 + (x2-5) ] < 1 with x2 > 0 To determine a smaller feasible domain we observe that (x1-6)2 + (x -5)2 (x -5)2 + (x2-5)2 + (11-2x1 and from constraint R2 and R3 we see that 100 < (x1-5)2 + (x2-5)2 < 82.81 - (11-2xl) or x1 > 14.095 (3) Since any feasible solution must satisfy (3), constraint R is redundant. From constraint R3 we have (x2-5)2 < 82.81 - I x1-612 < 82.81 - 1 14.095-612 since the right-hand side is strictly decreasing wrt x (this being true because of (3)). Therefore 0.8429 < x2 < 9.1569 (4) To determine the monotonicities we observe that a difficulty arises because the square terms in parentheses in the problem statement can be either increasing or decreasing depending on the sign of the quantity in

24 parentheses. But since x1 > 14.095 all terms containing x1 are positive. This is not true for x2 so we circumvent this difficulty by decomposing the problem into Case A with 0.8429 < x2 < 5 and Case B with 5 < x2 < 9. 1569. Case A. We rewrite the problem as follows: minimize f(x1+, x2+) x1-1013-120-x213 subject to R2(x1,x2+). 100[1x1-512 + 15-x212f-1 < 1 R3(x+ x -) (1/82.81)[Ix1-612 + 15-x212]' 1 and allowable range 14.095 < x1 and 0.8429 < x2 < 5 (5) The objective is increasing wrt both x1 and x2. Constraint R2 being the only one with opposite monotonicity wrt x1 must be active. Then R2(x1,x2 ) - 1 implies x1 = 2(x2+ and thus elimination of x1 from the problem gives an objective function increasing wrt x2. Then R3 must be active (whether it is monotonic or not) if the problem has a solution. Thus the optimum is given by the simultaneous activity of R2 and R3 and is * * * f -6961.8, x = 14.095, x2 0.8429 (6) We note that the allowable range (5) is equivalent to the two constraints R2, R3 and for f(x1 + x2+) the minimum is at the lower bounds for both x1 and x2. Case B. The problem is the same as in case A but with 5 < x2 < 9.1569. Again R2 must be active, but R3 cannot be simultaneously active since solution (6) resulting from R2-R31 gives x2<5. We observe that now R2(x1 -,x2-)x1 i23 gpl ies 2 2 implies x2= 42(x1-) so the objective may become

25 nonmonotonic. This, however, is irrelevant since the greatest lower bound on any solution in case B is fl.b. (14.095-10)3 + (5-20)3 -3306.3 (7) Clearly, since flb. > f no better solution than (6) can be obtained in case B and the global optimum is (6). EXAMPLE 2. (Collatz and Wetterling, cited by Eckhardt [19] in Lootsma [16]). The problem in normalized form is as follows: minimize f(x1-,x2-) - -2x1(24-x1) - x2(40-x2) subject to R1(1'X2 )+ 0.125 x1 + 0.125 x2 < 1 R2(X,x2 ) 01667 x1 < 1 R3(xlx2.++) 0.0556 x1 + 0.1667 x2 < 1 and xl > O, x2 > 0. We note that since f is decreasing wrt both xl and x2, the optimum would have x1 > O, x2 > 0 unless this is the only feasible solution, which is clearly not true. Next we observe that, because of the monotonicity-of f wrt x2, at least one of the constraints R1 and R3 must be active. To obtain a dominance condition we observe that R3 E (0.125 x1 + 0.125 x2) + (0.0417 x2 - 0.0694 x1) = R1 + (0.0417 x2 - 0.0694 x1) - R1 + d (8) so that if d < 0, R1 is dominant and if d > 0, R3 is dominant. This leads to two cases: Case A. The condition d < 0 holds and it corresponds to an added constraint R3, while R3 is redundant. Since R1 is active, it can be used to eliminate x2 and the resulting problem is

26 minimize f = 3x -24x -256 1 1 subject to R2 x 1/6 < 1 R 3x1 ( 1 which has an interior minimum with R2, R3 inactive, i.e. f -304, x1 4, x2 4 (9) Case B. The condition d > 0 holds corresponding to an added constraint R -1 R1 1.667 x1x2 < 1 (10) while R1 is redundant and R3 is active. Simultaneous activity of R3 and R2 leads to a violation of R1, while simultaneous activity of R3 and R1 leads to x1 = 3 and x2 =5, giving f=-301. Thus the global optimum is given by (9). The minimum reported in the original reference above is f=-292, x1=6, x2=2. This is in fact a local (boundary) minimum for case A above. EXAMPLE 3 (Op. cit. [16,19]). The normalized problem is minimize f(x-) -x subject to R1(X1 x x2 ) exp(x1) < 1 R2(2,x3 ) - x3 exp(x2) < 1 R3(x3+) 0.1 x3 < 1 and xl, x2, X3 > O0 Again we observe that only strictly positive values of the variables need be considered. Successive application of the two theorems in section 4 shows that all constraints must be active and the optimum is given by

27 * x3 - 10 * * X2 e In x3 in 10 = 2.303 (11) x1 v In x2 in 2.303 x 0.834 The same problem is given also with the objective (min) f: 0.2 x3- 0.8 x1 and the same constraints. This is more difficult. Constraints R1 and R3 are again active and the objective becomes f - 0.2(x3 - 4 ln(ln x3)) which is nonmonotonic with an interior minimum. In fact, for x3 10 we get f =6.663 while for x3z2, f-3.465, so R3 is definitely inactive. The optimal value of x3 can be found by an one-dimensional search. Here monotonicity cannot solve the entire problem but reduces it to a simpler one. EXAMPLE 4 (Pierre [20]). Now we examine a problem where the variables are unrestricted in sign. In such cases we may divide the entire space into regions, apply the theorems in each region and at the end compare the different extrema obtained in each region. It may occur, as in the present example, that in a region, extremum may not exist; in such a case, that particular region does not contribute in locating the global optimum. The problem is to find the maximum of f on a sphere, subject to remaining on one side of a plane that passes through the sphere: maximize f - x2 sub ject to 2 x2 + 1 R1~ x1 + x22 x3+ 1 R2: 2x2 - X1 < 1 where x1, x2' x3 are coordinates with the origin coinciding

28 with the center of the sphere. We note that in dealing with equalities, such as R1 above, there are two specific considerations: (i) whether the constraint can be automaticaly satisfied by a judicious choice of the value of a free (control) variable, which is not constrained in any other way; (ii) whether the equality can be replaced by an inequality, with the understanding that the (new) inequality will have to be active at the optimum, the direction of the inequality being determined by monotonicity rules. To illustrate the above point in the present example we observe that the only constraint on x3 is R1 and thus for given xl, x2 we can always satisfy R1 by choosing 2_ 2 )1/2 2 x3 - + (1-x-x2 (clearly it is always x1 +x <1) Thus the problem can be simplified by deleting R1 and replacing it by R1 as follows: maximize f x2 subject to R1: x12 + x22 <1 R2 2x2 -x 1 In this new problem at least one constraint must be active because the objective is monotonic wrt x2. Moreover, since the objective is independent of x1 there must be two active constraints with opposite monotonicities wrt x1. Thus, both R1 and R2 are active. Simultaneous solution of x12 + x2 and 2x2 - x1 1 gives (x1, x2) - (3/5,4/5) or (-1,0). Obviously, (3/5, 4/5) yields a larger value for the objective and the optimal solution is then

29 (x1,x2,x3) - (3/5,4/5,0). Note that we obtained the value of x3 as (1-x12-x22) 1/2 - 0. The monotonicity rules have been applied above without regard to the fact that x1, x2, x3 are unrestricted in sign (while the theorems assumed nonnegativity of the variables). However, if we consider the above problem divided into several cases, then in each case —except one —we would observe that there is no extremum. In fact, let us consider first three cases: (i) x3 - O, (ii) x3 > O, 2 2 (iii) x3<0. In case (i) with x3-0 we have x1 + x2 _ 1. Thus, if x1 >, then x2 < (1+x1)/2 implies that for a fixed x1 the maximum of the objective f is x2 = (1+x1)/2 so that x1 2 + x2 1 gives x12 + [(1+x1)/2] 2 1 or 5x1 + 2x1 - 3 0) Orx1 3/5 or -1. Since x1 > 0, the optimal solution in the region x1 > 0, X3 = 0 is (3/5, 4/5, 0). In the region x1 < O, x3 O, since x2 (1+x1)/2 implies x2 < 1/2, then x2 can be made arbitrarily close to 1/2 and the supremum_, namely 1/2, is never attained. Thus in case (i) the only optimal solution is (3/5, 4/5, 0). Next we consider case (ii) with x3 > 0. We can apply the theorems as stated. Since the objective is independent of x3, we can always satisfy it; for example, by choosing x3 + (1-x12-x 2)1/2 Thus, the problem simplifies to maximizing x2 subject to 2x2 - x1 < and the inactive x12 + x2 < 1. We examine the subcase where x1 > o, x2 > O. Applying monotonicity arguments we see that since f is independent of x1 and the second constraint is inactive, so should be the first constraint,

30 i.e 2x2 - x1 < 1. But a monotone objective with no active constraints cannot have an optimum. So no extremum exists in the subcase x3 > O, x1 > O, x2 > 0. All other cases with x1 < O, x2 < 0, x3 < 0 can be handled in entirely the same manner yielding no optimal solution. We conclude that the global optimum is the one found in case (i). Acknowledgement. The work of the first author was partially supported by the National Science Foundation Grant CME-80-06687 at the University of Michigan. This support is gratefully acknowledged.

31 Re ferences [1] Wilde, D. J. (1978). Globally Optimal Design. WileyInterscience, New York. [2] Wilde, D. J. (1975). Monotonicity and Dominance in Hydraulic Cylinder Design. Trans. ASME, J. Eng. Ind., Ser. B., 97, 4, 1390-1394. [3] Wilde, D. J. (1976). The Monotonicity Table in Optimal Engineering Design. Engineering Optimization, 2, 1, pp. 29-34. [4] Papalambros, P. (1979). Monotonicity Analysis in Engineering Design Optimization. Unpublished Ph.D. Dissertation. Design Division, Stanford University. [5] Papalambros, P. and Wilde, D. J. (1979). Global NonIterative Design Optimization Using Monotonicity Analysis. Trans. ASME, J. Mech. Design, 101, 4, pp. 645-649. [6] Papalambros, P. and Wilde, D. J. (1980). Regional Monotonicity in Optimum Design. Trans. ASME, J. Mech. Design, 102, 3, pp. 497-500. [7] Papalambros, P. (1981). Monotonicity in Goal and Geometric Programming. Trans. ASME, J. Mech. Design (to appear).

32 [8] Papalambros, P. and Bernitsas, M. M. (1981). Monotonicity Analysis in Optimum Design of Marine Riser. Trans. ASME, J. Mech. Design (to appear). [9] Kontaratos, M. and Papalambros, P. (1981). Optimum Preliminary Design of A Non-Terminal Railroad Station. Progress in Engineering Optimization 1981, Design Division, ASME, N.Y., (to appear). [10] Mangasarian, O. L. (1970). Convexity, PseudoConvexity and Quasiconvexity of Composite Functions. Cahiers du Centre d' Etude de Rech. Oper., 12, 114,-112. [11] Avriel, M. (1976). Nonlinear Programming. PrenticeHall, Englewood Cliffs, N.J. [12] Minty, G. (1962). Monotone (Nonlinear) Operators in Hilbert Space. Duke Math. J., 29, 341-346. [13] Rockafellar, R. T. (1970). On the maximal monotonicity of subdifferential mappings. Pacific J. Math., 33, 209-216. [14] Mickle, M. H. and Sze, T. W. (1972). Optimization in Systems Engineering. International Textbook, Penn.

33 [15] Shefi, A. (1969). Reduction of Linear Inequality Constraints and Determination of All Feasible Extreme Points. Unpublished Ph.D. Dissertation. Department of Engin.- Econ. Systems, Stanford University. ~16] Lootsma, F. A., ed. (1972). Numerical Methods for Nonlinear Optimization. Academic Press, London. [17] Fletcher, R. and McCann, A. P. (1969). Acceleration Techniques for Nonlinear Programming, in Optimization (R. Fletcher, ed.). Academic Press, London, p. 213. [18] Gould, F. J. (1972). Nonlinear Tolerance Program-ming, in [16], p. 349. [19] Eckhardt, V. (1972). Pseudo-Complementary Algorithms for Mathematical Programming, in [16], p. 301 [20] Pierre, D. A. and Lowe, M. J. (1975). Mathematical Programming Via Augmented Lagrangians. Addison-Wesley, Readinng, Mass.