A New Interior Variant of the Gradient Projection Method for Linear Programming Katta G. Murty* Department of Industrial and Operations Engineering University of Michigan Ann Arbor, MI 48109-2117, USA Technical Paper 85-18 May 1985 *Research Partially Supported by NSF Grant No. ECS-8401081

A New Interior Variant of the Gradient Projection Method for Linear Programming Katta G. Murty Department of Industrial and Operations Engineering University of Michigan Ann Arbor, MI 48109-2117, USA May 1985 ABSTRACT: This is a preliminary paper describing a new interior point variant of the gradient projection method for linear programming. The method is initiated with an interior point of the set of feasible solutions, K, and always moves along points in the interior of K. Appropriate starting procedures to apply when an initial interior point of the set of feasible solutions is not known at the beginning, are discussed. The method terminates after at most a finite number of cycles. Each cycle consists of at most n steps (n is the number of variables in the problem). In each cycle the method makes a single move from an interior point of K, in the steepest descent direction at that point allowing a move of sufficient length (this direction is computed during the cycle) within K, to the interior point of K just short of the boundary of K in this direction. In a cycle, each step computes a tentative steepest descent direction and tests it, if a move of sufficient length cannot be made in that direction within K, it moves to the next step, and so on. If a direction for the move is not selected in a cycle after n steps, it is an indication that the current interior point of K is close to an extreme point optimum solutions of the problem, then a well-known subroutine can be used to move to an optimum extreme point. The method can also detect unboundedness of the objective function on K, or with suitable starting schemes it can also detect infeasibility of the system of constraints. Preliminary computational testing with the method has given very encouraging results. The method has a nice gravitational interpretation and is expected to perform very well. KEY WORDS: Linear programming, gradient projection, steepest descent direction, interior point, moves of sufficient length.

1. INTRODUCTION We consider the linear program (LP) in the following form minimize z(x)= ex subject to Ax b (1) where A is of order mxn. Sign restrictions on the variables and any other lower or upper bound conditions on the variables are all included in the system of constraints in (1). Clearly, every LP can be put in this form by very well known transformations discussed in most linear programming textbooks. We assume that all the data in the problem is integer. We let K denote the set of feasible solutions of (1). We further assume that K has a nonempty interior, and that a point x~ in the interior of K is available (that is, x~ satisfies Ax~>b), and that K has at least one extreme point. In the next section, we discuss how any LP with integer data can be transformed into another in which all these assumptions are satisfied. If D is any matrix we denote its ith row by D, and its jth column by D j. If F is a set, F | denotes its cardinality. 2. TRANSFORMATION OF AN LP TO THE CANONICAL FORM These transformations are similar to the ones used by N. Karmarkar for his algorithm ([4]; see also [7]). An arbitrary LP in minimization form may have three possible outcomes - infeasibility (the constraints have no feasible solution), feasible and objective function unbounded below on the set of feasible solutions, feasible and has an optimum solution. The methods discussed below are mathematical procedures which transform an arbitrary LP into an 1

equivalent one for which the set of feasible solutions is nonempty with a nonempty interior and a known interior feasible solution, and in which the optimum objective value is known to be zero. Consider the LP in the symmetric form minimize hX subject to EX > p (2) X > 0 X>o Let Tr denote the row vector of dual variables associated with (2). Solving (2) is equivalent to solving the following system of linear inequalities in (X, T). hX-r p < 0 EX 2 P (3) r E < h X, r > 0 If (X, r) is feasible to (3), X is optimal to (2) and TI is optimal to its dual; and vice versa. Conversely, if (3) is infeasible, (2) is either infeasible, or it is feasible and hX is unbounded below on it. To solve (3), convert the top three constraints in it into systems of equality constraints by introducing appropriate slack variables. Then introduce the necessary non-negative artificial variables and construct the corresponding Phase I problem (see [2, 6]). Let this Phase I problem be minimize gu subject to Fu = d (4) u > 0 2

where the vector u includes all the variables X, IT, and the artificial variables for the Phase I problem. Since (4) is a Phase I problem, it is definitely feasible and has an optimum objective value > O. Let v denote the row vector of dual variables associated with (4). Now consider the LP minimize gu-vd subject to Fu = d (5) vF < g u > 0. This LP (5) has an optimum solution with an optimum objective value of zero. If (u, v) is optimal to (5), u is optimal to the Phase I problem (4). If gu (the optimum Phase I objective value in (4)) is > 0, the original LP (2) has no optimum solution (it is either infeasible, or the objective value hX is unbounded below on its set of feasible solutions). If gu = 0, the X-portion of u is an optimum solution of the original problem (2). The top set of equality constraints in (5) can be transformed into an equivalent system of linear inequalities, by replacing each equation by a pair of opposing inequalities (a better method is to use each equality constraint to eliminate one variable from u from the problem, one after the other, which at the end, leaves an equivalent system of linear inequalities in the remaining variables). Also, + + -- express v as v -v where v, v > 0. This transforms (5) into a problem of the following form minimize yy subject to Hy > q (6) 3

where y is the vector consisting of all the remaining variables (v+, v-, and remaining variables from u, if any of them are eliminated). The system (6) includes the non-negativity constraints on the variables (y>O). So the optimum objective value in (6) is zero, and since all the variables in it are restricted to be > O, its set of feasible solutions has at least one extreme point. Let yo > 0 be an arbitrary positive vector in the y-space. If Hy~ > q, then (6) is in canonical form (1), satisfying all the assumptions made in Section 1 (the set of feasible solutions has at least one extreme point, and a nonempty interior, and an interior feasible solution yO is available). If Hy~$ q, introduce an artificial variable YN+1 and transform (6) into the following problem minimize yy+ +l YN+ subject to Hy+eyN+l q (7) YN+1' where e denotes the column vector of all is of appropriate dimension. Making Y+1 sufficiently large and positive guarantees that (y0~ y+1) is an interior feasible solution to (7). So, (7) is in canonical form (1) satisfying all the assumptions made in Section 1. In addition, we also know that (7) has an optimum solution with an optimum objective value of zero if YN+1 is sufficiently large (assuming that all the data in (6) is integer, mathematically it is sufficient if Y > 2s where s N+1 is the size ofH ) ) Eventhough the transformations discussed above are mathematically nice and leave the data in the finally transformed problem (with the exception of the penalty parameter y N+) of the same order of magnitude as that of the original problem, they may not be practically viable since they increase the order of the problem. If E in the original 2i

problem (2) is of order rxs, the final transformed problem (6) or (7), may turn out to be of order (9r+9s+8)x(4r+4s+3) or larger, which is computationally undesirable. So, these transformations are basically useful only for a mathematical analysis of our algorithm. In practice one can use simpler transformations. Suppose one is trying to solve an LP minimize ag DC> a (8) If there are no sign restrictions on variables in S, express 5=~+- where +, - > 0, and transform (8) into the following form minimize a ( +- -)+Mn subject to D(S+-E-)+en_ a (9) f,,. n>. 0 where n is an artificial variable, and M is a positive penalty parameter. Let +0, i be strictly positive vectors, and choose n~ > 0 such that ( i~, o~, no) satisfies all the constraints in (9) as strict inequalities. Then ( +0, ~o, n~) is an interior point feasible solution for (9), and since all the variables are constrained to be > 0 and (9) is feasible, the set of feasible solutions of (9) has at least one extreme point. When M is sufficiently large, (9) is the big-M formulation of (8). And (9) is in canonical form (1) satisfying all the assumptions made in Section 1. 3. TO GET AN OPTIMUM SOLUTION FROM A NEAR OPTIMUM SOLUTION Consider the LP (1) satisfying all the assumptions made in Section (1). Introducing the slack vector s, the constraints in (1) 5

are of the form Ax-Is = b s > 0 where I is the unit matrix of order n. Since K has at least one extreme point, the set of column vectors of A must be linearly independent. Let x be any feasible solution with objective value cx=z. Let s=Ax-b. Let P={i: si>O}. If r ={A j: j=1 to n} U {I i: i P} is linearly independent, x is an n extreme point of K. Otherwise let Z SjA j+ Z yiI -0 j=1' iF i be a linear dependence relation for r. Define Yi=O for i P. Let (x(0), s(O))=(x+OB, s+Oy). Let 01<O<02 be the range of values for O which satisfies s+OY>O. Then 01<0, 02>0 and at least one of 01 or 02 is finite. If cB>O and 0 =- o, or if cB<O and 02=-+ then clearly z(x) is unbounded below on K. Otherwise choose 0=02 if cB<O, or 0=01 if cB >0, or 0 = a finite quantity among 01, 02 if cB=O. Then x=x+O0 is a feasible solution for (1) with cx<cx, and s=sg+0Y=Ax-b, has at least one less positive component than s. The above procedure can be repeated with the feasible solution x. After at most m repetitions we either conclude that z(x) is unbounded below on K, or obtain an extreme point x of K which satisfies z(x)<z. Assume that all the data in (1) is integer. Let L denote the size of (1), that is, the total number of digits in all the data in (1). If x is a feasible solution for (1) whose objective value is sufficiently close to z, the optimum objective value in (1); that is cx is within 2L of z; then the extreme point of K obtained by applying the above procedure beginning with x; will be an optimum 6

solution for the LP by the results proved in the ellipsoid algorithm (see [1, 3, 5, 6]). This follows because since L is the size of (1), any extreme point x of K satisfying z(x)<z +2-L has to be an optimum solution of (1) by the results proved under the ellipsoid algorithm. objective decreasing direction Fig. 1: If x is near optimal, any extreme point x of K satisfying z(x).z(') will be optimal, whether problem has a unique optimum solution or not. Just like in other interior point algorithms [4], when a near optimal feasible solution is reached, we use the procedure described above to move to an extreme point of K with the same or better objective value, and this is guaranteed to give an optimum solution of the problem. 4. THE ALGORITHM Consider the LP (1) again. As before, let K denote the set of feasible solutions and assuming that all the data is integer, let L denote the size of (1). Let x~ be an initial interior point of K. The algorithm goes through several cycles. Each cycle consists of at most n steps. The first cycle begins with the initial interior feasible solution x~. Later cycles begin with the terminal point of the previous cycle. In each cycle the algorithm makes a single move, 7

along the steepest descent feasible direction allowing within K a move of sufficient length from the initial point, in the cycle. The algorithm remains in the interior of K throughout, and at termination it either determines that z(x) is unbounded below on K, or determines the set of active constraints in (1) at an optimum solution. Let ~ be a positive number sufficiently small. Mathematically, taking e to satisfy 0<e<2-L will suffice. A good practical value for is to be determined from numerical experimentation. If c0O, x~ is an optimum solution of (1), terminate. So, we assume that cdO. Given a half-line starting at an interior point xr of K,in the direction y, the constraints fall into three classes. They are Jl(xr, y) = {ith constraint: i such that A y>O}. j2(xr, y) - (ith constraint: i such that A. y<O, and A ~(Ai x bi )/(-A y)>E}. J3(xr, y) = {ith constraint: i such that Ai y<O, and =(Ai x -bi )/(-Ai. y)< }. Jl(xr, y) is the set of all constraints in (1), that remain feasible as you move along this half-line indefinitely. The set of all constraints in (1) which are violated after you move a finite distance along this half-line is partitioned into two classes, J2(xr, y) is the class of all these constraints for which this distance is > e, and J3(xr, y) is the class of all these constraints for which this distance is < e. See Figure 2. 8

AM c\\\ \\ K \\ 21- \'\!\ m \: \: \ \\ \ \ \\ Side of t Ints 1 to 8 Souons defined the thick half i x and you are traint Constraints ae to, yoe das h, 2, - v 3. e din (xre y)The idotted2(,y

We will now describe a general cycle. GENERAL CYCLE k+1 Let xk contained in the interior of K be the initial point for this cycle. STEP 1: Choose the direction for the move to be yO -cT/ | c I. Let 1=Jt(xk, yo) for t=1, 2, 3. We consider several cases separately. CASE 1: J1 J 10. 2 3 In this case z(x) is unbounded below on K, and the half-line {xk+XyO: X>0 is a feasible half-line along which z(x)-> - c Terminate. CASE 2: Jm9, J2+0. _~linium X.:.=A.k 1 I k+lk- yo Define /=Minimum {_: Xi=(Ai x-bi)/(-A. Y0), C J2} x -x X-E Go to the next cycle. CASE 3: J+0O. Let T1=J3. Let D1 denote the matrix of order IT1Ix n whose rows 1 t T are Ai for i S T1. Let l be the orthogonal projection of -c in the subspace {x: D1X=O}. Assuming that the set of row vectors of D1 is linearly independent (if this property does not hold, eliminate some row vectors from D1 until it holds) we have i( I-DT D D1)1 D1 ) (-cT) If E1+0, let y1-,1/ I1111, go to Step 2. If 1 0, let the row vector p(Ui: i. T1)=-((D1DT) D (-cT) ) T. Then pD1=c. In this case, if 1 >0, define the row vector 7r( i) by 10

ij 0o, if i T1 Vi i' if i 6 T1. Then w is feasible to the dual of (1), and from the definition of T1-J: and 7 it is clear that cxk-b<2L, which implies that xk is near optimal to the LP(1). In this case, an optimum solution for (1) can be obtained from x using the procedure discussed in Section 3. (At this stage we have a dual optimum solution and a near optimal feasible solution to the primal LP(1). If we had applied the same algorithm on the dual problem associated with the LP(1), at a corresponding stage in that algorithm we would have an optimum solution for the primial LP(1) and a near optimum dual solution directly.) If 1-0 and pjO, delete the i corresponding to the most negative fi from the set T1 (any of the other commonly used rules for deleting one or more of the constraints from T1 associated with negative ui, can be applied in this case, too) and repeat the projection process with the new set. STEP r IN THE CYCLE: Let yr1 be the direction determined in the previous step. Define r-Jt(xk, yr-1), t-1, 2, 3. Cases 1, 2 are exactly the same as under Step 1 with Jr, J replacing J1, J 3' 2 3 2 r-l o; and y replacing y. You go to case 3 if Jr+g. In this case define the matrix Dr to be the matrix whose rows are Ai for i.6 Tr-Tr 1 U Jr. Let r be the orthogonal projection of -cT in the subspace (x: Drx-O}. If =-0 proceed in a way similar to that suggested under case 3 of Step 1. If 5rO, let yr.r/llI r I|, go to the next step. 11

DISCUSSION The constraint set Tr used in defining the projection is building up in each step. If the cycle continues until Dr becomes square and nonsingular, the projection in the last step will be zero, and the solution of the system A. x = bi, i Tr at that stage is a basic feasible solution of the LP (1). At that stage, if p = (pi: i E Tr - -((D ) DT)-D (-c )) is > O, that basic feasible solution is optimal to the r rr r LP (1). Otherwise, delete the i corresponding to the most negative li from the set Tr, as discussed above, and continue the process. 12

The usual gradient projection method selects the direction y to move from xk, moves all the way to the boundary and continues in this way. Thus, it is clear that the direction of move chosen by our algorithm is different from that chosen by the usual gradient projection method. It can be verified that the direction of move chosen by our algorithm is the steepest descent direction at the current interior point feasible solution allowing a move of length > E. 5. GRAVITATIONAL INTERPRETATION Jeff Alden has provided a natural interpretation for the path followed by this algorithm. Suppose the boundary of K is an impermeable layer separating the inside of K from outside. Also, suppose there is a powerful gravitational force inside K, pulling everything down in the direction of -c. Suppose a small spherical drop of liquid mercury (of dimension n) with diameter 2e is released at the initial point x~, an interior point of K. The path traced by this algorithm will be the same as the path of the center of this drop as it falls in free fall under the influence of gravity. Whenever the drop hits a face of K it continues to slide downwards along the face in the direction which is the orthogonal projection of -cT on this face. The drop will come to a sudden halt when it hits the bottommost point of K in the direction of -cT, if there is a unique point like that, or the face of K containing the bottom-most points of K. If there is no bottom-most point of K, after travelling along the faces of K for some time, the drop will continue to fall forever along a half-line in K on which z(x) --. In the gravitational experiment, after the first move in the direction chosen in the first cycle, the center of the drop always 13

stays at a distance of exactly ~ from the boundary. In the algorithm, since we always move to within c of the boundary in the direction of the move, the actual distance of the current point from the boundary may vary, it always lies strictly within 0 and c. 6. LIKELY ADVANTAGES OF THIS METHOD OVER THE OTHERS. Since the algorithm always moves along the interior of the set of feasible solutions, the problems created by the combinatorial structure of the boundary are avoided. Being a boundary method, the simplex method (see [2]) can at best choose the steepest descent edge direction in each step. But this method always chooses the steepest descent feasible direction allowing a move of length > ~ in each step. Thus a properly implemented package for this method is expected to give superior performance over packages for the simplex method. Choosing the value of c < 2 L is necessary for mathematical proofs of convergence, but it is not possible to operate with such small numbers on today's digital computers. Practical values for ~ have to be determined using numerical experiments. E. L. Lawler has suggested that a sequential strategy for e may be best. In this, one starts with a fairly large value for. At termination, if an optimum basic feasible solution is not obtained, one continues again from the terminal interior point with a reduced value of ~. This method also lends itself very well for the development of heuristic rules which help improve the practical efficiency of packages for this algorithm. The final set Tr used in a cycle, is the index set of constraints in (1) which are treated as active constraints in the orthogonal projection defining the direction for movement chosen in that cycle. By looking at changes taking place in 14

this set over the various cycles, one can make a guess at the set of active constraints at an optimum solution for (1), and thereby reduce the problem into a smaller one. The theoretical worst case computational complexity of this algorithm is currently under investigation. Initial computational trials with the method are very encouraging. Our computational project to assess the practical efficiency of this algorithm is progressing, and the results will be reported in a later publication. 15

REFERENCES [1] R. G. Bland, D. Goldfarb and M. J. Todd, "The ellipsoid method: A survey", Operations Research 29, 6, (November/December 1981), 1039-1091. [2] G. B. Dantzig, Linear Programming and Extensions (1963) Princeton University Press, New Jersey. [3] P. Gacs and L. Lovasz, "Khacian's algorithm for linear programming", Mathematical Programming Study 14, (1981), 61-68. [4] N. K. Karmarkar, "A new polynomial-time algorithm for linear programming",. Combinatorica 4, 4, (1984), 373-395. [5] L. G. Khachiyan, "Polynomial algorithms in linear programming", Zhurnal Vichislitel'noi Matematiki i Matematicheskoi Fiziki (in Russian) 20, 1, (1980), 51-68. [6] K. G. Murty, Linear Programming, 1983, Wiley, New York. [7] K. G. Murty, Linear Complementarity, Linear and Nonlinear Programming, (1985), Heldermann Verlag, West Berlin. 16