On-line Solution of Linear Programs Using Sublinear Functions JOHN R. BIRGE Industrial and Operations Engineering The University of Michigan ROGER J-B. WETS Department of Mathematics University of California at Davis Technical Report #86-25 July 1986

On-line Solution of Linear Programs Using Sublinear Functions John R. Birge* Industrial and Operations Engineering University of Michigan Ann Arbor, MI 48109 Roger J-B. Wets* Mathematics University of California Davis, CA 95616 Abstract: When linear programs need to be solved many times with only changes in the right-hand side vector, approximation by sublinear functions can provide solutions very efficiently. This paper presents this sublinear approximation and its properties. The method may be especially useful if implemented on parallel processors. Keywords: linear programming, sublinear functions, real-time optimization, parallel processing. *This research has been partially supported by the National Science Foundation. 1

1. Introduction Many uses of linear programming require the repeated solution of a single linear programming model with varying parameters. It is often important in these situations to solve the problems extremely quickly. The control of a ship or plane, for example, requires repeated optimization in real time. A specific example of this need for precise, quick solutions is the model of ship control in restricted waters in Cuong [1980] and Al-Idrisi [1981]. In their model, several unknown parameters that depend on the waters mus; be identified to determine the ship's course. These parameters should be continuously updated as the ship moves. Rudder controls are also continuously determined to maintain the ship on a desired course. The parameters and controls can be found by solving a sequence of linear programs to minimize absolute deviation from observed data and the desired course. Only the right-hand side vector changes as observations are made on the ship's actual trajectory. The goal of this paper is to present an efficient method for solving these sequences of linear programs with varying right-hand sides. We, therefore, wish to evaluate the function *(t): inf {cx | Ax = t, x L 0}, (1.1) X6Rn where t - Rm represents the right-hand side parameters which may vary. We present a method for quickly evaluating or approximating *(t) based on sublinear approximations. The method has especially strong potential for applications with parallel computation. We demonstrate this applicability by showing that our method's running time is lower even than average-care claims for general linear programming algorithms. 2

The method is based on an approximation scheme first developed for stochastic programs in Birge and Wets [1986a], and extended to a procedure for calculating upper bounds in Birge and Wets [1986b]. Its use with parallel processors was introduced in Wets [1985]. Section 2 presents useful properties of i(t) and gives the basic sublinear function method. The accuracy and effort involved in the method are also discussed. Section 3 describes properties of the sublinear approximation and presents alternative strategies for its implementation. We conclude with a discussion of its running time on parallel processors in comparison with the simplex method and Karmarkar's projective method. 2. Sublinear Approximation Method We assume that (1.1) is well-formulated so that the linear program is bounded below for all t. In this case, p(t) is a proper (nowhere -a and somewhere finite), sublinear (positively homogeneous and convex) function. Another useful property of * concerns its epigraph. PROPOSITION 2.1 The epigraph of i (epi I) is a convex polyhedral cone (i.e., p is a polyhedral convex function). Proof: A point (a,t,...,tm) = (a,t) G epi i if and only if a > (t). By duality, if i(t) < +~, then $(t) = max {rt | A > c}, (2.1) where we need only consider the finite set of basic solutions, {I basic irA > c} = {rl,... N }. We then have epi - = {(a,t) | (a,t)(-1,rk) < 0, -=1,...,k}. (2.2) This completes the proof. * 3

Since epi ) is polyhedral, we can also write it as t. epi = pos [ (t9), = 1,...,L] a a L cL - {(t)) Rm l | (t) (ti), Pv > 0, -=1 - = 1,...,L }i,.(2.3) so that L 4(t) = inf{a I (t) = E p9 (~t), 11 > O,. = 1,...,L}. (2.4) -=1 This leads to our first fundamental approximation. Suppose s {(tS) ~ epi 4, s = 1,...,S}, (2.5) is a collection of vectors in Rml. We then have that S s 4(t) < inf {a | () = s (), Us > 0, s = 1,...,S (2.6) s=1 The right-hand side of (2.6) is another sublinear function that approximates 4(t) from above. We show how to choose the vectors (ts) so that the approximation is reasonable and the computation of the approximation (finding the infimum) is straightforward. First note that a good approximation would be exact along the directions ts, s = 1,...,S. Hence, we require that ac = *(t3), and assume that as and ts are defined in this way in our discussion below. We assume that the effort to calculate these as values is small in comparison with the number of times 4(t) must be found. The solutions xs such that cxS = S as are also stored, in general, so that a primal solution x* = E psxs is s=1 obtained for each solution in (2.6). If the xS solutions are not available, then the dual may be used to find x* (see below). The set of vectors {tl,...,tS} must be sufficient for the approximation to be reasonable. For pos A = {y Ax = y, x) 0O, we 4

generally assume that pos (t...,tS) pos A. (2.7) To ensure (2.7), we often have pos (t1,...,tS)Z Rm. (2.8) Conditions for the set to obtain equality in (2.6) are given in the following proposition. PROPOSITION 2.2 If {tl,..., tS} [XlA 1,..., A } for \ > 0. i=1,...,n, and as = p(ts), s=1,...,S, then inequality (2.6) is satisfied as an equality for all t. ~~~~~i ~n Proof: Suppose t = Xi A i' i=1,...,n. Let ~(t) = qi xi. Note that, n.n i xi * by the assumptions, ai < Xiq. Since E A i = t, we have E t ( = t. - i-1 w1 i=1h. Hence, a S inf {a | (t) = s U (ts) Us > 0, s=1,...,S} s=1 s s - * n xi n < i1 (- ) a < i<~ qi xi = *(t). (2.9) - i-1 x - i-1 Combining (2.9) and (2.6) yields the result. a A difficulty in using the expression in (2.6) is that another optimization problem is required in computing the bound. Exact results could be obtained as in Proposition 2.2, but no clear time savings would be made. The situation is simplified if the expression in (2.6) has a unique solution for all t C Rm, i.e., if t,,..,tS form a positive linear basis for Rm. Positive linear bases can be easily obtained from linear bases by including the negative vector of each vector in the linear basis. Let D = [D,,,..,Dm] form a linear basis for Rm, then [D,-D] = [D1,...,Dm, -D,...,-Dm ] form a 5

positive linear basis for Rm. This is the form we shall use for approximations of i(t). For each D3 used in the approximation, let 6j = (D ) and 6; = (-Dj). The sublinear approximation using basis D is defined as a n n 6: $D(t) = inf {a | (t) = p( + j=1 j -=1 pj, pj > 0, j = 1,...,n}. (2.10) Note that i(t) < *D(t) from (2.6) and that $(t) = iD(t) if t = + D, j=1,...,m. The expression for ~D(t) can also be simplified to m iD(t) = Z~ J(t), (2.11) j =1 where 6 j(D 1) t if (D1)j t > 0, (2.12) *J(t) =-6 (D 1). t if (D- )j t < 0. Computing *D(t), therefore, only requires multiplication of t by the rows of D for each j, a sign check, multiplication by 6j and addition of the 4. This decomposition of computational effort leads to the parallel computation gains discussed in the next section. In general, *D(t) can be evaluated quickly given initial evaluation of 6j-, j=1,...,m. A single basis D does not in general provide good bounds on i(t) for all t. A collection of bases, D = (D(v), v = 1,...,N}, can be used instead. The approximations derived from this collection are described in the next proposition. PROPOSITION 2.3. Let {D(v), v=1,...,N} be a collection of linear bases of Rm, then 6

i(t) < co'D(v) (t) < inf )D(v)(t), (2.13) where co $D(v) (t) = inf [a (t) C co(epi *D(v), (2.14) v 1,...,N)] and "co" denotes "convex hull." Proof: Note that epi i D epi *D(v) for v=1,...,N, from (2.6). Hence, since epi i is convex, it contains the smallest convex set containing epi iD(v)' v=1,...,N, i.e., epi i pco(epi *D(v)' v=1,...,N). (2.15) From (2.15), we obtain i(t) < copD() (t). The other inequality, COY(D(V)(t) < inf v $D(v)(t) follows from the definition in (2.14). ~ By taking the convex hull of the functions PD(v), the approximation can improve and approach the exact value of $. The inclusion of more bases in the collection, {D(v), v=1,...,N}, also improves the solution. Conditions for a collection to obtain equality in (2.13) are given in the following proposition. PROPOSITION 2.4 Let the collection {D(v), v=1,...,N} contain all dual feasible bases in W(i.e., a basis B such that rB=cB, rA < c), then ip(t) = co *D(V)(t) = inf, *D(v) (t). (2.16) Proof: Let p(t) = cx* = r*t where BxB = t, rr*B = cB, and ir*A < c. For t m Bi, i = 1,...,m, we again have ai < ci, so for D(j) = B, PD(j)(t) = i a i=1 xB(i) < cx* = p(t). From (2.13), we obtain (2.16). ~ Using all dual feasible bases in D would generally not be efficient. The convex hull function can also be a computational burden on the sublinear approximation method. The next section discusses these implementation 7

concerns and describes other properties of the sublinear approximation method. 3. Uses of the Approximation To compute co WD(v)(t), the optimization problem in (2.14) must be solved. Since this problem may, in general, be difficult, the simpler bound of inf, $D(V)(t) is indicated. Another alternative that is especially useful when i(t) is part of a higher level optimization problem is to use the conjugate function of co D(v )' The conjugate of 4 is by definition p*(a) = sup [ut - p(t)]. (3.1) teRm The following proposition provides the conjugate of $D(/)' PROPOSITION 3.1 The conjugate of 4D(v) is given by D(v) (u)= 0 if -60 < u(D(v)).j < 6, j = 1,...,m +o otherwise. (3.2) Proof: From the definition in (3.1) and for D(v) a linear basis, %D(V)*(u) sup [uD(v)X - *D(v) (D(v)X) ] X&RM M (u(D(v)) j - 6j)X. if Xj > 0 = E sup j=1 X6R (u(D(v)) j + 6J )j if Xj < 0 Equation 3.2 follows immediately from (3.3). * The conjugate functions of cD(v) can be used to obtain the conjugate function of co PD(v) and a bound on p*(u). 8

PROPOSITION 3.2 The conjugate function of i is bounded by kp*(u) > (co )D(v))*(u) = sup 0D(v)- (3. 4) v=1,...,N Proof: From the definition in (3.1) and (2.13) 4*(u) = sup [ut - p(t)] tgRm = (co D(v))*(u). From Theorem 16.5 in Rockafellar [1970], (co iD(v))* (u) = sup sD(u) (u), =1,..,N proving (3.4). * From (3.4) and (3.2), the lower bounding function on i*(u) is p*(u) > sup D() (u) = 0 if -6J. u(D(v))j < 6;, (3.5) -=1l,...,N j-1...,m; v=1,..,N, +w otherwise. The bound in (3.5) can be useful when a dual program is formulated in place of a primal problem involving i(t). The optimization for p(t) is replaced by additional feasibility conditions in the dual. Note also that (3.5) provides an alternative method for finding (co (D(v)) (t). We have (co DD( )(t) = sup {ut - sup PD(v)(u) uERm v=1,...,N = sup tut | - 6j < u(D((v)) j < 6, ucR1 J * - j j=1,...,m; v=1,...,N}. (3.6) For problems with special structure, the dual formulation for co TD(v) in (3.6) may provide an efficient method for calculating co nD(v)' The subgradients of *D are other useful properties of ~D when used as an approximation in optimization problems involving i. 9

PROPOSITION 3.3. The subdifferential of *D(t) is given by a D(t) - {ir | = D1, j Z 1, j1...,m} (3.7) where Ej = - -6. if (D-lt) < 0 -6j, 6J] if (Dlt)j = 0 6 if (D-lt)j > 0. Proof: From (3.6), for N=1, and letting t = DX, D( )(t) = sup {(u D)X | -6J < u(D(v))j < 6j, uRm 3 - j = 1,...,m }.(3.8) m m -6 xj, tj < 0 j=1 (D j, (D'lt) > 0, (3.9) and - 6 <r (D(v)) - < 6;, j=1,...,m, (3.10) j=1 6 + Xj - *j > 0 is a subgradient. The conditions in (3.9) and (3.10) determine a in (3.7). u The subdifferential of co 4D(v) is also useful in approximating optimization problems involving ip. From (3.6), this is a(ca D(v))(t)= {tt|ir E argsup {ut z < u(D(v)). such t 7t - J 6 j < 6 1,...,m;...,N. (3.11) ( a(3.7) As in the calculation of co *D(v), the expression in (3.11) may be readily calculable when co TD(v) has special structure. 10

A subgradient of co'D(v) is also valuable in calculating a primal solution x* such that cx* - p(t). For iT a co (D(V)(t), let B be a maximal, linearly independent set of columns chosen from A = {A i I ci - rA i < 0}. In general, B has full rank, and we let x* = B-1t. We then have cx* < co $D(V)(t) for any choice of B from A. If {D(v), v = 1,..., N} is sufficiently large, then for all A i C A, ci - rA i = 0, and cx* = $(t). When B does not have full rank, a looser approximation is obtained by sequentially including columns into B in increasing order of ci - irA - Determining which bases to include in V is an important aspect of the evaluation of the sublinear approximation of W. From Proposition 2.4, if all dual feasible bases are included in V, then the sublinear approximation is exact. In general, including all such bases would be inefficient. Instead, we can describe a sequential procedure, in which, new bases are added to D in order to improve the approximation as much as possible. A suitable beginning basis, such as D(1) = I, is chosen that provides a rough approximation of W. Assume that N bases, D(1),...,D(N), have been chosen and that * is inf-compact (i.e., has compact level sets, (Wets [1973]). Consider the level sets leva i: = {t | i(t) < a} and leva (co *D(v)) = {t I co *D(v)(t) < a}). We would like to make leva(co $D(v)) as close as possible to levi W. To find these level sets, we use the following result. PROPOSITION 3.4. If f is a polyhedral function such that T k T1 f(t): = inf {T | (t) = Z (ti) Ai, Ai > 0, i = 1,.k} i=1 11

K a K then leva f = { t = t (-) x i Xl 1, Ai > 0, i=1 T i=1 i = 1,..., K} (3.12) a - co {(-)ti i 1,...,k Ti Proof: Suppose that piti - leva f, then pi Ti < a, so lev f includes ( —ti for 0 < XA < 1 and, by convexity, co(-)t, i1,...,K}. If t,T Ti lev f, then t = Z t where f(E) = E. Let 1 - (. Since f(E) < i=1 i=1 T1 a, E i 1 and t belongs to co{( —)t, i=1,...,K}. 1=1 - 1 From Proposition 3.4, we have that lev, p = co {(qj1 W ), j = 1,...,n}, (3.13) and 1 levi (co.D(v)) = co {( — )(+(D()) m; v1,...,N. (3.14) To improve the approximation of p by co wD(v)' we would like to minimize the (Hausdorff) distance between lev, p and lev (co ~D(v)). We try to do this by creating new bases with tV C argmax {dist(t, levl(co*D(,)) | t lev, ~}. (3.15) The nonconvex optimization problem in (3.15) can be solved by searching through the extreme points of lev1 *, but, in practice, search would only continue until a sufficiently large distance is found (according to some user-specified criterion). When tv is found, new matrices, D(N+1)...,D(N+q), are created so that N+q pos [tV, D(v),-D(v), v=1,...,N] = U pos[D(v),-D(v)]. (3.16) v=1 12

Another possibility for finding new matrices for V is to choose the D(v) as rotations of D(1)=I. This procedure provides a systematic way for every region of t to be investigated. A third alternative is to choose (randomly or according to a pattern) various values of t and let D(v) be the basis that is optimal in finding p(t). All of these procedures require further examination and testing before their relative merits can be established. Given a collection of chosen bases V, the calculations for a sublinear function approximation can be substantially performed in parallel (Wets [1985]). The steps of the method to find inf WD(v)(t) (assuming 6j, 6d known) are 1. Perform mN inner products (D(v)- )j t and determine PD(J)(t) in (2.12). 2. Perform N sums of m elements to find JD(,)(t) as in (2.11). 3. Sort *D(V)(t), v=1,...,N, to find the least element. With mN parallel processors, Steps 1 and 2 above can be performed in O(m) time. With N2 parallel processors, Step 3 can be performed in O(logN) time. PROPOSITION 3.5. The running time of the sublinear function method to find inf *D(v)(t) is O(m + log N) on max (mN, N2) parallel processors. The time can vary from the result in Proposition 3.5, especially if fewer than mN or N2 processors are available. Taking advantage of sparsity, however, and the availability of more processors (to accelerate the inner product) could decrease the running time. The result in Proposition 3.5 demonstrates the potential for parallel processing with the sublinear function method given here. Approximate solutions of linear programs can be obtained in linear or better than linear time. This can be compared to parallel processing applied to other linear programming algorithms (Pan and Reif [1985]). From Pan and Reif, if 13

Karmarkar's projective algorithm [1984] requires O(m1/2) iterations in practice (as some conjecture), then its running time is 0(m1'5 log m) with m2 parallel processors. If the simplex method (see Dantzig [1963] for a discussion of average case behavior) requires 0(m) iterations in practice, then its running time is 0(m2) on m parallel processors. The sublinear function method compares favorably with either of these general-purpose solution methods. 14

References Al-Idrisi, M.M., [1981], "Unconstrained Minimization Algorithms for Functions with Singular or Ill-conditioned Hessian", Ph.D. Dissertation, The University of Michigan, Ann Arbor. Birge, J.R. and R. J-B. Wets, [1986a], "Designing Approximation Schemes for Stochastic Optimization Problems, in Particular, for Stochastic Programs with Recourse," Mathematical Programming Study 27, 54-102. Birge, J.R. and R.J-B. Wets, [1986b], "A Sublinear Approximation Method for Stochastic Programming." Cuong, H.T., [1980], "Investigation of Methods for Adaptive Control of Surface Ships," Ph.D. Dissertation, The University of Michigan, Ann Arbor. Dantzig, G.B. [1963], Linear Programming and Extensions, Princeton University Press, Princeton, New Jersey. Karmarkar, N.K. [1984], "A New Polynomial Time Algorithm for Linear Programming," Combinatorica 4, 373-395. Pan, V. and J. Reif [1985], "Fast and Efficient Algorithms for Linear Programming and for the Linear Least Squares Problem," Computer Science Department, State University of new York, Albany. Rockafellar, T.R. [1970], Convex Analysis, Princeton University Press, Princeton, New Jersey. Wets, R. J-B. [1973], "On Inf-Compact Mathematical Programs, Springer-Verlag Lecture Notes in Computer Science 3, 426-436. 15

Wets, R. J-B. [1985], "On Parallel Processors Design for Solving Stochastic Programs," Proceedings of the 6-th Mathematical Programming Symposium, Japanese Mathematical Programming Society, Tokyo, 1985, pp. 13-36; also, IIASA Working Paper WP-85-67, International Institute for Applied Systems Analysis, Laxenburg, Austria. 16