Decomposition and Partitioning Methods for Multi-Stage Stochastic Linear Programs John R. Birge Technical Report No. 82-6 Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, Michigan 48109 May, 1982

Decomposition and Partitioning Methods for Multi-Stage Stochastic Linear Programs John R. Birge Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, Michigan 48109 Abstract Multi-stage stochastic linear programs can be formulated for problems in financial planning, corporate decision making, economic policy analysis, and many other areas. Deterministic equivalent linear program representations of these problems, however, become excessively large. This paper presents methods for solving these problems based on decomposition and partitioning. Computational results on a set of practical test problems are reported to show the algorithms' potential usefulness. 635 - Large scale systems programming Subject classification: 655 - Stochastic programming.

Many practical problems require that optimal decisions be made periodically over time. These problems can often be formulated as multistage (or dynamic) linear programs. The resulting programs have a staircase structure that may be exploited by decomposition (Ho and 4anne [1974]), basis factorization (Fourer [1979]),partitioning (Rosen [1963]),and other computational techniques. Difficulties may arise, however, in implementing solutions from these programs because of uncertainty about some parameters of the program. These random coefficients are often replaced by their expected values, but the solution of the corresponding expected value program may not be optimal for the stochastic program. In fact, no linear program which allows for only one value of each coefficient may lead to the optimal solution (Birge [1981]). In this case, a stochastic program must be solved to obtain the optimal solution. The deterministic equivalent linear program is usually very large, and standard solution procedures may prove very costly. In this paper, we present two methods for solving these problems based on a nested decomposition (or outer linearization) approach and a piecewise linear partitioning strategy. Both methods can be viewed as approaches to the dynamic programming formulation presented in Section 1. The outer linearization approach of Section 2 is an extension of the Van Slyke and Wets [1969] method for the two stage problem. The piecewise strategy in Section 3 is presented as a partial resolution of a degeneracy problem inherent in the method of Section 2. Section 4 describes computational experience on a set of practical multi-stage problems taken from Ho and Loute [1981].

2 1. Problem Formulation The stochastic linear program with recourse was first formulated by Beale [1955] and Dantzig [1955]. We consider the multi-stage version of this problem given by min z =cl 1 x+ E + E2 [ c x 3 +...+ ET [CTXT]... subject to: Al11 = bl - BX + A22= - B2 = BT-1 xT-1 +AT T x = x > 0, t = 1, 2,...,T, L ~c ~ Sty t = 1, 2...., T, (1) n m ml where c is a vector in R for t = 1, 2,..., T, b1 is a vector in R, t is a random m2-vector defined:on 5t, t = 2,.., T, and A and B are correspondingly dimensioned known real valued matrices. The vector xi represents decisions made in the first period, and vectors x represent decisions made in future periods dependent on the outcomes, 2,.'., i, and previous decisions, x2,..., xt. "E1E represents the expectation with respect to i t

3 The decision process in a practical problem is to determine a first period decision which minimizes the sum of current current costs, c1xl, and the expected values of all future costs until some time T. Solution methods for this problem have been most thoroughly explored for the case of dis1 2 k crete distributions, where t = {,'1,2 t}, and T = 2. Dantzig and Madansky [1961] proposed that the dual of this problem be solved using Dantzig-Wolfe [1960] decomposition. Another approach in Kall [1979] and Strazicky [1980] is to solve the dual using a basis factorization scheme. Van Slyke and Wets t1969] also proposed applying outer linearization, or Benders' decomposition (Benders [1962]) to the primal problem. In Section 2 we present an extension of this method for the case of general T. Other methods have been applied to the simple recourse problem in which T = 2 and A2 = [I: - I]. Everett and Ziemba [1979], for example, suggest a modification of the Frank-Wolfe algorithm, and Wets [1975] has suggested a method based on approximation by splines. Methods for this class of problems are important since the general form fits many problems in dynamic economic planning and financial planning. For the multi-stage case, Beale, Forrest, and Taylor [1980] have considered a specific production model. In their model, several products are produced and either sold or kept in inventory according to the result of a stochastic demand vector realized after the production decision has been made. They show that an approximation procedure incorporated into a backward recursion can produce solutions very close to the exact dynamic program solutions.

4 We consider the general multi-stage problem (1) as a dynamic program with stages, 1, 2,..., T, and states y = t 1, 2,..., T-l. Thle value at t = T is defined as ZT (YT-1) = E T[T(YT-l' T) (2) where T(YT-1' ST) = min T T (3) T T-l' T T T subject to: AT xT = YT-1 + T XT > 0. The solution to (1) can then be found recursively by zt(Yt-1)= t[t (Yt-l' )l' (4) where t(Yt-1 t) = min c xt + t+l(yt) (5) (xt' Yt) subj. to Ax = t + yt_, t t t -1 B x =y tt t Yt' x > 0. t = At t = 1, we define y0 = 0 and = bl, and obtain z (0) = min cx + z () (6) 1 11 21 (x1,y1) subj. to Ax =b BX1 = Y,1' 1 - 0.

5 The optimal solution, xl, of (6) gives us the optimal set of first period decisions. Of course, the dimension of t may make direct implementation of this procedure impossible. In the following sections, we present outer linearization and piecewise linear approximation methods for this dynamic programming framework. 2. A Nested Benders' Decomposition Method. Decomposition methods generally consist of outer linearization, as in Benders [1962], or inner linearization, as in Dantzig-Wolfe [1960] decomposition. Geoffrion [1970] has described these relationships and Kallio and Portius [1977] have discussed them in terms of dynamic linear programs. Ho and Manne [1974] have also formulated a nested decomposition approach of applying inner linearization to successive stages of dynamic linear programs. In this section, we describe the application of outer linearization to the primal problem in (5). This procedure can also be described as applying inner linearization to the dual of (5). This method is based on the following equivalence: Proposition 1. The mathematical program (5) is equivalent to: min txt + Q () (6) subj. to At xt = t + Btl xt- (6.1) D x > X (6.2) t t = t' x 0, (6.3) where Qt+l(xt) is a real-valued convex function, D is an mt+lxnt realmatrix, and At is a real mt+l vector.

6 Proof. See Wets [1966], Proposition 6. U The constraints (6.2) are known as feasibility cuts and are used to induce a feasible solution at stage t + 1. They can be constructed A sequentially by noting that if the problem at t + 1 is not feasible for xt, A a solution of (6), then for some t+l there does not exist xt+l > 0 such that At+ xt+l = t + Bt xt and D xt+ > t+ In this case, there t~l t~l t+l t t t+l rt+l exists a vector (a +l, a +) such that: Al ^2 ^+1 A^ + D- < 0, (7) t+1At+l t++l t+l (7) ^2 - a D <0, (8) t+l t+l (8) -L 1 ^ ^ ^2 anda l ( + B tx) + t+a t > 0. (9) t+l t+l t t t+l t+l A constraint of the form, Al Al 1 ^ ^2 B x >a +a (10) t+l t t t+l t+l t+l t+ (10) can then be added to (6.2) and Dt and X can be formed row by row as L.' t 1 2 consecutive solutions xt, xt,..., are used to test feasibility at stage t + 1. The convex objective function in (6) can also be handled by successive approximations. First, we rewrite (6) as: min cx + (11) tt t subj. to Atxt St +Bt t (11.1) D x >, (11.2) tt - t - Q(xt) + Ot > 0, (11.3) x >- 0. (11.4)

7 We wish to show the following proposition. Proposition 2. Constraint (11.3) in problem (11) for t = 1, 2,..., T, may be replaced by linear constraints of the form tl Bxt + - t+l' j = 1, 2,..., p, (12) t- l t t t Pt+19 -1,j where p < mt+l < + 03, iTt+ is an mt+l + 1 - vector, and p is a constant. -1,j Proof. The significance of it+l and t+l will become clear in the proof. t —-and L t+l We assume that the proposition holds for all t > T + 1. It is vacuously true for t = T since QT+ (XT) = 0, and we have T > 0. Assume that x is an optimal solution to (11) for t = T in which QT+l(xT) has been replaced by a zero vector. Let the optimal solutions of the stage T + 1 problems with XT = x be (x +l ( T+) OT (T)) and let the corresponding dual multipliers be(+1 T+l'9+li(C+ 1.,1 21 3,1 1,1 1 be ( T+ Tr)', +1 ( +1) where 1 + B x ) + 7T ( x )(x )+T r (E )(P (c ))= (+1 )+O ( ) and where p T(+ 1) is the vector of (= 1, 2,..., p. If we have some other feasible solution to (11), x $ x which leads to optimal A1 ^ 2 A3 multipliers, ('T+ (T +, I+() T+(1T+)' (+( T+))' at T + 1, then T;+i(rT+I)(T+I + B x + 7 X( x )( +) A >-T T+l ~T+l E+1 + T[+ l T T+1 + T+1 +' +lT (13) T~ (Ti )(+ +B x1) + 2 ),( + ~l(t+l) (P+2 (T+i)) (13)

8 where pT+2 consists of the union of linear constraints required at T + 1 1 ^ for the x and x cases. TC T Since QT+(X) is the expected value of the left hand side of (13), we obtain -1,1 -2,1 -3,1 -1 QT+ T - T+ T+ T+1 T+1 (14) for all x, where P +1 E 1+ [Tr+l +l * T+ (15.1) P+1 = 1 T+ ) T+ PT (15.3) feas=ible in (11 ) for t = ), 0Tl > + r B x (17) T -l+l (T+ Letng- -l 1 -2,1 -3,1 Let ting = PT+ + P1 + P+ we have that for all (xT, 0) feasible in (11) for t = T >-1 +1f Bx (17) T -T l T+1 I T This inequality leads to a method for constructing the constraints in (12). We successively add a constraint of the form (12) whenever we solve (11) where linear constraints replace (11.3) and find an optimal solution (xj OJ) such that 0 < pJ + T B xT. The constraints we T T i itin o Q 1 ( add lead to an outer linearization of QT+l(XT).

9 Murty [1968] showed that there is an optimal solution of (11) in which x has at most m + m + 1 nonzero elements. Therefore, the greatest T'T T+l number of extra linear constraints in (11.3) required at the optimum is mT+l + 1. This completes the induction step. i The modified version of (11) is then min ctxt + t (18) subj. to Atxt = t + B X, (18.1) Dtxt > 1 (18.2) t t t -1,j t+l t t t - t+l j = 1, 2,...,p; (18.3) x > 0. (18.4) The constraints in (18.3) are successively added to (18) until a solution, (xt t) is found that satisfies Q(x ) < 0t. The outer linearization interpretation of the method can be seen in Figure 1. Linear supports are placed under the convex function Qt+l(xt) until the value of 0* from (18) is such that * = Q.(x). The method t t t+l t can also be shown to be an application of Dantzig-Wolfe decomposition to t-he dual of (5).

10 The complete algorithm consists of first obtaining a feasible solution and then successively solving (11) through the approximation (18) from period t back to period T. Algorithm 1. Step O. Set up the following problem for all t = 1, 2,..., T, and all it min c x t+ (19) subiect to Atx = (19.1) x > 0. (19.2) t If there is no constraint on 0, we let O = - ~~ t t and let x* solve the problem omitting Ot. Step 1. Solve (19) for period 1 and 1 = bl. Let the resulting * optimal solution be xl. If the period 1 problem is infeasible, stop, the problem is infeasible. If not, replace each;2 at period (2) by;2 + B1xl. Solve each of these period 2 problems. If any is infeasible, then add a feasibility cut as in (10) to the period 1 problem in (19) and repeat. If every period 2 problem is feasible, then let t = 2 and go to Step 2.

11 Step 2. For all distinct solutions, {xt(St)} at period t, solve the t + 1 period problem by replacing each t in (19) for t + 1 by t+l + x(). If any of t+l tt t these is infeasible, then add a constraint of type (10) to the corresponding period t problem and solve that problem. If the corresponding period t problem is infeasible, then let t = t - 1 and (a) repeat Step 2 if t - 1 > 1, or (b) go to Step 1 if t - 1 = 1. If the period t problem is feasible, repeat Step 2. If t < T - 2 and all t + 1 period problems are feasible then let t = t + 1 and repeat Step 2. If t = T- 1 and all t + 1 period problems are feasible, then set O = - oo to correspond with every solution, x, t t t = 1, 2,..., T-l, and go to Step 3. Step 3 Find 7 +l(x*) and p (x*) for the current solutions of t+l t t+l t the t + 1 period problems for every period t solution * * - -1 *.t t If 0 <p (xt) + (xt Btxt, then add a t t+l t t tt constraint of type (12) to the period t problem (1Q) for which x was optimal. Do this for each x and solve each of these period t problems. Again, use the resulting solutions and solve all period t + 1 problems. If t < T - 1, let t = t + 1 and go to Step 2. If t = T - 1, repeat Step 3. 0* = p (x) -1 +*If =t+ (xt) + t (xt B x* for all x, then if t > 1, let t = t - 1 and repeat Step 3, and, if t = 1, then, Stop, the problem is optimal.

12 Steps 1 and 2 in the algorithm are called the foreward pass through all periods to obtain feasibility. Step 3 is the backward pass in a dynamic programming type recursion. We note that we have made no mention of unbounded solutions in the algorithm. This may be resolved by the procedure in Van Slyke and Wets [1969], but, in practice, we replace (19.2) by t < xt < ut where u (i) < + o for all i and g (i) > - m for all i. This bounded variable problem then does not admit an unbounded solution, although some adjustment for nonbasic variables at their upper bounds must be made in the feasibility cuts (10). We note also that the number of possible solutions in one period may be infinite if go is allowed a continuous distribution. In practice, we assume that 5 = {E it.> 2 } and that P(t ) = We also tantht = = p We also assume that Et and t', are independent random variables for all t + t' With these assumptions, it is clear that the algorithm proceeds back from t to t - 1 after a finite number of steps and that hence, the algorithm converges finitely. The repeated structure of the constraint matrix, At, in (19) for all realizations of the random variable, St, allows us to avoid solving each problem independently. Instead, a single problem may be solved to yield a dual feasible basis and other problems for which this basis is also primal feasible may be found parametrically. This procedure is described in Garstka and Rutenberg [1973]. In our implementation of Algorithm 1, this has only been done for t = T since the problems for t < T may have additional constraints. They could also be handled, however, using -a more complex data structure, in which, a working basis corresponding to A was stored and updated for different additional constraints. t

13 Another difficulty that arises in implementations of Algorithm 1 is that many passes between a master problem at period t and a subproblem at period t + 1 may occur without creating substantial changes in the period t problem solution. One reason for this extra work is that additional constraints on the period t problem allow more than mt period t columns to become basic. This causes degeneracy in the period t + 1 problem. Dantzig and Abrahamson [1980] observed this behavior in their experiments on deterministic multi-stage linear programs. They noticed that the period t problem basis changed very slightly from one pass to the next, and they theorized that this process could be improved with more interaction between the period t master problem and the period t + 1 subproblems. The basic degeneracy results are in the following propositions. Proposition 3. If a constraint of the form (10) is added to (19) and is binding for some optimal solution xt, then every primal basic feasible solution of (19) for t + 1, where (19.1) is replaced by At+l Xt+l =t+l +Bxt (20) t+1 l t t' is degenerate. Proof: For the binding feasibility cut, we have A *1 AJ l t Bt Xt t t+l (21) A AB where it is assumed that X = 0. Now, let A be a feasible basis of t+1 t +l * an let( x* (19) with right hand side t+ + B x and let xt+ (ABt+ (t+ + t). t+l t t+l t+1 ^1 Multiply each side of (20) by at to obtain ^1 ^ = t (22) t AB+1xt+1 = +1 + B x ) 0 (22)

14 A ^1 ^1 ^1 B Since ar was found previously such that C A < 0, then 0 A < 0. ~~~t t t+l- t t+l - A AA A B l^ a1 ABtl 0, however, since Atl is a basis and t t 0. Hence, there A exists some Xt+l(j) = 0. Proposition 4. If two constraints of the form in (12) are added to (19) and are binding at an optimal solution, (xt, 0*), then every set of solut t tions of (19) for t + 1 with right hand side in (19.1), t + B x* such t+l t t, -1 * that 0t = Pt +;t+l Btx includes a degenerate solution for some t+lt+l tl t t t+l ~ 7t+l Proof. Let the binding constraints be -1,1 BX*+0* -l(23) -TrT B x;+ ( P t+l (23) t+l t t t t+l and -1,2 + * -2 -i+ Btx + E* pt+ (24) t —tL t t t P +l Let the optimal bases at t + 1 be the set {AB(t l )}Rnd let the cort+l t+l responding prices be {7r1' (T )} (where again additional constraints have t+l t +1 been omitted in the period t + 1 problem). This set of prices must be -1,1 -1,2 distinct from one of the sets which generated 7t+l or Tt+l because, when constraint (24) was added to (19) for period t, (23) had to be satisfied and (24) had to be violated. Without loss of generality, let the current set of prices be distinct from those in (23) and the same as those which generated (24).

15 B.1 B B Let (E t+) = 7 (t++l( ) At+l(+) and let xt+(+l) = t+1't+1 t+i t+1 t+1 t+tl t+1 B -1 (A (t )) (t + B xt It follows that t+1 t+1l) (t+ l t tt F = [Tr d (t )( o + B x Et+l (t+l )t+l t+l t Et +1 = E [T 1,(~ )( ~ + B x*)] t [wi,i + B y*)] =t+ t+1 t+l t+l) t1 E Tar( )(A (E B ) ) (25) ts s t+l t+l t+1 t+l t+l t+l B From (25) and the definition of ct ), we obtain i (B.1,1 B B +1 (t ) (E A )) x+( t+) dF(t)= 0, (26) _j~ tltl-t+l t+l t+l t+l t+l t+l t' t+l1 where F is the distribution function of t+l.1,1 B _11B Now for t' (t ) dual feasible, c(t+l( ) - (+ ) A (l) >0 *t+1 t+1 t+1 t+1 t+1 t+l t+l t+1 - tand (tl -> 0. Since {tl (tl()} 4 Fr ('(l )} as assumed above, ^ B 1,1 B there exists some t+ such that c (t )4 (t+) A and, ~t+t+ t+( t+ t+ t+ t+l B A therefore, xt+l (t+ ) must be degenerate.

16 This degeneracy can force the period t + 1 subproblem to be overly restricted. Abrahamson [1980] suggests an alternative in the deterministic model in which columns are passed foreward from the master problem to the subproblem. This method does not directly apply to the stochastic case as it is not known beforehand which value of Et+l will obtain a different optimal basis. A proposal to allow for this coupling is presented in Birge [1980] but it creates an additional intermediary problem between the master problem of period t and the period t + 1 subproblems. This difficulty motivated the development of the following algorithm based on the piecewise linearity of Qt+l(xt) for discrete random vectors +. 3. A Piecewise Linear Partitioning Method. The approach followed in this section is restricted to the case where all random variables are discrete. We again will optimize between a master problem in period t and a subproblem in period t + 1, but the two problems are more closely connected. The basic idea is similar to that used by Rosen [1963]. A convex function is minimized on a subregion within the feasible region and a boundary or interior point solution is found. If the optimum occurs at a boundary point then optimizations are performed in all adjacent subregions to that point. If no adjacent region yields a better value or if the solution is interior, then the procedure stops and the current point is optimal. We begin the development of this procedure by discussing the basic optimality conditions between the t + 1 subproblem and the period t master problem. Let the set of optimal bases of (6) for t + 1 be

17 Bi,1B B,2k {At+l Ati..., A t+l}. e note that: i(cB (AB,i)-C i + B x * (27) Qt+l() = Plt+l t+ 1l ( t+l t(27) i=1 * for xt an optimal solution from period t. Because of the dual feasibility Qt+l(xt) = (CB )(AB) (l + B x), (28) =+t i=l P+t+i t+l t+1 t+l t t as long as (A) t+l B > 0. (29) t+1 t+l t t solve (6) on linear pieces of Q (x ) by solving: t+1 t min z= [c + Pt+l Ct+(AtB) Bt] x + k + B (ABi) l (30) j= t+l t+l t+l) t+l ( j=l subject to tt t -l t-l-1 (A t+)l B x > - (At+j)l' = l,..*k (30.2) Xt > 0. (30.3)

18 The solution of (30) allows the choice of x to be optimal in one step for a given set of bases in t + 1. The advantage over the decomposition method is that this process may avoid switching among period t solutions which correspond to a single set of period t + 1 bases. 0 After solving (30), we obtain an optimal solution, xt, and consider the set of binding constraints in (30.2). We define T = { B(i, j) [(AB)1 Bt](i,)) (31) t+ t tt+l t3+1 where i corresponds to the binding row and j corresponds to the binding subproblem at t + 1, a row of a matrix M is denoted M(i, *), and a column is denoted M(*, i). These are the places of degeneracy in the subproblems. We next order the pairs (i, j) c T lexicographically and write I = {T<,,...,2 T. * 0 0 If T = x then x = d x and cx + Qt (x < +) f or all t t tl t- t +l t x by convexity, so x is optimal in (6). If T $ 4, consider T = (il,j ). We have Bj BJ1 -1 JO 0 x (A )- ( +Bx), (32) Xt+l = (At+l t+l t t and Bj B'J1 -1l N,j Xt+l (ii)+ [At+;1 ) a(i -) ~ At+1 ] Xt+l= 0 (33) N,j1 N,j where xt+l are nonbasic variables in subproblem jl at t + 1 and At+ is the partition of At+, corresponding to those variables. We want to force x tjl(i ) from the basis in order to enter an adjacent feasible region. We find the entering variables xt+l(s), according to the dual simplex criterion (Dantzig [1963]),

19 cj (s-r i\)c j-3lij) ct+l(s) - t+l At+ 1 ) ct+l(j) - rt+l(i' j) = min t+l(il-s) j 3A(i1,j) < 0 - A (i,j) t+l t+l B,j) -1 where At+l(ilj) = [(A t) + (*, J)](i). (34) B.,i Pivoting xt+(s) in to replace xt+1 (i ) results in a new basis B,j1 B,j1 A A * r, where r is an elementary matrix. We next form the t t+1 Bj B,jE following auxiliary problem by replacing At+1 with At+: j=1kt+llt t -0 t+l B B, J -1 rmin Zt = [Ct + E Pt+l Ct+l (Atl) Bt] xt t+l Ct+l (A+ t+ (35) j=l subject to A x t +B (35.1) tt = t t-l - (35.1) i1(A B x >- (A 1 j (35.2) B,)l 1Btxt > - t+l' = 1,.. t+ (35.2) xt 0 (35.3) where A - A Bt for all j $ j 1 Problem (35) has a different objective function and some different constraints from (30). x0 is feasible in (35), but may not be optimal. -0o -o0 0 The algorithm proceeds by solving (35) to obtain xt. If t < Zt then 1 -O 1 -O we let xt x and z = z form a new set T for the binding constraints -1 1 in (35.2) and solve a new auxiliary problem for zt compared with zt.

20 If Z> Zt, then we proceed to (i2, j2) ~ T and form a new auxiliary problem pivoting to an adjacent basis in j2. We increment the index on z M whenever a better solution is found until we obtain some z such that -M M M no zt < z In this case, the solution x is optimal in (6). t t t 0 1 M The procedure creates a decreasing sequence of values, zt > t >0 *> Zt so that, because there are only a finite number of basis sets for the period t + 1 problems,it must converge finitely. An example of the route of the optimization is in Figure 2. The problem is solved along piecewise linear sections of the curve Qt+l(xt), It terminates when no improving direction is found. Substantial reductions in storage can also be obtained in this procedure by only using distinct bases in the sets of constraints, (30.2) and (35.2). Many period t + 1 problems may have the same optimal basis. We let B = A B,d Q = 1, 2,..., qI, be the set of distinct optimal bases in period t + 1. We define Pt+ () Z lt+ (36) t+l t+l where J(g) = {j: A l = ABt} We also need Pt+l(i) = - (+ ) (+l = max {-[(AB')l j (i)} (37) t+l tS+l t+l Ejc~J() t+l t+ the ith component of the right hand side in (30.2), and we store j (i, ) for all i and Z. (30) then becomes:

21 min [Ct + E P (A`') Bt] Xt t ct+ t+l t t + C cp~ (AB.i)1 (38) j=+ t+l t t+l )t+l subject to Ax = t + B t-l (38.1) (A l) Btxt P l = 1,.. q; (38.2) x > 0. (38.3) The elements in Tare sorted by (i, j(i, Z)) for binding constraints on (38.2) and, whenever a new basis is found, it must be added to B. We note that the constant term in the objective of (38) must be updated for each individual but that this does not affect the optimization. The same iterative procedure as above can again be applied. The modification can, however, substantially reduce the number of constraints. A parametric procedure can again be used to efficiently generate the set of optimal bases in the period t + 1 problem. The entire multi-stage piecewise algorithm proceeds as in the decomposition by first obtaining a feasible solution and then iterating backward to obtain optimality from period t to T for t = T - 1, T - 2,...,1. Algorithm 2. Step 1. Follow Steps 0, 1, and 2 of Algorithm 1 to obtain a feasible solution for the entire problem. Let t = T - 1, let jt = 1 for all t and go to Step 2.

22 Jt Step 2. Solve the j problem of the form in (38) for t = t and * xt1 = xt-l(j2"' Jt-_1) in (38.1), where xt-l(j2'jt is the current solution corresponding to realizations, j2 Jt 92''1 -l' of the random variables. Construct T, let g = 1, and go to Step 3. Step 3. If Z > p, (the cardinality of T-), A = max{l; X s.t. ji < k% for 2 < X < t - l}. If A = 1 and t = 1, stop, the current problem is optimal. If A = 1, and t > 1 let t = t - 1, = 1 for all t and go to Step 2. If A > 1, let j = j% + 1, let j = 1 for all t > X and go to Step 2. If k < p, set up the auxiliary problem corresponding to (38) with ABt substituted for A'. If the resulting objective t+l t -0 0 0 -0 0 -0 value zt < z, let z= z, x = xt, and go to Step 2. If -0 0 z = z, let g = 1 + 1 and repeat Step 3. Unboundedness is taken care of as in Algorithm 2 by bounding variables. The algorithm converges to an optimal or infeasible solution because each period t master-period t + 1 subproblem is solved finitely, and t is decreased whenever all period t problems have been solved. We note that if several bases are adjacent to a single basis in forming the auxiliary problem, then we consider each of these and correspondingly increase P to allow for duplication. We note also that primal feasibility is maintained throughout Steps 2 and 3 of Algorithm 2 so that additional feasibility constraints are not required after Step 1, Furthermore, we observe that the bases for period t problems steadily

23 increase in size as they are passed back from period T because they must ensure feasibility for all subsequent periods. In this case, then, Algorithm 2 solves much larger problems than Algorithm 1 as additional periods are incorporated into the solution. This also makes it difficult to use the method of only keeping distinct bases for any period but the last. Algorithm 2, therefore, has the advantage of maintaining some connections between master and subproblems and of preserving primal feasibility. It may, however, be less efficient on these problems because of their increased size. In the next section, we compare these algorithms on a set of test problems. 4. Computational Results. Two FORTRAN programs, NDST3 and PCST2, were written to implement Algorithms 1 and 2, respectively. The linear programs in (19) and (38) were solved using LPM-1 as a subroutine. LPM-1 was written by J. A. Tomlin and revised by G. Kochman at the Systems Optimization Laboratory, Stanford University. It employs compact storage of non-zero entries in the coefficient matrix, performs an LU-decomposition of the basis, and uses a merit counting sort to maintain sparsity in the inverse(Pfefferkorn,and Tomlin [1976] ). NDST3 currently handles problems with up to T = 4 periods and up to 1,000 realizations of the right hand side vector. Each period t problem is limited to 350 rows and 600 columns. The number of nonzero elements in any of these problems is less than or equal to 3,000. NDST3 solves period T problems by parametrizing the right hand side and only storing distinct bases. For t < T, however, each problem is solved independently.

24 PCST2 currently applies only to the 2 stage case. In attempting to implement the code for T > 2, it was found that storage requirements were greater for the same size problem if the problem was broken into T > 2 periods, instead of solving it with T = 2. The size of the individual problems at period t would also not be reduced. In the case of Algorithm 2, then, it was assumed that unless a data structure is used which enables one to keep only distinct bases for t < T, the problem should only be solved in the two-stage format. PCST2 allows for 125 realizations of the right hand side subproblems and handles the same size subproblems as does NDST3. PCST2 also uses parameterization in the second period and stores only distinct bases. The test problems used are taken from Ho and Loute [1981]. They are practical deterministic multi-stage problems with staircase structure, as in the constraint matrix of (1). These problems were chosen instead of randomly generated problems because it was thought they would be more representative of actual applications. For comparative purposes, all problems were limited to 3 periods and the number of right hand side realizations was limited to 8. This size limitation enabled the problem to be solved using a standard linear programming code on the full deterministic equivalent problem and provided a check on solution values. It shouldbe noted, however, that the greatest advantage from the techniques would probably come from problems whose full deterministic equivalent linear programs were too large for standard mathematical programming packages.

25 The problems were taken from the following areas of application: Problem Application SC205 Economic development, SCAGR7 Agricultural planning, SCRS8 Dynamic energy modeling, SCSD8 Structural design optimization, SCTAP1 Dynamic traffic assignment, SCFXM1 Production scheduling. Their statistics, where "Totals" refers to the size of the deterministic equivalent linear program, appear in Table I. The values in Table I are the result of solutions using the IBM mathematical programming package, MPS/360, (IBM [1973]), on The University of Michigan's Amdahl 470/V8 computer. The number of iterations for MPS/360 to obtain the optimal solution appears in Table I. Unfortunately, CPU second times for the solutions are not available for MPS/360 on the Michigan MTS operating system. The number of iterations are given only for rough comparisons with the results for NDST3 and PCST2. It should be noted that these iterations are on larger problems than those in NDST3 and PCST2. The test problems were each solved by NDST3 and PCST2 on the Amdahl 470/V8 using the standard FORTRAN-G compiler without optimization (The University of Michigan Computing Center [1980]). The results for breaking the problem into 2 and 3 period problems for NDST3 are both reported in TableII. CPU seconds do not include problem input or solution output. "Iterations" are the number of simplex pivots and "passes" are the number of distinct subproblems solved. (The number of

26 passes is augmented only once for t = T.) Results for PCST2 are also listed in Table II. In the first fi.ve problems, NDST3 withl T = 2 performed the best. There appeared to be no advantage from furthering decomposing the problem into T = 3 periods or from employing the partitioning approaching in PCST2. These results should not be considered conclusive, however, considering the sample size and the absence of advanced storage techniques for the t < T cases. The basic reason for good performance relative to MPS/360 appears to be the elimination of repeated solutions for different subproblems. In the first five problems, degeneracy also did not appear to be a problem in- the nested decomposition approaches. PCST2 reduced the number of passes but not enough to compensate for its solving largerproblems. The example SCFXM1 displayed some behavior that can be especially confounding for both NDST3 and PCST2. In the process of obtaining a feasible solution, a large number of feasibility cuts (10) were generated These cuts were very dense and hence required the storage of a large number of nonzero elements. In fact, after 23 cuts had been placed on the period 1 problem, the maximum number of nonzero elements, 3,000, had been exceeded. Since the period 1 problem could conceivably have m2 completely dense cuts of this type, an upper bound on the number of elements for this period 1 problem is m2n1 or 8,100 in this case. This number is greater than the number of nonzeroes in the deterministic equivalent linear program. This situation is one of the potential difficulties involved in using decomposition or partitioning procedures. It does not appear to be common as smaller problems have not exhibited this property (Birge [1980]),

27 but, in solving programs as in (1), some check on the growth of nonzeroes should be incorporated into the solution procedure. It should also be noted that, in general, the subproblems solved in NDST3 are much smaller than the full linear program. For T = 2, if (i) v(1) = number of nonzeroes in Al, (ii) v(2) = number of nonzeroes in B1, and (iii) v(3) = number of nonzeroes in A3, then the linear program has v(l) + k2(v(2) + v(3)) nonzero elements(plus the objective row and right hand sides), whereas the period 1 problem in NDST3 has at most v(l) + nl IT nonzeroes, where T is the number of passes. When IT is small as in the first five examples, this value is small. 5. Summary Two algorithms for multi-stage stochastic linear programs were presented. The methods were based on decomposition and partitioning procedures that are common in large —scale mathematical programming. Computer codes forthese algorithmswere applied to a set of practical problems from applications in a variety of areas. The methods solved five of these problems in substantially fewer iterations than a standard linear programming approach, but one problem required an excessive number of nonzero elements to be added to the master problem. The results seem to indicate that gains in efficiency come from reducing the number of bases stored for the subproblems and from decreasing the problem sizes.

28 Qtel(xt)~ M Qt+l (xt) Q(xt) ( x Q(x ) t 2 t ---- t 1 2 x\ \ x Ft xt t x Figure 1. Outer Linearization of Qt+l(xt).

29 Q (xe) t+l (xt) ~Q(xt^~).QQ (x) t+1 t Q(x2) t t Q(x ) a _-1 2 3 x Xt X = Xx t t t t xt Figure 2. Piecewise Linear Paths of Qt+l(xt). t.L.' L

30 Table I Problem Statistics Within period Within piod Totals Optimal Rows Cols 1 2 1 2 Problem Max Min Max Min Rows Cols Els Den Value SC205 13 11 14 11 190 380 701.97 60.4 SCAGR7 15 19 20 20 320 660 1693.80 832903.5 SCRS8 31 28 38 37 254 342 1133 1.3 123.4 SCTAP1 30 30 48 48 511 817 3814.91 240.5 SCSD8 10 10 70 70 171 1191 4867 2.4 16.5 SCFXM1 92 82 126 99 1277 1915 7952.33 2877.6 1 - Nonzero elements 2 - % of elements that are nonzero.

31 Table II Results MPS/ 360 NDST3(T = 2) NDST3(T = 3) PCST2 __ Ita CPUs C It Ps CPUs It Ps SC205 139.07 24 8.26 28 25.13 24 7 SCAGR7 409.49 100 12.61 117 49.93 69 11 SCRS8 203.11 28 4.29 23 19.25 28 3 SCSD8 381.15 11 4.39 52 19.34 11 3 SCTAP1 337.43 64 4.76 92 35.71 63 4 SCFXM11742 d d d e e e d d d a - Number of simplex method pivots. b - CPU seconds excluding problem input and solution output. c - Number of subproblems solved. d - Exceeded nonzero element limit after 40 passes and over 800 iterations. Last solution was 25% from optimal. e - Exceeded nonzero element limit after 49 passes and over 1,000 iterations. Last solution was 35% from optimal.

32 REFERENCES Abrahamson, P.G. 1980. A Nested Decomposition Approach for Solving Staircase Structured Linear Programs. In Proceedings of the Workshop on LargeScale Linear Programming. IIASA, Laxenburg, Austria, June 2-6. Beale, E.M.L. 1955. On Minimizing an Convex Function Subject to Linear Inequalities. Journal of the Royal Stat. Soc. B17, 173-184. Beale, E.M.L., J.J.H. Forrest and C.J. Taylor. 1980. Multi-time Period Stochastic Programming. In Stochastic Programming, M.A.H. Dempter (ed.). Academic Press, New York. Benders, J.F. 1962. Partitioning Procedures for Solving Mixed-variables Programming Problems. Numerische Mathematik 4, 238-252. Birge, J.R. 1980. Solution Methods for Stochastic Dynamic Linear Programs. Technical Report SOL 80-29, Systems Optimization Laboratory, Stanford University. Birge, J.R. 1981. The Value of the Stochastic Solution in Stochastic Linear Programs with Fixed Recourse. Technical Report 81-10, Department of Industrial and Operations Engineering, The University of Michigan. Also to appear in Mathematical Progr amming. Dantzig, G.B. 1955. Linear Programming under Uncertainty. Mgmt. Sci. 1, 197-206. Dantzig, G.B. 1963. Linear Programming and Extensions. Princeton University Press, Princeton, NJ Dantzig, G.B. and A. Madansky. 1961. On the Solution of Two-stage Linear Programs under Uncertainty. In Proceedings, 4th Berkeey Sp. on Math. Stat. and Prob. U.C. Press, Berkeley. Dantzig, G.B. and P. Wolfe. 1960. Decomposition Principle for Linear Programs. Opns. Res. 8, 101-111. Everett, R. and W.T. Ziemba. 1979. Two-Period Stochastic Programs with Simple Recourse. Opns. Res. 27, 485-502. Fourer, R. 1979. Solving Staircase L.P.'s by the Simplex Method. Technical Report SOL 70-18, Systems Optimization Laboratory, Stanford University. Garstka, S. and D. Ruthenberg. 1973. Computation in Discrete Stochastic Programming with Recourse. Opns. Res. 21, 112-122.

33 Geoffrion, A.M. 1970. Elements of Large-Scale Mathematical Programming. Mgmt. Sci. 16, 652-675. Ho, J.K. and E. Loute. 1981. A Set of Staircase Linear Programming Test Problems. Math. Program. 20, 245-250. Ho, J.K. and A.S. Manne. 1974. Nested Decomposition for Dynamic Models. Math. Program. 6, 121-140. IBM. 1973. Mathematical Prpramming_ System /360 Application Description (H20-0136-3). Kall, P. 1979. Computational Methods for Solving Two-Stage Stochastic Linear Programming Problems. Journal of App. Math. and Physics 30, 261-271. Kallio, M. and E.L. Porteus. 1977. Decomposition of Arborescent Linear Programs. Math. Program. 13, 348-355. Murty, K.G. 1968. Two-Stage Linear Programming under Uncertainty: a Basic Property of the Optimal Solution. Z.F. Wahrs. u. Ver, Geb. 10, 284-288. Pfefferkon, C.E. and J.A. Tomlin. 1976. Design of a Linear Programming System for ILLIAC IV. Technical Report SOL 78-7, Systems Optimization Laboratory, Stanford University. Rosen, J.B. 1973. Convex Partition Programming. In Recent Advances in Mathematical Programming, R.L. Graves and P. Wolfe (eds.). McGrawHill, New York. Strazicky, B. 1980. Some Results Concerning an Algorithm for the Discrete Recourse Problem. In Stochastic Programming, M.A.H. Dempster (ed.). Academic Press, New York. The University of Michigan Computing Center. 1980, MTS Volumne 6, FORTRAN in MTS. Van Slyke, R. and R. Wets. 1969. L-Shaped Linear Programs with Applications to Optimal Control and Stochastic Linear Programs. SIAM J. on App. Math. 17, 638-663. Wets, R. 1966. Programming under Uncertainty the Solution Set. SIAM J. on App. Math. 14, 1143-1151. Wets, R. 1975. Solving Stochastic Programs with Simple Recourse, II. Working Paper, Department of Mathematics, University of Kentucky.

UNIVERSITY OF MICHIGAN Illl3 9015 03994 7786 3 9015 03994 7786