AN EXPLICIT FACTORIZATION FOR SOLVING MULTISTAGE STOCHASTIC LINEAR PROGRAMS USING INTERIOR POINT METHODS Derek Holmes Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 Technical Report 93-18 July 1993

An explicit factorization for solving Multistage Stochastic Linear Programs using Interior Point Methods. D. Holmes Department of Industrial and Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 9 June 1993 Abstract: Multistage stochastic linear programs (MSLPs) have a deterministic representation which is a generalization of a dual block angular program. These programs have been shown in the literature to be difficult to solve using interior point methods, since they require solutions of very large and dense systems. Several methods for improving solution efficiencies for two-stage SLPs have been suggested in the literature, but not for multistage programs. Computational experience suggests that a direct factorization based on the Sherman-Morrison-Woodbury identity is superior in terms of numeric stability and speed. This paper derives a multistage generalization of the direct factorization technique and discusses its theoretical properties.

1 Introduction Many practical problems that are influenced by uncertainty can be modeled as stochastic programs. Examples of problems that have been formulated include cash and portfolio management models (Mulvey [1]), electric power generation capacity planning (Louveaux, [2]), and forestry management (Gassman, [3]). Stochastic programs are multistage in nature, so that they sequentially make decisions before realization of stochastic parameters over time. Another common characteristic is that they relate stochasticity linearly to decision variables and costs. Such problems are called multistage stochastic linear programs with recourse (MSLPs). One of the most undesirable characteristics of MSLPs is their excessive computational requirements. Since the size of these problems grows exponentially with the amount of stochastic information included in the formulation and the number of stages (discrete decision points), realistically sized problems can quickly become computationally intractable. The recent advent of interior point methods for the solution of large linear programs (Marsten, et.al. [4], Carolan, et. al. [5], and Monma and Marsten [6]) has held great promise for the efficient solution of these problems. To be effective however, these algorithms require the efficient solution of a sequence of large, symmetric, positive definite systems of linear equations. Generally, the solutions to these systems are obtained by factoring the coefficient matrix into some equivalent triangular matrix and back solving with some right hand side. The ease with which the factorizations are obtained decreases significantly as the density of the coefficient matrix increases. Unfortunately, the structure of MSLPs can lead to quite dense systems, limiting the effectiveness of interior point methods for their solution. The purpose of this paper is to propose an efficient factorization technique for solving these systems. The factorization is simply an extension of the two-stage Birge-Qi factorization (Birge and Qi [7]), which is based on a generalized version of the Sherman-Morrison-Woodbury formula. The two stage method has been shown to have a worst case computational complexity at least an order of the number of variables over that of the standard Karmarkar algorithm. We will show here that the multistage generalization has the same complexity in the number of variables, but grows exponentially in the number of stages. This is still troublesome, and methods for getting around this restriction will be the focus of future research. 2 Preliminaries 2.1 Multistage Stochastic Linear Programs Here, we will assume that we are trying to solve multistage stochastic linear programs with fixed recourse. If the number of stages H is finite, general form of this problem is min yo + E (m iny1 +..c. Y+ _ E(min cH)..) Y0 xy EGIl-l yH s.t. Aoyo = bo Toyo + Wlyl =V 1 a.s. TH-1YH-1 + WHVH = H a.s. uo > Yo > loUt > yt > ly t = 1,..., H a.s. 1

where bold face vectors are (possibly) stochastic. Were, the stochastic elements are defined over a discrete canonical probability space (E,a(-), P), where E = 1i * * * X IH, and the St elements of t are {(Tts, Wtsts, ts), $ = 1,...,St}. The requirement that each stochastic constraint must hold almost surely may be written deterministically by defining a set of constraints for each realization. The data dependencies for a three stage problem with two realizations in each stage is shown graphically in Figure 1. The paths of the tree shown in Figure 1 correspond to scenarios that ---— (-2,T2 1,W1 21 ) ( 22,T22 'W22) (bo,Ao) (923'T23 'W23) M(12'T12 'W12) (24T24,W24) Figure 1: Multistage deterministic equivalents describe complete events up to the last stage. Scenarios can also be defined for an intermediate period t which correspond to paths from the root node of the tree to nodes in the tth stage. Defining a set of variables for each node in the decision tree, the deterministic equivalent program can be written as N1 NH min coyi,o + Pk,l Ck,lYk,1 + * * + Pk,HCk,HYk,H k=l k=l subject to Aoyl,o = bo (1) Tj,tY(jt),t- + Wjtyt = j,t j = 1,...,Nt, t = 1,...,H uj,t > yj,t > lj,t j=,...,Nt, t = 1,..., H where Nt = Number of possible outcomes in stage t Nt = Cumulative number of scenarios through stage t, Nt = N1 x... x Nt pi,t = Probability that scenario i of stage t occurs, i = 1,..., Nt ci,t = Cost vector for scenario i of stage t ~i,t = Right hand side vector for scenario i of stage t Ti,t = Technology matrix for scenario i of stage t Wit = Recourse matrix for scenario i of stage t, Wi,t E Rn',txmi,' c(j,t) = L(j - )/NtJ + 1 Here, a(j, t) finds the predecessor of node j in stage t. The block structure for a two stage problem as well as the three stage problem depicted in Figure 1 is shown in Figure 2. The two stage deterministic equivalent is called a dual block

A A TW TW T W T W T T W T W I T W T W T W w I w....................... 2 Stage 3 Stage Figure 2: Block structure for deterministic equivalents angular program. As can be seen from Figure 2, the multistage deterministic equivalent can be viewed as several dual block angular programs nested within each other. As will be seen in the next section, dual block angular programs are not well suited for interior point methods. 2.2 Interior Point Methods In the last decade, several breakthroughs in general purpose linear programming algorithms have been made using path following methods (Gill, et. al. [8]). Karmarkar [9] pioneered these breakthroughs with the first algorithm that could be proven to converge to an optimal solution in polynomial, or O(n2m2L) time, where n is the size of the problem and L is a measure of the problem's data. Other variants such as the dual affine scaling algorithm (Adler et.al. [10]) and the primal-dual method (Megiddo, [11]) have also been shown to converge in polynomial time. By contrast, the worst case complexity of the simplex method cannot be bounded by a polynomial. For the purposes of discussion, we focus on the dual affine variant as applied to the dual-block angular program described above. Consider a linear program in the following standard equality form: (P) minimize cTx subject to Ax = b x > 0 x>0 where A E RmXn and has full row rank, b E 7Rm is a resource vector, and c E Rn is the objective function vector. The dual affine variant finds an optimal solution to the dual (D) to the problem (P): (D) maximize bTy subject to ATy < c where y E 3" is a dual vector to the equality constraints of (P). The algorithm requires an initial interior point y0 that satisfies dual feasibility ( TyO < c), and successively iterates on a current

point to find another interior point with a better objective function value. The algorithm proceeds as follows: 1. k = O. Select a stopping criterion (e.g., Stop if bTyk+l - bTyk < E, where E > 0 is a small positive number. 2. Stop if optimality criterion is satisfied, otherwise calculate dual slack variables: vk = c-ATyk. 3. Calculate the search direction, which is a function of a Newton step taken toward the "center" of the feasible region and a steepest descent step in some transformed space: Let Dk = diag{(l/v),...,(1/I )}, dy = (A(Dk)2AT)-lb, and dv = -ATdy 4. Calculate a step size: (a) Let a = - x minvik/ - (dv)i: (dv)i < O, i = 1,...,m} Here, 0 < 7 < 1 is a step size parameter to insure that the next iterate will remain interior to the feasible solution set. For most practical purposes, 0.95 < y < 1 is sufficient. 5. Update dual variables, primal variables, and counters: (a) Let yk+l = yk + ady and xk+l = (Dk)2dv (b) Let k = k + 1 (c) Goto 2 Dual Affine Algorithm The vast majority of the computational effort required in the above procedure is to calculate the solution to the system (AD2AT)dy = b (the iteration counter will be dropped whenever the context is clear), or to calculate some factorization of the matrix to enable quick solution of the system. These computations are common to every interior point algorithm developed thus far (Shanno and Bagchi, [12]). The matrix M = AD2 AT is a large (M E Rmxm) symmetric, positive definite matrix for which several solution methods have been developed (see, e.g. Golub and Van Loan [13]). Generally speaking, a direct inversion of M is an inefficient solution method, and is seldom used in large scale implementations. There are two main strategies for solving the system (AD2AT)dy = b. They are iterative methods and direct methods. 1. Iterative methods generate a sequence of approximations to dy. Since these methods only involve vector-matrix multiplication, they are computationally more attractive and require less storage than alternative solution procedures. Convergence to solutions of these linear systems can be unacceptably slow, unless special tricks (e.g., matrix preconditioners) are employed. Examples of iterative methods include the Jacobi, Gauss-Seidel, Chebychev, Lanczos, and Conjugate Gradient methods. Meijerink and van der Vorst [14] discuss the use of these methods in interior point algorithms. 2. Direct methods calculate the exact solution to the set of equations (AD2AT)dy = b by factoring the matrix (AD2AT), and using backwards/forwards substitution to find dy. The most 4

common schemes in use are (LU) factorization and Cholesky (LLT) factorization. The effectiveness of these methods are dependent on the use of special data structures and pivoting rules, and on the characteristics of the coefficient matrix itself. Examples of software implementations include YSMP (Eisenstadt [15]) and SPARSPAK (Chu, et. al. [16]). Direct and iterative methods can also be combined by using ideas from the direct solution procedures to generate an effective preconditioner that improves the convergence of iterative methods. The ease with which direct methods may be used depends heavily on the amount of fill-in, or density of the factorized matrix. Matrices which can be rearranged to minimize fill-in can be stored using less memory, and hence require fewer operations to update the factorization or obtain a solution. However,, matrices that are ill-structured will generate an extremely dense matrix M, and hence can be quite inefficient to solve. The density of the matrix (AD2AT) largely depends on the number of dense columns that are contained in the original matrix A. Unfortunately, the dual block angular program that is the deterministic equivalent of a two stage stochastic program contains many dense columns. To see this, let D2 E smoxmo and D2 E RmIxmL be defined by D2 = diag {(v)-2,...,(I, )-2},/ = 0,..., N. Suppose further that T' = T, W = W, I = 1,..., N. Then the required system to solve can be calculated symbolically as AD2AT AD2TT ADTT... AD2TT TD2AT TD2TT + WD2WT TD2TT... TDoTT AD2A = TD2AT TD2TT TD2TT + WD2WT *. TDoTT TD2AT TD2TT TD2TT. TDoTT + WDT2WT Clearly, the presence of the columns associated with the T matrices creates an extremely dense M matrix to factorize. For this reason, Arantes and Birge [17] have shown that dual block angular programs in the primal form are expensive (whenever possible) to solve, even with basic preprocessing or row reordering to reduce fill-in. Since multistage stochastic programs are also dual block angular problems, they suffer from the same fill-in problem that plagues two-stage stochastic programs. Figure 3 shows the block structure for the multistage program shown in figures 1 and 2. Although the matrix is not completely dense like its two stage counterpart, it still has a substantial number of nonzeros off the diagonal. Unfortunately, the AD2AT matrices include columns which have nonzeros near the top and bottom of the matrix. As a result, we would expect the matrices to be resistant to sparsity reducing techniques such as row and column reorderings. Practically, dense projection matrices occur even in problems with relatively sparse T matrices. For example, a three stage (two scenario) version of the problem SCAGR7 (see, e.g. Gassman [18] or Birge [19]) is shown in Figure 4. The left hand figure is the nonzero structure of the deterministic equivalent's AAT matrix, and the right hand figure is the nonzero structure of AAT's Cholesky factorization (with minimum degree heuristic applied.) As can be seen, the Cholesky factorizations can be dense and hence computationally difficult to solve. 2.3 Explicit Factorizations of Stochastic Linear Programs The solution to the set of equations that determine search directions in affine scaling algorithms may also be accomplished by decompositions specific to the dual block angular structure. Birge and Qi [7] propose using a generalized version of the Sherman-Morrison-Woodbury formula (reviewed below) for the inverse of a matrix to efficiently obtain the search direction dy. While the full

A T T W T. T T W T m" W w w Constraint Matrix AD2AT Matrix Figure 3: AD2AT block structure for multistage problems. '""'"".Sa-'-! X.! -—; S" " ** s- S Pvl @o~; e. ',s: F ~m.m -.-. m= m..!... t... I s.- m;SCAGR7 AD2AT Matrix;.* -.,. *.. d I~. r "'...;.. * ';'- -. - '-.I *.. _,; '; - _..-... -... * _.,... o, _ 5; i - - - '9. — - ' " _ -.._....=.:._..V,' * - - -.* '. '.'..m -..';I~~~~~~~-., '~c'_"::::-: SCAGR7 Cholesky Factorization Figure 4: SCAGR7, 3 stages, 2 realizations per stage. 6

calculation of the step size in affine scaling may be performed generally in O(m2n) operations, the decomposition they propose reduces the computational complexity of the two stage program to 0(n2) operations (assuming nt - n for all I = 0,..., N). After reviewing the two stage factorization, we will posit its theoretical extension to multiple stages and discuss its implementation. 2.3.1 Two stage Birge and Qi factorization The main result obtained by Birge and Qi notes that the matrix AD2AT may be written as the sum of a block diagonal matrix D and the product of two similar matrices U and V (defined below). Given this representation, the Sherman-Morrison-Woodbury formula, which is reproduced here for convenience, may be used to find the inverse of the matrix. The notation used in the following follows that of Section 2.2. Lemma 1: For any matrices A,U, V such that A and (I + VTA-1U) are invertible, (A + UVT)-1 = A-1 - A-1U(I + VTA-'U)-lVTA- (2) Proof: See, e.g. Golub and Van Loan [13]. Theorem 1 (Birge and Qi [7], Birge and Holmes [20]) Consider the feasible region of the dual block angular program minimize cTXO + E5N1 CITy subject to Aoxo = b (3) TIxo + WIyl =bi l=1,...N Xo, Yl > 0 and its dual solution (yO,...,yN). Let M = AD2AT,S = diag {So,S1,...,SN}, where S1 = WlDWl Il = 1,...,N, So = I2 E Rmoxmo, and D, = diag {(v)-2,...,(v4,)-2}. Furthermore, let II and I2 be identity matrices of dimension no and mo, respectively. Also, let N G = (Do)2 + Z TS 1T G2 = -AoG AoT 1=0 Ao 12 Ao -I2 T1 0 Ti 0 U=, v=.. TN TN 0 If Ao and WI has full row rank, for I = 1,..., N, G2 and M are invertible and M- S-' S-^ 1^ G~l^ }\h \\\11 ~1[^1?1(4) M- = S1 - SU [ I [ l G=l ] [ Ao ] [ ~ 1 ] -/2 0 G-1 Ao /2 0 /2 Proof: In the affine scaling algorithm, Vk is positive, so D is invertible. By assumption, WI has full row rank, so Si is invertible for I = 1,...,N and S is invertible. Let D = diag {Do, 2} and AoDo I\2 AoDo -12 T1Do 0 TiDo 0 kTNDO 0 7 TNDo O

Note that U = UD and V = VD. By construction, M = S + UV. Applying Lemma 1 (the Sherman-Morrison-Woodbury formula), M is invertible and M-1 = (S + U)-1 S- - S- UD)(I + DTVTS-UD)-lDVTS-1 = S-1 - S-UDD-(D-2 + VTS-IU)-lD-ljDVTSl= S-' S-U(D-2 + VTS-lU)-lVTS-1 = -1 - S-UG-1VTS-l where G=[ 0 [-Ao 0 Then vTs-lu= [ AoAo + NI T'IS'T A4T -Ao -12 and D2 + VTS-1U G. So, M-1 = S-1 - S-UG-VTS- (5) if and only if (I + VTS-1U), or G, is invertible. G1 can be expanded as N G1 = (Do)2 + AoAT + TLSi1TT 1=1 By construction, (Do)2 and AoAT are positive definite and symmetric. Since Si is positive definite for all I = 1,...,N, TiSfTTT is positive definite and symmetric. The sum of positive definite matrices is again positive definite, so G1 is positive definite and symmetric. So, G'1 exists, is symmetric, and can be written as G1 = G/2G/2, where G1/2 is also symmetric. By assumption, Ao has full row rank, so AoG'/2 has full row rank. Consequently, G2 = -AoG-1AT is invertible, and G II G-AlAT 0 1[ I 0 1 [ G1 0 l0 -I2 0 G21 Ao I2 0 I2 G, AT 1 [ G-1 + G-ATG-2AoG- G 1ATG 1 _ [ I1 0 1 -Ao 0 -G[GAoG1 -G'21 J 0 I2 Since G is invertible, (5) holds. Consequently, M is invertible and (4) holds. O Using Theorem 1 to explicitly compute the inverse of the matrix M is not the most efficient way to determine the search direction dy. However, the system of equations Mdy = b may be efficiently solved using Theorem 1 by solving (in order) Sp= b Gq = VTp Sr = Uq (6) and setting dy = p - r. To verify that dy = M-b, note that dy = p-r= S-b-S-1Uq = [S-1 - S-1U(G-VTS-1)]b = M-lb 8

Further simplification of the second equation of (6) may be made by symbolically expanding G into its components G1 and Ao. Let qT = (qT, q), where q1 E RTn and q2 E RmO. Then solving [ G1 A ] ] [ ] AT J K] -Ao 0 [q2 P2 [-I2 0... 0 () L PN J implies that q2 = (AoG'lAT)-l(P 2-AoGlPl) = -G21(i2 - AoG1 1f) ql = (Gi)-(Pl -AgTq2). Solving (6) requires Cholesky factorizations of SI, G1, and G2. At each iteration of the affine scaling algorithm, an update of SI to reflect the current dual solution must precede the solution of Mdy = b. Birge and Holmes [20] use these facts to describe a pseudo-code implementation for solving AD2AT = b for two-stage stochastic programs. They also discuss the decomposition's theoretical and empirical properties. 2.3.2 Multistage Birge and Qi Factorization The Birge and Qi factorization can be extended to discrete support multistage stochastic program. Multistage stochastic programs generalize two stage stochastic programs by allowing more than one recourse decision to be made over the problem horizon. As was mentioned in Section 2, these programs also have deterministic equivalents which are "nested" dual block angular programs. This fact can be used to apply a nested version of the Birge-Qi factorization to find iterates of interior point algorithms. Consider the deterministic equivalent linear program formulated in Section 2, as well as the following the definitions. Di,t = Diagonal matrix for scenario i of stage t required by the interior point algorithm At = The constraint matrix of the original problem truncated to t stages, where 2 < t < H, and At E Rntxnt, t = 1,..., H Dt = The diagonal matrix associated with At, t = 1,...,H Mt = At(Dt)2(At)T t= 1,...,H These definitions will be used to decompose the projection matrix AD2AT for the nested dual block angular program into submatrices that can be fit into the generalized version of the ShermanMorrison- Woodbury formula. To see this decomposition, let TO(k,l),t kt= OA: OB ' k = i,.. *,it-l P(k, k) = t(k - 1)+ k' T0(k,Nt),t where OA is a zero matrix with nA = nt-2 + Ek-i nk,,t-1 columns and OB is a zero matrix with EN+,t- nk',t_- columns. Here, P(k, k') is the index of descendant k' of the kth problem in stage t.,d(k, k') is also a function of t, but will be left implicit to simplify the presentation. Let Wk,T =

diag {WO(k1),H,. *, Wp(k,NH),H} for k = 1,..., NH-1. For any 2 < t < H, the constraint matrix of the original linear program ( 1) may be rewritten as AH-1 -AH T'1,H WH AH = - TNH_,H WNHl,H Having defined the new dual block angular version of the constraint matrix of ( 1), we can now apply Theorem 1 of the previous section. Let Sk,H = (Wk,H)(Dk,H)2(WkH)T for k = 1,..., NH and SH = diag{IH-1,S1,H,...,SN,H}, where dim IH- = dim AH-1. Let D = diag{DHi1, IH-1} and AH-1 IH-1 H- AH-1 -IH-1 UH = T,, VH =-. T.H-1,H 0 J TN,H 0, If UH = UHD and VH = VHD, we can decompose MH into H = AD2 AT = SH + (UH)(VH)T Developing a procedure for finding the interior point search direction for the multistage problem is now a matter of redefining the appropriate matrices that were used in the proof of Theorem 1. Key to this development is the definition of the G and G1 matrices. To avoid yet another subscript, the G1 and G2 matrices used in the two-stage presentation will be rewritten as G1 and G2. Let Glk,HH = -2 k= NH{DkH-1 + k'=+ l (TS(k',k),H (S(k',k),H) T( k',k),H -,., NHand G1H = diag{Go,H,..., G1H_,H}. Unlike the G1 matrix previously defined for the two-stage stochastic linear program, the multistage G1 does not include (AH_-)(AH_-)T. Instead, we will include (AH-1)(A4H- 1)T explicitly in the matrix G. Let G be -AH-I 0 ' G = [ G1H+ (AH-1)(A-H-1))T AH-] Following the discussion presented after Theorem 1, the equations Mdy = b may be solved using: SHPH = b GHqH = VHTpH SHrH = UHqH (8) and setting dyH = PH - rH. The most complex set of equations to solve is the second, GHqH = VHpH. Let PH = (PH, pBH) = VHTPH. Solving [ G1H + (AH-1)(AH-1)T ATH- 1 ] 1 H 1 L -AH-i 0 J L qH e H 10

for qH = (q,qB) gives B = [AH-1[G1H + AH-1(AH-1)T]-1' ] (-1 - AH-1[G1H + (AH-)(AH-1) -1)]1 ) q. [G1H + AH-_(AH_-)T]-I(PA - ATqHB) Let GlIH = G1H + (AH-1)(AH_1)T. Then finding qH requires finding the solutions of the form G1H x = y. Since G1H is a block diagonal matrix, and AH-1 is the constraint matrix corresponding to another (smaller) multistage problem, G1H will have the same block structure as MH, the projection matrix of the original problem. G1H may then be decomposed into a sum of block diagonal matrices and the product of two similar matrices. Let IH-2 + DH-2 k = Sk,H-1 (WkH-1)(Wk,H-1 )T + Dk,H-1+ (9) k',=l(T/3(kk),H T(kk),H)-T(k), k = 1,..., NHand SH_ = diag{S0,H-1,..., SNH _,H-I}. Also, if UH-1 and VH-1 are defined similarly as above (using AH-2 instead of AH-1), then G1H_1 = SH-1 + (UH-1)(VH-1)T, and Theorem 1 may again be invoked. Since the proof of the basic step in this decomposition was proved in Section 2.3.1, we will skip a formal proof here. Instead, we will proceed to define a pseudo-code algorithm to find interior point iterates. The algorithm is Procedure finddy ( S, A, U, V, t, b, dy) begin 1. (Solve Sp = b). Solve Sk,tpk,t = bk,t for Pk,t, k = 0,...,t 2. (Solve Gtqt = VtTpt). (a) Solve SP(k',k),t(uP(k',k),t)i = (TP(k',k),t)-i for Uf(k,,k),t)i, i = 1,..., n(k',k),t, k = 1,..., Nt, k = 1,.., N (b) Set (Glo,t)ii = (Dt2_2)ii for i = 1,..., * t-2. Set (Glk,t).i = /(2)j )+ k'=t l(TP(k',k),t)T(up(k',k),t)i, i = l,... (k',k),t, k = lt-i. Let Glt = diag{Got,... Grt_,t}(c) Form pA and pB. Let 'AT i PNt )T ATPk,t + Ek=1(TI(k,,k),t)T(P(k',k),t) k = 1,..., Nt Pk,t = -Pk',t k = Nt_1 + 1... 2Nt-1 k' = k-Nt-_ + 1 Let A = (p1,t,...,p,_,) and pj = (Pr,_1+l,t,..,P2t,- )(d) Form Glt = Glt + (At_1)(At_-)T. (e) Let S' be defined as in ( 9). (f) Solve Gltu = pA for t > 1 by calling finddy ( S', At-, Ut-i, Vt_, t - 1,p4, u) If t = 1, solve Gltu = pA directly. Set v = pB - At_.u. (g) Solve Gltwi = (At)T for t > 1 by calling finddy ( S',At-, Ut_., Vtl,t-l, (At)T',wi) For t = 1, solve Gltwi = (At)T directel. Set G2 = At[wI..* * * ].

(h) Solve G2qB = v for qB and solve?;J-lqB = A- (At-1)TqB for qB by calling finddy ( 5', At-1,U t_1,t1t _ -jA (At1)T qB Iq)fot>1 Solve G2qB = V for qB directly if t = 1. 3. (Solve Sr =Uq). Let qi = jt~1qA + qB.Let jk,t = Tk~tqA~k = 1,...,N I~t1. Letqk,= (L3(k,1),t i.,'ckNt),t) Solve SkI,trk',t = qk',t for rkI',t for k'=1..,V 4. Set dYk,t =Pk,t - rk,t for k =1.., end. Forming the Multistage dy using Birge and Qi's Decomposition Each iterate of the interior point algorithm is started by calling finddy (SH, AH, UH, VH, H, b, dy). Implementing the above algorithm requires setting up a "line" of Cholesky factorizations, with those involving Sk,Hf on one end, and those involving Sk,1 on the other. Finding the search direction then involves passing data from the Sk,H end to the Sk,1 end, and collecting results 'in the reverse order. Parallel processes may be exploited at each stage by assigning sets Of Sk,t factorizations to concurrent processing units. Theorem 2 (Time Complexity of Multistage Birge and Qi' decomposition) Let Nt be the number of distinct possibilities in stage t, Nt = N1 x... x Nt be the number of cumulative scenarios through stage t (N1 = 1), and Nt = E'= St1 be the cumulative number of nonzero blocks in a multistage SLP with t stages. Let NA = Zt~ Nt, NB = ZEt Nt NA = tLj N~, and NB = tZ=i N?. Assuming Tk,t and Wk,t have dimension m x n for all k, t, the overall time complexity of the Birge and Qi decomposition is O(NA(n3 + mn+in)+ NB(n3 + n) +INAmn + N~B(in + 2mn2) Proof: Let T(t) be the time complexity of the Birge and Qi decomposition for a problem with t stages. At each step in the algorithmic statement given above, we have the following time complexities: Step Work Cmlxt 1 ~~Solve Sp =b O(Nt(n_ +mn) 2(a) Solve S'1T O(N~t~i(n +n2) 2(b) Find Gi O(N~tmn2) 2(c) Form3 O(Ntmn2+N (mn)) 2(d) Form G1t O(Nt(mn)) 2(e) Form 5' Obtainable from 2(b) 2(f), 2(g) Find G2 O(Nt?(m,0) + Nt2(mn) + 2T(t - 1)) 2(h) Solve for q O(Nt?(M3) + Nt2(mn) + T(t - 1)) 3 Solve Sr = Uq O(Nt(mn)) Other computational efforts are small compared to the above operations. Adding the time complexities for a generic stage t gives O( ~3 + m m2) + Nt-1(n 3 + n 2) + (Nrt3+N(M + 2mn 2) + 3T(t - 1))) 12

Since the procedure will be called H - 1 times, the time complexity of the algorithm can simplify to H H-1 T \ o A t(n3 + 2mn + 2mn2) + E (n3 + n) + ( (mn) + N(m + 2mn2)) t==l + +t=1 t=l Redefining the sums as above simplifies the time complexity to O(NA(n3 +mn + mn 2) + NB(n3 + n2) + NAmn + NB(m + 2mn2)) 0 3 Conclusion This technical report has presented a multistage generalization of the Birge-Qi factorization for solving two stage stochastic linear programs using interior point methods. The Birge-Qi factorization has been shown in (Birge and Holmes [20]) to be computationally superior and numerically more stable than the most common solution tricks available. The same benefits can be extended to multistage stochastic linear programs by noting that multistage problems can be written as nested two-stage problems, and applying the same idea. References [1] J.M. Mulvey, 1984. "A Network Portfolio Approach for Cash Management," Journal of Cash Management 4, 46-48. [2] F.V. Louveaux, 1980. " A solution method for multistage stochastic programs with recourse with application to an energy investment problem,", Operations Research 28, 889-902. [3] H.I. Gassman, to appear. "Optimal harvest of a forest in the presence of uncertainty," Canadian Journal of Forest Research. [4] R. Marsten, R. Subramanian, M. Saltzman, I. Lustig, and D. Shanno, 1990. "Interior Point Methods for Linear Programming: Just call Newton, Lagrange, and Fiacco and McCormick!," Interfaces 20:4, 105-116. [5] W.J. Carolan, J.E. Hill, J.L. Kennington, S. Niemi, and S.J. Wichmann, 1990. "An empirical evaluation of the KORBX algorithms for military airlift applications," Operations Research 9:2, 169-184. [6] C.L. Monma and A.J. Morton, 1987. "Computational Experience with a Dual Affine Variant of Karmarkar's Method for linear Programming," Manuscript, Bell Communications Research. [7] J. R. Birge and L. Qi, 1988. "Computing Block-angular Karmarkar Projections with Applications to Stochastic Programming," Management Science 34:12, 1472-1479. [8] P.E. Gill, W. Murray, M.A. Saunders, J.A. Tomlin, and M.H. Wright, 1986. "On Projected Newton Methods for Linear Programming and Equivalence to Karmarkar's Projection Method," Mathematical Programming 36, 183-201. [9] N. Karmarkar, 1984. "A New Polynomial Time Algorithm for Linear Programming," Combinatorica 4, 373-395. 13

[10] I. Adler, N. Karmarkar, M.G.C. Resende, and G. Veiga, 1986. "An Implementation of Karmarkar's Algortihm for Linear Programming," Report No. ORC 86-8 (Revised May, 1987), Operations Research Center, University of California, Berkely, California. [11] N. Megiddo, 1986. "Pathways to the Optimal Set in Linear Programming," Technical Report RJ 5295, IBM Almaden Research Center, San Jose, California. [12] D.F. Shanno, and A. Bagchi, 1990. "A Unified View of Interior Point Methods for Linear Programming," Annals of Operations Research 22, 55-70. [13] G.H. Golub and C.F. van Loan, 1983. Matrix Computations, The Johns Hopkins University Press, Baltimore, Maryland. [14] J.A. Meijerink and H.A. van der Vorst, 1977. "An Iterative Solution Method for Linear Equation Systems of which the Coefficient matrix is a Symmetric M-matrix," Mathematical Computations, 31 148-162. [15] S.C. Eisenstadt, M.C. Gurshy, M.H. Shultz, and A.H. Sherman, 1981. "The Yale Sparse Matrix Package, I. The Symmetric Codes," ACM Transactions on Mathematical Software. [16] E. Chu, A. George, J. Liu, and E. Ng, 1984. "SPARSPAK: Waterloo Sparse Matrix Package User's Guide for SPARSPAK-A," Research Report CS-84-36, Department of Computer Science, University of Waterloo, Waterloo, Ontario. [17] J. Arantes and J. Birge, 1989. "Matrix Structure and Interior Point Methods in Stochastic Programming," Presentation, Fifth International Stochastic Programming Conference, Ann Arbor, Michigan. [18] H. I. Gassman, 1990. "MSLiP: A computer code for the multistage stochastic linear programming problem." Mathematical Programming, 47, pp. 407-423. [19] J. R. Birge, 1985. "Decomposition and partitioning methods for multistage stochastic linear programs," Operations Research, 33, pp. 989-1007. [20] J. R. Birge, and D. Holmes, 1992. "Efficient solution of two-stage stochastic linear programs using interior point methods," Computational Optimization and Applications, 1, pp. 245-276. 14