A THREE STEP QUADRATICALLY CONVERGENT IMPLEMENTATION OF THE PRIMAL AFFINE SCALING METHOD Romesh Saigal Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 Technical Report 93-9 February 1993 Revised July 1993

A Three Step Quadratically Convergent Implementation of the Primal Affine Scaling Method Romesh SAIGAL Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109-2117, USA February 26, 1993 ( revised) Abstract Recently, Tsuchiva and Monterio have proposed an implementation of the affine scaling method that attains, asymptotically, a two step superlinear convergence rate of 1.3. This is achieved by a step size selection rule under which the affine scaling method can be viewed as a predictor-corrector method. In this paper, we propose a different, but related, step size selection strategy that asymptotically, attains a three step quadratic convergence rate. Key words: Linear Programming, affine scaling methods, quadratic convergence, interior point methods. Abbreviated title: Quadratically convergent affine scaling method 0

1 Introduction In this paper we consider the application of the (primal) affine scaling method for solving the following linear programming problem: minimize cTx Ax = b x > 0 where we assume that the matrix A is an m x n matrix, b is an m vector and c is an n vector. In addition we will assume that Assumption 1 The linear program is feasible and has an interior point. Assumption 2 A has full row rank m. Assumption 3 The objective function is not constant on the feasible region. There have been some significant recent developments in the convergence theory of this method. A notable development is the series of papers Tsuchiya [17], [18], Dikin [6], [7] and Tsuchiya and Muramatsu [19]. The last of these papers shows that if the step-size at each iteration, is restricted to 2 of the step to the boundary, the dual sequence converges to the analytic center of the optimal dual face, and the primal sequence to the interior of the optimal primal face. Prior to these results, the convergence of the primal sequence was known but the optimality of its limit point was known only under the non-degeneracy assumption, Vanderbei and Lagarias [22]. Simpler proofs of the 2 result can be found in limit at 0. This fraction can be easily increased to 2q, and it appears that this may be the largest constant step-size under which the convergence to optimality can be established. In another recent paper, Tsuchiya and Monterio [20] have achieved another milestone. By establishing a connection between the affine scaling step and the Newton step, they have 1

devised a strategy of adjusting step sizes under which the dual sequence converges to the analytic center of the optimal face and the primal sequence to the interior of the primal optimal face. Their method, asymptotically, attains a two step super-linear convergence rate of 1.3, and can be viewed as a predictor-corrector method. They show that asymptotically, one can take large step sizes ( predictor steps ) provided the step-size at the next step is restricted to 2 ( corrector steps ). In this paper we build on their work and generate a different step selection strategy, which also can be viewed as a predictor-corrector method. This step selection strategy asymptotically attains a three step quadratic convergence rate. In this method, we take two corrector steps between each pair of predictor steps. We also show that the primal converges to the interior of the optimal primal face and the dual to the analytic center of the optimal dual face. This paper is organized as follows. Besides the introduction it has 9 other sections. In Section 2, we introduce the affine scaling method, and state some of its known properties. In Section 3 we investigate the affine scaling step with a view to relating it to Newton's method. In Section 4, we study an analytic center problem, and the application of Newton's method for computing it. In Section 5, we relate the affine scaling and Newton directions for this center problem, and in Section 6 we the relate centers of two polytopes. We introduce the accelerated affine scaling method in Section 7, and in Section 8 we establish the preliminary results required for proving the three step quadratic convergence of the method, which is established in Sections 9 and 10. Finally in Section 11, we compare the theoretical efficiency of this method a two step quadratically convergent method and the superlinearly convergent method of Tsuchiya and Monterio [20]. 2 The Affine Scaling Method'In this section we present the method and known results ( generally without proof ) about the sequences generated by this method. We now present the affine scaling method we will deal with in this paper: Step 0 Let x~ be an interior point solution and let k = 0. 2

Step 1 Tentative Solution to the Dual: yk = (AXJAT)l-AXlc Step 2 Tentative Dual Slack: sk = c- ATyk If sk < 0 then STOP. The solution is unbounded. Step 3 Min-Ratio test: Ok = min kk: > 0} 3 IIXkSkII q (XkSk) where )(x) = maxjxj. Step 4 Step Size Selection: Choose, by some rule, the next step size 0 < ak < 1. Also, if Ok = 1 set ak = 1. Step 5 Next Interior Point: k+1 = k - ako k IlXkskll Step 6 Iterative Step: If k+1 = 0 for some j, then STOP. xk+1 is an optimal solution to the primal, yk the optimum solution to the dual. Otherwise set k = k + 1 and go to step 1. Assume that the method presented above generates infinite sequences {xk}, {yk} and {s"}. Then the following can be shown for these sequences, and the proofs can be found in the references cited: Theorem 1 Let the sequence {cTxk} be bounded. 1. {cTxk} is strictly monotone, decreasing, and thus converges, say to c*. 2. Xksk > 0. 3. {xk) converges, say to x*. 3

4. There is an L > 1 and a 6 > 0 such that for al k > L CTZk - C* > S. |IIxk - X*11 Proof: The proof can be found in Tsuchiya [17], Monterio, Tsuchiya and Wang [12] and Saigal [15]. 1 Given that the sequence {xk} converges, let the limit of this sequence be x*. Then we can define: N = {j: = 0}, B = {j: x > 0}, q = NI k, c-k k * (1dk X ksk _ k _ dk = Xksk U =k k CTxck -_ C* CTk - C* The following is a corollary to Theorem 1(4). Corollary 2 There exists an L > 1 and E6 > 0 and 62 > 0 such that for all k > L islxlXk11 < CTXk-* < _ |lX1n. cx - c*~ 62114XN. Proof: Readily follows by applying Theorem 1(4). U We now investigate the properties of the sequences {sk}, {dk}, {uk} and {vk}. Theorem 3 The sequence {dk} converges to 0. Proof: Follows from Theorem 1. Can also be found in Vanderbei and Lagarias [22] and [15]. Theorem 4 There is an L > 1 and a S > O such that for all k > L IIS1II < &(CTxk - c*)2. Proof: The proof follows from Corollary 2 and results found in [12] and [15]. 1 Theorem 5 Consider the sequence {uk}. 1. It is bounded. 2. There is an L > 1 and 6 > 0 such that for all k > L 4

(a) eTuk = 1 + k where 6k < ^(cTXk - c*)2. (b) (uk) = (U). (c) Lk 112 > 1. Proof: The proof can be found in [15]. U Theorem 6 There is an L > 1 a 6 > 0 such that for every k > L, \IIVNI <. Proof: Readily follows from Theorem 1 part 4 and Corollary 2. U 3 The Affine Scaling Step We will study the affine scaling step with a view to relating it to a Newton step for computing an analytic center of a certain polyhedal set. Towards this end, define D = {(y,s) ATy = CB, ANY + SN = cN} as the set that represents all dual estimates complementary to x*; i.e., xt > 0 implies sj = 0. Please note that in the definition of D, we have relaxed the non-negativity condition on SN required for dual feasibility. The following can be readily established: Theorem 7 The sequences {yk} and {sk} are bounded, and thus D: 0. Also, for every primal feasible pair x and x and every (y, s) E D T T A T1_ ) C X - CX == Sc T (XN - N) Proof: The proof readily follows by substituting c = s + AT and noting that membership in D implies sB = 0; and A(x - x) =. 0 As is now well known ( see for example, Barnes [2] ) the affine scaling direction X2sk IIXkSk 1i used in Step 5 of the method, is generated by solving the following Ellipsoidal Approximating Problem: minimize cTx Ax = b IIX;l(x- 5 k)|| < 1. 5

Substituting p = xk - x we obtain the equivalent problem: maximize cTp Ap =0 II1XklpI L 1 Now cTp = = P for (, s) E D. Thus, for some fixed (y, S), the above problem can be written as maximize gTNpN ANPN + ABPB = 0 Pv -2 + -2 NXk,NPN + PBk,BPB 1 By noting that the solution of this problem is on the boundary of the ellipsoid ( i.e., the second constraint is at equality ), the K.K.T. conditions for this problem are - -A - 2X,NPN = 0 (1) T -2 -ATBy-20XkBPB = 0 (2) ANPN+ ABPB = 0 (3) IIX1pIl = 1 (4) Now, let the sequences {ik} and {vk} be as defined in Section 2. Letting, for some given k, u and v represent.k and vk respectively, we define A = AV and s = VS, where V is the diagonal matrix whose jth diagonal entry is vj for each j = 1,,n. Now consider the system: _A11 2 -ATNY=- S = 0 (5) T- lu112?1SB AN + ARBPB = 0 (7) S N = 1 (8) 11I 112 The following theorem establishes a connection between the conditions represented by the above systems. Theorem 8 Consider the systems represented by the equations (1) - (4) and (5) - (8). 6

1. (1) - (4) have a unique solution which generates a solution to (5) - (8). 2. The solution to (5) - (8) is unique upto a choice of p'; and, there is a value for p' for which the resulting solution also solves (1) - (4). 3. When AB has full column rank, the two systems of equations are equivalent. Proof: Since the Equations (1) - (4) represent the solution to a strictly convex problem, they have a unique solution. Using this solution, define the vectors -y(cTxk - c*) 2011ull (CTXk - C*)lIUII P 20 =, PB PB (cTxk - c*)IIluI' It is now readily confirmed, by simple algebra, that u, y p and p' solve the System (5) - (8). Thus we have proved Part 1. From Part 1, it follows that (5) - (8) have a solution. Considering qN = 1M2, p = JU12X y and pB as variables, this system is linear in these variables. If AB has full column rank, the solution to (5) - (8) is unique, and Part 3 follows. Otherwise, since (5) - (8) can have a solution only if 5B lies in the row space 1Z(AT) of AB, the third condition must have redundant constraints readily identified by choosing any full column rank submatrix of AB. To see Part 2, let AB = (Ac, AD) where Ac has full column rank and spans the range ( or column space ) IZ(AB) of AB. Thus, for some unique matrix A, AD = ACA. Replacing equations (6) and (7) by k Ay = ull2 (9) and AN l1 +Acpc = 0 (10) respectively, we obtain a new system that has a unique solution. By setting p' = (p',P D), and letting p' = 0, the solution to Equation (10) generates a solution to (7). Now, let (qN,y,t, pIp) be any solution to (5) - (8). This then generates the unique solution (qN,,yPp - Api,) to (5) and (8) - (10). Since only the vector pB is modified in any solution to (5) - (8). Part 2 is established with the required pB = (cTkPl e 7

4 Newton's Method and an Analytic Center Problem In this section we consider the application of Newton's method for finding the analytic center of a certain polytope. By interpreting the affine scaling step with a step-size of 2 on this polytope, a close connection can be established between the two steps. This connection plays a central role in the accelerated implementations of the affine scaling method. We now introduce this polytope. Consider the sequence {rk} generated by thle affine scaling method. From Theorem 1, it converges to, say Ix, and as in Section 2, let N = {j:: = 0} and B = {j: x > 0}. Also, define the sequence Xk X* Vk - CT k - C* and note that for every k, A',k = ANV^ + ABVB = 0. Also, if (y, s) E D, then from Theorem 7: - T Tk STNN = CT _C and thus STvN = 1. Now define the polytope: P = v: ANVN + ABVB = O, STVN = 1,VN > 0} and its projection PN = {VN: v E P}. Its analytic center is the solution v* to the following problem: maximize ZJEN log(v^) ANVN + ABVB = 0 -TVN = 1. VN > 0 The K.K.T. conditions for this problem are: VI e-A- y-sNA = 0 (11) -ABy = 0 (12) ANVTN + ABVB = 0 (13) SVN = 1 (14) VN > 0 8

The solution to the K.K.T. conditions is unique, if it exists. The analytic center problem has a feasible solution, but the set may not be bounded, and thus the center may not exist. Indeed, it can be shown that, for a given N, the center exists if and only if x* is an optimal solution of the primal linear program. Since we have not shown this fact, we cannot claim the K.K.T. conditions have a solution. However, we will, in this section, assume that the center exists. Thus the results established as a consequence of this assumption can only be used when this existence has been shown. We now apply Newton's method to the system of equations (11) - (14) to determine its zero. The Newton direction (LAv, y, 9) at (v, y, 0) is obtained by solving the following system: -V4-2ZVN -A NY-SNA = -VNle + AT + SN0 (15) -ATY = ATy (16) AN/VN + ABAVB = 0 (17) STNAVN = 0 (18) Defining y = y + Ay, 0 = 0 + AO and substituting WN = VjN1AVN, AN = ANVN and SN = VN3N we can rewrite the system (15) - (18) as: wN+ AN + SNO = e (19) A - = 0 (20) ANWN + ABAVB = 0 (21) ATTON = 0 (22) We are now ready to prove an important result: Theorem 9 Let WN solve the system (19) - (22). Then 1. eTWN -= wTTWN. 2. IIwNII < vj, where q is the number of elements in N. Proof: (1) is obtained by multiplying equation (.19) by w and substituting W' AT 1 = O obtained from equations (20) and (21), and equation (22). (2) can be obtained by considering 9

the problem maximize zTz T _ eTz = 0 and noting that z = e solves this problem. U The system (19) - (22) is linear in the variables wN, AVB, y and 0, with the underlying matrix: I AT SN 0 0 AT 0 0 M(v) = AN 0 0 AB T 0 0 0 It can be readily verified that, when AB has full column rank, this matrix is non-singular for every VN > 0. In the other case, just as was done in Theorem 8, it can be shown that the solution to the system (19) - (22) is unique upto a choice of AivN. Now, consider a sequence {vk} in PN that converges to vN, the analytic center of PN. Also, let A/3= M(vN)-1. We state, without proof, the following standard result on the convergence and convergence rate of Newton's method. Lemma 10 There is an L > 1 andt p > 0, P2 > O such that for all k > L 1. ]IM(v )-1ll < 20. VIkav II 2. e = 1 + 8k where &Ski < piIv - VNII IIVN + AVN - VN < P2II1VNVN' 3. IIvH +,,v~ - v4l ~ P2IIlVN - v;II2. Proof: Can be found in standard texts in numerical mathematics. U Now consider the affine scaling step as determined by the system (5) - (8). We note that if we consider jfr, p and T4 as variables, this system is also linear with the underlying matrix M(vN). Thus we can prove the following theorem: Theorem 11 There exists an L > 1 and p > 0 such that for all k > L, e-N - _ 1N 0 -A 10

with IlAkll < pllccTxk - c*2 Proof: Consider the systems (5) - (8) and (19) - (22). By selecting possibly a submatrix of AB, we assume that AB has full column rank in these systems. In the later system, make the change of variable uW' = e- wN. Then STNVN = 1. Now, by defining vB = vB - /VB and T/- UN t VNlal = N w_ a2 = J - p a3 = - a4 = PB - vB generate the system M(vN)a = (0, - OO)T. This system is readily seen, with S' = - -, as the K. K. T. conditions of the following optimization problem: minimize SB Ta4 ANal + ABa4 = 0 sTal = 0 llV-l a112 < r2 for some r > 0. Consider the first constraint. Since AB has full column rank, by using the formula a4 = -(AnAB)1ABANal we can eliminate a4 to generate the following equivalent problem: -T maximize SB al ANal = 0 sTNal = O. IV1'laill2 < 1 where s = ATAB(ATAB)1-s' and A* = AN - AB(ATAB)-'ATAN. This is an affine scaling problem. It is well known that the K.K.T. multipliers (a2, a3) of this problem are bounded by a function of the form q(AN, SN).-lsBll independent of the diagonal matrix VN > O and r. ( See for example, Theorem 4, Saigal [15] ). Thus, for some q(A, SN), |(a2, a3)i < ((A, N)iI.I 11

Thus IIVNaill, < IIVN(ANa2 + sNa3)11 < 11,,Ill' By using Theorems 4 and 5 part 2(c), and that the bound is independent of a4. 5 Affine Scaling and Newton Directions in P In this section, we show the connection between the Newton direction Avk = Vk,Nw4 at vk determined in Theorem 11, and the affine scaling direction as interpreted in P. Consider the sequence {vk} in P generated by the affine scaling algorithm. Then we can show the following result: Theorem 12 The affine scaling direction at vn in P is'N -'N = 1 (,) ('N - VN k2 k N 2uN k k 2) where &(uk) = i;41. Also, the Newton direction AVk at Vk in P is: A = 1vN —Vk,Nl2 + Vk, N where Ak is as in Theorem 11. Proof: Since wt = Vk-1AvV the formula for the Newton direction follows from Theorem 11. To see the affine direction note that: k+1 k _ +lN XN i'N -VN - Ck+1 -* CT C C xN - c* xN - k akVk NUk VN - (1'k) Vk ~- 1 _- lt12- N 1 (u*) "klllj" 112k = N | - | k uk = ( L ) -(VN - Nk UI2) (23) I -?kIl"l'II 11V II and we are done. U Using Theorem 11, if ak is chosen such that the scaler term in the formula for the affine scaling direction is 1, the directions taken by Newton's method for computing the analytic center of P will be very close to the affine scaling direction. It is this observation that allows the interpretation of this step as a corrector step. We now present a relationship between two analytic center problems, and the version that exploits this property of the affine scaling step. 12

6 Relation between two analytic centers Consider the analytic center problem for D n {: SN > 0} maximize EEN log(sj) AT + SN = CN A yj = CB SN > 0 and its K.K.T. conditions: SNle —VN = 0 (24) ANVN+ ABVB = 0 (25) ANY + N = CN (26) A7By = CB (27) sN > 0 The following relates the two analytic centers defined by systems (11) - (14), and (24) - (27). Theorem 13 (y*,s*) is an analytic center of D n {s: SN > 0} if and only if v* is an analytic center of PN and qvN = SN e Proof: It is readily confirmed that if v* is the analytic center of PN, then for some 0 = 9* > 0 and y = u*, equations (11) -(14) are satisfied. Thus 1 1 1 SN = n: AT u + SN, v = -V, y = y- u 9* 0* 9* satisfy the equations (24) - (27). Also, if (y*, s*) is the analytic center of D n {s: SN > 0}, then for some v* equations (24) - (27) are satisfied. Thus 1 v = -v*,y = -q(y* - ), = q solve equations (11) - (14). We are now done by equation (24). U 13

7 Accelerated Primal Affine Scaling Method In this section we present a step-size selection strategy that asymptotically behaves as a predictor-corrector strategy for computing the analytic center of P. This method follows: Step 0 Let x~ be an interior point solution, 0 < a < 1 ( normally between 0.95 and 0.99 ) and let k = 0. Step 1 Tentative Solution to the Dual: yk = (AX2AT)-l AXc Step 2 Tentative Dual Slack: k = - ATyk If sk < 0 then STOP. The solution is unbounded. Otherwise, define dk = Xksk Step 3 Min-Ratio test: iIdkII k } Sk = min{t d:J > k q(dk) Also, if Ok = 1 set ak = 1, and go to Step 7. Step 4 Tentative Non-Basic set: Nk = j: < 4(d)} lk = leTdk I fNk ~ k Step 5 Estimate of Newton Step: Ek = II hN\kI 14

Step 6 Step-size calculation: 1. Normal Step: If 7a: > 1 then set ak = a 2. Predictor Step: If 7k < 1 and E' 75 < 7k then set: = 1 - max {Ik, 50} 3. Corrector Step: Otherwise, set Nfk 2 Ctk - mint 7k 2 20klldkI I3 Step 7 Next Interior Point: Xy2k Xk+l = xk _ k Ok IIXkk 11 Step 8 Iterative Step: If xzj = 0 for some j, then STOP. xk+l is an optimal solution to the primal, yk the optimum solution to the dual. Otherwise set k = k + 1 and go to step 1. We will show that the above implementation is three-step quadratically convergent. Before we prove this, some comments are in order. As a consequence of Theorem 3, q(dk) converges to 0. Thus, asymptotically, Nk will become some constant N. Also, at Step 5 eN = Td k TkN idki92 k ik = T k Vk,N III k (28) e1! 11UN II and using the Theorems 5 and 11, is readily seen as a very good estimate of AvN. We wish to apply the predictor step when IIhvII = O(cTxk - C*)2. Since eTdN goes to zero; asymptotically the acceptance condition for the predictor step will be satisfied. Since the resulting constant in this relation is unknown, a predictor step may be performed with a lower order of cTzk - c*. This will result in a lower convergence rate than two. We will show that this error can be corrected by taking three corrector steps following the first predictor 15

step selected by the rule. After this, two corrector steps after each predictor step can be taken to achieve two-step quadratic convergence. The corrector step chooses the step size 001 k T k fk _(YN )e 1N (29) 20oIdk l 2|11ukl|2 which approaches -. We are now ready to establish the preliminary results before proving the two step quadratic convergence of the sequence. 8 Preliminary Results In this section we will establish the preliminary results for the proof of the convergence of the sequence generated by the above implementation. Proposition 14 There exists an L > 1 such that for every k > L, Nk = N. Proof: As O(dk) > O, there exists an L > 1 such that for all k > L, x > (dk) for each j E B and we are done. 1 Let L be generated by Proposition 14. We henceforth restrict our attention to the sequence {xk }kL Proposition 15 There is an L > L and an c > 0 such that for all k > L, IIh - AV||1 < c (CTxk - c*)2 Proof: From equation (28) and Theorems 12 and 5 we obtain hk - Ak = 6vN V- N4 k -N N I- = 1 + kN and the result follows from the above theorems and Theorem 6. a Proposition 16 There is an L > L and 3 > 0 such that for every k > L 1. cTxk+ = 1 - ckk() where S(u ) = 2 2. 1 - 1III - | kil ~< 86(u) < 1 xwith ISkl < P(cTxk - c*)2. 16

Proof: From Steps 3 and 7 we obtain cT k+ = CTx k c rXsk - L (Xk k) and thus cTxk+l _ c*.12 cTxk - =c* 1 -k (k) (30) Because of Theorem 5, we have (1). Since (1) is true for every ack < 1, by setting ak = 1 we obtain the upper bound of (2). Now, from Theorem 11, k(U) = uN IILk112 I 11k112 = O(e + wk - k) < 1+ l ||wTI|| + 6k (31) where llAkil = Sk; and we have our result from Theorem 11. U Proposition 17 Let wk -- 0 on some subsequence K. Then, on K Pk pk = ^21jUkj2 2 Pk < (1 + +ll +IIz kIt) -k 2 and we get the result as 5k and I1kIll go to zero. 1 Proposition 18 There is an L > L such that for all k > L 0.50(ck - c*) < eTd < 1.5(cTk - c*) Proof: Note that eTdk = (cTxk - c*)eTuik, and so the result follows from Theorem 5. We are now ready to prove the optimality of the limit point x* of the primal sequence generated by the accelerated method. The convergence of the primal sequence follows from the Theorem 1. 17

Theorem 19 x* lies in the relative interior of the optimal primal face. Proof Let L be as in Proposition 14. In case the predictor step is only taken a finite number of times, there is an L > L such that for all k > L, ok < 3. Thus the result follows from Theorem 1.1 of Tsuchiya and Maramatsu [19] ( see also Theorem 16 of Saigal [15]). Now, assume the predictor step is taken an infinite number of times, and let K be the subsequence of such iterates. Let k E'K. Then, from Steps 4, 5 and 6 Part 2, IIh ll < (eTXk,N.S) < (eTXkSk) and, from Theorem 1, Part 2 h - 0 for k E K. Also, from Proposition 15, AvN -k 0, for k E K as IltAk 11 < llhNll + II||Av- hI11. But, v E PN is the only vector for which Av* = 0. Thus vN -> vN for k E K. As Av -- 0, w -) 0 for k E K. Thus from Theorem 11 UNl - e for k E K, and, from Theorem 5 Part 2(a), il|ll2 -l - fo r k E K. q Thus k_ 1 UN -e for k E K. q The sequences {yk} and {sk} are bounded and so, on some subsequence K' C K, they converge, say to (y*,s*). So V.,N S = 1 q Also, as v1 is the analytic center of PN, from Theorem 13, (y', s*) is an analytic center of D n {s:5N ~ 0}, and we are done. 1 18

9 On Predictor and Corrector Steps In this section we will establish the asymptotic rate of convergence of the sequence {cxk - c*} generated by the accelerated affine scaling method. We will show that this sequence converges three-step quadratically to 0. Because of Theorem 19, the analytic center v* of PN exists, and thus Newton's method, when initiated close to the center, will converge to it. Thus, there is a sufficiently large L > 1 for which the following properties hold: For all k > L, 1. v? > (cTk - c*)~025 for all i E N. 2. From Lemma 10 Part 2, related to Newton's method, there is a 0 > 0 such that (cTxk - c*)025 < 0; and, for all I|VN - vll < 0 1 3 IIVN - v*1 I XVNII < 3lJVN - v|II. 3. The conditions of Lemma 10 related to Newton's method; and, Proposition 14, are satisfied. Such an L exists as v, > 0 for each i, > 0 and cTxk - c*- 0. We can then prove the following straight forward lemmas: Lemma 20 There is an L > L such that for all k > L 0.50(cTxk - c*) < eTd < 1.5(cTxk - *). Proof From Theorem 5 Part 2(a), there is a L' > L and a 0 > 0 such that for all k > L' eTd = (1 + 6k)(cTxk - *) where ISk& < 0(cTxk - c*)2. Thus, there is a L > L' such that Ik1l < 0.50 and we are done.E Lemma 21 Let v IIv- v\ ll < f?(cTk - c*)P for some d > 0 and p > 0.25. There is an L L and a 0 > 0 such that for all k > L IIWI11 < 0O(CTxk - c)P-025 19

Proof Consider k L V |V -V | 1 (CTk - c*)0.25 _ (CT k - *) and thus there is a L > L such that for all k > L V > 0.5(cTxk _ c*)025. Thus, aIVNII < (T 0.2 and since zit = V Vk, we obtain our result from Lemma 10. We are now ready t.o investigate the predictor step and the corrector step of the accelerated method. The next theorem investigates the predictor step, Step 6, part 2 of the accelerated method. Theorem 22 Assume that for some k > L anrd 3 > 0, I\v - v\I < N(cTx - C*)2P for some 0.50 < p < 1. Then 1. There is a 0 > 0 such that 0.50(cTx - c*)2 < cTxk+1 - c* < (cTxk - c*)2p 2. There is a 0 > 0 such that lIVk+1 _ V*ll < O(CT xk+l _ C*)2 Proof To see (1), note that from Step 6, part 2; Lemmas 20, 21 and Proposition 16, we get 0.50(cTxk -c*) < eTd < i-ak __ Iauk71II2 _ 1-ak( < 1 -.~ ~( > CTxk+l _ C CTJxk - C* 1 -(1 - max {E50,7 k})(l - 2II|wl|) < max {ek,Y/k} + 2IIwII O(cTxk - *) 20

To see (2), from Theorem 12 Part 3 k+1 - k = 4 + ^ - 9OZr O - ^lA1 - k v N VN+1 - VN=V +AvN + V ( Uk) - VN + 1 k) ( - N,kkA IlUk l12From Propositions 15, 16 and Theorem 8, after some simple algebra, we obtain jvk' -vIIkIv +zv, I2ak- 1 IT k * 2 IVllv NA - < |lv 11 + A - 1|| + I ||hN| + (C X c 1 - k Choosing k > L and substituting the formula for ak from Step 6, Part 2; Part 1 of this theorem, and the Lemma 10,k+1 - N IIV - 2 + O(CTXk _ C*)P IIUN VNl < < (cTxk - c*)P < (CTxk+l C*) where S1 > 0, 0 > 0, and. > 0 are appropriate constants. We have used 2ak - 1 1 - 2max {Ek e5, Yk} ---- k -<- -- k 1 ak max {e 50~ k} < O.50 We now investigate the corrector step. Theorem 23 There is an L > L and c > 0 such that for some k > L, assume II||vN-vJ < a(cxk - c*)P where 0.25 < p < 0.50. 1. Let p = 0.50. Then two iterations of the corrector step, Step 6, Part 3, can be taken, and for some 0 > 0, IIV+2 -;vl[ I< O(cTxk+2 - c*)2 2. Let 0.25 < p < 0.50. Then three iterations of the corrector step can be taken, and for some 0 > 0, +-,vll (ck+3 - )2 Proof From Lemma 21, we note that for all k > L IIw4lI ~ 9(cTxk - c)0-~25. 21

Let L > L be such that for all k > L, (cTxk - C*)P-~025 < 0.25. Now, consider k > L. Then from Proposition 16 = k 7T eTt/L(u) 20k ||d || -I |L < (1 + 6k)(l + IIwlI1 + Yk) 2 3 where 6k and Yk are positive and of the order O(cTxk -c*)2. Thus, during the corrector step, ak = Sk, and thus, from Proposition 16 Part 1 we get CTxk+l - C* e Tuk T =k 1 - CTxk _ C* 2 and thus from Theorem 5, Part 2(a), for all k > L O.25(cTxk - c') < cTxk+l - c' < 0.75(cTk - c'). (32) For the first corrector step, after substituting ak Theorem 12, we get'TUk k k+1 k eN ( vV N T - N kNk112) VN - Vv + T k (e v-Vk,N jluknl[ ) Thls. by simple algebra and substitutions we get: IIV+1 - VNIl < + I 1A,+Lv +v- Ivl,AN~+l 1 -!1 IIVNAw v, Nk 1 - \0k\ I1 -ISkl < PlV - V;112 + /(CTXk _ C*)2 < pa2(c:Txk - c*)2P + /3(cTxk - c )2 (33) < pl(cTxk - c*)2p < (4)2pp(Txk+l _- *)2P. Applying the same argument as above to one more corrector step, we obtain IIvl+2 - v*1 < a2(cTxk+2 - c*)4P (34) Now, if p = 0.50, then 4p = 2 and we have part (1). Otherwise after one more corrector step, we obtain I~'~+; - z'\II ~ a3(CTZk+3 - c*')8 + 3(c.T.k+3 - c*)2. (35) Since 8p > 2 for 0.25 < p < 0.50, we obtain part (2) and we are done. U 22

10 Rate of Convergence of the Accelerated Method We are now ready to prove the main theorem. Theorem 24 Let the sequences {xk}, {yk} and {sk} be generated by the accelerated affine scaling method. Then 1. xk -x* k X* 2. k -- y* 3. sk ) s* where x* lies in the relative interior of the optimal face of the primal, (y*, s*) is the analytic center of the optimal face of the dual. In addition, asymptotically, the sequence {cTxk-cTx*} converges three-step quadratically to zero; i.e., there is an L > 1, 0 > 0 and K = {L, L + 3, L + 6, * } such that the subsequence {cTxk - cTx*}keK converges quadratically to zero, or cTx(k+1)' cT* < 0(cT - cTx*)2 for k' e K. Proof Using equation (32) and Lemma 20 we note that, during corrector steps, asymptotically the sequence {Yk} decreases linearly in cTxk - c*. Also, using equations (34), (35), Lemma 10 and Proposition 15 asymptotically, if no predictor step is taken, the sequence {ek} decreases at least as fast as O(cTxk - c*)2. Thus, for some large k, we must have 0.5 < k (36) and a predictor step must be taken. The first time the condition of equation (36) occurs, if Ek > Yk then we enter the predictor step with 1 > p > 0.50; and, after the predictor step we enter the corrector step with p < 0.50 and thus three steps may be required to guarantee, by Lemma 10 and Proposition 15 that we enter the next predictor step with p = 1. Then every subsequent corrector step will be entered with p = 0.50, and will thus require at most two corrector iterations to satisfy the conditions for the predictor step, i.e, enter the predictor step with p = 1. Thus, if during the first corrector step, three corrector iterations are taken, from Theorem 23, we guarantee that the predictor step is initiated with p = 1, and the three step quadratic convergence of {cTxk - c*} follows. 23

It is readily confirmed from Theorems 22 and 23 that I1o - V1 -- 0 and thus vv - v. But this can only happen, by Theorem 13, if (yk,sk) converges to the analytic center of D n {s: SN > 0} and we are done. U 11 Relative efficiency of Three Step Method Ostrowski [14] section 6.11, introduced the following measure to compare algorithms achieving different rates of convergence. As is evident, greater rates of convergence can be achieved by increasing the work. The simplest way to get order four convergence of a sequence generated by quadratically convergent Newton's method is to drop each odd element of the sequence. Ostrowski's measure is invariant under such manipulations. For a method which requires w units of work per iteration and achieves a convergence rate of p, the measure of efficiency is defined as log(p) IV In the table below, we summarize the efficiency of three methods. Algorithm Rate Work/Iter Efficiency Factor Tsuchiya and Monterio 1.3 2w 0.3 0.5678 I Saigal 2 3w 105 1.00 Two step Quadratic 2 2w 0.34657 1.50 The two step superlinear algorithm of Tsuchiya and Monterio [20] is about 44% slower than the algorithm presented here, which is 50% slower that a two step quadratic algorithm. 24

References [1] I. Adler, N. K. Karmarkar, M. G. C. Resende and G. Veiga, "An implementation of Karmarkar's algoritlhm for linear progra.mming," Mathematical Progamming 44 (1989) 297 - 335. [2] E. R. Barnes, "A Variation on Karmarkar's algorithm for solving linear programming problems," Mathematical Programming 36 (1986) 174 - 182. [3] J. R. Birge and C. Rosa, "A simplified proof of the convergence of affine scaling," Technical Report, Department of Industrial and Operations Engineering, University of Michigan, (Ann Arbor, Michigan 1991). [4] I. I. Dikin, "Iterative solution of problems of linear and quadratic programming," Soviet Mathematics Doklady 8 (1967) 674 - 675. [5] I. I. Dikin, " On the convergence of an iterative process," Upravlyaemye Sistemi 12 (1974) 54-60. (In Russian). [6] 1. I. Dikin, " The convergence of dua.l variables," Technical Report, Siberian Energy Institute, Irkutsk, Russia, December 1991. [7] I. I. Dikin, " Determination of interior point of one system of linear inequalities," Kibernetica and system analysis 1 (1992). [8] C. C. Gonzaga, " Convergence of the large step primal affine-scaling algorithm for primal nondegenerate linear programs," Technical Report ES-230/90, Dept. of Systems Engineering and Computer Science, COPPE Federal University of Rio de Janeiro, 21941 Rio de Janeiro, RJ, Brazil, September 1990. [9] L. A. Hall and R. J. Vanderbei " Two-thirds is sharp for affine scaling," Technical Report SOR-92/9, Department of Civil Engineering and Operations Research, Princeton University, Princeton, New Jersey, 08544, September 1992. [10] N. Karmarkar, "A new polynomial-time algorithm for linear programming," Combinatorica 4 (1984) 373 - 395. 25

[11] S. Mehrotra, " Quadratic convergence in a primal-dual method," Technical Report 9115, Dept. of Industrial Engineering and Management Science, Northwestern University, Evanston, IL 60208, USA, September 1991. [12] R. D. C. Monteiro, T. Tsuchiya, and Y. Wang, " A simplified global convergence proof of the affine scaling algorithm," Technical Report, Dept. of Systems and Industrial Engineering, University of Arizona, Tucson, AZ 85721, USA, October 1992. [13] M. Muramatsu and T. Tsuchiya, " A convergence analysis of a long-step variant of the projective scaling algorithm," Research Memorandum 454, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan, November 1992. [14] A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, New York, (1960). [15] R. Saigal " A simple proof of the primal affine scaling method," Technical Report No. 92-60, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, December 1992. [16] P. Tseng and Z. Q. Luo," On the convergence of the affine-scaling algorithm," Mathematical Programming 56 (1992) 301-319. [17] T. Tsuchiya, " Global convergence of the affine-scaling methods for degenerate linear programming problems," Mathematical Programming 52 (1991) 377-404. [18] T. Tsuchiya, " Global convergence property of the affine scaling method for primal degenerate linear programming problems," Mathematics of Operations Research 17 (1992) 527- 557. [19] T. Tsuchiya and M. Muramatsu, " Global convergence of a long-step affine scaling algorithm for degenerate linear programming problems," Research Memorandum 423, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan, January 1992 (revised, September, 1992). [20] T. Tsuchiya and R. D. C. Monterio " Superlinear Convergence of the Affine Scaling Algorithm," Research Memorandum, November 28, 1992. 26

UNIVERSITY OF MICHIGAN 3 9015 04735 0445 [21] R. J. Vanderbei, M. J. Meketon and B. A. Freedman, "A Modification of Karmarkar's linear programming algorithm," Algorithmica 1 (1986) 395 - 407. [22] R. J. Vanderbei and J. C. Lagarias, " I. I. Dikin's convergence result for the affinescaling algorithm," In J. C. Lagarias and M. J. Todd, editors, Mathematical Developments Aris- ing from Linear Programming: Proceedings of a Joint Summer Research Conference held at Bowdoin College, Brunswick, Maine, USA, June/July 1988 volume 114 of Contemporary Mathematics, pages 109-119. American Mathematical Society, Providence, Rhode Island, USA, 1990. [23] C. Witzgall, P. T. Boggs, and P. D. Domich, " On the convergence behavior of trajectories for linear programming," In J. C. Lagarias and M. J. Todd, editors, Mathematical Developments Arising from Linear Programming: Proceedings of a Joint Summer Research Conference held at Bowdoin College, Brunswick, Maine, USA, June/July 1988, volume 114 of Contemporary Mathematics, pages 161-187. American Mathematical Society, Providence, Rhode Island, USA, 1990. [24] Y. Ye, O. Guler, R. A. Tapia, and Y. Zhang, " A quadratically convergent O(V/-L)- iteration algorithm for linear programming," Technical Report TR-91-26, Dept. of Mathematical Sciences, R.ice University, Houston, TX 77251, USA, August 1991. 27