EFFICIENT VARIANTS OF THE AFFINE SCALING METHOD Romesh Saigal Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 Technical Report 93-21 August 1993

Efficient Variants of the Affine Scaling Method Romesh SAIGAL Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109-2117, USA August 20, 1993 ( Revised October 22, 1993) Abstract In this paper we present a variant of the affine scaling method. This variant is defined by choosing a real r > 0.5, and is similar to the power barrier variant of the primal-dual homotopy methods, considered by den Hertog, Roos and Tarlaky and Sheu and Fang. Here we analyse the variants for r > 1. The analysis for 0.50 < r < 1 is similar, and can be readily carried out with minor modifications. Under the non-degeneracy assumption, we show that the variant converges for any choice of the step size a. To analyse the convergence without the non-degeneracy assumption we define a power center of a polytope. We use the connection of the computation of the power center by Newton's method and the steps of the variant, to generalize the 2/3rd result of Tsuchiya and Muramatsu. We show that with a constant step size a such that (1a)2r < r; and with a variable asymptotic step size ak uniformly bounded away from 2r91, the primal sequence converges to the relative interior of the optimal primal face, and the dual sequence converges to the power center of the optimal dual face. We also present an accelerated version of the variant. We show that the two step superlinear convergence rate of the variant is 1 + r X while the three step convergence rate is 1 + 32. Using the measure of Ostrowski, we note that the three step method 0

for r = 4 is more efficient than the two step quadratically convergent method, which is the limit of the two step method as r approaches infinity. Key words: Linear Programming, affine scaling methods, interior point methods, power barrier method, power center, merit function, superlinear convergence, efficient acceleration. Abbreviated title: Variants of affine scaling method 1

1 Introduction We consider here the linear programming problem: minimize cT x Ax = b (1) x > 0 with its dual maximize bTy ATy + s =b (2) s > 0 where A is a m x n matrix and b and c are appropriate vectors. We also assume that Assumption 1 The primal linear program has an interior solution. Assumption 2 The objective function is not constant on the primal feasible region. Assumption 3 The matrix A has rank m. In this paper we consider the application of the affine scaling method and its variants for solving this problem. The first ( the primal method ) was proposed by Dikin [6] in 1967, who subsequently proved its convergence under the non-degeneracy assumption, Dikin [7]. This method was rediscovered, Barnes [2], Karmarkar and Ramakrishna [12]; and, several of its variants like the dual and primal-dual, Vanderbei, Meketon and Freedman [25], were generated in the process of implementing the projective transformation method of Karmarkar [11]. The convergence of this method has been shown when the step size a < -, Tsuchiya and Muramatsu [24], and simpler proofs of this result have been developed by Monterio, Tsuchiya and Wang [13], and Saigal [16]. In addition accelerated versions of this method, which attain superlinear convergence, have also been developed by Tsuchiya and Monterio [23] and Saigal [17]. In the convergence rate analysis, a connection between the iterates of Newton's method applied to computing an analytic center of certain polyhedron is exploited. A local potential function, which is the objective function of the analytic center problem, is investigated in the proof of convergence. And it is shown that the primal 2

sequence! converges to the relative interior of the optimal primal face while the dual sequence to the analytic center of the optimal dual face. For each r > 0.5, we will consider here the variant based on the following approximating problem: minimize cTx Ax = b (3) IlXor(x - x0)l < 1 where xa0 > 0 is a given interior point of the linear program (1). When r = 1, the above approximating problem generates the ( primal ) affine scaling method. The variant thus generated by choosing r > 0.5 is analogous to the power barrier variant of the primal-dual homotopy ( barrier ) method of den Hertog, Roos and Terlaky [5] and Sheu and Fang [19]. Under the non-degeneracy assumption on the primal, we prove the convergence of the primal and the dual sequences to respective optimal solutions for any step size a < 1. To prove convergence under degeneracy, we introduce the concept of the power center of a polytope. We define two polytopes, and their power centers are defined by the maximization of concave functions. These functions are related to each other in the same manner as the "dual norms" are. By using the connection of the iterates of Newton's method applied to computing the power center of the polytope associated with the primal problem, we prove two results. In the first result we consider the case of constant step size, and prove that if the step size a satisfies (1a)2r < 2r1, the primal sequence converges to the relative interior of the primal optimal face, and the dual to the power center of the optimal dual face. In the second result which gives the same convergence behavior of the sequences as the first, we consider the case of the variable step size, and implement the step size ak at iteration k. We choose the sequence {1ak} such that it is, asymptotically defined by 1-a < 2r-1; and the sequence is required to be uniformly bounded away from 2r+1, which is 2/3 for r = 1. Our result can thus be considered as a generalization of the 2/3rd result of Tsuchiya and Muramatsu [24]. In both these cases, the proof is obtained by considering a merit function ( which is the objective function of the center problem related to the primal ), which plays the same role as the local potential function used in the analysis of the affine scaling method, Tsuchiya [21], Dikin [9] and Tsuchiya and Muramatsu [24]. 3

By exploiting the relationship between the Newton and the affine iterates, we present here an accelerated version of the variant, which generalizes the accelerated version of Saigal [17]. We prove its convergence, and show that the primal sequence converges to the relative interior of the optimal primal face while the dual sequence converges to the power center of the optimal dual face. In addition, for each r > 1, we obtain the two-step superlinear convergence rate of 1 + -r-; and the three-step rate of 1 + 2. Using the measure of Ostrowski [15], we investigate the efficiency of each of these methods, and note that for each r > 1, the three-step version is more efficient than the two step version. In this paper we restrict our attention to the values of r > 1, and note that with minor modifications in several formulae as well as the objective functions of the power center problems, our analysis carries over to the values of 0.50 < r < 1 as well. Besides the introduction, this paper has 15 other sections. In section 2 we present the variant, in section 3 we prove the convergence of the primal sequence, in section 4 we prove some properties of the dual and the primal-dual sequences. In section 5, we prove the convergence of the sequences to optimality with the primal non-degeneracy assumption. In section 6 we introduce some translated sequences and prove some of their important properties. In section 7 we introduce the two power centers, and establish a connection between them. In section 8 we investigate the properties of Newton's method when computing the power center, and in section 9 we re-interpret the affine scaing steps in the same polytope. In section 10 we relate the Newton and affine iterates. In section 11 we introduce the merit function and prove the convergence to optimality without the primal non-degeneracy assumption and with a constant step size, while in section 12, we prove the same result with a variable step size. In section 13 we present the accelerated variant, and prove its convergence. In section 14 we derive the convergence rate of the accelerated variant while in section 15 we investigate its efficiency. Finally, we end the paper with concluding remarks in section 16. 2 The Variant We now present the variant generated when r > 0.5. Step 0 Let x~ be an interior point solution, 0 < ao < 1, and let k = 0. 4

Step 1 Tentative Solution to the Dual: yk = (AXkrAT)-1AXk C Step 2 Tentative Dual Slack: sk c- ATyk If ok < 0 then STOP. The solution is unbounded. Step 3 Min-Ratio test: II j 2r-1 jl f[ k min{t I Xrsk > 0} II(X2r-k11 ~jjX2r-1 Sk ) o(xr-1k) where f(t) = maxjtj. If k = 1 set a = 1. Step 4 Next Interior Point: y2r Sk xk+1 = k _- Xgk k x k 9k XII xl2r-1/ ck Step 5 Iterative Step: If xk+l = 0 for some j, then STOP. xk+l is an optimal solution. Otherwise set k = k + 1 and go to step 1. Let the above variant generate the infinite sequence {xk}, and let the sequence {cTXk} be bounded below. In the contrary case, we can conclude that the dual of the linear program (1) is not feasible. We can then prove: Theorem 1 The sequence {cTxk} is monotone strictly decreasing; and thus converges, say to c*. Proof: From Step 4, we note that TX2r sk CTxZk+1 = cTIk - OoC x - sk rkIIX 2rlkII As can be readily established from the definitions, xk > 0 and 0k > 1. Also, cTxkrsk = cTXk (c- ATyk) = cTXr (I- XAT(AXkrAT) -AXkr)Xkc = IiPkXkc12 5

where Pk I - X'AT(AXjrAT)-lAXr is the projection matrix into the null space A(AXr) of the matrix AXk. Now, by a simple calculation, we see that IPkXkcII = lXIIXskl and thus we have II yJrkJJ2 CT k+1 = CTzk _ ck lX (4)1 C X C X iOkI2rkr-lIk From Assumption (2) the subtracted term in the above formula is non-zero, and we are done since every bounded monotone sequence converges. U 3 Primal Sequence In this section we consider some important properties of the primal sequence {Ixk} and also establish its convergence. Consider the approximating problem defined, for k = 0, by (3). Setting p = x - x, we obtain the equivalent problem: maximize cTp Ap =0 (5) 1lXkrpl <_ 1. The following result is stated without proof, which can be found in the cited reference. Theorem 2 Let pk solve the approximating problem (5). There is a p > 0 such that for each k = 1,2,. *, cTpk > pllpkll Proof: See Corollary 6, Saigal [16]. U We can now prove the convergence of the primal sequence. Theorem 3 Let {cTxk} be bounded below. Then the sequence {xk} converges, say, to x*. Proof: It is readily seen that the solution to the approximating problem (5) is y2r k k =X. s P JIIXrskl 6

For each k, define %k > 0 such that xk+l- xk = kpk. From Theorem 2 we obtain the relation 00 00 00 oo > x1 -c = E CT(Xk - Xk+l) > P E llpkll = p IE Ik+l -_ 11 k=l k=l k=l From the above relation, we see that the sequence {xk} is a Cauchy sequence, and thus converges to some vector x*, and we are done. U The following simple corollary is a consequence of Theorem 2. Corollary 4 There is a p > 0 such that for all k = 1, 2,... CTxk - C* Ilxk -*I - Proof: Let k be arbitrary. The following relation is a consequence of Theorem 2 and the triangular inequality: CTxk _ c* = CT(k+ - k+j+l)) pE iIk+j+l - k+jl I > plxk - x*11 j=o j=o and we are done. U Given that the sequence {xk} converges to x*, define B = {j:x > O} 3 (6) N = {j:x; = }. 4 Dual and Primal-Dual Sequences In this section we consider the dual sequences {yk} and {sk} and the primal-dual sequence {Xksk}. We now investigate their properties. The following result is well known and its proof can be found in the cited reference. Theorem 5 The sequences {yk} and {sk} are bounded. Proof: Follows from Step 1 and Theorem 4 of Saigal [16]. U Now consider the sequence {Xksk}. We now prove that it converges to 0. Theorem 6 Xksk - 0. 7

Proof: From Theorem 1, {cTxk} converges, and thus from relation (4), we note that II yrSkll2 XOk lrkllsk where a > 0 and Ok > 1. Since {zk} converges and {(k} and {sk} are bounded, the denominator of the above expression is bounded, thus Xsk 0. O But this is only possible if Xksk > 0, and we are done. U 5 Convergence of Dual Sequence We now show that if the primal is non-degenerate, the dual sequences also converge, and the limit points are optimal for the respective problems. We do this in the next theorem. Theorem 7 Let the Assumptions (1) - (3) hold, and let the primal be non-degenerate. Then 1. k x X 2. yk - y 3. sk -- s where x* is an optimal solution to the primal linear program, (y*, s*) is an optimal solution to the dual linear program. Proof: Using Theorem 5, let (y*, s*) be a cluster point of the sequence {(yk, sk)}. From Theorem 6, s = 0. Thus ABy* = CB. From the non-degeneracy assumption, AB has full row rank m, and thus the above system has at most one solution. But each cluster point y* of {(yk solves this system, thus the sequence has only one cluster point y*, and so yk y* 8

and thus sk - s*. Now assume that for some j E N, s* < 0. Then there is an L > 1 such that for all k > L, sk < 0. Thus, from Step 4 (Xk\2r k k+1 k _ej ) 2rS - q(XrlSk) k > Xj ~~k,,~~~~3 and thus xk 4 0, and we have a contradiction. Thus s* > 0 and so (y*, s*) is dual feasible, and the theorem follows from the complementary slackness theorem. M 6 More on Sequences Our aim is to investigate the convergence without the non-degeneracy assumption, as well as the convergence rate of the sequences generated by the variants, and, if possible, to accelerate the asymptotic convergence rate by controlling the step size selection, as per Tsuchiya and Monterio [23] and Saigal [17]. For this purpose we derive some important properties of the sequences generated by the variant. Consider the translated sequences: k xk-x* Uk - X Xrsk (7) (CT k ) uk Xk S. p = Xrsk. The following are simple consequences of the results already established. Proposition 8 The sequence {vk} is bounded. Proof: Follows from Corollary 4. U Proposition 9 The se e sequence {u } is bounded. Proof: Follows from definition of uk, and Corollary 4. U Proposition 10 For every k = 1, 2, *., Apk = 0. Proof: Readily follows by substitution of definitions. U 9

Given B and N as defined by relations (6), we define the set of all possible dual estimates that are complementary to x* as the polyhedron: D = {(y, s): ATy + s=, SB = 0} (8) We can then prove: Proposition 11 D) # 0. Proof From Theorem 5, the sequences {yk} and {sk} are bounded, thus on some common subsequence K, yk -+ y* and sk - s*. Using Theorem 6, it is readily established that (y*,s*) ED. E Consider (y, s) E D. We can show that Theorem 12 For each k = 1, 2,... 1. cTxk c* = 54. 2. There are constants a > 0 and 3 > 0 such that c(CTxk - c*) _< IIx|| < (cTXk - c*). 3. sT = 1 Proof Part 1 of the theorem follows from the following identity: CTXk - C* = T(Xk _ X*) = TN (9) The upper bound of Part 2 follows from Corollary 4, and the lower bound from Part 1. Part 3 readily follows from the identity (9). U As a consequence of Theorem 12, we can define the polyhedron P = {v: ANN + ABVB = O, SVN= 1, VN 0} (10) and we note that the sequence {vk} C P. We are now ready to prove two important results. Proposition 13 There exist P1 > 0 and P2 > 0 such that for every k = 1, 2, 10

1. I||Xsl < p1O(X ) 2. ||pI 1 < P2 (Xk )r X1 S I1. B N t.'N~ Xk,N N Proof: Using the argument of the proof of Theorem 1 and Proposition 10,for (y, s), we obtain: I||XS12 = cTpk =(ATq + S)Tpk -T k sNPN = (XkN (Xk P) < ISNi(XN) Ir k,N NI and the Part 1 follows with pi = IISNII. To see Part 2, note that from Theorem 2 for some P > 0, |pkII < IIpkIl < pCTpk = pTNpkp < PII (4IIk()rIIXkNsII and the result follows from B N < P I ISNNI NS(N)IIX Part 1 with P2 = PPi Proposition 14 There are constants pi > 0, P2 > 0 and L > 1 such that for all k > L 1. Iull,11 <I p(cTXk - c*)rTIu II 2. IIs4I < p2(cTxk - c*)2 Proof: Since xk x* and x > 0, there is a p > O and an L > 1 such that for all k > L, IIXk 1 <.: p. Let k > L. Part 1 follows from Corollary 4, Proposition 13 and IIXI (c lipc*) ll1 1 < -(Tck - *)r' To see part 2, not that I IIIIX,BSI I ) IXII( ( ) SIBbs u11< IIPI Tk - r lt ( cf - c*) = IIrB (cB - p c= ) Susttun (Crc *)- c* Substituting Part 1, our result follows from Proposition 9. U 11

7 Two Power Centers and their Relation In this section we consider the situation when r > 1, and the polytopes P and D n {s: sN > 0} defined by (10) and (8) respectively. We define the power center of P as the solution to the following concave maximization problem: -2(r-1) maximize - EjeN v j ANVN + ABVB =0 SNVN = 1 VN > 0 where the K.K.T. conditions defining the center are: 2(r - 1)VN(27)e - ANz - 0N = 0 (12) -ATz = 0 (13) ANVN+ABVB = 0 (14) STVN =1. (15) We also define the power center of 7D n {s: SN > 0} as the solution to the concave maximization problem: 2(r-1) maximize ZjEN Sj2r-l AY + SN CN (16) ATy = CB SN > 0 where the K.K.T. conditions defining the center are: 2r - 1) - 2r-1 SN2 e - UN = 0 (17) ANUN+ ABUB = 0 (18) AN(y- )+(S N-SN) = 0 (19) AB(y-y) 0 (20) where (y, s) is an arbitrary element of D, with SB = 0. There is an intimate relationship between these two centers, and we explore this in the next theorem. 12

Theorem 15 (v*, z*, 0*) is a center of P if and only if (s*, u*, y*) is a center of D n s: SN > 0}, and * _ 2r - 1 SN = - V*,N e. SN =- *,N 0* 2(r -1) 2r - 1 U 2r 1 0* V will solve the system (17) - (20). Now let the center of D n {s:SN > 0} exist and the system (17) - (20) have the solution (S*, *, *). Then it can be verified that the transformatio 2(r-1) -y = eTs*2-1 e 2r-1 2T7-2(r- 1) v = /3u* will solve the system (12) - (15). Here we have used STNUN = STUN, which is readily established using equations (18) - (20). d 8 Newton's Method and Power Center of P the system of equations (12) - (15). This is a non-linear system of equations, to which we can apply Newton's method to find its zero. The purpose of this section is to investigate this application. Given v, y, 0, with v E P, the Newton direction Av, Ay, A0 is given by: -2(r - l)(2r - 1)V 2 - AAy - ~N = -2(r - 1)V 2+e + Ay + sN (21)1 2-y(r- 1) 8 Neton's 1Method and Power Center of P -2(r - 1)(2r - 1)VN72"/Xv - A~Ay - AO~N = — 2(r -- 1)VN~2'+le + AT~y + OgN (21) 13

-ABa = ABY (22) ANAVN + ABAVB = 0 (23) 5TAVN = 0. (24) Consider the change of variables: = y+ Ay Y ~~ 2(r-1) _ O+AO 2(-1) (25) AV = (2r-1)A vB WN = (2r- 1)VNAivN. Substituting this change in the system (21) - (24), we can derive the following equivalent system with AN = ANVJ and SN = VSgN, WN + A + SN V (-1)e (26) -ABY = 0 (27) ANWN + ABVB = 0 (28) WN = 0 (29) We can then prove the following result: Theorem 16 Upto a choice of AnV, the solution to the system (26) - (29) is unique. Also, 1. eTV- (r-1)) W= WTWN. 2. IIWN 2 < eTVN2(r-le. Proof: The uniqueness follows from the fact that when the columns of the matrix AB are linearly independent, the system has a unique solution. Otherwise, it is readily confirmed that a unique solution can be found by replacing AB with a submatrix spanning the column space TR(Ao) of AB. This only changes the value of Av. Multiplying equation (26) by wT we obtain WNWN + WNANY + WNS = N e. 14

The second and the third terms of left hand side of the above expression, from equations (27) - (29), are readily seen as zero, and we have Part 1. To see Part 2, consider the optimization problem: maximize xTx xTx = eTVJ(r-l)X whose solution is x = VN (l)e. By an appropriate choice of a submatrix of AB, we can guarantee that the system (26) - (29) is defined by a non-singular matrix, for every v > 0. In that case, the following is a well known result relating to the rate and the convergence of Newton's method, which we state without proof. Assume that the sequence {vk} in P is converging to the power center v*. Then the following can be proved about the Newton steps taken at the iterates vk: Theorem 17 There is an L > 1 and constants p1 > 0, P2 > 0 such that for all k > L, Ivk II 1. -vl% = 1 + 6k with 16kl1 P1IIv~ - IvNVNIIN 2. IIvK + AV - vI P < P2lV- v112. 9 Affine Scaling Step in P The affine scaling step is defined by solving the optimization problem (5). Using Part 1 of Theorem 12 we can restate the problem (5) as: minimize sNPN ANPN + ABPB = 0 PTNxkNPN + pTXkS PB < 1 The K.K.T. conditions for the above problem are SN-AN - 2X NPN = 0 (30) -Ay - 28XpB = 0 (31) ANPN+ABPB = 0 (32) IIjXkrPI = 1. (33) 15

Using the definitions of uk and vk given in (7), and setting u = uk, v = k and iB C~P _c*)r H|ull P'B = ~-(CTZk C* )ryI 2 - llull (34) - (cTxk-C*)r'IIuI 20 = 1 we can rewrite the system (30) - (33) as: UN2 NY SN = ~ (35) jlU112 1 jJull12 k -ABY = 12 (36) u11u UAN 1= (38) SN lul1 ( where AN = ANVk and SN = VkSN. Theorem 18 Consider the systems represented by the equations (30) - (33) and (35) - (38). 1. (30) - (33) have a unique solution which generates a solution to (35) - (38). 2. The solution to (35) - (38) is unique upto a choice of p'; and, there is a value for p' for which the resulting solution also solves (30) - (33). 3. When AB has full column rank, the two systems of equations are equivalent. Proof: Since the Equations (30) - (33) represent the solution to a strictly convex problem, they have a unique solution. It is now readily confirmed, by simple algebra, that u, y p and p, defined by the change of variables (34), solve the System (35) - (38). Thus we have proved Part 1. From Part 1, it follows that (35) - (38) have a solution. Considering qN = iV, P-= = 12, y and p' as variables, this system is linear in these variables. If AB has full column rank, the solution to (35) - (38) is unique, and Part 3 follows. Otherwise, since (35) - (38) can have a solution only if SB lies in the row space 7(AT) of AB, the third condition must have redundant constraints readily identified by choosing any full column rank submatrix of AB. 16

To see Part 2, let AB = (Ac, AD) where Ac has full column rank and spans the range ( or column space ) 7Z(AB) of AB. Thus, for some unique matrix A, AD = AcA. Replacing equations (36) and (37) by -sk -AT = - B (39) ~Cy JIu112 and AN IUN2 +nc = O (40) respectively, we obtain a new system that has a unique solution. By setting p' = (PC PD) and letting pD = 0, the solution to Equation (40) generates a solution to (37). Now, let (qN, P,, pB) be any solution to (35) - (38). This then generates the unique solution (qgN, y, P. PC - ApD) to (35) and (38) - (40). Since only the vector pB is modified in any solution to (35) - (38), Part 2 is established with the required PB = (itiPB)i We can also prove the following important property of the sequences. Theore:m 19 There exists a p > 0 and an L > 1 such that for all k > L eTk = 1 + 6k where 6k < p(CT k _ c*)2r. Proof Multiplying equation (35) by eTV(), we obtain, N, w o n eTfik e e UN eTV, AT- VkN. llA'ly- 0lukll=. Substituting ANVN = -ABVB, and equation (36), we obtain e U + v4 -1=0. e UN + VBSB 1 O. The theorem now follows from Propositions 8 and 14. U By making the change of variables W = (- 1)e - WN AVB = VB- Av B 17

in the system (26) - (29), we obtain the equivalent system: Comparing systems (41) - (44) and (35) al = a2 a3,A iLy-SNO = 0 -AB^ = 0 + ABAVB = 0 sNWN- = 1. SN N - (38), we note that if PSN = tlUllN 2, VkN(WN - PN) Y-Y AB 1 A B - PB (41) (42) (43) (44) and b =(O, 12 0, O0)T M(v)a = b (45) where lV,- -A~ N 0 vk,N -AN SN M() ~0 -AB 0 0 M(v) = AN 0 0 AB ST 0 0 0 We now prove an important result relating solutions of the two systems. Theorem 20 There is an L > 1 and f > 0 such that for all k > L uN -(r-l) k A /k UN Vk e- N IlUk2 klN e W + A m~llA^II ^Iuk 11 with Il|kII ~ Proof It is readily confirmed that the system (45) represents the K.K.T. conditions of the following quadratic programming problem: minimize aT VNa + a4 sIl ANal + ABa4 = sTal = 0. 18

where we can assume, because of Theorems 16 and 18, that the columns of AB are linearly independent. Thus, we can substitute a4 = -(ABAB)-'1AANal in the above optimization problem to obtain the equivalent problem maximize aTVNa1, + a sB ANal = 0 SNal = 0 where AN = AN - AB(ATAB) ATAN and s = -ATAB(ATAB)-l B B) BN and SB B N ' B ull"2 The above problem is a quadratic programming problem, and thus, its KKT multipliers are independent of the diagonal matrix Vk,N Thus 11(a2, a3)|| q(A) Ils ||. Since vk,(N, —1) - _w) ia) = ATa2 + SNa3 kN e~ - W 1N - IIUk2 by defining S= -v, N(ANa2 + SNa3). we obtain out result. U 10 Affine Scaling Step and Newton Step We now investigate the Newton step and the affine scaling step in the projected polyhedron PN = {'N: v E P}. Consider the sequence {vk} E PN generated by the affine scaling method. Then Theorem 21 The affine scaling direction at vk is k+1 k - ak k v Vk^N UN VN+ - VN = N a -VkN ' ) ~~Hwhere cOk = ~ 2l~ill I ' where &k = ak (r )l O(Vc ~ 19

Proof: We note that Vk+1 Vk k+ XN N VN = cTxk+l - c cTxk - Vr-lUk k kkN N kl UI -r-2 (VUk) k 1,~~1 1- akIIUk II2 (V ) Ctk I N Ilk IF O(Vkr- I Uk) and we are done. We can also define the Newton step at v%. Theorem 22 The newton step at v k is N~~~N 1 ~Vr k I I ~a) k~v -kr kNUN + VrkNL). ~I-N 2r - 1 \N /Ijlk112 k Proof: The result follows from the change of variables (25), and Theorem 20. U 11 On Convergence of Dual Sequence We now investigate the convergence to optimality of the primal sequence and the convergence of the dual sequence, without the non-degeneracy assumption. We now prove a lemma and a proposition: Lemma 23 There is an L > 1 and 61 > 0 such that for all k > L Jj~k 11 > 61. Proof: From Theorem 19 and definitions, UN V,1N4, and for some L' > 1 and all k> L', eTii = 1 + 6k with 5k ~ p(CTXk - C*)2r. Thus, there is an L > L' such that for all k+ > L k > 1+6k 1. q 2q Now assume, that, on some subsequence K, IuAI -1 0. Then, for k E K Vk,~N N 0 20

or r-1 -k v-1 > O for all j N. Let iVk = (fk4) > L. Thus on some subsequence K' C K, lk = 1. Now, for k E K', we note that _> 2.1 Thus VI - 0. But uk = vksf. Thus s -> oo, a contradiction, as the Theorem 5 implies that the dual variables s are bounded, and we have our result. U Proposition 24 There is an L > L 3 > 0 and y > 0 such that for every k > L CT'xk+l -C*k 1 1. czkc: 1- a(u ) > 0 where 6(uN) II(Vkr 2. 1 - ||- I |wk| 1k $ (u) < 1 with 16kl < (cTxk - c*)2r. 3. 4lWr-lUk) =(Vr-lUk 4. (uL~) k Y > 0. Proof: Part 1 follows readily by substitution, results of Theorem 8 and Proposition 14, Part 1. Now, to see Part 2, from Theorem 20, Ovr-1 k ) -k, UN lUk 112 (lUk 112 -= (e - w + Ak) < 1 + I| | + 6k (46) where 11\kll = 6k; and the lower bound follows from Theorem 19, Lemma 23 for each real a > -1, i > 1 - a. The upper bound follows from the fact that part 1 holds for every a < 1. Part 3 readily follows from Propositions 8 and 14 and Part 4 from Lemma 23 and the Propositions 8 and 9. U We now show that, at least asymptotically, the sequence {cTxk} converges linearly. We do this in the next lemma. Lemma 25 There is an L > 1 and 0 < 3 < 1 such that for each k > L (CT k+ - c*) < P(CTX - C), 21

and thus 0 (CTxC - c*) < k=L Proof: Follows from Proposition 24 Parts 1 and 4. U When r = 1, the convergence to optimality is proved by using a merit function, called a local potential function, which is shown to decrease locally. This merit function also defines the analytic center. Here, we will use FN(X) = E(vj)-2(r-) jEN as the merit function in our analysis, and is the negative of the objective function of the power center problem on P. We now prove a simple lemma related to this function: Lemma 26 Let w and v > 0 be arbitrary p vectors with V the diagonal matrix generated by v. 1. Let 0(-w) < 0. Then v ((1 + Wj)-2(r-1) - 1) < -2(r - 1)(eTVw - 2)wVw ). j=1 2 2. Let 1 > q(-w) > O. Then 1)(eT w (2r - 1) f vj((l + wj)-( - 1) < -2(r- l)(eVw- 2(1 -,(-))2 TV). j=1 Proof: Part 1 follows from the following identity obtained by using the mean value version of Taylor's formula, and noting that Wj > 0 so Wtj > 0. This implies that the last term in the expression below is non-negative. 2(r - 1)(2r- 1) 2(r - 1)(2r -(1 +) (1 +wj)-2(r-1) = 1-2(r-1)wj+ 2 wj-(1 + W)-(2r+)W3 2 (1 + b w Part 2 follows from the following relations, obtained by again applying the mean value version of Taylor's formula, where wj > -0q(-w): (1+ W)-2(-1) = 1-2(r-1)w1)2r 1) 2(r - 1)(2r - 1) 2 < 1 - 2(r - 1)wj + 2(1 - n_(-w)), wj 22

and we have our result. m We are now ready to prove the main theorem related to the decrease in the merit function: Theorem 27 Let a = (a V u) and wk = (2r - 1)V,-A^v. k Then, there is an L > 1 and,(V'- 1k) N an 3 > 0 such that for every k > L _ _2(r - d (2r - 1)& +T-2(r-1 Fv(x ) - FN(xk) < - (1 a)2 )(WN kN WN+k) where I I < (_ (-k ))-(-1)CTk - C*)2r. Proof: From Step 4 and Theorem 20, for each j E N, xk - 1 llUk 112 (Vj)r-lUj xk ~ r ^ -1^) -U c2 = 1 (v r- 1Uk lluk112 = 1-&(1-w -+ Aj), where Ak = VkrlAk From the above, definitions and Proposition 24 Parts 1 and 3 we obtain k+1 k+1 CTk C* Vk Xk CTxk+l -C* xI +cTxk - Ak > 0. k+l V. 1.. Let j - 1 = ^( ), and note that 0^ > -1. First, consider the case when I 1 > 0(-Ok) > 0. Then, from Lemma 26, Part 2 we obtain FN(xl) - N(Xk) = E ((v+)-21) )-2(-1)) jEN -2&(r -1) (eT -2(r-1) ( Ak _ (e VkN WN ) (2r - 1)& Ak)T -2(r-1)(f k 2(1 - )(1 - (-k))2T (N - ) N( - ) Using Theorem 20 and Proposition 24 Part 3, we obtain 1- _(- k) = 1- (-a wN + A) 23

1 _ ( (1 4(e-W + Ak)) -(1 (V-Iu )k 1- r1-a Thus, from Theorem 16, as 1 - o < 1, we obtain ~F ( }-F_-2d(r( - 1)(1 (2r - 1)&(1 - +&)2r-1 N(Xk+I) _FN(xk) < I- 2(1 - a)2r kN < ( N) (((4T7-1) where I TV2(r-)Ak - (2r - 1)(1 _ad)2r (2(N)T -(r) AV-k _2 ( k)T N^ 2(1=kN -( )TV-2(1- -)AlAN A eTVk(rl)Ak (2r -1)( ) (Wk )T-1k _ (- k)T-k) ke k ",N- 2(1- c2)2) (ANAk)' Now consider the case when (_-0k) < 0. Then Lemma 26 Part 1 must be used to calculate this change. By an identical analysis as above, in this case we obtain N(kl _~F~r 2d(r - 1) (21 - 1)d 2(r-1) -k kk FN(xk+) - FN(xk) < (1- I a- ( r - 21)-, WN k) 1 a Q 2(1-) (ajk)T N+) where I = eTVk-(l)Ak _ (2r - l)a(2(k )Tk - (Ak)Tk). = kN 2(1 -&) N As a < 1, a < a it is readily seen that the bound obtained by Part 2 of Lemma 26 is larger. As -q(-v,) = V = minkeNv we see using Theorem 16, Part 2 TV-(r-)kl < (vk )-(r-l) kllA kN<- - q 3k- IAC i(Wk )TAk < Vd(vk )-(r-1)llAkl1 and our result follows from Theorem 20 and Lemma 23. We are now ready to prove the main convergence theorem. Theorem 28 Let I r-r1 < *L2, and assumptions (1) - (3) hold. Then, 1. xk x* 24

2 k - 3. s~ - 8 where x* is an optimum solution of the primal, and (y*, s*) is an optimum solution of the dual. Irt addition, (y*, s*) is the power center of the optimal dual face, and thus strict complementarity holds between this pair of optimum solutions. Proof: We see from Proposition 24 Part 4 we see that there is a -y > 0 such that &k > ya. Thus and fromn Proposition 24 upper bound in Part 2, a >- &,. Thus 1 _ (2r -l1)k >1 I_ (2r -1)a = 0 > 0. 2(1- a2r- 2(1l-Ce)2r-2 Assume that on some subsequence K, jk k and let this be the largest and thus the only such subsequence. Then, from Theorem 20, k~ ~ ~ ~~~~_?~ Wj~ -1; and so there is an L >i such that for all k~> L, k > 0.50. Thus Thus, fro-,m Theorem 27 FN~xkl) - N~xk <-91(92 + E k)(WN)' VkN )N where I~kI = <4Q k )rl Txk - r N kN WN and from Proposition 8, Ek -~0. Thus there is an L > L such that for all k > L FN(Xk+l) _ F x' 12k)V2r1 jk Now, from Corollary 4, there is a p > 0 such that CTxk - C > P W 25

Thus for each k > L in K FN(Xk+l) - FN(xk) < 1 2(r-l) 11 2 From the definition of K, there is a 6 > 0 such that for each k 6 K k 1 3k Vil >. Thus, from Theorem 27 there is a 3' > 0 such that Ak\l < P( (cTk - C*)2r where 3 = P'16-1 and FN(xk+l) _ -FN( -(p2(r-1)O211i k 112 + 'k)o Thus for each k > L there is a ak > 0 such that 00 (FN(Xk+) - FN(Xk)) k=L ~ - '91B92P2(r-1)Z llk 11Z2 kEK + E 11k112) -0 E k kjIK koK o0 <- E ZokllllI -1 1 E lk. k=L kcK From Lemma 25, I E k kiK E Ikl < (cTX _ C*)2r < 00. kcK k!K From Corollary 4 FN(xk) is bounded below by zero on this sequence. Thus IIWIk -> 0. As W = (2r - 1)VkLAVkN, from Proposition 8, AN r — and, from Theorem 20 uVk,N UN e lluk112 Since the only vector in PN for which the Newton step lVN = O is its power center vN, we must have, Vk ) VN. Let K be such that for k E K ( it exists since all these sequences are bounded ) Sk ) S* yk y,* Uk - U*, 26

and, so Vr-l k I/1,N UN *,N N (48) kN vV —',u.vv= e. (48) lukW11 IlU*112 Using Theorem 15, it is readily seen that (y*, s*) is the power center of the optimal dual face; and our result follows from the complementary slackness theorem. U We get the following sharp bound on the linear convergence rate of {cTxk}. Theorem 29 Let (_a)2 < 2 Then CTxZk+1 _ C* lim = I - a. k —oo CT k - C* Proof: Follows readily from Proposition 24 Part 2 and Equation (48). U 12 Convergence of Dual with Variable Step Size In the previous section, the step size implemented at each iteration was a constant a. In this section we will allow the stepsize to vary, and in iteration k, we will choose the stepsize ak by rules described below. This increases the step size implemented, which will be shown, asymptotically to be given by the formula oak 2 1 - ak 2r - 1 We obtain this increase by using the following estimate of 6(uk) ( of Proposition 24) XII tsk l(xk)Tsk rk = (X rls) (49) The lemma that follows is useful in understanding the above estimate. Lemma 30 There exists an L > 1 and a/3 > 0 such that for all k > L (xk)Tsk...r.....C 1 + 6k cTxk -- c* where 16kl < P((cTxk - C*)2r-1 Proof: Since (xk)TSk = (X )TSk + (Xk )TS, Sfrom Proposition 14 Part 1, and the fact that xk ---, X > 0, there is an L > 1 such that the first term is bounded by,l(cTxk - c*)2r for some fl1 > 0. Our result now follows from Theorem 19. U We now establish the goodness of the estimate (49). 27

Proposition 31 Let Tk be defined by Equation (49). There is an L > 1 and a 3 > 0 such that for all k > L 6(uN) = + 6k where lkl < 3 (cTX - C*)2r-1. Proof: This result follows readily from Lemma 30, Proposition 14, Part 2 and the upper bound of Part 2 of Proposition 24. U Since Tk is, asymptotically, a very good estimate of 6(Uk ), in place of using (1_)2r as an estimate of ~(1-a)2,-1 estimate of a&(la)2r- in the relations (47), we will use the estimate 6kac of & in the above formula. Also, from Proposition 24, Part 2 whenever IIw| -11 0, 6(u ) -- 1. Thus, in this case k -> 1, and the new estimate approaches 1A We now present this step selection strategy. Define a* such that ca* 2 (1 - a*)2r 2r - 1 Let 0 > 0 be very small and at iteration k choose the step size ak by the following strategy: Step 1 Let a' be such that 7ka/(1 - Tka')2r-1 2 -(i~o^ -= 2^ y-(* o.(50) (1- a'i)2r -(2r - 1)- (50) Step 2 Define f a' if a' > a* akk = (51) c a* otherwise The above choice guarantees that the estimate is not smaller than one obtained from Theorem 28. The next lemma establishes a relationship between the a' computed in (50) and the related expression in system (47). Lemma 32 Let a' be computed by (50) and let & = 6(uk )a'. Then, there is an L > 1 and a p > 0 such that for all k > L &(1 - d)2r-1 = eka'(1 - TkQa/)2r-1 + 6k where,6k- < 3(cTxk _ C*)2r-. 28

Proof: Follows readily from Proposition 31, and the fact that for small e > 0, (1 + e)2r-1 < 1 + 4'e. m We are now ready to prove the main theorem of this section. Theorem 33 For each k, let ak, be generated by the above rules and let the assumptions (1)- (3) hold. Then, 1. x -k x* 2. yk - y 3. Sk -. s* where x* is an optimum solution of the primal, and (y*, s*) is an optimum solution of the dual. In addition, (y*, s) is the power center of the optimal dual face, and thus strict complementarity holds between this pair of optimum solutions. Proof: Assume that ak = a* for each k. Then this theorem follows from Theorem 28. Otherwise for each k for which ak = a' > a*, using the same argument of Theorem 28, it is readily shown that for some 3 > 0 FN(xk+l) - FN(k) < -h|Iwk112 + k where /k < y(cTxk - c*)2r. The proof is completed in the same was as in the Theorem 28. We now obtain the asymptotic behavior of ak. Corollary 34 ak > a where a 2 1-a 2r-1 1 Proof: As a consequence of Theorem 33, we obtain the fact that lwCk Il 1 0. From Proposition 24, 6(uk) - 1, from Proposition 31, rk - 1, and we have our result. m Drr\c;Crr?3/ fnl \ \1 lr~DI~N I IYIC 1r_ 29

13 Accelerated Variant We will now use the connection between the Newton and the Affine Scaling step to accelerate the convergence of the algorithm. We now present the accelerated version of the method and then investigate its convergence rate. Step 0 Let x~ be an interior point solution, 2^r6 < 2y, 6 > 0, r > 1; and, let k = 0. Step 1 Tentative Solution to the Dual: yk =(AX2rAT) AXc2r Step 2 Tentative Dual Slack: k = c -ATyk If sk < 0 then STOP. The solution is unbounded. Step 3 Min-Ratio test: jIj 2r-1Sl A( Ok = mint Xl k l sk > 0} l(X2r-1k 11 - 12r-1S W(x- 2r )' If Ok = 1 set ak = 1, and go to Step 5. Step 4 Step Size Selection: If eTXksk > 1, set ak = a and go to Step 5. Otherwise, define cak defined by (51) Nk = {j: Xk < eTXkSk} Yk = eTXk,NkSkNk 1 x 2r-1 k 7hk1 (Xk,Nk XkN SNk Nk~ 2r- UXk % Xlskt2 11h lYik 1l Ek5 = IhlIg(I log(k ) log(yk)' 1. Predictor Step: If pk > 1.5r then ak = 1-max{e bpk}. C~llc Ilnt~n ~ kI, Yk 30

2. Corrector Step: Otherwise, Ck -- min{?Nk(Xk,'Nk sA k) i k2rl|XrSkl2 ' k Step 5 Next Interior Point: X2rsk k+1 =k _ QXkk k X -X kkIX v2r- kl Step 6 Iterative Step: If k+l = 0 for some j, then STOP. xk+l is an optimal solution. Otherwise set k = k + 1 and go to step 1. Some comments are in order here. This acceleration scheme is identical to the three step acceleration scheme of Saigal [17]. hk computed in step 4 is a very good estimate of the Newton step Akk and its magnitude E, is used to estimate the distance to the power center v* of PT' n {VN: VN > 0}. Asymptotically, we apply a predictor step when the size, Ek of the Newton step is of the order O(cTxk - c*)2r; and the corrector step otherwise. As is well known about the Newton step, IAvk 1 is a very good estimate of \jvk - v*I\. During the corrector step, ak is chosen so that k+l - V = AV + O(cTXk - C*)2 and thus the affine scaling step behaves like a Newton step. We now establish some preliminary results and prove the convergence of this accelerated method. Proposition 35 There is an L > 1 and a subset N of {1,- *, n} such that for all k > L, Nk = N. Proof: From Theorem 6, Yk - 0; and by definition xj > 0 for each j E B. Thus for some L > 1 and all k > L > Yk for all j E B and we are done. * Let L, be generated by Proposition 35. We now prove a series of results using the property of L. Proposition 36 There is an L > L and an a > 0 such that for all k > L IIh - _Avk11 < a(cTxk - c*)2r \N^ VN\\^O\ 31

Proof: We note that by substitution of the results of Theorems 19 and 22 and definitions, we get 1 Tk N ~ 2r-l(eTu ||Nk112 ) kVkNuN V Ak) 2r-1 N luk 1~ + N -1 (VrAk k vkk r( ^+ -O 2r -1 (VkN + N) = AN- Ek. As vk is bounded by Theorem 8, our result follows from Theorems 19 and 22. Proposition 37 Let wk -> 0 on some subsequence K. Then, on K yk (X2r-1 s 2 1 Pk = 2rlIXkrSkII2 2r Proof: From Theorem 19 and Proposition 24 Part 2, 2rpk > eTi = 1 + k. From equation (46) 1 + Sk Pk < 2 (1 + llwl11 + IIakII) 2r and we get the result as 6k and lIAkll go to zero. m Proposition 38 There is an L > L such that for all k > L 0.50(ck - c*) < k < 1.5(cTxk - c*) Proof: Note that eTXk,NkskN = (CTxk - c*)eTi4, and so the result follows from Theorem 19. U We are now ready to prove the convergence to optimality of the limit point of the primal sequence generated by the accelerated algorithm. Theorem 39 Let assumptions (1) - (3) hold. There exists a subsequence K such that for kEK 1. xk x* 2. yk - y 32

3. Sk > S* where x* and (y*, s*) are optimal solutions of the primal and the dual problems respectively. Also, (y*, s*) is the power center of the optimal dual face. Proof: Let L be as in Proposition 35. In case the predictor step is taken only a finite number of times, then ak is selected by the variable step size selection strategy, and our result follows from Theorem 33. Thus, let there exist a subsequence K such that for each k E K, a predictor step is taken. Then, for k E K hk 11|| < 1.5r But, from Theorem 6, yk - 0. Thus, as ILTvIN < IIh 11 + II|AV - h 11 AVN -- 0; and the result follows by an argument identical to the proof of Theorem 28. U 14 Convergence Rate of Accelerated Variant We will now investigate the convergence rate of the variant. This rate is a function of the choice of r, and approaches 2 step quadratic as r approaches infinity. From Theorem 39, we can conclude that the power center of P exists. We investigate now the application of Newton's method to compute this center. Proposition 40 There is a 0 > 0 such that for all I\vN - vv I < 9, 0.5OIIvN - VNl < IIAvNII < 1.5011vN - VN1 Proof Follows from Theorem 17, Part 1. U The following is a consequence of the above proposition: Proposition 41 There is an a > 0 such that for every 0 < 6 < 1, there is an L6 > 1, and; if k > L6, 3 > O, p > 1.5r and Ilvk - vl\I < (cTxk - c*) then IIWkII ~< C(CTXk - c*)P. 33

Proof Since v* > 0, for every 0 < 6 < 1, there is an L6 > 1 such that for all k > L^, minjvj* a =min > (cTk - c*)1.5r 2 - Thus for some 7 > 0, v_ > v - i1v — vk > a and our result now follows by observing that ||W|| <~ (minjV')-IA 11V I, and the result of Proposition 40. U Lemma 42 Assume that for some k > L6 and 1.5r < p < 2r, Il[v - vI| < /(cTxk - c*)P. Then, there is an a > 0 such that Ilh k | < a(cTXk - c*)P. Proof: From Proposition 40 IIzAv <I 1.5/(cTk - c*)P. Also, by Theorem 36 IIhk 11 < AIIzvII ll - v 1.(CT - *) + |y(ch ~ | <- 15 + 7( )2 -and we are done. 1 We are now ready to investigate the predictor step. Proposition 43 Assume that 0 < 6 < 1, k > Lb, 3 > 0, 1.5r < p and IIvN - v\II < p(cTxk - c*)P. Then 1. There are 0 < 01 < 02 such that 01(cTk - c*)1+6Pk < (CTxk+1 - c*) < 92(cTxk - *)+Pk. 2. There is a 03 > 0 such that (1-6)Pk Illk+l 1 v 3(cTxk+1 - c*) 1+ Proof: From Lemma 42, we obtain a 3 > 0 such that Ek < /(CTxk+1 - c*)P. 34

We note the following sequence of inequalities which follow from the Propositions 38 and 24: (0.50)6Pk(cTxk _ c*)6Pk < 6Pk < 1-aak < l-~Uka(Vr-BUkd) (CTxk+l - C*) (cTxk - c*) < 1 - (1 - max{Ek, })(1- IIWN|I - 1kl1) < max{lyk6Pk } + IIwAI + lkl < (1.5)6Pk(cTXk _- *)6Pk, and we have part 1. Now, from Theorems 21 and 22, and some simple algebra, we can write k+1 - k = Ak k + Ek VN VN toN-FLE where k=2rakS(U)-1 1 1 k vk 1 r fk 1 — -ak6() 2r-1( 2r-1ll+6kN 2r- 1 + 6 Using the facts that 6(u ) < 1, ak < 1 -, we obtain, for some 93 > 0 IlEk11 i< 3 (cTXk - c*)(1-6)P and the Part 2 follows. * We are now ready to investigate the corrector step of the algorithm. Proposition 44 Assume that for some k > Lb and 3 > 0, I\v - vI < p(cTxk - c*)P. Then 1. H14 < p < r. One iteration of the corrector step will be taken, and for some 01 > 0 IIvN+1- vN | < 01(cTxk+l - C*)2p 2. L- < p < ~. Then at least one corrector step will be taken, and after at most two steps, for some 92 > 0 I Vik+2 - vl 1(cTxk+2 - c*)4P __ VNil <~_ 01 )4p 35

Proof: From Theorems 19 and 20 and Proposition 41, we can write 1 L6 I k < (ksNX Nk N) + 1 Pk 2r 2rI|lXSk112 - 2r where |k l~c O (cTxk - c*)2r and lPkI <~ eO(CTxk - C*)P. Thus, as ac approaches 2 - e for 2r+1 some small e > 0, for all sufficiently large k, a y -k(XkN SN)k ) 2rllXkSk 112 Thus, from Proposition 14 Part 1, Theorems 21, 22 and simple algebra we see that c+1,c 1 + +k (v_, V r UN 1N V^ N = r- - -2r — ^k- 1- Vk i2 Ark - Ek where F _ _ 2(r - 1)6c + 6 (k r (2r- 1)(2r-1 ) -+6 (v -k 2r - 1 Using Propositions 8, 9 and Lemma 23 we can show that IIEkIl < P(cTk _ C*)2. Thus, after one step with ak, we see that ( using Theorem 17 ), lIVN- vllN < IIv + L - + IIEA- | | < PIIV - vl112 + p((cTx C*)r < a(cTXk - c*)2p. From, Proposition 24 CTxk+l _ *k T.Txkc = 1-k(U) CTZC - C* eT I 2r and using Theorem 19 we see that for all sufficiently large k 1.5 cTxk+l _ c* 0.5 2 < < 1- r - cTxk - c* 2r Substituting the above inequality, we obtain part 1 of the theorem for 1 = (1 - 52p 36

To see part 2, we note that after one corrector step, either 2p becomes greater than 1.5r and we stop the corrector iterates and go to the predictor step; or, after one more corrector step the desired result is obtained. m We are now ready to prove the main convergence theorem. Theorem 45 Let the sequences {x k}, {yk and {sk} be generated by the accelerated variant with r > 1, and let assumptions (1) - (3). Then 1. xk.- x* 2. yk -- y* 3. k > S* where x* lies in the relative interior of the optimal face of the primal, and (y*, s*) is the power center of the optimal face of the dual. In addition, asymptotically, the sequence {cTxk -- cTx*} converges to zero as follows: 1. For 6 = 2(-1), the convergence is two-step superlinear at the rate 1 + +. 2. For 6 = 2( 6 2) the convergence is three-step superlinear at the rate 1 + r+ Proof: Assume that only a finite number of predictor steps are taken. Using the argument of Theorem 28 it can be shown that there is a subsequence K along which Av -, 0. Using Theorem 17 Part 2, we can show that there is an L > 1 and a 0 > 0 such that for all k > L and k E K IIvk + AlVk _ II < Ollvk - v I112. For each k E K, sufficiently large, using an argument similar to that of Proposition 44, we can show that for some 01 > 0 IIvk - v II < 0IIvk - vr 12. and from Proposition 40, IIAv+e <1 2 LAvk 112, and a corrector step can also be performed at the step k + 1, and we will obtain II vk1+211 < 0zv11k 1144. Repeating this argument we contradict the fact that only a finite number of predictor steps are taken. 37

Let k be an index, sufficiently large at which a predictor step is performed. To investigate the convergence rate of the two-step method, assume Pk > 1.5r, and let 6 = p(+ After one corrector step, from Propositions 43 and 44, we obtain 2(l-6)pk 11V+2 - V11 < 03(CT+2 - C*) 1+6Pk where 2(1-6)pk = 2r, we obtain part 1 as l+6pk 6 = 1whenpk = 2r 2(r + 1) and the convergence rate 1 + 6pk = 1 + r + For the three step method, let 6 = 2pk- After two corrector steps 11vk+3 - v1 < 4(CT3 - Ck+3 *) 1+6Pk where 4(6)Pk = 2r. Part 2 now follows since 6 = 3 if Pk = 2r and the convergence rate l+6pk 2(r+2) 1+ 6pk = 1+ + 2. 15 Efficiency of Acceleration Ostrowski [15], section 6.11 introduced the following measure of efficiency for algorithms achieving different rates of convergence, involving different amount of work per iteration. He suggested the following measure log(p) w where p is the convergence rate of the algorithm, and w is a measure of the work per iteration. The larger this measure, the more efficient the algorithm. This measure has been used by Brent [4] who investigated the hybrids of Newton's method proposed by Shamanskii; and, Saigal and Todd [18] who investigated the hybrids of fixed point computing methods with variants of Newton's method. The convergence of the accelerated variant depends on the choice of r and the two-step or three-step method. The table below shows these calculations for several choices. 38

Two- Step Three- Step rate efficiency rate efficiency r 1.0 1.51 0.2027 202 0.231 W W r 1.5 1.6 0.2350 2.2857 0.2756..W W r = 2.0 1.67 0.2564 2.50 0.3054 r=4.0 1.80 39. W r 4.0 1.80 0.2939 3.03 0.3662 I I w W 1. Tsuchiya and Monterio [23] obtain a rate of 1.3. This can be established by the method of Saigal [17]. 2. This is obtained in Saigal [17]. 3. The efficiency of three step cubic is larger than the two-step quadratic, which is 03466. 16 Concluding Remarks In this paper we have shown that for every r > 0.50, there is a variant of the affine scaling method. The usual method is generated when r = 1. We have analysed the convergence of these variants for r > 1. The analysis for 0.50 < r < 1 is analogous, with a few changes in the formulae to account for the sign changes; and, the objective functions of the power center problems. Under the assumption of non-degeneracy, convergence to optimality of the primal sequence is shown for any step size less than 1. To investigate the convergence without the non-degeneracy assumption, the concept of a power center is introduced. The power centers of the primal and the dual optimal faces are related in an intimate way, and the objective functions defining these centers are related in the same sense as the "dual norms" are. In this case, it is shown that if the step size a is chosen such that _)2r < 2r2 for r > 1, the primal sequence converges to the relative interior of the optimal primal face and the dual sequence converges to the power center of the optimal dual face. Also, a variable step selection strategy is presented where the sequence {Ck} of step sizes, asymptotically is selected by a, < 2r2 This sequence is required to stay uniformly away from from 2r-1 Thus, ck < 2-, and thus this result is a generalization of the 2/3rd result of Tsuchiya and 39

Muramatsu [24] for r = 1. An accelerated variant is also presented. This variant achieves superlinear convergence, and the rate is higher for larger values of r > 1. This generalize the work of Saigal [17] and Tsuchiya and Monterio [23]. This work opens up the study of hybrid variants of the affine scaling method in which different values of r are implemented at different stages of the method. From Theorems 12 and 14, it is evident that the rate of convergence of lx\k 11 is O(CTXk - c*) while that of Ilsk 11 is O(cTxk - c*)2r. Implementing 0.50 < r < 1 in the early iterates will reduce this disparity between the accuracy of the primal and the dual sequence, and thus make the method behave more like the primal-dual methods where the accuracy of the two sequences is similar. In the later iterations ( when Yk < 1), a value of r > 1 ( say r = 1.5 or 2.0 ) can be implemented to get a higher rate of convergence. These hybrids have not been studied yet, and we expect to report computational experience on them at a later date. 40

References [1] I. Adler, N. K. Karmarkar, M. G. C. Resende and G. Veiga, " An implementation of Karmarkar's algorithm for linear programming," Mathematical Progamming 44 (1989) 297 - 335. [2] E. R. Barnes, " A Variation on Karmarkar's Algorithm for solving linear programming problems," Mathematical Programming 36 (1986) 174 - 182. [3] J. R. Birge and C. Rosa, " A simplified proof of the convergence of affine scaling," Technical Report, Department of Industrial and Operations Engineering, University of Michigan, (Ann Arbor, Michigan 1991). [4] R. P. Brent, " Some efficient algorithms for solving systems of non-linear equations," SIAM Journal on Numerical Analysis 10 (1973) 327-344. [5] D. den Hertog, C. Roos and T. Terlaky, " Inverse barrier methods for linear programming," Report No 91-27, Delft University of Technology, Netherlands (1991). [6] I. I. Dikin, " Iterative solution of problems of linear and quadratic programming," Soviet Mathematics Doklady 8 (1967) 674 - 675. [7] I. I. Dikin, " On the convergence of an iterative process," Upravlyaemye Sistemi 12 (1974) 54-60. (In Russian) [8] I. I. Dikin, " The convergence of dual variables," Technical Report, Siberian Energy Institute, Irkutsk, Russia, December 1991. [9] I. I. Dikin, " Determination of interior point of one system of linear inequalities" Kibernetica and system analysts 1 (1992). [10] L. A. Hall and R. J. Vanderbei, " Two-thirds is sharp for affine scaling," Technical Report SOR-92/9, Department of Civil Engineering and Operations Research, Princeton University, Princeton, New Jersey, 08544, September 1992. [11] N. K. Karmarkar, "A new polynomial-time algorithm for linear programming," Combinatorica 4 (1984) 373 - 395. 41

[12] N. K. Karmarkar and R. G. Ramakrishna, " Further developments in the new polynomial-time algorithm for linear programming " Talk given at ORSA/TIMS National Meeting, Boston, MA, April 1985. [13] R. D. C. Monteiro, T. Tsuchiya, and Y. Wang, " A simplified global convergence proof of the affine scaling algorithm," Technical Report, Dept. of Systems and Industrial Engineering, University of Arizona, Tucson, AZ 85721, USA, October 1992. [14] M. Muramatsu and T. Tsuchiya, " A convergence analysis of a long-step variant of the projective scaling algorithm," Research Memorandum 454, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan, November 1992. [15] A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, New York, (1960). [16] R. Saigal, " A Simple Proof of the primal affine scaling method," Technical Report No 92-60, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, December 1992. [17] R. Saigal, " A three step quadratically convergent implementation of the primal affine scaling method," Technical Report No 93-9, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, February 26, 1993. [18] R. Saigal and M. J. Todd, " Efficient acceleration techniques for fixes point algorithms," SIAM Journal on Numerical Analysis 2 (1978) 997-1007. [19] R. L. Sheu and S. C. Fang, " On the generalized path-following methods for linear programming," Technical report, 1992. [20] P. Tseng and Z. Q. Luo, " On the convergence of the affine-scaling algorithm," Mathematical Programming 56 (1992) 301-319. [21] T. Tsuchiya, " Global convergence of the affine-scaling methods for degenerate linear programming problems," Mathematical Programming 52 (1991) 377-404. 42

[22] T. Tsuchiya, " Global convergence property of the affine scaling method for primal degenerate linear programming problems"' Mathematics of Operations Research 17 (1992) 527- 557. [23] T. Tsuchiya and R. D. C. Monterio, " Superlinear convergence of the Affine Scaling Algorithm," Research Memorandum, November 28, 1992. [24] T. Tsuchiya and M. Muramatsu, " Global convergence of a long-step affine scaling algorithm for degenerate linear programming problems"' Research Memorandum 423, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan, January 1992 (revised, September, 1992). [25] R. J. Vanderbei, M. J. Meketon and B. A. Freedman, " A Modification of Karmarkar's linear programming algorithm," Algorithmica 1 (1986) 395 - 407. [26] R. J. Vanderbei and J. C. Lagarias, " I. I. Dikin's convergence result for the affine-scaling algorithm," In J. C. Lagarias and M. J. Todd, editors, Mathematical Developments Arising from Linear Programming: Proceedings of a Joint Summer Research Conference held at Bowdoin College, Brunswick, Maine, USA, June/July 1988 volume 114 of Contemporary Mathematics, pages 109-119. American Mathematical Society, Providence, Rhode Island, USA, 1990. 43