A TIE-BREAKING ALGORITHM FOR INFINITE HORIZON OPTIMIZATION Sarah M. Ryan James C. Bean Robert L. Smith jepartment of Industrial & Operations Engineering The University of Michigan Ann Arbor, Michigan 48109 Technical Report 87-24 October, 1987

A Tie-Breaking Algorithm for Infinite Horizon Optimization Sarah M. Ryan James C. Bean Robert L. Smith Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, Michigan 48109 October 6, 1987 1 Introduction Dynamic sequential decision problems are an important class of optimization problems. Applications include production and inventory control, capacity expansion, and equipment replacement. In some cases the decision variables are continuous, allowing a mathematical programming formulation with a continuous solution space. Often, however, a problem calls for a model with discrete decisions as, for example, in the choice between the various pieces of equipment available to replace one currently in use. A great majority of these problems share the characteristic of an indefinite horizon. The economy, firm, or government involved has no predetermined moment of extinction. To appropriately model these problems, it is necessary to assume that the problem may continue indefinitely. The traditional and most commonly used solution approach for such problems is to assume some finite horizon, T, and simply proceed as if the world ended there. The hope is that information beyond T will have little or no effect on the optimal solution, for at least the first few decisions, since those first decisions will be implemented immediately. A finite horizon with this guarantee is known as a forecast horizon. This solution method is known as a solution or forecast horizon approach (Lundin and Morton [1975]; Charnes, Dreze and Miller [1966]; Modigliani and Hohn [1955]). Nearly all forecast horizon existence results require a unique optimal initial decision (see Bean and Smith [1984]; Hopp, Bean, and Smith [1984]; Bes and Sethi [1986], Schochetman and Smith [1987a]). Ryan and Bean [1987] show that in finite choice problems, such a requirement may be difficult to meet. In this paper, we allow that more than one optimal initial decision may exist, and develop an algorithm to select one such decision. Forecast horizon approaches solve increasingly long finite horizon problems to approximate a sequence of decisions that is optimal over the infinite horizon. A common algorithm is to lengthen the horizon until the optimal initial decision stabilizes. When more than one optimal initial decision exists, it may happen that such a decision is optimal for some finite horizons but not others, no matter how long. In that case, the optimal initial decision may never stabilize. In this paper we describe conditions under which each tThis material is based on work supported by the National Science Foundation under Grants ECS-8409682 and ECS-8700836. 1

infinite horizon optimal initial decision is optimal for every sufficiently long finite horizon. For some types of problems, we can then select one optimal decision using a simple tie-breaking rule. Section 2 includes a mathematical statement of the problem and assumptions. Section 3 is a discussion of the convergence of closed sets in a general compact space, and Section 4 discusses how closed set convergence can lead to an algorithm for infinite horizon optimization. In Sections 5 and 6 we present two alternate sets of conditions under which the sets of finite horizon optimal initial decisions converge to the set of infinite horizon optimal initial decisions. We illustrate the conditions with various capacity expansion models. Finally, Section 7 contains conclusions. 2 Problem Definition and Assumptions Suppose we are faced with making a sequence of decisions over a continuous or discrete time frame. Each individual decision, denoted i, will be called a policy, and a sequence 7 = (71, i2,...) of policies constitutes a strategy. Let II, be the set of policies available after n - 1 decisions have been made. We assume policies may be indexed so that IIn C {0,1,2,...} and, for any rnl and nr2 E II, Tr - T2 < M < o, where M is uniform over n. Let II C x o=IIn be the set of all feasible strategies. Let 9 = 0 = (0,0,...). The strategy 9 may or may not belong to II. Associated with each strategy, 7r, is a cumulative net cost function Ct(t). In order to compare costs incurred over time, we continuously discount them to time zero. Hence C^(r) = fO~~ ertdC,(t) is the resulting infinite horizon discounted cost as a function of the discount rate, r. Also, C((r,T) = fT e-rtdC(t) is the discounted cost over the finite horizon T. Bean and Smith [1984] showed that infinite horizon discounted costs converge if r is larger than 7 = sup limsup In 7ren t- oo t The set {rlr > y} is called the range of convergence for interest rates. We keep the assumptions of Bean and Smith [1984]; in particular: C,(t) = the cumulative net cost for strategy xr up to time t = K,(t) - R(t) where K(t)-<Be'' where O (t)R Bert Bt t t To, B,7, To independent of n, K (t), R,(t) nondecreasing. The functions K,(t) and R,(t) represent cumulative pure cost and cumulative pure revenue, respectively. Without loss of generality, we can assume To = 0 (if not, replace B by BerTO). We define a metric similar to that of Bean and Smith [1984]: for 7r, * E II, o00 p(r, ) = E /|n n -n I n=l where p < jM1' This metric has the property that, for any L, any two strategies which agree in the first L policies are closer than any two strategies which do not. This metric induces a topology on II. Given an interest rate, r, the problem we wish to solve is: 2

min Cr(r).,rEHI As shown by Bean and Smith [1984], subject to an assumption that II is complete, the minimum exists since II is compact and C,(r) is continuous in 7r. The forecast horizon approach involves solving finite horizon problems: min C (r, T). TnE In the remainder of the paper we assume r is fixed, and omit it from the notation. We define C*(T) to be the minimum finite horizon value and C* to be the minimum infinite horizon value. A strategy f is termed infinite horizon optimal if it minimizes Cr and finite horizon optimal or T-optimal if it minimizes C,(T) for some T. Let II* and 1I*(T) be the sets of optimal strategies for the infinite horizon and finite horizon problems, respectively. An optimal initial policy is an initial policy included in some optimal strategy. The sets of optimal initial policies for the infinite and finite horizon problems, respectively, are denoted II* and l (T). A specific forecast horizon is a time, T, such that for T > T, lH(T) = H*, with 1I*l = 1. A general forecast horizon is a time, T, such that for T > T, II (T) C II*. Most forecast horizons found in the literature have been specific ones. In this paper, we seek a general forecast horizon. 3 Hausdorff Convergence Bean and Smith [1984] derive a specific forecast horizon from the convergence of 7r*(T) to lr*, assuming rt* is unique. In this paper, we allow IIl to contain more than one element and seek a general forecast horizon from the convergence of II*(T) to II*. In this section we describe the nature of the convergence we seek for the optimal sets of strategies. First, we present results concerning the convergence of closed sets in a general compact space. The next definitions are adapted from Schochetman and Smith [1987b]. The space (II, p) is a compact metric space. Let C(II) be the set of all non-empty closed (hence compact) subsets of II. Define d(r, P) = min p(7r, 4), for r E II, P E C(), 4EP h(Q, P) = max d(r, P), for Q, P E C(II). rEC D(Q, P) = max(h(Q, P), h(P, Q)), for Q, P E C(II). The function D is known as the Hausdorff metric on C(II). The space (C(II), D) is a compact metric space. Let {Pi} be a sequence in C(lI). As in Schochetman and Smith [1987b], define lim inf Pi (resp. lim sup Pi) to be the set of all xr in II such that every neighborhood of 7r is eventually (resp. frequently) intersected by Pi. In general, liminfPi C limsup Pi. A selection on C(II) is defined to be a mapping S: C(H) - IIra such that S(P) E P for P E C(ll). Let p be an arbitrary point in II, and, for P E C(II), let Sp(P) be any point ir E P such that p(p, 7) = d(p, P). The function Sp(P) is called the nearest-point selection defined by p. If p is such that there exists a unique 7r in P as above, then p is a uniqueness point for P. In this case, ir uniquely minimizes p(p, q) over all q E P. Let P1 denote the uniqueness set for P, i.e., the set of all uniqueness points for P. If p E P1 and Sp is a nearest-point selection defined by p, then Sp(P) is uniquely determined in P. Schochetman and Smith prove the following theorem: 3

Theorem 1 Let {Pi} be a collection of closed sets in 1, and P be a closed set in H. Then the following are equivalent: (i) P = liminfio Pi = limsupioo Pi. (ii) limsupi,,o Pi C P and for each r E P, iri5 E Pi for all i such that rti -or r as i -+ oo. (iii) Pi - P in the Hausdorff metric. (iv) Sp(Pi) -S Sp(P) as i -- oo for all nearest-point selections Sp on C(I) defined by p E p1. To apply Theorem 1 to our problem, set P equal to II*, the set of infinite horizon optimal strategies, and let P(Ti) be the set of strategies that are optimal in some sense (to be defined below) for a finite horizon Ti. Assume Ti -+ oo. In this context, limsupi., P(Ti) is the set of algorithmically "optimal" strategies, or "optimal" strategies which can be identified by solving finite horizon problems. Whether or not they are truly optimal depends on the definition of P(Ti). Results (i) and (ii) state that the algorithmically "optimal" strategies are optimal and, further, each optimal strategy can be approximated by solving any sequence of increasingly long finite horizon problems. Result (iii) states that finite horizon "optimal" sets converge to the infinite horizon optimal set. This result is similar to one found by McKenzie [1976]. Finally, result (iv) suggests an algorithm for choosing one out of the possibly many optimal strategies. For p E (II*)1, Sp selects a sequence of finite-horizon "optimal" strategies that converges to the corresponding infinite-horizon optimal strategy. From (iv), we can derive a forecast horizon by using the p-metric as in Bean and Smith [1984]. Convergence in this metric implies agreement in early policies. Therefore, since a convergent sequence of strategies eventually share the same initial policy, a forecast horizon can be identified by extending the sequence far enough. 4 Application to Infinite Horizon Optimization In this section we lay some groundwork for applying Theorem 1 to infinite horizon optimization. We show that the finite horizon and infinite horizon sets of strategies involved in solving an infinite horizon problem are closed. Next we establish that the point 0 = 0 is a uniqueness point for any closed set of strategies. Finally, we demonstrate that any set of strategies can be ordered according to their distance, measured by p, from 0. The ordering can be determined simply by examining the sequence of policies which compose each strategy. Definition: Let P(T) C H. We say that P(T) is a T-horizon set if membership in P(T) is determined by data up to time T (inclusive). Lemma 2 The sets II* and P(T), where P(T) is any T-horizon set, are closed. Proof: II* is closed since C, is continuous in ir and II is compact. For II(T), note that there are a finite number of partial strategies up to time T. If P(T) is finite then it is closed. Suppose that P(T) is infinite and that 7r is a cluster point of P(T). Let {1rn}~=1 be such that rn" E P(T) for all n and rn -+ ir in p. Then {n}n=l is Cauchy, which implies that for some N, {7n, n > N} are in agreement with one another and with ir up to time T. Therefore ir E P(T). Thus P(T) contains all its cluster points. Therefore P(T) is closed. I Lemma 3 For any closed set P C II, = 0 is a uniqueness point for P, that is, 9 E P1. Proof: We will show that: 1. p(O, Ir) attains its minimum on P, 4

2. The function p(, ir) of r is one-to-one. (1) and (2) imply that p(G,?r) achieves its minimum at a unique point of P. Therefore, 0 E pl'. 1. We begin by showing that p(O, 7r) is continuous on II. Suppose rT -T r. By the triangle inequality, p(0, 7) < p(, 7rn) + p(r, r), and p(G, 7") < p(e, i) + p(T, 7r). Therefore, Ip(O,K ) - p(O, n')I < p(r, r) - 0. Hence, p(O, 7r) is continuous on II. Now, since P is closed and P C II, it follows that P is compact. Therefore, p(0, 7r) attains its minimum on P. ~ 2.- Suppose r1,7r2 E II and 7rl y 7r2. Then oo p(o, 7r) - p(o, r2) = E "('1 - 2). n=l Since 7rl: 7r2, there exists n such that 7rA # Tn. Let no be the smallest such n. Assume without lost of generality that nr1 > 7rn. Then 7r1 -lrn > 1. By assumption,,r1 -r2 > -M for n > no Then oo p(O, r) - p(, r2) = >E 1 (, - ) ~~~~~nn=no+l Hence, p(, ar) is one-to-one on II. ~ Lemmas 2 and 3 imply that we can apply Theorem 1 to the question of whether any T-horizon set,'), converges to II*. We can use 0 as a uniqueness point for a nearest-point selection. The next result rprets the selection Se(.) as a simple tie-breaking rule, similar to one used for resolving degeneracy in linear programming (Murty [1983]). For finite choice infinite horizon optimization problems, we can define a lexicographic ordering of strategies, and use it to distinguish between alternate optimal strategies. Definition: Let a = (ai, a2,...) and b = (b1, b2,...). We say that a -< b (a is lexicographically smaller than b) if and only if a # b and, if no is the smallest n such that an, b,, then an,, < bno Lemma 4 For any closed set P E I, Se(P) is the lexicographic minimum element of P. Proof: We will show that p(O, irl) < p(O, ir2) if and only if r1 -_< r2. Suppose 7r1 -4 7r2. Then for some no we have rn = T2 for n < no and r no< 7'. Then oo p(O, 2) - p(, rl) = 2 (ir - ir) n=no 5

Mgno+ O 1 > Po _ - > 0 since 3 < Hence, t 1-< 72 p(O, 71) < p(0, 2). Now suppose 7r1 7r 2. Either 7r1 = r2 or 7r1 >- 2. If 7r = r2 then p(Orl) = p(0 7r2). Suppose 7rl > 72. Then by the same argument as above, p(O, 7r') > p(O, 7r2). Hence 7r1 2 > p(0, r) > p(0, r2). ~ Thus the nearest-point selection defined by 6 is simply the lexicographic minimum strategy. To summarize, we have: Theorem 5 If {P(T)} is some collection of T-horizon sets, then P(T) -+ H* iff the lexicographic minimum element of P(T) converges to the lexicographic minimum element of H*. Proof: Follows from Theorem 1 and Lemma 4. ~ Assuming P(T) -* 11*, theorem 5 suggests the following algorithm: solve increasingly long finite horizon problems, identifying for each T, P(T) and its lexicographic minimum element. The sequence of strategies thus generated converges to the lexicographic minimum infinite horizon optimal strategy. 5 Weakly Reachable Problems Having established that Hausdorff convergence is equivalent to the existence of a tie-breaking algorithm, we would like to establish conditions under which Hausdorff convergence takes place. Then we can exploit the equivalence to derive an algorithm with a tie-breaking rule for identifying a forecast horizon and an optimal initial policy. To establish Hausdorff convergence, we define the finite horizon sets {P(T)} and specify assumptions on the problem's structure. In this section, we give a rather broad definition for P(T) and a condition under which P(T) -, H*. In the following section, we discuss a more restricted class of problems in which a tighter definition of P(T) allows a stopping rule for the tie-breaking algorithm. In that section we also give a necesary and sufficient condition for Hausdorff convergence to take place. We now turn from our focus on strategies to consider the dynamic programming network defined by the problem. In modeling a problem, we may consider states to be defined by strategies. In the extreme case, there may be a distinct state at time T for each different strategy up to that time. Often, however, more than one strategy will reach the same state at time T. If there is only one possible state at time T, that is, the state depends only on the time and not on the sequence of decisions up to that time, the state is called a regeneration point. Alternatively, we can define strategies in terms of states. For deterministic problems, strategies can be viewed as paths through the dynamic programming network, where a path is a sequence of states. The concept of reachability of states has been used to prove forecast horizon results (McKenzie [1976], Lasserre, Bes, and Roubellat [1984], Bean and Smith [1986]). Loosely speaking, a state is reachable if it is not too far from an optimal path. Assume that states can be numbered so that states associated with later decision epochs have higher numbers. Let there be a root node for the dynamic programming network, representing the initial state. A strategy covering the infinite horizon can be represented by the infinite sequence of states it traverses from the root node through the infinite horizon, and its discounted cost represented as the length of that path. The following definitions are adapted from Bean and Smith [1986]: 6

Definition: C(s) = the shortest distance from the root node to node s. C'(s) = the shortest distance from node s through the infinite horizon. C(Is) = the length of a shortest path from the root node covering the infinite horizon which passes through node s. The principle of optimality may be stated as: C(ls) = C(s) + C'(s). Definition: A sequence of steatees n{}, aSn - oo, is weakly reachable from an optimal path if for all e > 0, there exists an N such that for n > N, there exists a j(n) and an sn such that there is a path from S(n) to s' at cost cn with C(Isn) ~ C(Osn) + e, where s* is a state on some path {s*} optimal for the infinite horizon problem, s() -* oo, and cn -* 0 as n -- oo. Bean and Smith [1986] prove the following: Theorem 6 C(si) -+ C* if and only if {si} is weakly reachable. To apply Theorem 1 to finite choice infinite horizon optimization, we need to define a finite horizon set of strategies for each finite horizon, T. Let S(T) be some set of dynamic programming states at time T. Let T*(T) = {7rlr is optimal for some s E S(T) at time T}. We would like to establish result (ii) of Theorem 1. We accomplish this in two steps. Lemma 7 limsup *(T) C II* if and only if all infinite horizon state paths composed of states in {S(T)} are weakly reachable. Proof: By definition, r E limsupIT(T) if and only if there exists {Ti}1=l, Ti -- oo, and ri E T' (Ti) such that 7ri - 7r. Say ri passes through state si at time Ti. By Theorem 4.5 of Schochetman and Smith [1987a], C,i(Ti) - Ca. We have C(si) = Ci(Ti). Therefore, by Theorem 6, we have Ci(Ti) = C(si) -C* = C if and only if {si} is weakly reachable. ~ Lemma 8 If S(T) is the set of all possible states at time T, then HI* C liminf T(T). Proof: Suppose r E II*. Then, by the principle of optimality, 7r is optimal for its state at each time T. Thus, r E i (T) for all T > 0. Hence, there exists XT EII (T) such that XT -+ 7r. Therefore, 7r Eliminf (T). We now have the following theorem. Theorem 9 IfS(T) is the set of all possible states at time T and if all infinite horizon state paths composed of states in xS(T) are weakly reachable, then I (T) -- II*. 7

5.1 Capacity Expansion Illustration In this section we give an example in which all dynamic programming states are weakly reachable. The problem is therefore amenable to solution by the algorithm suggested by Theorem 5. Consider the problem of optimally deploying facilities to meet growing demands for capacity. As in Bean and Smith [1985], assume we are given a cumulative continuous demand, D(t), for capacity at time t. Demand can be satisfied by any of a finite set of distinct replicable facility types indexed 1,2,..., n. Installing facility j incurs a discrete fixed cost Fj > 0 and provides Xj units of capacity. Assume that X1 < X2 <... < Xn and, without loss of optimality, that F1 < F2 <... < Fn. No undercapacity is allowed, and we assume for convenience that variable operating costs are unaffected by the choice of facilities. The problem is to choose an optimal sequence of facilities to meet the demand for capacity over an infinite horizon. We can assume without loss of optimality that a new facility is never deployed until all existing capacity is exhausted. A point in time when it is necessary to add capacity is called an installation epoch. Bean and Smith [1985] show that, for this problem, installation epochs are regeneration points. We define a dynamic programming state to include the time, T, and the amount of excess capacity at T. At an installation epoch, excess capacity equals zero. The cumulative cost function for any strategy is a step function with a discontinuity at each installation epoch. Assume this function is left-continuous so that the fixed cost for a facility is added just after its installation epoch. Also note that if T1 and T2 are two installation epochs (possibly for different strategies) and T1 < T2, then the optimal cost for T1 is less than or equal to the optimal cost for T2. Otherwise, the cumulative demand up to T1 could be satisfied by the decision sequence leading up to T2 at lower cost. Finally, assume D(t) is continuous, nonnegative, and eventually bounded by an exponential function with rate g. Bean and Smith [1985] prove that if costs are discounted at a rate r > g, then infinite horizon discounted costs are finite. Lemma 10 If states are defined as (T, excess capacity at T), then all paths are weakly reachable. Proof: Choose a state path {sn}n'1. Let N be such that for n > N, C'(sn) < e (N exists for interest rates r > g). Then C'(s) < e for all s at times beyond T where T is the time associated with Sn. Let tn be the last installation epoch leading to Sn and tn+1 be the first installation epoch after Sn on a strategy that goes through Sn. Given some optimal path {s* }, let j(n) = max{j|lT, < tn, s! an installation epoch} St = min{s| IT,. < tn+i, s* an installation epoch.} (See Figure 1). Then Cn = cOSt(S!(), S) 0 by the convergence of f. Hence C(Isn) - C(In) = C(Sn) - (S) + C'(Sn) -'(). But C(sn)- C(s' ) < 0 since C(sn) < C(tn+) < C(s), and I(sn)- C'(sn) < e by the choice of N. Therefore {sn} is weakly reachable. N 8

Optimal Strategy Arbitrary Stratevg Excess Capacity Sj(n) t T tn+,l Time Figure 1. Weak Reachability in Capacity Expansion Problems. Bean and Smith [1985] present an algorithm for identifying a specific forecast horizon for this problem assuming nondegeneracy. By Theorem 5 and Lemma 10, we have an approach for identifying a general forecast horizon even in the presence of more than oe optimal initial policy. In this instance, weak reachability follows from the regeneration point properties of optimal strategies. We first had to establish, by insight into the problem's structure, that it is sufficient to consider only strategies in which capacity is added only when all existing capacity is used up. A weak reachability result could be established similarly in a Wagner-Whitin type of production planning problem, in which production occurs only when inventory is zero. We expect that proving weak reachability for other problems will require some measure of insight into special problem characteristics. 6 Regeneration Point Problems In this section we address Hausdorff convergence for problems with regeneration point structure. We give an algorithm with a stopping rule which is guaranteed to settle on an infinite horizon optimal initial policy in finite time. This is the first implementable algorithm guaranteed to converge in finite time without the restriction that xr* be unique. Next we present a necessary and sufficient condition for Hausdorff convergence to occur in regeneration point problems. This condition is a type of reachability stronger than those defined by Bean and Smith [1986] and McKenzie [1976]. Finally, we illustrate the reachability condition by means of two examples. The knapsack problem solved by Shapiro and Wagner [1967] satisfies the reachability condition. The Shapiro-Wagner algorithm is actually a special case of our tie-breaking algorithm. We also present an example of capacity'expansion under exponential demand which has regeneration points but fails to pass the reachability test. Recall that a decision epoch is a regeneration point if the optimal strategy beyond T is independent of the sequence of policies up to time T. A problem has regeneration point structure, or is a regeneration point problem, if the decision epochs for optimal policies are regeneration points. We can draw two conclusions which will be useful for proving results in this section. First, suppose T is a regeneration point for 7r* E II*. By the principle of optimality,?r* is optimal to its state at time T and optimal from its state at time T through the infinite horizon. It follows that (i) any strategy truncation ending at T which is T-optimal can 9

be continued by xr* beyond T. Otherwise, since optimality implies feasibility, the optimal strategy beyond T depends on the policies before T. Also, (ii) lr* E II*(T). If not, there is a less costly sequence of policies up to time T which can be continued by 7r* beyond T. This strategy has lower cost than 7r*. 6.1 Tie-Breaking Algorithm For regeneration point problems, we define the finite horizon set P(T) for each T to be II*(T), the set of T-optimal strategies. Assume that II*(T) - II* in the Hausdorff metric. The following algorithm can be used to solve any regeneration point problem. Let r be the maximum duration of any policy. Algorithm 1. Let {Tn}'1 be the sequence composed of all decision epochs for all strategies. Set n +- 1. 2. Solve the Tn-horizon problem to obtain [Se(n*(Tn))1], the initial policy of the lexico minimum T,-optimal strategy. 3. If [Se(II*(Tn))]l = [Se(nI*(Tk))] for all k such that Tn - r < Tk < Tn, then stop. Otherwise, set n - n + 1 and go to step 2. Theorem 11 If the algorithm is applied to a regeneration point problem then (i) The algorithm will stop in finite time. (ii) When the algorithm stops, [Se(II*(Tn))]I = [Se(l*)]i. Thus, the algorithm selects an optimal initial policy in finite time. Proof (i): Since II*(T) II-*, we know that Se(1*(T)) -- Se(5I*). Hence there exists T* such that for T > T*, p(S(II*(T)), Se(*)) < 3. Therefore, for T > T*, [Se(I*(T))]1 = [Se(II*)]1. (ii): Suppose not, i.e., suppose the algorithm stops at the finite horizon T, and [Se(n*(Tn))]i = [Se(II*(Tk))]l for all T, -r < Tk < Tn, but [se(n*)11i [Se(n*(Tn))]I. For any 7r* E II*, xr* has a decision epoch in [Tn - r,T,]. Call it T'. Now, II(T') C II; by the regeneration point property. Then [Se(lI*(T'))]1 E I*. Hence [Se(I*(Tn))]l e I*. Then, since [Se(II*)],: [Se(II*(Tn))]i, we have [se(nE*)]1 - [se(n*(Tn))]x. But Se(II*) has a decision epoch T" E [Tn - r,Tn]. Then Se(II*) E II*(T"). Therefore, Se(II*) 7 Se(II*(T")). Hence, [Se (n*)]1 7 [Se(n*(T"))]1 = [Se(n*(Tn))]i. This is a contradiction. ~ 10

6.2 Necessary and Sufficient Condition for Hausdorff Convergence Let LII* be the set of truncations of strategies in II* to the first L policies. Let LII*(T) be the analogous set for II*(T). Since LII* and LII*(T) are finite, they are compact. To prove the main result in this section, we need the following lemma. Lemma 12 II* C liminfII*(T) =- for every L there exists TL such that for T > TL, LII* C LI*(T). Proof: Suppose II* C liminfII*(T). Then for each 7r E II*, for each T, there exists 7rT E II*(T) such that rT -- Ti. Choose L and consider the truncation L7r of 7r E nI*. Since rT —, r, there exists TL such that for T > TL, LXT = L7r. Thus LT E LII(T). But L7r was an arbitrary element of LI*. Now suppose II*! liminfII*(T). Then for some 7r E II*, for some e > 0, for every T, there exists T' > T such that p(r, iT) > e for each 7T E II*(T'). Then for some K, for every T, there exists T' > T such that for every irT' E H*(T'), XTi i: Tin for some n < K. In other words, for some K, for every T, there exists T' > T such that K7r KII*(T'). The result holds by contraposition. I Our necessary and sufficient condition stipulates that for any pair of infinite horizon optimal strategies, there exists, indefinitely far into re ci the future, a sequence of policies that connect the two strategies optimally. That is, one can follow the first strategy up to an arbitrary decision epoch, then implement a sequence of policies to arrive with optimal cost at a decision of the second strategy. A rigorous definition follows. The terminology "T on er" means T is a decision epoch for strategy ir. Definition: A strategy 7r is optimally connected to a strategy 7r2 if for any T on 7r, there exists T' > T on 7r1 and S on r2, S > T, such that there exists a sequence of policies (*il, 2,...,d) with C1r2(S) = C' (T')+ {discounted cost of(try, i2,..., Tn) when implemented at time T'}, where (*ti, 2..., n) continues at least up to S when implemented at T'. We assume that, in the event that (ii,7 2,..., *n) continue beyond 5, the policies of?r2 which would be implemented starting at S may still be implemented at S. An example is capacity expansion (Bean and Smith [1985]), where (*T1, *2,..., *Ti) may install more capacity than is actually needed up to time S. In that case, we assume the excess capacity is ignored, and a new facility can be installed at time S. We say that all strategies are optimally connected if every strategy is optimally connected to every other. The following is a statement of our necessary and sufficient condition for Hausdorff convergence. Theorem 13 For a regeneration point problem, II*(T) -* I* if and only if all strategies in II* are optimally connected. Proof: As shown by Schochetman and Smith [1987b], limsupII*(T) C II*. We will show that optimal connectedness is necessary and sufficient for II* C liminf * (T). Suppose II* C liminfln*(T). Then by Lemma 12, for any lr* E 1* and any L, Lr* E LII*(T) for all T sufficiently large. In particular, there exists TZ such that for any * E II*, jII* E LII*(Ti) where Ti is a decision epoch for f*, Ti > TZ. Now suppose that not all strategies are optimally connected. Then for some Tr* and * E II*, for some T on lr*, for every T' > T on 7t* and every S > T on ir, C*i(S) < C,.(T')+ {cost of any sequence of policies starting at T' and continuing at least up to S.} Suppose T is the (m + 1)st decision epoch. Then mrr* ~ II*(Ti) where Ti is a decision epoch for *r, Ti > T. This is a contradiction. Now suppose every optimal strategy is optimally connected. We will show that for every L, there exists TL such that for T > TL, LII* LII*(T). Then I*(T) -I 1* by Lemma 12. Suppose not. Then for some L, there exists {T~L}'1j, T -k oo, such that LIH*! LHn*(TL), i.e., for each i, there exists L7ri E LII* such that Llrti LII*(TiL). By compactness, there exists a convergent 11

subsequence of { L7r'}, call it { L7rj}, such that L3j -~ L7r. That is, Lj = LWT for j sufficiently large and- LT E LII*. Let Xi be some continuation of LT which is contained in II*. For each j choose an element of II*(TfL), call it ir. There exists a convergent subsequence {7rk} of {*rj} such that irk ir E I1*. We have Llrk = LT for k sufficiently large. Summarizing, we have, LT E LI, L F L II(Tj*), Vj, and L* E LII, Lr E LrI(Tk), Vk, where {TL} C {TL}. There exists some TL on ir such that L policies have been implemented by ir before TL. By assumption, there exists T' on ir, T' > TL, and T on *, T > T', and a sequence of policies (7r, Wr,.., 7* ), which, when implemented at T', extend at least up to T and satisfy Cr(T) = Cf(T')+ { cost of (r*, Wr,..., XTn) when implemented at T' }. We can form a new strategy: follow ir up to T', then (7r*, T,2..., Xrn) between T' and T, then the policies of ir from T up to = min{Tk'Ik such that Tf > T}. This strategy is optimal for T. Then L*r E LII*(), which is a contradiction. m 6.3 Examples The knapsack problem solved by Shapiro and Wagner [1967] is one example in which a forecast horizon exists regardless of degeneracy. Their paper gives an algorithm together with a stopping rule for solving infinite horizon problems which is guaranteed to be satisfied in finite time. They make no assumption concerning the number of optimal initial policies. In fact, the knapsack problem is an example of Hausdorff convergence in a regeneration point problem and the Shapiro-Wagner algorithm is a special case of our tie-breaking algorithm. We can show, by extending the results of Shapiro and Wagner, that the knapsack problem satisfies the necessary and sufficient condition for Hausdorff convergence. Therefore, the tie-breaking algorithm is guaranteed to converge. First, we define the problem. Some of the notation and conventions from Shapiro and Wagner [1967] are altered to harmonize with the rest of this paper. In particular, we reverse the policy indexing order. At any point in time, the same set of n policies, {u1, u1,..., u, }, are available. Policy j incurs cost Fj and has duration rj. Expressed as a dynamic programming recursion, the finite horizon problem is G(T) = min Fj + a-j G(T - j ), {j=1,2,...,nrj<_T} where a = e-r and G(O) = 0. Let fr a < x1 jThe infine h n 1 The infinite horizon problem can be expressed as 9 g-(1 - a)G= min j j=1,2,...,n Any policy minimizing 7j is called a turnpike policy. Then II* is the set of turnpike strategies, or the set of all possible sequences of turnpike policies. Let {u1, U2,.., Ut } be the set of turnpike policies. Assume policies are indexed in order of increasing tj. In the event of a tie, the lowest index is given to the policy with the smallest rj. Thus, u1 is the turnpike policy with the smallest duration. 12

As in Shapiro and Wagner [1967], let aj and bj be relatively prime positive integers such that ajj = bjrl, for j = 1,2,...,t. Let M* be the smallest integer such that F (1 - a- +1) < (1.L) 1 - eL) Ce-rk for all L > M*, where k* is such that Fk. = min { Fj Fj F 1 - ark- a 1- 1 -a - J J Let t N*= (aj -1) j=2 and t L*= Eajrj. j=2 Shapiro and Wagner show that for T > M* + N*, there exists a strategy Xr in II* (T) such that 7r1 = ul. Therefore, for any T > M* + N*, there exists a strategy 7r E II*(T) such that iri = ul up to some epoch in [T- M* - N*, T- M* - N* + ). Lemma 14 In the knapsack problem, all infinite horizon optimal strategies are optimally connected. Proof: Consider any 7r1 and 7r2 in II*. We will show that r1 and 7r2 are optimally connected. Choose some decision epoch T for ir1. Up to time T, lr1 includes NT replications of policy uj, for j = 1,..., t. Let m = min{i > 0, integer iaj > NT, Vj E {1,...,t}}. Let S be a decision epoch for ar2, S > M* + N* + mL*. Then there exists a strategy *r E II* (S) with fri = ul up to some S' E [S - M* - N*,S- M* - N* + 1r). We have S' > mL*. In strategy it, we can replace mbj replications of ul by maj replications of uj with no difference in cost. In so doing, we include at least NT replications of uj, j = 1,2,...,t. Rearrange the initial policies so that they agree with 7r1 up to time T. The resulting strategy is optimal for S and has a decision epoch at T. Its policies between T and S form an optimal connecting path between 7r1 and 7r2. Therefore 7r1 and r2 are optimally connected. ~ The Shapiro-Wagner algorithm solves increasingly long finite horizon problems, identifying an optimal initial policy for each horizon. If two policies tie, the one with the smaller index is chosen. The algorithm terminates at Ts if the optimal initial policy is ul for all T E [Ts - r, Ts] where r = maxj=l,2,...,n Tj. Thus their algorithm is identical to our tie-breaking algorithm. One instance of the knapsack problem is capacity expansion with linear demand. The policies represent available facilities to augment capacity. The problem is to choose an optimal sequence of capacity additions, subject to the constraint that cumulative demand for capacity up to time t, D(t) = dt, be satisfied for all t. Facility j incurs a fixed cost Fj and adds Xj = rjd units of capacity. 13

An extension of this problem is the case where D(t) = Doegt, as in Smith [1979]. The extension preserves the regeneration point property. In the case of exponential demand, however, Hausdorff convergence is not guaranteed. In the following example, Hausdorff convergence fails to take place. Nonlinear D(t) implies that policy durations are nonstationary. Smith shows that for exponential demand, (t) =- log 1 + ) is the duration of policy j when installed at time t. Also, the cost of installing facility j forever starting at time t is 00 / nXj -rg Cj(t) = e-rt Z Fj 1( + Doegt n — =O where r is the interest rate used to discount costs. Let j (t) = - r -rj(t) This is a nonstationary version of the 7j defined by Shapiro and Wagner [1967]. Example: Let facility 1 add capacity X1 = 1 and incur fixed cost F1 = 2. Facility 2 adds capacity X2; 0.1052 and incurs fixed cost F2, 0.3314. The values of X2 and F2 are designed so that r2(0) = 1 and F2 = C1(0) - C1(l). Let ir1 = (1, 1,1,...) be the strategy of installing facility 1 indefinitely. Let 7r2 = (2,1,1,1,...) be the strategy of installing facility 2 once, then facility 1 indefinitely. Then 7r1 and 7r2 have the same infinite horizon discounted cost. See Figure 2. Also, one can verify that 72(r2(0)) > 71(T2(0)). By Theorem 3 of Smith [1979], it follows that 7r1 and 7r2 are superior to all other strategies. Note that decision epochs for 7r1 and 7r2 never coincide: decision epochs for 7r1 occur when D(t) = X2n for some integer n and those for 7r2 occur when D(t) = 0 or X1 + X2m for some integer m. We claim that each strategy is optimal to its own decision epochs and suboptimal to the other strategy's decision epochs. It follows that limsupT_ ooII*(T) = {r1, r2} = I* but liminfT.oo rI*(T) = 0. Therefore, by Theorem 1, I*(T) f II*. 14

Capacity D(t) I2 1 Time Figure 2. Example of Capacity Expansion under Exponential Demand with no Hausdorff Convergence. Proof of Claim: At a decision epoch T for 7r1, D(T) = X1k for some k. Since X2 < X1, the cumulative capacity installed by 7r2 up to time T must be at least X2 + X k. Let T, be the installation epoch for the (n + l)St facility of 7r1. Then k-1 C i(T) = Fie - r n=O Cn=FOie"- (i+ DoegTo' = F1 + Do J) n=0 and C,2(T)=F2+Fe-F Z (+Doe -)/ =F2 + ( eg+ Do n== By algebra, C01(T)< C,2(T). Similarly, up to a decision epoch S for ir2, 1rl has installed facility k + 1 replications of facility 1 and 15

7r2 has installed one replication of facility 2 and k of facility 1. Then k nXi -r/g CI(S)=F1E 1+ Doi n=O and k -1 + nx, -r./ Cr2(S) = F + F (e+ ) -r/g r=O From algebra, C1(S) > C,2(S).. 7 Conclusions In this paper we have abandoned the common assumption of a unique optimal initial policy in infinite horizon optimization. Instead, we have addressed the question of whether the set of finite horizon optimal strategies converges to the set of infinite horizon optimal strategies as the horizon is lengthened. We presented two alternate reachability conditions which are sufficient for set convergence. The second condition, for regeneration point problems, is also necessary for set convergence. We have shown that convergence of the optimal sets leads to a simple tie-breaking algorithm, and have given a stopping rule which is guaranteed to be satisfied in finite time. The Shapiro-Wagner algorithm for solving knapsack problems (Shapiro and Wagner [1967]) is a special case of our tie-breaking algorithm. 16

References Bean, J. and R. L. Smith [1984], "Conditions for the Existence of Planning Horizons," Mathematics of Operations Research, Vol. 9, pp. 391-401. Bean, J. and R. L. Smith [1985], "Optimal Capacity Expansion Over an Infinite Horizon," Management Science, Vol. 31, pp. 1523-1532. Bean, J. and R. L. Smith [1986], "Conditions for the Discovery of Solution Horizons," Technical Report 86-23, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109. Bes, C. and S. Sethi [1986], "Concepts of Forecast and Decision Horizons: Applications to Dynamic Stochastic Optimization Problems," to appear in Mathematics of Operations Research. Charnes, A., J. Dreze and M. Miller [1966], "Decision and Horizon Rules for Stochastic Planning Problems: A Linear Example," Econometrica, Vol. 34, pp. 307-330. Hopp, W., J. Bean and R. L. Smith [1984], "Non-Homogeneous Markov Decision Processes," to appear in Operations Research. Lasserre, J., C. Bes, and F. Roubellat [1984], "Detection of Planning Horizons for the General Inventory Problem with Discounted Concave Costs," Laboratoire d'Automatique et d'Analyse des systemes du C.N.R.S., 7 Avenue du colonel Roche, 31400, Toulouse, France. Lundin and Morton [1975], "Planning Horizons for the Dynamic Lot Size Model: Zabel vs. Protective Procedures and Computations Results," Operations Research, Vol. 23, pp. 711-734. McKenzie, L. W. [1976], "Turnpike Theory," Econometrica, Vol. 44, pp.841-844. Modigliani, F. and F. E. Hohn [1955], "Production Planning over Time and the Nature of the Expectation and Planning Horizon," Econometrica, Vol. 23, pp. 46-66. Murty, K. G. [1983], Linear Programming, Wiley, New York. Ryan, S. M. and J. Bean [1987], "Degeneracy in Infinite Horizon Optimization," Technical Report 875, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109. Schochetman, I. and R. L. Smith [1987a], "Infinite Horizon Optimization," Technical Report 87-3, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109. Schochetman, I. and R. L. Smith [1987b], "Set Convergence in Infinite Horizon Optimization," Technical Report 87-2, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109. Shapiro, J. F. and H. M. Wagner [1967], "A Finite Renewal Algorithm for the Knapsack and Turnpike Models," Operations Research Vol. 15, pp.319-341. Smith, R. L. [1979], "Turnpike Results for Single Location Capacity Expansion," Management Science, Vol. 25, pp.474-484. 17