COST-BASED MEANS FOR RESOLVING DEGENERACY IN INFINITE HORIZON OPTIMIZATION* Sarah M. Ryan James C. Bean Techinical Report 86-41 December 2, 1986 *This material is based on work supported by the National Science Foundation under Grant No. ECS-8409682.

Cost-Based Means for Resolving Degeneracy in Infinite Horizon Optimization t Sarah M. Ryan James C. Bean December 2, 1986 1 Introduction Dynamic sequential decision problems are an important class of optimization problems. Applications include production and inventory control, capacity expansion, and equipment replacement. In some cases the decision variables are continuous, allowing a mathematical programming formulation with a continuous solution space. Often, however, the situation calls for a model with discrete decisions as, for example, in the choice between the various pieces of equipment available to replace one currently in use. A great majority of these problems share the characteristic of an indefinite horizon. The economy, firm, or government involved has no predetermined moment of extinction. To appropriately model these problems, it is necessary to assume that the problem may continue indefinitely. The traditional and most commonly used solution approach for such problems is to assume some finite horizon, T, and simply proceed as if the world ended there. The hope is that information beyond T will have little or no effect on the optimal solution, for at tThis material is based on work supported by the National Science Foundation under Grant No. ECS8409682. 1

Cost-Based Means for Resolving Degeneracy in Infinite Horizon Optimization t Sarah S. McAllister James C. Bean December 2, 1986 1 Introduction Dynamic sequential decision problems are an important class of optimization problems. Applications include production and inventory control, capacity expansion, and equipment replacement. In some cases the decision variables are continuous, allowing a mathematical programming formulation with a continuous solution space. Often, however, the situation calls for a model with discrete decisions as, for example, in the choice between the various pieces of equipment available to replace one currently in use. A great majority of these problems share the characteristic of an indefinite horizon. The economy, firm, or government involved has no predetermined moment of extinction. To appropriately model these problems, it is necessary to assume that the problem may continue indefinitely. The traditional and most commonly used solution approach for such problems is to assume some finite horizon, T, and simply proceed as if the world ended there. The hope is that information beyond T will have little or no effect on the optimal solution, for at tThis material is based on work supported by the National Science Foundation under Grant No. ECS8409682. 1

least the first few decisions, since those first decisions will be implemented immediately. A finite horizon with this guarantee is known as a forecast horizon. This solution method is known as a planning or forecast horizon approach (Lundin and Morton [1975]; Charnes, Dreze and Miller [1966]; Modigliani and Hohn [1955]). We define degeneracy as the absence of any forecast horizon. Bean and Smith [1984] showed that forecast horizons exist when every optimal sequence of decisions includes the same initial decision. Recently, Bes and Sethi [1986] found an algorithm for the case of discrete time and discrete decision sets that is guaranteed to identify a forecast horizon when the optimal initial decision is unique. Schochetman and Smith [1986] extended their results to the case of continuous time and continuous decision sets. In this paper, we adapt the Bes and Sethi algorithm to the context of continuous time and discrete decision sets. More important, we present a scheme for perturbing time zero costs which guarantees that the optimal initial decision is unique. We also calculate, for any c, a horizon that yields an initial decision that is part of a strategy with cost within e of the optimal cost. 2 Problem Definition and Assumptions Suppose we are faced with making a sequence of decisions over a continuous or discrete time frame. Each individual decision, denoted nxi, may also be called a policy, and a sequence Ir = (r1,xrl,...) of policies constitutes a strategy. Hn is the set of decisions available after n - 1 decisions have been made. II = x 1fIn is the set of all feasible strategies. Associated with each strategy, ir, is a cumulative net cost function C,(t). In order to compare costs incurred over time, we continuously discount them to 2

time zero. C,(r) = fO7 ertdC,(t) is the resulting infinite horizon discounted cost as a function of the discount rate, r. Cr(r, T) = oT ertdC(t) is the discounted cost over the finite horizon T. Bean and Smith ([1984]) showed that infinite horizon discounted costs converge if r is larger than ln Ic(t)l = sup lim sup I ren t-xoo t The set {rlr > -y) is called the range of convergence for discount rates. Given a discount rate, r, the problem we wish to solve is: min C.(r). rErI As shown by Bean and Smith [1984], the minimum exists since II is compact and C1(r) is continuous in vr. The forecast horizon approach involves solving finite horizon problems: min C,(r, T). WEl We define C*(T) to be the minimum finite horizon value and C* to be the minimum infinite horizon value. A strategy X is termed infinite horizon optimal if it minimizes C,(r) and finite horizon optimal if it minimizes C,(r,T) for some T. II* and II*(T) are the sets of optimal strategies for the infinite horizon and finite horizon problems, respectively. An optimal initial decision is an initial decision included in some optimal strategy. The sets of optimal initial decisions for the infinite and finite horizon problems, respectively, are denoted and II* and HIIj(T). We keep the assumptions of Bean and Smith [1984]; in particular: 3

C,(t) = the cumulative net cost for strategy ir up to time t = K.(t) - -R(t), O<K(t)<Met ] where K(t) Me Vt > To, M,y, To independent of ir, 0<Rr(t) <Met J K,(t), R,(t) nondecreasing. Without loss of generality, we can assume To = 0 (if not, replace M by Me'7T). We also assume that IIn is finite for all n. Bes and Sethi ([1986]) proved that this assumption is necessary for H to be compact. Compactness allows the existence of an optimal strategy (Bean and Smith [1984]). As in Bean and Smith [1984] we define the metric 00oo p(xr,') = E (7r, )2, n=l where 1 if the nth policies in r, r are different, On (r, x X)= 0 if the nth policies in Xr, are the same. This metric has the property that, for any L, any two strategies which agree in the first L decisions are closer than any two strategies which do not. A forecast horizon is a point in time such that if the time horizon is truncated at or beyond this time, an optimal initial policy for the truncated problem will be optimal for the infinite horizon problem as well. Forecast horizon approaches consist of solving successively longer finite horizon problems until a forecast horizon is discovered. Bean and Smith [1984] showed that uniqueness of the optimal initial decision is sufficient for a forecast horizon to exist. Schochetman and Smith [1986] found a stronger result in the case of continuous decision sets. They define an algorithmically optimal (AO) strategy to 4

be an optimal strategy that is the limit of optimal strategies for finite horizon problems. This concept was originally introduced by Hopp, Bean, and Smith [1984] for discrete time and finite decision sets. There such strategies were called periodic forecast horizon optimal strategies. Schochetman and Smith demonstrated that the set of AO strategies can be considerably smaller than the set of all optimal strategies. An AO initial decision is an initial decision included in an AO strategy. Schochetman and Smith showed that uniqueness of the AO initial decision is necessary and sufficient for the existence of forecast horizons. We define degeneracy in infinite horizon optimization as the presence of multiple AO initial decisions, i.e., the absence of any forecast horizon. 3 Cost Function Characteristics Using the exponential bound on cumulative costs, we can derive bounds on discounted costs. Define rM a(t) = — e r-jf Note that a(t) -- 0 as t -- oo. The following theorem is proved in Schochetman and Smith [1986]: Theorem 1 For X E II, r > y, and T > 0, (i) 0 < K(r) - K,(r,T) < a(T) (ii) 0 < k, (r- (r,T) < (T) (iii) &C.(r) - C.(r, T)| < a(T) 5

In particular, it follows that as T -+ oo, jC,(r) - C&r(r,T) -_ 0 uniformly with respect to r E II. Corollary 2 For S > T, jCr(r,S) - C,(r,T)I < a(T). Proof: From Theorem 1, 0 < Kr(r,S) - K,(r,T) < K.(r) - K1(r,T) < a(T), 0 < R,(r,S) - R,(r,T) < R,(r) - R,(r,T) < a(T). Now, C,(r,) - C.(r, T) = K.(r, S) - K(r, T) - (R((r,S) - R (r, T)). Thus -a(T) < -(Ro(r,S) - R(r,T)) < C-(r,S) - C,(r,T) < K(r,S) - kr(r,T) < a(T). m Theorem 3 IC*(r) - C*(r,T) I < a(T). Proof: From Theorem 1, C.(r) < C1(r,T) + a(T), Vr E II. Taking the infimum on the left, we get C*(r) < C,(r,T)+ a(T), VE E n. Taking the infimum on the right yields C*(r)- C*(r,T) < a(T) 6

Similarly, from Theorem 1, C,(r) > Cr(r,T) - a(T), Vr E n. Taking the infimum on the right and then the left yields C (r) - C*(r,T) > -a(T). m 4 Continuity Results In order to demonstrate the existence of forecast horizons and develop an algorithm for finding them, we will require discounted costs to be continuous. Schochetman and Smith [1986] prove the following lemma: Lemma 4 Fix T > 0. Suppose the real sequence {rk} converges to r monotonically from the left or right. Then the sequence of functions {er-'t} of t converges uniformly to the function e-rt on [0,T]. Schochetman and Smith [1986] prove the following theorem for continuous strategy spaces subject to an additional assumption on the cost functions. We extend it to the discrete case. Theorem 5 The functions C,(r, T), Kf,(r,T), Ri(r,T) of (r, r) are continuous. Proof: Suppose rn -_ in II and rn -> r in (0, ). Suppose the theorem is false. Then 3e > 0 and a subsequence {(r, rm)} of {(rn, rn)} such that K.rn (rmT) - Kr(rT) > ~, m = 1,2,... 7

Since rm -- r, 3 a subsequence that converges from either the left or the right. There exists a subsequence of this that converges monotonically to r from either the left or the right. Call this subsequence {rk}. Then we have {(rk, rk)} - {((r,r)} and,(rk,T) - k(r,T)[ > E, k = 1,2,... Since rk - r monotonically, we have e -kt e-'r uniformly on [0,T] by Lemma 4. Hence max e-" - e -rt 0 as k -+ oo. O<t<Tr Then K, (r, T) - Kr (r, T)\ = | etdK, (t) - ertdKr(t) IT k rT T) < (e t-e-t dK, (t) + ertdKK (t)- e-tdKf(t) Let Var(f(-), [a, b]) denote the total variation of the function f on the interval [a, b]. Then r (e -rkt ert) dK, r(t) < max (e-kt- ert) Var (K, (.), [0, T]) - 0 O<t<T since max (e-rt ert) -0 o<t<T and Var(K^,(.),[0,T]) < Meit, Vk. For the second term, note that since rk --?r, Wk and Xr agree over an ever increasing time frame. Let Tk be the time covered by the decisions on which they agree. Tk -~ oo 8

as k -, oo. Then Tk > T for k > KT. Hence T fT i e-rtdK,,^ (t)- e-rtdK,,(t) = 0, k > KT. Thus rk(rk,T) - Kr (r,T)| - 0 as k -4 oo. This is a contradiction. The claim for RP(t) is proved similarly and thus follows for Cr(t). U The next two theorems are proved as in Schochetman and Smith [1986] using Theorems 1 and 5. Theorem 6 The functions Cr(r), K,(r), R,(r) of (r, r) are continuous. Theorem 7 Suppose.rn -, ir, rn - r, and Tn -+ oo as n -+ oo. Then CXn(rn,Tn) - C,(r) as n -, oo. The same is true for K and R. The following lemma is proved in Bean and Smith [1984], but is not stated explicitly there. We prove it here for the sake of completeness. Lemma 8 Let {Tn} be a sequence of positive times such that Tn -+ oo as n -+ oo. For each n = 1,2,3,..., let W" E fl*(Tn). Then 3 a subsequence {xrk} of {In} and an element r* E II* with the property that rk - 7r* as k -> oo. This property holds for any convergent subsequence of {irn}. Proof: Since II is compact, 3 a convergent subsequence {(k} of {(rn. Let a' = limkoo rk. By definition, Ck (Tk) < C.- (Tk). Taking limits of both sides, lim Crk(Tk) _ lim C&-(Tk) = Cr'. k —oo k-*oo 9

By Theorem 7, lim Crk(Tk) = Cr. k —oo But r* E II* = C,, = C,.. Hence r' E II*. ~ Theorem 9 C* (r) = limT-oo C (r, T). Proof: Follows from Theorem 7 and Lemma 8 as proved in Schochetman and Smith [1986].. 5 Forecast Horizon Results The existence of a forecast horizon and the algorithm for identifying it are based on the bounds on the discounted costs derived in Section 3. Lemma 10 allows us to restrict the set of potentially optimal initial decisions based on their minimum cost up to a finite time horizon. Theorem 11 expresses a stopping criterion for identifying a forecast horizon. Theorems 12 and 13 state that the stopping criterion will be satisfied if the optimal initial decision is unique. For the remainder of the paper we assume r > 7 is fixed. Define:,C*(T) = min{C(T) lr e I, r, = ri}. iC* = min{Cr E IIn,r = r1}. These minima exist since {Tr E nI| = 7rl} is compact. C(T) = min{iC*(T)i: i T n; (T))}. C =min{iC* li' nl)}. 10

Lemma 10 If for any strategy x, C*(T) - C*(T) > 2a(T) then *- ~ I1*(S) for all S>T. Proof: For each S > T, and Vr E II, C,(S) < C,(T) + a(T), by Corollary 2. Then by taking infimums on the right and then the left, C*(S) < C*(T) + a(T). We also have C*(S) > C*(T) - a(T) by Corollary 2. Then, by subtracting, C*(S) - C*(S) > C,(T) - C*(T) - 2a(T) > 0.. We generalize two theorems of Bes and Sethi [1985] to continuous time. Theorem 11 If C(T) - C*(T) > 2a(T) and HI(T) = {Ir(T)} then l (S) = {X1(T)} for all S > T. Proof: Suppose for S > T, 3rS E HI with IrS 7r 1 (T). By definition, C,.s (T) > C(T). Therefore, Cs (T) - C*(T) > 2a(T). Then by Lemma 10, irS ~ II;(S). Hence II;(S) C Il*(T). Theorem 12 If H1 = {Xr} then 3T such that VS > T, Il1(S)= {or}. 11

Proof: Suppose not. Then 3 a sequence {Tn} such that T, -- oo and one can find r E Il*(Tn) with?r': r;. By Lemma 8, 3 a subsequence {irk} of {r"n} such that rk - * E n*. Then, for k sufficiently large, 4r = -l. But *1 =;r, which is a contradiction. - Now let b(t) be any function satisfying b(t) > 0, for t > 1, and b(t) O- 0 as t -+ oo. Bes and Sethi [1986] prove the following result for discrete time and discrete decision sets. Schochetman and Smith [1986] extend it to continuous time and continuous decision spaces. The version here applies to continuous time and discrete decision sets. Theorem 13 If tIITI = 1, then there exists a positive integer N satisfying C(N) - C*(N) > b(N). Proof: Since In = 1, 3r* E n* and To such that H; = {X;} and ir(T) = r1,T > To (by Theorem 12). Suppose C(T) - C*(T) < b(T), VT > To (in particular, for all integers N > To). For each N, 3rN such that CrN = C(N) and 7N $ r. Then 0 < CN - C*(N) < b(N) V integers N > To. Since I is compact, we can extract from {rN} a subsequence {7rNn} such that n -X ir. Then 3no sufficiently large such that n > no implies N, > To so that 0 < Cfn(Nn) - C*(N) < b(Nn). But b(Nn) -> 0, CrNn (Nn) -. C. by Theorem 7, and C*(Nn) - C* by Theorem 9. Then C, = C*. Hence Xr E II and trl = -r;. Since limn.oo Nn = r, 3nl such that Vn > nl, p(rNn",) <. Thus r" = l = r;, which is a contradiction.. We can sum up our results so far: 12

Theorem 14 If HI is a singleton then 3T > 0 such that T is a forecast horizon. Proof: Follows from Theorems 11, 12, 13, with b(t) = a(t). u Theorem 14 was also proved by Bean and Smith [1984]. But from Theorems 11, 12, and 13 we also get an algorithm guaranteed to find a forecast horizon whenever IHII* = 1: Algorithm (1) Set T -1. (2) Solve the T-horizon problem to get Tl(T),C*(T), and C(T). (3) If I/L*(T)I = 1 and C(T) - C*(T) > 2a(T) then STOP: T is a forecast horizon. Else set T - T + 1 and go to (1). Theorem 15 If HI is a singleton then the Algorithm will terminate with an infinitehorizon optimal initial decision for finite T. Proof: Follows from Theorems 11, 12, 13.. 6 Degeneracy Resolution The results in Section 5 show that we can find a forecast horizon if the optimal initial decision is unique. Degeneracy can occur if there is more than one optimal initial decision. In this section we present two approaches to resolving degeneracy that follow from the results of the previous section. One is an exact algorithm for finding an optimal initial decision by perturbing costs. Together with the algorithm in Section 5, this constitutes the first method guaranteed to terminate in finite time when solving any of 13

a general class of infinite horizon problems. The second approach is the calculation of a forecast horizon with the guarantee that any finite horizon optimal first decision has infinite horizon discounted cost nearly as small as that of the infinite horizon optimal first decision. We make some additional definitions: G={l: E n}. ( = (C1, 2,...,,Ek) is a perturbation vector. Let C ) (t) = C(t) + i, t > 0, if r1 = \ri; that is, add an instantaneous, discrete cost of Ei for Ilr at time 0. C6(r), Ca', C"*, C' are defined as for Cr(t). II6* = {Xr: r = argminIE C4(r)} = the set of optimal strategies for the perturbed problem. II* = the set of initial decisions corresponding to r E II*. G = {: 4 e T}. e = C6 - C6* = the minimum penalty for choosing a suboptimal initial decision in the perturbed problem. Adjust M if necessary so that K,(t) < (M + )et R,(t) (M + )et, for t > 0. Lemma 16 Let e be such that ei < e Vi, and ei: ej for i s j. Then G_ C G and GIl = 1. 14

Proof: Suppose j ~ G. Then for any i E G, C*(j) - C*(i) > c. Then C" (j) - C*(i) = C*(j) - C*(i) + ej - ei > e + ej - ei. But 0 < Ei < e Vi = \e~ - EI < e. Hence, C`*(j) - CE*(i) > 0. Therefore, j V G6. Thus G C G. Now suppose i E Go. Suppose j E G' also. Then i E G and j E G. Hence C*(i) = C*(j). But ei ej => C*(i) 0 C'*(j). This is a contradiction. ~ Theorem 17 Let costs be perturbed as in Lemma 16. Then the Algorithm will terminate in finite time. Proof: After the perturbation, If* is a singleton. The result follows from Theorem 15. - Because calculating a requires all the data over the infinite horizon, this perturbation scheme cannot be implemented. We are currently working on an algorithm to implement these ideas without calculating a directly. But a consequence of Lemma 10, we can find a near-optimal forecast horizon. Let P(T) = {iliC*(T) - C*(T) < 2a(T)} be the set of potentially optimal initial decisions. By Lemma 10, if for any T, j V P(T), then Wr is not an optimal initial decision for the infinite horizon problem. We say that T as an c-forecast horizon if Co.(s) - C* < e for all S > T and all r*(S) E *(S). Theorem 18 If TE >, 1 log 4rM then Te is an c-forecast horizon. r-1 (,-~),T 15

Proof: For any 7r, Cr < C&(T) + a(T) by Theorem 1. Also, C* > C*(T) - a(T) by Theorem 3. For any S > T, let irs E fI*(S). Then?r = 7r\ for some i E P(T) by Lemma 10. Then C, - C* < C,(T) + a(T) - C*(T) + a(T) < 4a(T) 1 4rM < Eiff T > -log 4rM. r - -7 (r- )C Note that by Theorem 3, C* < a(0). Let p = = be the error expressed as a proportion of the maximum possible total discounted cost. Then Te = /% log 4. Thus the c-forecast horizon length grows proportionally to the log of the accuracy desired and inversely proportionally to the discount rate used. These properties are consistent with those of the E-forecast horizon found by Bean and Smith [1985] for capacity expansion problems. 16

References Bean, J. and R. L. Smith [1984], "Conditions for the Existence of Planning Horizons," Mathematics of Operations Research, Vol. 9, pp. 391-401. Bean, J. and R. L. Smith [1985], "Optimal Capacity Expansion Over an Infinite Horizon," Management Science, Vol. 31, pp. 1523-1532. Bes, C. and S. Sethi [1986], "Concepts of Forecast and Decision Horizons: Applications to Dynamic Stochastic Optimization Problems," Technical Report. Charnes, A., J. Dreze and M. Miller [1966], "Decision and Horizon Rules for Stochastic Planning Problems: A Linear Example," Econometrica, Vol. 34, pp. 307-330. Hopp, W., J. Bean and R. L. Smith [1984], "Non-Homogeneous Markov Decision Processes," Technical Report 84-24, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, MI 48109. Submitted to Operations Research. Lundin and Morton [1975], "Planning Horizons for the Dynamic Lot Size Model: Zabel vs. Protective Procedures and Computations Results," Operations Research, Vol. 23, pp. 711-734. Modigliani, F. and F. Hohn [1955], "Production Planning Over Time and the Nature of the Expectation and Planning Horizon," Econometrica, Vol. 23, pp. 46-66. Schochetman, I. and R. L. Smith [1986], "Infinite Horizon Optimization Over Continuous Policy Spaces," in preparation. 17