OPTIMAUTY CONDITIONS FOR MATCH-UP STRATEGIES IN STOCHASTIC SCHEDULING John R. Birge Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 M.A.H. Dempster Department of Mathematics, Statistics and Computing Science and School of Business Administration Dalhousie University Halifax, Nova Scotia, Canada B3H 3J5 and Balliol College, Oxford University Oxford, England OX1 3BJ Technical Report 92-58 November 1992

Optimality Conditions for Match-Up Strategies in Stochastic Scheduling John R. Birge* Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, Michigan, USA 48109 M. A. H. Dempstert Department of Mathematics, Statistics and Computing Science and School of Business Administration Dalhousie University Halifax, Nova Scotia, Canada B3H 3J5 and Balliol College, Oxford University Oxford, England OX1 3BJ Abstract: Practical scheduling problems require periodic decisions that consider future random phenomena. They can be modeled as dynamic stochastic optimization problems. In this paper, we present conditions for optimal solutions of a general class of these problems. The optimality conditions are in turn used to obtain the asymptotic optimality of strategies that match up with a turnpike strategy. We present examples of these models and discuss their implications. Keywords: stochastic programming, optimality conditions, turnpike solutions, stochastic scheduling, match-up. * This author's work was supported in part by Office of Naval Research Grant N0001486-K-0628, by Dalhousie University and the Natural Sciences and Engineering Research Council of Canada during a visit in the Department of Mathematics, Statistics, and Computing Science, by the National Research Council under a Research Associateship at the Naval Postgraduate School, Monterey, California, and by the National Science Foundation under Grant ECS 88-15101. tf Now at Department of Mathematics, University of Essex, Colchester, England C04 3SQ. This author's work was supported in part by the Natural Sciences and Engineering Research Council of Canada under Grant A-5489. 1

1. Introduction Practical scheduling problems require periodic decisions that consider future random phenomena. Much of the scheduling literature, however, considers problems that are deterministic and static (see, e.g., Graves [1981], Rinnooy Kan [1976] and Dempster, et al. [1982]). Discussions of stochastic scheduling problems with machine disruptions have generally considered single machines and limited models (Glazebrook [1984], Pinedo and Ross [1980], Mittenthal [1986]). An approach for dealing with multiple machines was introduced in Bean and Birge [1986] for single disruptions or disruptions that are well-spaced apart. In this paper, we provide a justification for this match-up scheduling approach, elaborated in Bean et al. [1991], through a general stochastic model. This model includes a convex objective function and decisions at discrete time intervals. It is similar to the continuous time model in Solel [1987] (see also Dempster and Solel [1987]) but allows us to treat a wider class of problems. Our main results are optimality conditions that are used to obtain conditions for optimal cyclic policies and the asymptotic optimality of match-up policies. We motivate these results through examples from scheduling. Our results in fact apply to more general stochastic optimization models and extend the developments discussed in Dempster [1988]. They relate most closely to Flam [1983,1985]. We extend Flam's results by relaxing his conditions of interior optimal solutions and uniform convexity of the objective function. Our work is also related to the deterministic work of McKenzie [1976], the control results of Kushner [1972] and Arkin and Evstigneev [1979], and the finite horizon results in Rockafellar and Wets [1983] and Hiriart-Urruty [1982]. In our model, we assume a data process, w:= {ct: t = 0,...} in a (canonical) probability space (Q., E,/ ). We also assume a decision process x:= {xt: t = 0,...} such that x is a measurable function x:'w - x(w). The space of the decision processes is the space of essentially bounded functions, L~:= Loo(f x N, Z x P(N), p x #; An), where P is the power set and # is counting measure. Associated with the data process is a filtration 2

IF:= {Et}~~1, where Et:= a(wt) is the a-field of the history process t:= {Wo,...,w} and the Et satisfy {0,Q} C Eo C C E. A fundamental property of the decision process at time t is that it must only depend on the data up to time t, i.e. xt must be Et-measurable. An alternative characterization of this nonanticipative property is that xt = IE{xtlEt} a.s., t = 0,..., where IE{I-Et} is conditional expectation with respect to the a-field Et. Using the projection operator It: z- TZ:= IE{zlEt},t = 0,..., on L', this is equivalent to (I- IIt) = O, t= 0,.... (1) We then let KV denote the closed linear subspace of nonanticipative processes in L'o and denote by II:= (IIo, II1,...) the projection operator from Ln onto A. Our general optimization model is to find 00 infxegIE E ft(w, Xt(w), Xt+i ()), (2) t=o where "IE" denotes expectation with respect to E. We use the notation xt and ft to denote respectively xt and ft as functions of w, i.e. as random entities. Expression (2) then becomes 00 infxEIEE ft(xt, Xt+li), (3) t=0 with objective F(x):= IE E o ft(xt, xt+i). We assume in (3) that the objective components ft are proper convex normal integrands (see Rockafellar [1976]) with the following additional property: Assumption 1: For any y = (xt,xt+i), there exists 7 > 0 (independent of t) such that for 7r E Aft(y) C (L'o)*, the Banach dual space of Lno, and for all w, either (a) XrE A ft(w), or 3

(b) there exists z such that 7r E oft(z) and ft(z) + 7r(w - z) < ft(w) - 711z - yll a.s. (4) for all t > 0. ~ Note that uniform convexity implies the above assumption, which allows nonstrict convexity involving a von Neumann facet (see Figure 1). We use this more general assumption here because it allows us to use the common scheduling objectives which involve linear tardiness and earliness penalties. These objective functions are discussed in more detail in Section 4. f A t von Neumann facet Wb z Wa ^ ( xt+ ) Figure 1. Illustration of the convexity of a realization of the objective function ft implied by Assumption 1, Cases a) and b). Figure 1. Illustration of the convexity of a realization of the objective function ft implied

mality of cyclic policies and the asymptotic optimality of match-up strategies. Scheduling examples are discussed in Section 4. Section 5 presents our conclusions. 2. Optimality Conditions In general, the objective in (3) is infinite. We can avoid this difficulty by defining a policy x*:= {xo,x;,...} as (weakly) optimal, as in McKenzie [1976], if it is not overtaken by any other policy, i.e. if there does not exist x' such that limsup E [ft(x, x+) -ft(x, x+l)] <-6 (5) T -— = OO T-600 t=O where e > 0. We also assume that the objective functions satisfy a condition ensuring that no infinite terms are present in the sum in (5). Assumption 2: For any t and 6 < oo, there exists e < oo such that lIxtIl < 6 a.s. implies JE ft(xt,xt+l) > -e and l|xt+lll < e a.s. for xt+l feasible. I Given Assumption 2, we can subtract a constant from each ft and not change the weak optimality of x*. By setting this constant equal to the expected objective value in each period, we obtain an infimum of 0 in (3). We therefore impose Assumption 2 so that without loss of generality we can assume a finite infimum in (3). Our goal is to construct a sequence of prices supporting the objective terms in (3). These price supports provide the optimality conditions in the following theorem. Theorem 1. Suppose Assumption 2 holds and that x* is optimal in (3) with finite infimum, and (a) (nonanticipative feasibility) For any x E dom F (i.e., such that IE Et=o ft(xt,Xt+l) < (o), the projection of x onto, IIx, is such that IE ^oft(lI^xir t+ixt-+) < oo, (b) (strict feasibility) For some x E K, such that E,=0of,(xt,xt+i) < oo, there exists 6 > 0 such that for all IIy - xl|| < 6,y E L, IE Et=o ft(Yt, Yt+l) < 0. 5

(c) (finite horizon continuation approximation) There exists x' such that for all Tk in some sequence {T1, T2,... }, and, for any x E dom F, (xTk xTk+X, xk...) is also feasible, and the transition cost to x' is such that IIE[fTk-l(XTk-l, XTk)+fTk(XTk, xTk+l)]l -- 0 as - oox and |IE [fTk-l(xTk- 1, XT+fTk (xTk, XT+)]| > E[fTk-l(XTk-1,XTk)+fTk(XTk,XTk+l)]| for k = 1,... Then, x* is optimal with given initial conditions x0 if and only if there exist Pt E L?(E),t = 0,..., such that (i) pt is nonanticipative, i.e. pi = IE{ptlEt} a.s. fort = 0,..., (ii) IEo(fo(xo, xi)-poxo +pixl) is a.s. minimized by xi:= xl overxl = IE{x1 I E }, and, for t > 0, IEt(ft(xt,xt+j) -ptxt + pt+lxt+l) is a.s. minimized by (xt,xt+l):= (x,x x*) over Xt = IE{xt | Et} and xt+l = IE{xt+l | Et+l}, and (iii) E Pt(Xk - k) - 0 as tk -* oo, for all x E dom F. Proof: The proof follows from the theory of convex normal integrands (Rockafellar [1971]) as in Flam [1983] and [1985]. To prove sufficiency, given (i) - (iii) and denoting IE{IlEt} by IET[.], for any x E N' with xo:= x0, we have tk-i tk-i IE Z[ft(x*,x+j) -ft(xt, xt+l)] = IE > IEt[ft(xt,x t+1)-ft(xt,xt+i)] t=O t=O tk (6) <IE IE Et[pt(x - xt) - pt+i(x*+l - Xt+i)] t=O = Ept,, (xtk,, - ). Note that the last quantity in (6) approaches 0 as tk -+ oo by (iii). Hence, x* is weakly optimal, and optimal in (3) by the finiteness assumption. Now, to prove necessity of (i)-(iii), suppose x* is optimal in (3) and (a) - (c) hold. Define Tk -1 oo FTz'(x):= IE[ ft(xt,Xt+l) + fT,(xT,, xT,+l) + ft(x,, x,+l)] t=0 t=Tk +l 6

for x' as in (c). Note that the objective in (3) is F (x) and that it follows from (c) that F~~(x) < oo implies FTk(x) < oo for all k = 1,.. Suppose the functional xTk E aFTk(x*Tk) C (Ln )*, the Banach dual space of LZ, where F T(xT):= E[E=o ft(xt, xt+) + fTk(xTk,XTk+1)], which is the expectation of a convex normal integrand by assumption. By Rockafellar [1971, Corollary 1B], there exists a representation of wTk as 1Tk = TTk + a7Tk, where Tk e FTk(x*) n L (7) (i.e., there exists pTk such that rTk (xTk) = lEpTkxTk for all xTk E L" ) and aTk is a singular functional corresponding to a purely finitely additive measure (see Yosida and Hewett [1952]) such that 0Tk (x*Tk) = sup{Tk (XTk) I FTk(xT ) < 00}. (8) We now construct an appropriate 7Tk. For t = 1,...,Tk - 1, let t*_-(xt):= IE[ft-_(x*_-,xt) + ft(xt,x*+1)] and 7rt E 9_*-(z(x*). We can again apply Rockafellar's result as in (7), so that 7rt = rt + at, for?rt E Ln and singular ot. By the definition of subgradient, 7rt(xt - x;) + IE[ft-.(x_-1,x;) +ft(x,x+1)] < IE[ft,_(x1_, xt) + ft(xt,x*+1)], (9) for all essentially bounded xt. We also define TTk as a subgradient of IE[fTk -1 (xk_, x ) + f,(xT, xT,+lx)], defined as a function of xTk and evaluated at x*, involving a trajectory x' whose existence is guaranteed by (c), so that we also have 7r TZ(x -xT)+IE X < E[fT-l(xX 4 x Xi,xz) + fTk (XTk, XTk + 1 )], (10) for all essentially bounded XT,. We set rTk:= (7r1,...,Tk-1,rTTk). Combining (9) and (10), we have irT~ e &FTi(x*Tk). Using (c), F(x) = limko FT(x). Thus, 7t(xt - X) + klm (Tk(XTk - Xk)) + F(x*) < F(x), (11) t=O

for all x E L". From (10) and (c), the measure 7rTk must also satisfy lim T7rk(XTk -XT)I < lim |IE{[fTk-l(xvkT,XTk)+fTk(XTkX )] k oo ~~~~~~~k-co ~(12) -[fTk-l(Tk-1,XTk) + fT,(xTk, Tk+1)]}- =0, for all x E dom F. (To see this, add and subtract fTk - 1 (X 1,xTk) + fTk (xT - 1,x ) within the expectation on the right hand side.) From (11), (12) and, since F(x) = +00oo if x 0 dom F, 00 Z rt(xt - x) + F(x*) < F(x) (13) t=o for all x E L. Therefore, ir = (rl, 7r2,...) E 8F(x*). Now, define 6(x):= { if xt = IE{xtEt} a.s. oo otherwise. Let the projection from L" onto M be II, where Ht(xt) = IE[xt I| ], t = 0,1,... Then, 6 is the indicator function of the linear subspace M = ker(I - II) of Loo. Solving (3) is then equivalent to finding infXEL- (F + 6)(x) Since x* is optimal in (3), 0 E 8(F + b)(x*). Hence, -r E 96(x*), i.e. Xr E [R(I- H)]~, the subspace of (Ln )* of annihilators of the range of the complementary projection (I - II) of II which (in an abuse of notation) we shall denote by'-L. By (8) and (9), Tk-1 Tk-1 (x'T) - E [t (x;)] + aT(x (X) > [ZT(x)] +T:(xTk, (14) t=l t=l for all x E dom FTk D dom F. From (12) and (b), |lrT(xk - XTk)l -- 0. From (14), then, ^[at(x)]> Z[at(xt)] Vx E dom F. (15) t=O t=O 8

We can now follow the proof technique in Flam [1985] to obtain an equivalent subgradient without singular components. The (Banach) conjugate of the operator II is denoted by II*. Define 7':= I*Pa. We will show that T E Li. Since 7r E (Al)o, we have a(x) = r(x) for all x E K'-. Hence, for any x E -AlV, we can write u(x) as Et=0 f p(w,t)x(w,t)P(dw) where p E Ln. Let (., t):= IE{p(-,t) Et} for all t, so that p E Ln. Then, for all x E L"o, 00 oo = Es(Hx) = IEp(,)E[ p(., t) |(xt])|t] t=t=o 00 t=o 00 = IEE p1(., t)x(., t) = IEpx. t=O Hence, T E Ln. Let 7r*:= r+ r E Ln and note that, since - = ar on KVA, 7r* E (L')o. It also follows as in Flam [1985], since [(I - H*)r*]x = ir*[(I - I)x] = 0 for all x E L" and (n*7r)t = IE(7T|Et), that t = IE(7r* I Et) a.s. Vt. (16) From (15) and nonanticipative feasibility, Y(x*) = u(x*) > a(nx) = W(x), (17) for all x E dom F. We can again apply Rockafellar's result on convex normal integrands as in (7) so that r E OF Cl Ln. It follows that 3r*= 77 + E FnL. (18) Since lr* now excludes the singular multipliers, by (b), we can decompose lr* into multipliers for each component ft of the objective (by applying Rockafellar [1971, p. 442] to the finite sum 0 to T to decompose into a finite sum for almost all a, then decomposing this sum using Rockafellar [1969, Theorem 23.8], and noting the vanishing T terms as in (11), or, more directly, by using the results in Ioffe and Levin [1972]). Thus, 7r*(x) = IE ~_O pt(x, Xt+l), 9

where pi be written with two components as p' = (p, -Pi) E 9ft(x*, x*+) a.s., where p'(xt,xt+i) = p* xt - p*+ xt+1, and p Xt t- x)pt+ (Xp+l-xt+l ) + ft(xt,xt+l) < ft(xt, Xt+1), (19) a.s. for all xt = E{xt I| E} and Xt+l = IE{xt+l I Et+1}, t = 0,1,.... Since x* minimizes ft- (x*_,- ) + ft (-,x+l) a.s., 0 E 9[ft- (x 1,x) + ft(x*,x*+i] a.s., and we may choose -_p2 + p*1 = 0 a.s. By (16), we can define p* such that pt:= IE(p*2 Et) = IE(p;1 l t), t = 0,1,..., (20) proving (i). For xt = IE{xt | Et}, IEt[p*'(xt)] -= Et[p;*]xt = p*xt. Also, for xt+i = IE{Xt+l I Et+i}, IEt[pi1(xt+l)] = IEt[IEt+i[pi1(xt+i)]] = IEt[pi+xt+i]. Now, taking conditional expectations with respect to Et in (19), we obtain lEt [p (xt-x;)-p P+li(xt+l - x+i) + ft(x;,x+1)] < lEt[ft(xt,xt+i)], a.s. for all xt = IE{xt | EX} and xt+l = IE{xt+i I Et+i}, implying condition (ii). Condition (iii) follows from (c) as in (12). * Our results extend the price support results given in Flam [1983,1985]. In Flam [1983], strict feasibility of the optimal policy is required, while the infinite horizon results in Flam [1985] require the objective to be completely separable in t. We use the finite continuation property to obtain these more general results which provide a more realistic framework for infinite horizon problems. The finite continuation property states that one can pay finite and absolutely decreasing amounts to reach a fixed finite cost path at given periods that extend to the horizon. This payment may, for example, represent the cost of joining a cyclic feasible policy. Examples of nonremovable singular multipliers without similar assumptions to ours are given in Prescott and Lucas [1972]. Nonanticipative feasibility is related to other conditions, such as relatively complete recourse as used in Rockafellar and Wets [1976], see e.g. Dempster [1988] for more details. 10

3. Turnpike Results Our goal in this section is to show that optimal solutions approach a common facet - the von Neumann facet - from any given starting condition x0. The main implication of this result is that it is asymptotically optimal to match up with a decision process that is optimal for a specific initial condition even if that initial condition changes. This result is further enhanced by our showing that if the data process is cyclic then it is asymptotically optimal to return to an optimal cyclic policy even if other conditions temporarily obtain. These results lend justification to the match-up scheduling policy in Bean, et al. [1991] and extend the deterministic results in Bean and Birge [1986]. Discussion of specific scheduling policies is however postponed to the next section. In this section, we again use a general stochastic optimization model that may be applied in a variety of contexts. Proposition 1. Given Assumptions 1 and 2 and Conditions (a) - (c) in Theorem 1, let X* be the set of solutions (xt*,x*+) that are minimal in (ii) of Theorem 1 for p* a set of optimal supporting prices given the initial condition xo and let x' be an optimal decision process given the initial condition x'. Then, for any e > 0 and 6 > 0, there exists T < oo, such that, for all t > T, P{w: inf(x;,X:+)Ex I(x't, Xt+l) - (xt,x +1)ll > e} < 6 (21) Proof: By summing terms in (ii) of Theorem 1 for x* and x' for each ft, t > 1, we obtain IE<t{(p+1 - p+1)(x+l ) - (x - p,)(x' - x,)} < 0, (22) where p* is the price support process corresponding to x*, an optimal solution given x0, and p' is the sequence of prices supporting x'. Next let Vt(x ):= (p - pt)(x'~ - x), so that (22) becomes IEtvt+i(x;+l) < IEtvt(x;) for all t > 1, (23) 11

for any (xT,x*+l) E XT. By (iii) of Theorem 1, limt_oo JEvt(x*) = 0 for any xT. Note however that if (21) does not hold, then, for some e > 0, 6 > 0, for t = Tk,..., P{w: inf(xX;l)x(xt, X+l) - (Xt+,xt+x)11| > 6} > 6. (24) From Assumption 2, since (p9,-P +i) E Oft((x*, x*+1)), either (p*,-P+) E aft((x,,,x/+i)) (in which case (x',x'+1) E X*) or there exists (zt,Z+l1) E Xt such that f(zt, t+i) -Pt Zt P t+i <+ f(x,x+) )- Ptxt * t+1Xt+l (25) -7||(zt, Zt+i)- (x,x+1)Il a.s.. By the definition of Xt for given x0, for any (x,xt+) we have that IE{f(x*,xt+l) - ptx; + p*+lxt+l} = IE{ft(zt,Zt+i) - p;zt + p+lt+l}. Hence, for any T, summing (25) and taking expectations yields T T IE[ ft(x;,xIt+)] -IE[Eft(x,x'+1)] < lE((pO - p)(xo- x)) t=O t=O - IE((PT - PT)(XT - xT)) (26) T 7-Ell|(zt, z,+l)-( (Xl x',+). t=O For t = Tk, i = 1,..., using the Markov inequality and (24), we have that IEII(zt,Zt+i) - (xxt+,)I > cP{inf(x.,x)E I(xt,xt+,) - (x,x+x)ll > C} > e6. Since lEvt(x*) — O 0, the right-hand side in (26) therefore approaches -oo, contradicting the finiteness of F~(x'). U Theorem 2. Under the conditions of Proposition 1, we may conclude that as t -- oo, inf(x;,x+ )Ex ||(xt,x+l) -(x,x|+|)11 -- 0 a.s. (27) Proof: Proposition 1 states that m^.x~x~ ~12 - (x;,x+)| - 0 12

in probability. Since the ft, t = 0,1, 2,..., are proper convex normal integrands, the infima over the X* are actually achieved. Hence, we may use the vector form of Skorohod's Theorem (see, e.g., Billingsley [1979], Theorem 29.6, pp. 337-340) to adjust the paths of the data process so that (27) holds. ~ For Theorem 2 to be fully applicable, we would like to have a method for determining an optimal policy for some initial state so that the policy of matching up to that strategy can be implemented. This determination is simpler if we can show that cyclic policies are optimal. In this case, only a single cycle needs to be analyzed to determine the turnpike optimal policy. An example of such a policy is given in the next section. In this development, we follow a similar approach to Arkin and Evstigneev [1979]. We first assume that the data process has a left tail, i.e. that wo can be interpreted as...,w 1,w0. An alternative is to assume some type of Markovian property of the data process (see Arkin and Evstigneev). The data process is assumed to be cyclic with cycle k if the measure p is invariant with respect to the k-period forward shift operator Tk where Tkw = aw such that wt:= Wt+k, i.e., Wt = Tkwt = Wt+k a.s. It follows that we may define Tk Et:= Et for t = 0,..., k - 1. We also assume that the objective is invariant with respect to Tk so that ft+k(Tkxt,Tkxt+l) = ft(xt,xt+i) a.s., where Tkxt(w):= zt(Tkw). In this context, x is a cyclic policy if Xt+k = Tkxt a.s.. Corollary 1. Given conditions (a) - (c) of Theorem 1 and a cyclic data process with cycle k, then there exists a weakly optimal policy with cycle k for any initial condition x0. Proof: First consider the problem: k-2 inf E( ft(xt, xt+i) +fk-l(xk-1,Tkxo)) (28) t=O subject to xt nonanticipative in Ln for t = 0,...,k - 1 and where TkXo(w):= xo(T). We can interpret (28) as inf E(f(xo,...,xk-1)), (29) 13

over the same decisions. Program (29) is a finite horizon problem as in Flam [1985], hence there exist supporting prices p *,t = 0,..., k - 1, such that an optimal policy x* almost surely minimizes over nonanticipative policies in L': IEt(ft(xt,xt+l)-p Pxt + p+1xt+), t = 0,..., k-2, (30) and IEk-1(fk-l(Xk-1, Tkxo) - P*_lXk-1 + p*xo). (31) Define x((w):= x*(Tkw) and p*(w):= p*(T'lw). From the cyclic property of the data process, IE[x:(w) I k] = IE[x(Tkw) I Tko] = IE[x (w') I So] (32) = x0(w') = z (Tkw) = x(w) a.s.. Similarly, p: is nonanticipative. Continue to define Pt*+Lk(w) = p (TLk1 w) for 0 < t < k- 1 and L = 1,..., and xt+k(w) = (TLkw) for 0 < t < k- 1 and L = 1,. We then have conditions (i) and (ii) in Theorem 1 for optimality by applying (30), (31), and repetitions of (32). By summing the terms in (ii) up to r = Lk and noting that IExo = IExr, we obtain IE[E(ft (x',x,+) -ft(x;,x1+)] > 0 (33) t=o for all r. Inequality (33) implies that x* is weakly optimal by definition. U 4. Application Examples The primary concern of our'analysis is in application to stochastic scheduling problems. Our first example is a small problem that involves a general type of scheduling objective and meets our requirements for cyclic optimal policies. We show that these conditions are met and determine the optimal cyclic policy. Assume a single machine and that units of a single item produced on the machine. The state Xt of the process is the amount of inventory (positive or negative) of the item 14

at the beginning of each period. One unit is demanded in each period t. One unit may be produced in regular time in each period. An additional unit may be produced with overtime if the machine is available. The uncertainty is in the availability of the machine. We assume that the machine is available with probability 2/3 independent of the time period. This model is a small stochastic version of the general deterministic example in Bean et al. [1991]. The objective includes a penalty (equal 10) for any backordered products, unit holding costs for any positive inventory, regular time production at cost 2 per unit, overtime production at cost 4 per unit, and a possible outside vendor purchase at cost p per unit. The result is that, without production, the state moves from xt to xt+l = xt - 1 from t to t + 1. The cost of this transition is xt if xt > 0 or -10xt if xt < 0. The value of xt+l given Xt is the production/purchase decision. Production is only possible if the machine is available. The cost is 2(xt+l + 1 - xt) if xt > xt+l > xt - 1 or 2 + 4(xt+l - xt) if Xt + 1 > Xt+l >_ Xt. If the machine is not available, then outside purchases are possible at a cost p(Zt+l + 1- Xt). The objective function is then f-10t + p((t+l + 1- xt) if xt < 0 and -1 < xt+1 - xt < 0, ft(Wt+l, tz Xt+i) = zt + p(zt+i + 1 - Xt) if xt > 0 and -1 < xt+l - xt < 0, co0o otherwise, (34) if wt+1 corresponds to "machine unavailable." If wt+1 corresponds to an available machine, we have (-10 + 2 + 4(xt+l - t) if xt < 0 and 0 < xt+l - xt < 1, -10xt + 2(t+l + 1 - t) if xt < 0 and -1 < xt+l-Xt < 0, ft(Wt+l, zz, zt+l) = <t + 2 + 4(xt+l - zt) if St > 0 and 0 < 4t+l -z < 1, |zt + 2(xt+l + 1 - zt) if St >_ 0 and -1 _< t+l -- Zt < O, oo otherwise. (35) Note that the functions in (34) and (35) are convex and satisfy Assumption 1. Finite 15

values over the infinite horizon can be obtained as in Section 2 by subtracting constants in each period corresponding to the expected values of contributions from weakly optimal schedules. The data process here is Markovian, so we have the conditions for an optimal stationary strategy. We seek an optimal solution to (28) for k:= 1. The penalty term p is used as the price of obtaining one unit of the product elsewhere. For finite p, an optimal solution exists in Loo. To simplify the analysis, we consider optimal solutions for p -- oo. Results for finite p can be easily obtained as perturbations (depending on p) from the results below. The solution of these problems only requires the determination of additional parameters corresponding to the levels at which outside products should be purchased. This problem can be interpreted as an abstract linear program. Suppose wt+1 = 0 if the machine is unavailable, and wt+l = 1 if the machine is available. Problem (28) becomes: inf IE[10x- + x+ + 2y1+1 + 4y2+1 2 t+l] s. t. x+ -x- X= xt a.s., E[xt Et] = Xt a.s., (36) Xt+l = Txt a.s., xt + Yt+i + Yt+1 = xt+ - 1 a.s., 0 < x+,O < x,,0 X < y l, =,0 < Y2+1 < 1,t+,= a.s. An optimal solution occurs at an extreme point of the feasible region of stationary distributions for xt. In our case, the solution occurs at a discrete distribution with atoms spaced units apart. There are two classes of these extreme solutions:61: solutions corresponding to Xt+l = Xt + 1 if the machine is available and xt < a and Xt+l = Xt- 1 otherwise. ~2: solutions corresponding to Xt+l = xt + 1 if the machine is available and Xt <a C- 1, Xt+l = Xt if the machine is available and ca > Xt > ca - 1, and xt+l = Xt - 1 otherwise. Note that a stationary solution cannot have a wider range of regular time production of 16

units since we can never achieve any higher xt+l than a. The optimal value of a and the corresponding optimal strategy remain to be determined. For ~1, we consider a:= 3. This yields the stationary distribution: Xt machine status probability 4 up 1 4 down 1 3 up 1 3 down 1 2 up - 2 down 6 1 up 1 1 1 down 32 - up (1)(1)+ - I down 1) The expected objective value is then 67. For ~2, the stationary distribution for a:= 3 is: St* machine status probability 3 up 1 1 3 down 6 2 up 2 down l 1 1 up 1 1 1 down 24 - I up ( 1)( )1+1 - I down (() Here the expected value is 6-. 17

To show the optimality of x*, consider changes from the values of Y+li, which we can consider as nonbasic variables in (36). Note that, since x* is stationary, we must have IE[y1+l + y2+1] = 1. Since IE[y+l] = 2/3, it cannot be increased. The only possible changes are then to decrease Y+1i and to increase y2+1. We never would reduce Y+li before reducing y+i to 0, so consider first an E reduction of Y+i for any subset of Qt(3) of measure 6 > 0 where x4(w) = 3 for all w E Qt(3). Correspondingly, we must increase y2+ on Qt(3). The resulting stationary distribution xt < x* a.s. with P{x' = xt - e} = 6 and P{x' = x*} = 1/3 - 6. So, the difference in expected inventory cost from x* to x' is 6(-eP{x* > 0} + 10eP{x* < 0}) = b6(3/8). Thus, both inventory and production costs would increase in this case. The only other alternative change in the strategy is to keep y 1+1= 1=1 and to translate the distribution. In this case, however, any e increase in a yields an expected cost increase of j5c, and, as before, an c decrease in a yields an expected cost increase of -e. With a finite penalty p the results are similar. For example, if p = 1000, then the same a value is optimal and the distribution is truncated at 1 = 35. The results of the previous sections then show that it is asymptotically optimal to match up with this strategy regardless of our initial conditions. A match-up strategy to accomplish this is simply not to produce if inventory is greater than 3, to produce one unit for inventory equal 3, to produce in overtime one plus 6 for inventory of 2 + 6 for 6 > 0, and to produce two units for inventories of 2 or less. This first example includes an objective with tardiness or shortage penalties and holding costs that are common in scheduling models. Here the randomness was confined to the machine availability. Other scheduling models with varying degrees of uncertainty can also be incorporated into our general stochastic optimization model. For example, consider a k-period cyclic model in which each of several commodities i = 1,..., n has a random processing time p(i), a random release date r(i), a random due 18

date d(i), and a penalty weight of wi for every period after the due date in which processing is not completed. The evolution of the data process is such that p(i) is known given d(i). We also assume that r(i) < d(i) < k a.s. Decisions are the amount of processing performed on each item i in each period t. We may consider the state of each item to be reduced by the processing requirement at each due date, i.e. the state is changed by the total processing xt+((i) - xt(i) in period t if t is not a due date (t: d(i)) or by xt+l(i) + p(i)- xt(i) if t is a due date (t = d(i)). The decisions are constrained so that no processing can occur if an item is not released (t < r(i)) and processing in each period on each item is at most one (for simplicity). Other restrictions on feasible processes appear in an indicator function 6(w,xt,xt+1) that considers all resource availabilities. The only costs in this model are due to tardiness. The penalty wi is charged in each period for every unit of item i overdue (xt(i) < 0). The total tardiness cost at time t given w is then -i=l wi(-xt(w, i))+. The objective is to minimize the expected total tardiness. The single period objective contribution is then: n ft(xt,xt+l):= Eft(x(i),,Xt+l(i)) + (wxtxt+l), (37) i=1 where - wiXt(i) if -l{t=d(i)}p(i) < xt+(i) - Xt(i) _< l{t>r(i)} - l{t=d(i)}p(i),xt(i) < 0 a.s., ft(Xt(i),Xt+i(i)) = 0 if -l{t=d(i)}p() < Xt+l(i) - xt(i) < 1{t>r(i)} - 1{t=d(i)}P(i),xt(i) > 0 a.s., 0oo otherwise. The data process is assumed to determine the availability of the resources (such as machines, labor and tools) for processing all commodities. We allow the 6 indicator term to represent feasibility generally by assuming a value of "0" if xt+l is feasibly reached from Xt and "oo" otherwise. For example, suppose that the processing of each commodity i requires a resource m(i) where m(i) E {1,..., M}, the set of resources, and each resource can process at most one unit during a time interval if available and cannot process anything 19

if unavailable. In this case, and independent component Ot of Wt can be interpreted as an M-vector of ones and zeroes corresponding to availability and unavailability. We then have 0 if E-=1 l=m(i)(Xt+l(i) - Xt(i) + P(i)l{t=d(i)}) 6(w, Xt,Xt+i) = < < t(j) for j = 1,...,M, oo otherwise. Other constraints can also be represented in this way. Our only requirement is that 6(w,, ) must be convex. This model is the basic minimum expected weighted tardiness multiprocessor scheduling problem. With the convexity assumption, it meets the criteria for optimality, asymptotic stability and cyclic optimality given in Theorems 1 and 2 and Corollary 1. In some cases, an optimal stationary distribution for this model follows a path between disruptions such that an optimal match point is achieved as quickly as possible (cf. Bean et al. [1991]). To see this, let x* be an optimal cyclic turnpike schedule in X*, the set of optimal cyclic turnpike schedules. Assume that some state x' < x* a.s. is the initial state instead of x*. Let x' be an optimal trajectory given x'. We wish to show that a trajectory x that starts at x' and matches up with x* at the earliest feasible t can be constructed with the same objective value as x'. Theorem 3. Suppose x* is optimal from xo in the above tardiness model, find infxE,xo=xo a.s. IEZ=oft(xt,xt+l) given x4 > x' a.s., with ft defined in (37), and that there exists a feasible solution x such that xo = x' a.s., and XT = xT a.s. for some < 00o, then there exists an optimal solution x' given x' such that xt = x*,t > r a.s.. Proof: The existence of the feasible path x that matches up to x* at r is equivalent to the availability of n l l{j=m(i)} (0 (, i) - (w, i)), (38) i=1 additional units of capacity on every machine j beyond the processing required on machine j under policy x* for a.e. A, j = 1,..., M, before time r. This is true because we implicitly 20

assume that if x;(w, i) > x[(w, i), then r(w, i) < 0. Given this surplus capacity, consider x' optimal given x' and construct xt, 0 < t < r such that for all i = 1,..., n and all w: (i) t(, i):= (w, i) if x (w,i) < x (wi), (ii) X, (W, i) = (w, ) if(,,i) > xS (, i) Note that x:= x' A x*, i.e. a minimum of the two optimal policies in the sense of coordinatewise order in t, so that xt < Xx, t = 1,...,r - 1, X, = and no processing takes Xt(i) beyond x*(i) for any i. Since no more processing capacity than (38) has been used on any mahcine j = 1,..., M, x defined by (i) and (ii) can be defined as Xt:= x4 for t > r. Note also that for x' and x* nonanticipative, x is nonanticipative by construction. Suppose that x is not optimal from the initial point x'. Then, there exists a sequence 6x:= {Sx0:= 0, x1, 6x2,...} such that x' = x + 6x a.s. and F(x + 6x) < F(x). We will show that this implies that there exists 6x* such that F(x* +6 x*) < F(x*) in contradiction to optimality. First, define F(x'):= i= Fi(x'(i)), where Fi is the contribution of commodity i to the overall objective, and Fi(x'(i)):= IE[gi(w,x(i))], where gi(w,x(i)):= t=0 ft(xt (i), Xt+ l(i)). Now suppose Fi(x'(i)) < Fi(x(i)) (39) for some i (i = 1,..., n). Then, there exists Qi C Q, P{WQ} > 0, such that gi(w, x'(w, i)) < gi(w, x(w, i)) for all w E Qf. This in turn implies that there exists r(w, i) < r such that for all w E Qi, zT(,i)(w ) = X (,i)(, i) = r(W i)( i), and 00 00 yE /( t(w,(i,), t+ (w, i))< fti(w, t(w i,), t+(w i)) (40) t=r(w,i) t=r(w,i) since R:= x' A x* up to period r. Note that 6xr(,i)(w, i) = 0 and define the feasible new policy x* (w, i) + 6x* (w, i) for w 6 ~i by setting 21

and setting 6x (w, i):= 0, t = 1, 2,... for w E Q \ Qi. It follows, for all w E Qi, using (40) that 00 gi(w, x*(w, i) + 6x;(w, i))= fti(w,, x(w, i), *+1(w, i)) t=O + f'(Wi)- l(w, X;(,i)(w, i), XT(,.i)(w, i) + 6x.(,,i)(w, i)) 00 + E f(w,x,(w,i)+6S(w,,i),zt+(w,)i)+6x+((w,i)) t=Tr( (,i) 00oo < ft(w, Xt(W,i), Xt+l(o, i)) t=O + f'(w, i) l(w, Xr(,,i)(w, i), X(,i)(w, i) + x(W,,i)(w, i)) + ~ fti(w,x *(w,i), *+l(, i)) t=r(W,i) = gi(wA, *(w, i)) + fT(Wi).(W X(wi)- (w i), XT(W,)( i) + 6XT(Wi)(W, i)) -- (Wi)-I(X XT(uWi)-l(W i) ( X (W i)) = gi(w, x*(W, i)), since x.(W,,i)(w, i) = 6x,(,i=)(w,i) = O. This implies that Fi(x*(i) + bx*(i)):= E[gi(w, x*(i) + 6x*(i)] < Fi(x*(i)):= IE[gi(w, x(i))]. Making this construction for each i for which (39) holds and summing over i = 1,...,n gives F(x* + 6x*) < F(x'), the required contradiction. This tardiness model can be extended further by including other constraints in the 6 term of the objective. The only requirement is that the overall objective remain convex. Hence we need only require convex feasible sets. 5. Conclusions We have presented a general stochastic optimization model with discrete time periods and infinite horizon. We showed that optimality conditions in terms of supporting prices 22

can be derived for the model. This conclusion extends previous results by allowing extreme value optimal solutions and a more general convexity condition. These generalizations make the model more applicable to a variety of problems. They allow for characterizations of turnpike and cyclic optimal solutions that justify match-up strategies. In particular, we showed how these results apply to scheduling. In these problems, a cyclic schedule can be developed "off-line" using whatever computing resources are available and the cyclic policy is implemented whenever conditions evolve as predicted by the model. When other conditions obtain (for example, a catastrophic failure of a machining center), it is optimal to match- up with the cyclic schedule sometime in the future. Determining this match-up strategy can be accomplished "on-line" while the remainder of the cyclic schedule is maintained. In practice, the match-up point can be set so that matching up is both computationally feasible and cost effective. References V.I. Arkin and I.V. Evstigneev (1979). Stochastic Models of Control and Economic Dynamics. Nauka, Moscow. (In Russian). English translation by E.A. Medova-Dempster and M.A.H. Dempster. Academic, Orlando (1986). J.C. Bean and J.R. Birge (1986). Match-up real-time scheduling. In Real-Time Optimization in Automated Manufacturing Facilities, R.H.F. Jackson and A.W.T. Jones, eds., NBS Special Publication 724, National Bureau of Standards, pp. 197-212. J.C. Bean, J.R. Birge, J. Mittenthal, and C.E. Noon (1991). Match-up scheduling with multiple resources, release dates and disruptions. Operations Research 39, 470-483. P. Billingsley (1979). Probability and Measure. Wiley, New York. M.A.H. Dempster (1988). On stochastic programming: II. Dynamic problems under risk. Stochastics 25, 15-42. M.A.H. Dempster, J. K. Lenstra, and A.H.G. Rinnooy Kan (1982), Deterministic and 23

Stochastic Scheduling. Reidel, Dordrecht. M.A.H. Dempster and E. Solel (1987). Stochastic scheduling via stochastic control. In Proceedings 1st World Conference of the Bernoulli Society, K. Krickeberg and A. Shiryaev, eds. VNU Science Press, Utrecht, pp. 783-788. S.D. Flam (1983). Asymptotically stable solutions to stochastic problems of Bolza. Chr. Michelsen Institute Report, Fantoft, Norway. S.D. Flam (1985). Nonanticipativity in stochastic programming. Journal of Optimization Theory and Applications 46, 23- 30. K.D. Glazebrook (1984). Scheduling stochastic jobs on a single machine subject to breakdowns. Naval Research Logistics Quarterly 31, 251-264. S.C.Graves (1981). A review of production scheduling. Operations Research 29, 646-675. J-B. Hiriart-Urruty (1982). Extension of Lipschitz integrands and minimization of nonconvex integral functionals: Applications to the optimal recourse problem in discrete time. Probability and Mathematical Statistics 3, 19-36. loffe and Levin (1972). Subdifferentials of convex functions. Trans. Moscow Math. Soc. 26, 1-72. H.J. Kushner (1972). Necessary conditions for discrete parameter stochastic optimization problems. In: Proceedings Sixth Berkeley Symposium Math. Stats. and Probability, J. Neyman, ed. U. California, Berkeley, pp. 667-685. L. McKenzie (1976). Turnpike theory. Econometrica 44, 841-864. J. Mittenthal (1986). Single Machine Scheduling Subject to Random Breakdowns. Ph.D. dissertation, The University of Michigan, Ann Arbor, Michigan. M.L. Pinedo and S.M. Ross (1980). Scheduling jobs subject to nonhomogeneous Poisson shocks.Management Science 26, 1250-1257. E.C. Prescott and R.E. Lucas, Jr. (1972). A note on price systems in infinite dimensional space. International Economic Review 13, 416-422. 24

R.T. Rockafellar (1971). Integrals which are convex functionals, II. Pacific Journal of Mathematics 39, 439-469. R.T. Rockafellar (1976). Integral Functionals, Normal Integrands and Measurable Selections. Lecture Notes in Mathematics 543. Springer-Verlag, Berlin. R.T. Rockafellar and R. J-B. Wets (1976). Nonanticipativity and L'- martingales in stochastic optimization problems. Mathematical Programming Study 6, 170-187. R.T. Rockafellar and R. J-B. Wets (1983). Deterministic and stochastic optimization problems of Bolza type in discrete time. Stochastics 10, 273-312. A.H.G. Rinnooy Kan (1976). Machine Scheduling Problems: Classification, Complexity and Computations. Martinus Nijhoff, The Hague. E. Solel (1986). A Dynamic Approach to Stochastic Scheduling via Stochastic Control. Ph.D. dissertation, Dalhousie University, Halifax, Nova Scotia. K. Yosida and E. Hewett (1952). Finitely additive measures. Transactions of the American Mathematical Society 72, 46- 66. 25