OPTIMAL MATCH-UP STRATEGIES IN STOCHASTIC SCHEDULING John R. Birge Department of Industrial & Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 M.A.H. Dempster Department of Mathematics University of Essex Colchester, England C04 3SQ Technical Report 92-52 October 1992

Optimal Match-Up Strategies in Stochastic Scheduling John R. Birge* Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, Michigan, USA 48109 MI. A. H. Dempstert Department of Mathematics University of Essex Colchester, England C04 3SQ Abstract: Practical scheduling problems require the consideration of random events that may affect planned starting and completion times. In general, these problems are quite difficult to analyze. In a previous paper, general conditions were, however, developed for turnpike optimal strategies and the asymptotic optimality of strategies that match up with a turnpike strategy. In this paper, we present examples of these models, comparisons with deterministic approaches, and discuss implications for practical scheduling. Keywords: stochastic programming, optimality conditions, turnpike solutions, stochastic scheduling, match-up. *This author's work was supported in part by the National Science Foundation under Grant ECS 88-15101. t On leave from Dalhousie University. This author's work was supported in part by the Natural Sciences and Engineering Research Coulncil of Canada under Grant A-5489. 1

1. Introduction Practical scheduling problems require periodic decisions that consider future random phenomena. Much of the scheduling literature, however, considers problems that are deterministic and static (see e.g. Graves [1981], Rinnooy Ihan [1976] and Dempster et al [1982]). Discussions of stochastic schedluling problems with machine disruptions have generally considered single machines and limited models (Glazebrook [1984, 1987, 1991], Pinedo and Ross [1980], Birge and Glazebrook [1991], Mittenthal [1986]). An approach for dealing with multiple machines was introduced in Bean and Birge [1986] for single disruptions or disruptions that are well-spaced apart. In a previous paper (Birge and Dempster [1992]), we provided a theoretical justification for this match-up scheduling approach, discussed for deterministic systems in Bean ci al [1991], through a general stochastic model. In this paper, we elaborate on conditions that lead to these results and on examples of their application, and make comparisons to deterministic models. In contrast to many scheduling models, in which nonconvexities immediately appear through disjunctive constraints, our basic model relaxes this requirement to obtain a convex region and convex objective function with decisions at discrete time intervals. It is similar to the continuous time model in Solel [1987] (see also Dempster and Solel [1987]) but allows us to treat a wider class of problems. We will show the general nature of these results, however, in our examples. Related results for stochastic optimization models appear in Dempster [1988], Flam [1983,1985], Kushner [1972], Arkin and Evstigneev [1979], Rockafellar and Wets [1983] and Hiriart-Urruty [1982]. Turnpike theory for the deterministic case may be found in McKenzie [1976]. To set the stage for our model, we assume a data process, w:= {wt: t = 0,...} in a (canonical) probability space (Q, E,/). We also assume a decision process x:= {t: t = 0,...} such that z is a measurable function x:w. x(w). The space of the decision 2

processes is the space of essentially bounded functions, LX:= L,(Q x N,E x P'(N),/i x #;5 "3), where P is the power set and # is counting measure. Associated with the data process is a filtration IF:= {~t} '1, where Et:= ((Wt) is the a-field of the history process Wt:= {wo,...,Wt and the Et satisfy {0, Q} C (o C. C S. A fundamental property of the decision process at time t is that it must only depend on the data up to time t, i.e. xt must be Et-measurable. An alternative characterization of this nonanticipative property is that xt = IE{xtlEt} a.s., t = 0,..., where IE{(IEt) is conditional expectation with respect to the a-field Et. Using the projection operator t: z lrtZ:= IE{zlEt}, t = 0,..., on Ln, this is equivalent to (I- n,)t, = 0, t = 0,... () We let KS denote the closed linear subspace of nonanticipative processes in Lo and denote by II:= (Ho, II,...) the projection operator from L' onto A. Our general optimization model is to find infxeAIE E ft(w,x t(W), Xt+l()), (2) t=0 where "IE" denotes expectation with respect to E. We use the notation xt and ft to denote respectively xt and ft as functions of w, i.e. as random entities. Expression (2) then becomes 00 infxeAIE E ft(xt, Xt+l), (3) t== with objective F(x):= IE E o ft(xt,xt+l). We assume in (3) that the objective components ft are proper convex normal integrands (see Rockafellar [1976]) with the following additional property: Assumption 1: For any y = (xt,xt+l), there exists 7 > 0 (independent of t) such that for?r E Aft(y) C (Ln)*, the Banach dual space of L", and for all w, either (a)7r E ft(w), 3

or (b) there exists z such that ir E aft(z) and ft(z) + T7(w - z) < ft(w) - 7y|z - yl a.s. (4) for all t > 0. ~ Note that uniform convexity implies the assumption which allows nonstrict convexity involving a von Neumann facet. We use this more general assumption here because it allows us to use the common scheduling objectives which involve linear tardiness and earliness penalties. These objective functions are discussed in more detail in ~3. Section 2 presents the main results from Birge and Dempster [1992] on optimality conditions, turnpike results concerning the optimality of cyclic policies and the asymptotic optimality of match-up strategies. Section 3 provides motivation for the results in ~2 through examples in stochastic scheduling. Section 4 presents our conclusions and directions for extensions to nonconvex problems. 2. Optimality Conditions The proofs of all these results may be found in Birge and Dempster [1992]. In general, the objective in (3) is infinite. We can avoid this difficulty by defining a policy x*:= {x(,,x,,...} as (weakly) optimal, as in McKenzie [1976], if it is not overtaken by any other policy, i.e. if there does not exist x' such that T lim sup IE [ft(xx +l) - ft(x,, x,+,)] < -i, (5) - t 00 ( t=0 where c > 0. We also assume that the objective functions satisfy a condition ensuring that no infinite terms are present in the sum in (5). Assumption 2: For any t and 6 < oo, there exists e < oo such that IIxtll < 6 a.s. implies IE ft(xt,xt+l) > -e and IIXt+(il < 6 a.s. for xt+l feasible. U 4

Given Assumption 2, we can subtract a constant from each ft and not change the weak optimality of x*. By setting this constant equal to the expected objective value in each period, we obtain an infimumn of 0 in (3). Thus without loss of generality we a~ssume a fiite infimum in (3) in the sequel. The first result we give is that there exist prices supporting the objective, terms in (3). These price supports provide the optimrality conditions in the following theorem that allow decomposition of the conditions by time period. Theorem 1. Suppose Assumption 21 holds and that x' i's optimal in (3) with fin zte infimunm, and (a) (nonanticipative, feasibility) For any x E (loin F (i.eC., su71ch that I EZ tQ1ft(xt,xt+l) < oo), the projection of x onto A", [Ix, is such that LE 0ft(Hl~xt, Ht,+Ixt+1) < 00D, (b) (s trict fe a sib Ilit y) For- s om7re x E A", suitch t hatI LEZ t Cft(xt,xt+i) < oo, there exists 6 > 0 such that for all Ily - x~l < 6, y E L, IE Z t f(yt, yt+i1) < 00. (c) (finite horizon continuation approximation) There, exists x' such that for- all Tk in some sequence {T1, T2,.},and, for any x E (loin F, (x Tk X/+k+1) x/+,.) i's also feasible, and the transition cost to x' is such that IIE[fTk1I(XTk 1, XTk)~fTk (XTk Ix/+l] 0 as k 00 and I'E~fTk.(XTk 1XTk )~+fTk (XTk)x X )]I ~ IIE[fTk(x lxk)fkxkx +)]fo k = 11. Then, x~ is optimal with given initial conditions x() if and only if there exist Pt E Ln(E),t = 0,...,uhta ()Pt is noaticipative, iePt=IEf ptljtI a~s. for t = 0,.. (ii) IEo(fo(xo, xi) -poxo +plx) is a~s. mz.i~n?.zcd by x1: x7l over xi = IEfxi I El) and, for t > 0, IEt(ft(xt, xt+i) - ptxt + pt+lxt+1) is a.s. minilrLnized by (xt,xt+i):=(x*,x*+ over xt = IEfxt I Xt} and xt+1 = IEfxt1 I ~t+j}, and (ii) EPtxt-Xk) -~ 0 as tk - oc, for- all x E dom F.U 5

These optimality conditions (c) characterize optimal solutions which approach a common facet - the von Neumann facet - from any given starting condition x(. The main implication of this result is that it is asymptotically optimal to match up with a decision process that is optimal for a specific initial condition even if that initial condition changes. This result is elaborated by showing that, if the data process is cyclic then it, is asymptotically optimal to return to an optimal cyclic policy even if other conditions temporarily obtain. These results justify the match-iup scheduling policy in Bean et al [1991] and extend the deterministic results in Bean and Birge [1986]. Discussion of specific scheduling policies is however postponed to the next section. In this section, we continue to use the general stochastic optimization model (3) which may be applied in a variety of contexts. Proposition 1. Given Assumptions 1 and 2 and Conditions (a) - (c) in Theorem 1, let X* be the set of solutions (x*,x*+l) that are minimal in (ii) of Theorerr 1 for pi a set of optimal supporting prices given the initial condition x( and let x' be an optimal decision process given the initial condition x[(. Then, for any e > 0 and 6 > 0, there exists T < oo, such that, for all t > T, P{w: inf(XX+)ex II(x,t+l)- (x,xt+1)ll > } < 6. (6) Theorem 2. Under the conditions of Proposition 1, we may conclude that as t - oo, inf(xx+)ex ||(xxt+l)-(xxt+1l)ll 0 a.s. (7) For Theorem 2 to be fully applicable, we would like to have a method for determining an optimal policy for some initial state so that the policy of matching up to that strategy can be implemented. This determination is simpler if we can show that cyclic policies are optimal. In this case, only a single cycle needs to be analyzed to determine the turnpike optimal policy. An example of such a policy is given in the next section. 6

In this development, we follow a similar approach to Arkin and Evstigneev [1979]. We first assume that the data process has a left tail, i.e. that wo can be interpreted as... ^w l,. An alternative is to assume some type of Markovian property of the data process (see Arkin and Evstigneev). The data process is assumed to be cyclic with cycle k if the measure it is invariant with respect to the k-period forward shift operator Tk where TkW = w' such that wt:= Wt+k, i.e., wt = Tkwt = wt+k a.s. It follows that we may define Tk Et:= Et for t = 0,..., k - 1. We also assume that the objective is invariant with respect to Tk so that ft+k(Tkxt,Tkxt+l) = ft(xt,xt+l) a.s., where Tkt(w):= zt(Tkw). In this context, x is a cyclic policy if xt+k = Tkxt a.s. and (3) becomes /k-2 infxEgIE( ft(xt, xt+l ) + fk-1(xk-I, Tkx()). (8) t=() Corollary 1. Given conditions (a) - (c) of Theorerr 1 and a cyclic data process with cycle k, then there exists a weakly optimal policy with cycle k for any initial condition Xo. U 3. Application Examples The primary concern of our analysis in ~2 is in application to stochastic scheduling problems. Our goal in this section is to show how the optimality conditions given there can lead to solutions of specific stochastic scheduling problems and to characterizations of optimality that can aid in constructing heuristic solutions. We also wish to illustrate the differences between these problems and deterministic scheduling problems and how these differences are reflected in practical solution (luality. Typically, a scheduling problem consists of a fixed set of known jobs, a fixed set of available resources, and costs associated with the time each job completes. The decision is then to sequence the jobs seon the resources to minimize the overall cost. This leads to various combinatorial formulations, see e.g. Dempster et al [1992]. 7

Our approach here is to view the problem dynamically so that decisions occur at each instant in time. On the surface this may appear more difficult because of the number of time intervals potentially involved, but our avoidance of prespecified sequencing rules has advantages apparent in the convex optimization problem in (3) and its properties. To illustrate some of these advantages, we first consider a small problem (which also appears in Birge and Dempster [1992]) that involves a general type of scheduling objective and meets our requirements for cyclic optimal policies. We show how the cyclic optimality conditions for (8) are met and then contrast this problem with a deterministic counterpart. Assume a single machine on which units of a single item are produced. The state xt of the process is the amount of inventory (positive or negative) of the item at the beginning of each period. One unit is demanded in each period t. One unit may be produced in regular time in each period. An additional unit may be produced with overtime if the machine is available. The uncertainty is in the availability of the machine. We assume that the machine is available with probability 2/3 independently in each time period. This model is a small stochastic version of the general deterministic example in Bean et al [1991]. The objective includes a penalty (equal 10) for any backordered products, unit holding costs for any positive inventory, regular time production at cost 2 per unit, overtime production at cost 4 per unit, and a possible outside vendor purchase at cost p per unit. The result is that, without production, the state moves from xt to Xt+l = xt - 1 from t to t + 1. The cost of this transition is xt if xt > 0 or -10xt if xt < 0. The value of xt+l given xt is the production/purchase decision. Production is only possible if the machine is available. The cost is 2(xt+l + 1 - t) if xt > Xt+l > xt - 1 or 2 + 4(xt+l - Xt) if xt + 1 > xt+1 > Xt. If the machine is not available, then outside purchases are possible at a cost p(Xt+i + 1 - t). The objective function is then 8

f-10l t + (.Irt+i + 1- J't) if a't < 0 and -1 < 't+l - j < 0, ft(wt+i,xt,xt+l) = zt + p(xt+ + I -:t) iff rt > 0 and -1 < rt+i - lt < 0, (9) oo otherwise, if wt+l corresponds to "machine unavailable." If wt+ corresponds to an available machine, we have -10xt + 2 + 4(xt+1 - xt) if xt < 0 and 0 < xt+l - xt < 1, -10xt + 2(xt+l + 1 - xt) if xt < 0 and -1 < Xt+1 - Xt < 0, ft(wt+l,zt,Xt+l) = Zt + 2 + 4(xt+l - t) if xt > 0 and 0 < xt+1 - Xt < 1, xt + 2(xt+1 + 1 - xt) if zt > 0 and -1 < zt+l - t < 0, oo otherwise. (10) Note that the functions in (9) and (10) are convex and satisfy Assumption 1. Finite values over an infinite horizon can be obtained as in ~2 by subtracting constants in each period corresponding to the expected values of contributions from weakly optimal schedules. The data process here is Markovian, so we have the conditions for an optimal stationary strategy. We seek an optimal solution to (3) with a cycle of length k:= 1. The penalty term p is used as the price of obtaining one unit of the product elsewhere. For finite p, an optimal solution exists in Lo. To simplify the analysis, we consider optimal solutions for p - oo. Results for finite p can be easily obtained as perturbations (depending on p) from the results below. The solution of these problems only requires the determination of additional parameters corresponding to the levels at which outside products should be purchased. This problem can be interpreted as an abstract linear program. Suppose wt+l = 0 if the machine is unavailable, and wt+l = 1 if the machine is available. The one period 9

optimization for (3) becomes: inf IE[lOx- + x+ + 2y1+, + 4yt,+ l] 5.t.+ - s. t. X - Xt = xt a.s., E[xtlvt] = xt a.s., (11) xt+l = Txt a.s., xt + Y+i -+ yxi = xt+1 + 1 a.s., 0 < x+ O < x-,0 < Yt+l i l,+,=,0 < Y+1 < ly,+,=i a.s. An optimal solution occurs at an extreme point of the feasible region of stationary distributions for xt. In our case, the solution occurs at a discrete distribution with atoms spaced units apart. There are two classes of these extreme solutions:~1: solutions corresponding to xt+l = zt + 1 if the machine is available and Xt < a and t+l = xt - 1 otherwise. ~2: solutions corresponding to xt+l = xt + 1 if the machine is available and xt < a - 1, Xt+l = Xt if the machine is available and a > x, > a - 1, and xt+l = xt - 1 otherwise. Note that a stationary solution cannot have a wider range of regular time production of units since we can never achieve any higher xt+l than a. The optimal value of a and the corresponding optimal strategy remain to be determined. For 61, we consider a:= 3. This yields the stationary distribution 10

Xt tmacine status 4 lup 4 down 3 up 3 down 2 up 2 down 1 lup 1 down -1 up -I down The expected objective value is then 6.1 For 62, the stationary distribution for c:= 3 is: xt machine status 3 up 3 down 2 up 2 down lup 1 down -I tiup ~~~-l ~down probability 1 6 1 1 4 1 S 1 1 16 1 1 32 1 )()1+1 1 6 21+ f) L 1 )+ probability 1 3 1 6 1 6 1 12 1 12 1 24 ()(1 )1+ 1 1 )( I )1+1 Here the expected value is 621. To show the optimality of x*, consider changes from the values of yl+l, which we can consider as nonbasic variables in (11). Note that, since xt is stationary, we must have 11

IE[yl+l + y+] = 1. Since IE[yJ+1] = 2/3, it cannot be increased. The only possible changes are then to decrease y+l and to increase Y'+l. We never would reduce y,+l before reducing y2+, to 0, so consider first an e reduction of y+l for any subset of Qt(3) of measure 6 > 0 where xS(w) = 3 for all w E Qt(3). Correspondingly, we must increase yt+l on Qt(3). The resulting stationary distribution xt < xt a.s. with P{x' = x - e} = 6 and P{x' = x } = 1/3 - 6. So, the difference in expected inventory cost from xt to x' is 6(-eP{x; > 0} + 10P{xt < 0}) = &6(3/8). Thus, both inventory and production costs would increase in this case. The only other alternative change in the strategy is to keep y +l - 1,=l and to translate the distribution. In this case, however, any e increase in a yields an expected cost increase of ce, and, as before, an e decrease in a yields an expected cost increase of -e. With a finite penalty p the results are similar. For example, if p = 1000, then the same a value is optimal and the distribution is truncated at I = 35. The results of ~2 show that it is asymptotically optimal to match up with this strategy regardless of our initial conditions. A match-up strategy to accomplish this is simply not to produce if inventory is greater than 3, to produce one unit for inventory equal 3, to produce to obtain one plus 6 for inventory of 2 + 6 for 6 > 0, and to produce two units for inventories of 2 or less. A typical approximation to solving a stochastic problem such as (11) is to form a deterministic model by replacing the random variables with their expectations. Unfortunately, this procedure is always optimistic (see, for example, Birge and Wets [1986]) and underestimates the true cost considerably. The mean value problem corresponding to this 12

deterministic version of (10) is: inf 10x- + 21 + 2 +4yt+ S. t. t+ - x = Xt, xt+l = Txt, (12) Xt + Yt+l + Y+l = -t+l + 1, 0~x,0~x:,0y' 2 2 2 0 < ~ 00 <-,0 << y < 2 < < - t - t t+J3, - t+1 3) which clearly has an optimal solution, y =, and = with all other variables t i t+ Ier variables equal zero. The optimal value of (12) is 24, which, as mentioned, far underestimates the true optimal expected cost of 6.The above solution to (12) is not feasible for the original stochastic problem (11) in every period since processing is not always possible. A policy with the same expected value for yt as this solution is possible, using /,;- = lw,+,=1} and yi?+ = y (l1t+i=}). This policy, however, results in a stationary dlistribution for xt concentrated on points at intervals of 1 satisfying the equations: 1 2 1 1 1 P{xt = i}= P{xt = i + 1} + P{x = i - },i = 0,~, ~1, ~,..., (13) 3 322 2 00 Z(P{x, = 2i} + P{xt = 2i + 1} + P{xt = -2i} + P{xt = -2i - 1}) = 1. (14) i=o Equations (13) and (14) thus do not have a solution in the class of discrete probability measures since they imply a uniform distribution on {0, ~, ~1, ~1,...}. Hence the expected cost of the mean value solution modified for feasibility in (11) is infinite. The difference between this cost and the expected cost of the stochastic solution to (11) is called the value of the stochastic solution (see Birge [1982]), which represents the additional objective value gained by modeling the randomness in the problem explicitly. The result here says that using the mean value problem solution (modified for feasibility) is infinitely worse in the stochastic model although the mean value problem (12) gives a deceptively low value. 13

Even truncating the distribution at some point leads to extremely high storage and penalty costs. Our first example included an objective with tardiness or shortage penalties and holding costs that are common in scheduling models with randomness confined to the machine availability. Other scheduling models with varying degrees of uncertainty can also be incorporated into our general stochastic optimization model. For example, consider a model with several commodities i = 1,...,n processed according to random processing times p(i), random release dates r(i) and random due dates d(i), with a penalty weight of wi for every period after the due date in which processing is not completed. We wish to model a situation in which orders for each item arrive randomly (according to r(i)), in varying amounts (according to p(i)), and with random due dates (d(i)). The random entities, r(i), p(i), d(i), correspond to sequences of times, {r(w, i, 1), r(w, i, 2),..., }, {p(w, i, 1), p(w, i, 2),..., }, {d(w, i, 1), d(w, i, 2),..., }, for each order number 1, 2,.... The data process is defined so that when the jth order for i arrives at t = r(w, i, j), then the processing time p(w, i, j) and due date d(w, i, j) are also known. Thus Et distinguishes r(w, i, 1), p(w, i, 1), d(w, i, 1) for 1 < 1 < j, but not for I > j. Decisions are the amount of processing performed on each item i in each period t. Since the state of each item is reduced by the processing requirement at each due date, the total processing in period t is xt+l(i)-xt(i) if t is not a due date (t 5 d(i,j) for any j = 1,2,...) or xt+1(i) + p(i,j) - xt(i) if t is a due date (t = d(i,j)). The decisions are constrained so that no processing can occur if an item is not released (t < r(i,j)) and processing in each period on each item is at most one. Other restrictions on feasible processes appear in an indicator function 6(w,,t,Xt+l) which considers all resource availabilities. The only costs in this model are due to tardiness. A penalty wi is charged in each period for every unit of item i backordered (xt(i) < 0). The total tardiness cost at time t given w is then i-l W,(-xt(w,i))+. The objective is to minimize the expected total tardiness. 14

The single period objective contribution is thus: n ft(xt,xt+l):= >ft'(xt(i), t+l(i)) + (c, xt,xt+ ), (15) i=l where ( -wiXt(i) if j7 l —l{t=d(i,j)}P(i,j) < X,+(i) -xt(i) < jp=1 llt>r(ij)}P(i,J)- l{t=d(i,j)}p(i,j), xt(i) < 0 a.s., f/(xt(i),xt+(i)):= if:==l -l{t=d(ij))p(i,j) < xt+l(i)- x(i) < Ej=l l{t>r(ij)}P(i, j) - l{t=d(ij)}p(i, 1), xt(i)>0 a.s., 00 otherwise. This noncyclic model is a generalization of the model with cyclic data process (with cycle k) in Birge and Dempster [1992]. In that model, it is assumed that r(i,j + 1) - r(i,j), d(i,j)- r(i,j), and p(i,j) are all identically distributed and that r(i,j) < d(i,j) < k a.s. for all j. The data process is assumed to determine the availability of the resources (such as machines, labor and tools) for processing all commodities. We allow the 6 indicator term to represent feasibility generally by assuming a value of "0" if xt+l is feasibly reached from xt and "oo" otherwise. For example, suppose that each process i requires a resource m(i) where m(i) E {1,..., M}, the set of resources, and each resource can process at most one unit during a time interval if available and cannot process anything if unavailable. In this case, wt can be interpreted to have several components such that the first M components form an M-vector of ones and zeroes corresponding to availability and unavailability of resources. We then have r 0 if '=1 l{j=l m(i)}(Xt+() - Xt(i) + P(i)l{t=d(i)}) 6(w,,Xt,,x+i):= < wt(j) forj = 1...M, oo otherwise. Other constraints can also be represented in this way. Our only requirement is that 6(w,, is convex. This model is a basic multiple-processor, minimum expected weighted tardiness problem. With the convexity assumption, it meets the criteria for optimality and asymptotic 15

stability given in Theorems 1 and 2. In some cases, the stationary distribution for this model follows a deterministic path between (lisruptions andl an optimal match point is achieved as quickly as possible (cf. Bean el al [1991]). To see this, let x* be an optimal turnpike schedule in X, the set of optimal turnpike schedules. Assume that some state x' < x; a.s. is the initial state instead of x(. Let x' be an optimal trajectory given x'. In Birge and Dempster [1992], it is shown that a trajectory x that starts at x, and matches up with x* at the earliest feasible t can be constructed with the same objective value as x' for the cyclic problem. Essentially the same proof mutatis mulandis yields this result for the current noncyclic problem. Theorem 3. Suppose x* is optimal from x( above in the tardiness model, find infxX,xO=xXt, a.s. IE Z =ft(xt, xt+) yiven x* > x4 a.s., wlith ft defined in (15), and that there exists a feasible solution x such that x = x( a.s., and xt = xt a.s. for some r < oo, then there ezists an optimal solution x' given x') such that x' =, t > T a.s.. To illustrate the use of this result, we consider a one-machine, multiple commodity version of the model with single period objective (15). In this case, M:= 1 and m(i):= 1 for i = 1,..., n. For simplicity, we also assume that wi:= 1 for i = 1,..., n. With these assumptions, the familiar earliest due date scheduling policy is optimal. To define this policy, suppose that r(w,i, (w, t, i)) < t < r(w, (w, t, i) + 1) and that d(w, il,j(w,t,il)) < d(w, i2,j(w,t, i2)) <... < d(w, in,(w,t, i, )) and define zt+l(w, i1) recursively from l = 1 to = n by: Xt+1(W, it) = Xt(W, i) - l{t=d(w.ilj(wt,ti ))}p(w, ij(w, t, i)) 1-1 + l{W(l)=1)(min{l - Z(Jt+l(w, i,)-:(w, i,) (16) a=1 + l{t=d(w,i,,J(w,ti,))}P(W, i, j(w, t, i,)),p(w, i,(w, t, i,)). Equation (16) then forces production to occur up to the machine availability in due date order on any items that have not yet reached the order quantity p(w, it, j(w, t, i1)). This 16

definition implies that order j for item i is always completed before order j + 1 is released, i.e. that xr(ij)(i) > a.s. This assumption can be relaxed by allowing due date order to apply also to previous orders j < j(w, t, i) that are not yet complete. Theorem 4. An optimal solution to problem (3) with objective defined by (15), M:= 1, x0:= 0, and m(i):= 1, wi:= 1 for i = 1,..., is to process items according to earliest due date of released items first, i.e. according to (16). Proof: The optimality of this policy can be proved by assuming an optimal policy x* such that some item is processed out of due date order and showing that processing can always be shifted to an earlier due date item without increasing total cost at any horizon point under any outcome w. Let processing occur on item i' at time t under w although i is available, incomplete (xt(w,i, (w,t, i)) < p(w, i,(w, t,i))) and has an earlier due date than i'. To simplify notation and without loss of generality, we assume that t < d(w, i,j(w,t,i)) < d(w, i',j(w,t, it)), Xt(W, i) > 0, t(w, i') > 0, and that there exists 6 > 0 such that p(w,i' (w, t, i'))- x,(w i') > 6, x*+1(W, it) - X;t( it) > 6, p(w,i,(w,t, i)) - (w, i) > 6, (17) X:r+l(w i)- x(w, i) = 0, for r = t...t - 1, x*+ 1(, i) - Xt,(W, i) > 6. We assume in (17) that some point t' exists with xz,+l(w,i) - xz,(w,i) > 6. This may be relaxed by allowing t' -- oo. We use the assumption on the initial state, xo = 0, to ensure that states with Xt(w, i) > 0 and xt(w,i') > 0 can be obtained. Suppose the set of w satisfying (17) is Q17. Then, define a policy x' such that xT 1 <.+< t, x+1:= x.+l+ (ei-ei)l, r= t,....- 1, (18) xr+1, rTt'- 1, where ei is the ith unit vector in R". In this way, x' is obtained from x* by shifting production from i' to i in period t and from i to i' in period t'. This switch is feasible 17

because 6 units of additional production are available for i in period t according to (17) (and for i' in period t'). The objective change from x* to x' is t -1 F(x*) - F(x') = IE[1{wEn,7}6(E lr>d(w,,j(w,t,)) - lr>d(,i',j(w t,i')))] > 0, (19) r=t since d(w, i,j(w, t, i)) < d(w, i', j(w, t, i')). Thus, F(x') < F(x*). Indeed, the objective contribution over the finite horizon to t' does not increase. The amount of switched processing (6) can be increased until either xzt+(w, i) = p(w, i,j i)) or x't+(wi') - xt(,i') = 0. In either case, this violation of due date order is corrected without increasing objective costs. The procedure can be repeated on all sets Q17 satifying (17) with positive probability for any t, i and i'. Hence, all violations of due date order can be removed without increasing costs, which proves the result. I The result of Theorem 4 for due date order are not valid (even with equal weights) in cases where penalties are charged only when jobs are finished. In such cases, the optimal order follows due dates if all jobs can complete on time, but the optimal order switches to shortest processing time if all jobs are late. The result of Theorem 4 does apply, however, if processing times and due dates follow the same order (see Birge et al [1990], Lemma 2.3). It also applies if the weights are ordered in decreasing order from earliest due date to last due date. Due date order is optimal here regardless of processing time because, according to (17), charges are incurred only on the incomplete portion of each job. This assumption is practical if an order is large and small batches within the large order can be shipped to the customer as they are finished. This ability to break up jobs is the critical factor in our convexity assumptions. Processing available jobs in due date order according to Theorem 4, provides a long run optimal solution provided the initial system is empty (or that we can assume some point in time at which we have nonnegative processing on all released jobs. We would like 18

to show that this policy satisfies the conditions for match-up optimality in Proposition 1 and Theorem 2. Assumptions 1 and 2 are valid since the costs are just piecewise linear with fixed increment in each period and the set of possible states is at most a unit L1 -distance from the current state. For condition (a) in Theorem 1 (nonanticipative feasibility), note that the feasibility conditions only depend on information available when an order is accepted. The second condition (b), strict feasibility, is satisfied if we assume that the system has sufficient slack such that the objective can be obtained without completely using all available capacity in some period. We assume this is possible (although an optimal solution may use all capacity). The third condition (c), finite horizon continuation, is that we can at some point reach a trajectory starting from, for example, the empty inventory state. With these assumptions and following Theorem 3, the optimal policy for any initial inventory state is to match up with the state from the empty inventory position as quickly as possible. The alternative initial states in Theorem 3 would correspond to entering period 0 with some overdue orders causing initial negative inventories for these items. The optimal match up response then correspons to processing any items with negative inventories before proceeding to items with zero or positive inventories. The result is that one reaches by time t the same state as in the zero initial inventory state whenever the cumulative excess capacity up to time t is greater than the total negative inventory at time 0. 5. Conclusions We have described a general stochastic optimization model with discrete time periods and infinite horizon. We showed that optimality conditions allow for characterizations of turnpike and cyclic optimal solutions that justify match-up strategies, in particular, for scheduling problems. We showed how optimal policies can be derived in two examples. In one case, the 19

optimal policy is found through a stochastic linear program (with recourse) that yields a low-cost solution when stochastic parameters are included. However, when deterministic mean values are substituted for random parameters in this program, the resultin linear program yields a solution which, after correction for feasibility in the stochastic program, yields a very poor policy indeed! In the other case, we showed how a simple policy based on the model structure is optimal. In each case, we showed how match-up strategies could be derived. The results to date on match-up scheduling are based on the ability to obtain equivalent convex optimization problems. The fundamental element of this assumption is the divisibility of orders into small shipping quantities. A remaining question is whether these results carry over to problems with integer variables or other nonconvexities. In these cases, duality gaps appear that void direct use of supporting prices. The addition of facet-defining constraints may allow similar results, but will require not inconsiderable further study of the structure of dynamic scheduling polyhedra. References V.I. Arkin and I.V. Evstigneev (1979). Stochastic Models of Control and Economic Dynamics. Nauka, Moscow. (In Russian). English translation by E.A. Medova-Dempster and M.A.H. Dempster. Academic, Orlando (1986). J.C. Bean and J.R. Birge (1986). Match-up real-time scheduling. In Real-Time Optimization in Automated Manufacturing Facilities, R.H.F. Jackson and A.W.T. Jones, eds., NBS Special Publication 724, National Bureau of Standards, pp. 197-212. J.C. Bean, J.R. Birge, J. Mittenthal and C.E. Noon (1991). Match-up scheduling with multiple resources, release dates and disruptions. Operations Research 39, 470-483. J.R. Birge (1982). The value of the stochastic solution in stochastic linear programs with fixed recourse. Mathematical Programming 24, 314-325. 20

J.R. Birge and NM.A.H. Dempster (1992). Optimality for match-up strategies in stochastic scheduling. Technical Report, Department of Industrial and Operations Engineering, The University of Michigan. Submitted to: Mathematics of Operations Research. J.R. Birge, J.B.G. Frenk, J. Mittenthal and A.H.G. Rinnooy Kan (1990). Single machine scheduling subject to stochastic breakdowns. Naval Research Logistics 37, 661-677. J.R. Birge and K.D. Glazebrook. Assessing the effects of machine breakdowns in stochastic scheduling. Operations Research Letters 7 (1988), 267-271. J.R. Birge and R. J-B. Wets (1986). Designing approximation schemes for stochastic optimization problems, in particular, for stochastic programs with recourse. Mathematical Programming Study 27, 54-102. M.A.H. Dempster (1988). On stochastic programming: II. Dynamic problems under risk. Stochastics 25, 15-42. M.A.H. Dempster, J.K. Lenstra and A.H.G. Rinnooy Kan, eds. (1982). Deterministic and Stochastic Scheduling. Reidel, Dordrecht. M.A.H. Dempster and E. Solel (1987). Stochastic scheduling via stochastic control. Proceedingd 1st World Conference of the Bernoulli Society K. Krickeberg and A. Shiryaev, eds. VNU Science Press, Utrecht, pp. 783-788. S.D. Flam (1983). Asymptotically stable solutions to stochastic problems of Bolza. Chr. Michelsen Institute Report, Fantoft, Norway. S.D. Flam (1985). Nonanticipativity in stochastic programming. Journal of Optimization Theory and Applications 46, 23- 30. K.D. Glazebrook (1984). Scheduling stochastic jobs on a single machine subject to breakdowns. Naval Research Logistics Quarterly 31, 251-264. K.D. Glazebrook (1987). Evaluating the effects of machine breakdowns in stochastic scheduling problems. Naval Research Logistics 34, 319-335. K.D. Glazebrook (1991). On non-preemptive policies for stochastic single machine schedul21

ing with breakdowns. Probability in Engineering and Informational Sczences, 5, 77-87. S.C. Graves (1981). A review of production scheduling. Operations Research 29, 646 -675. J-B. Hiriart-Urruty (1982). Extension of Lipschitz integrands and minimization of nonconvex integral functionals: Applications to the optimal recourse problem in discrete time. Probability and Mathematical Statistics 3, 19-36. H.J. Kushner (1972). Necessary conditions for discrete parameter stochastic optimization problems. In: Proceedings Sixth Berkeley Symposium Math. Stats. and Probability, J. Neyman, ed. U. California, Berkeley, pp. 667-685. L. McKenzie (1976). Turnpike theory. Econometrica 44, 841-864. J. Mittenthal (1986). Single Machine Scheduling Subject to Random Breakdowns. Ph.D. dissertation, The University of Michigan, Ann Arbor, Michigan. M.L. Pinedo and S.M. Ross (1980). Scheduling jobs subject to nonhomogeneous Poisson shocks.Management Science 26, 1250-1257. A.H.G. Rinnooy Kan (1976). Machine Scheduling Problems: Classification, Complexity and Computations. Martinus Nijhoff, The Hague. R.T. Rockafellar (1976). Integral Functionals, Normal Integrands and Measurable Selections. Lecture Notes in Mathematics 543. Springer-Verlag, Berlin. R.T. Rockafellar and R. J-B. Wets (1983). Deterministic and stochastic optimization problems of Bolza type in discrete time. Stochastics 10, 273-312. E. Solel (1986). A Dynamic Approach to Stochastic Scheduling via Stochastic Control. Ph.D. dissertation, Dalhousie University, Halifax, Nova Scotia. 22