INVENTORY CONTROL AND SAFETY CAPACITY SCHEDULING IN A JIT ENVIRONMENT Izak Duenyas Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 and Wallace J. Hopp Department of Industrial Eng and Mgmt Sciences Northwestern University Evanston, IL 60208 Technical Report 93-39 December 1993

INVENTORY CONTROL AND SAFETY CAPACITY SCHEDULING IN A JIT ENVIRONMENT Izak Duenyas and Wallace J. Hopp Department of Industrial and Operations Engineering University of Michigan, Ann Arbor, MI 48109 July 1992 Revised July 1993 This work was supported in part by a grant from the IBM Corporation and the National Science Foundation under Grants DDMI-8905638 and SES-9119621.

INVENTORY CONTROL AND SAFETY CAPACITY SCHEDULING IN A JIT ENVIRONMENT IZAK DUENYAS Department of Industrial and Operations Engineering University of Michigan Ann Arbor, Michigan, 48109 WALLACE J. HOPP Department of Industrial Engineering and Management Sciences Northwestern University Evanston, Illinois, 60208 Abstract We consider the problem of establishing a target inventory level in a manufacturing system where both production and demand are stochastic. Under the assumption that "safety capacity" (i.e., overtime or a vendoring option) is available, we develop two models to address this problem. The first model assumes that quota shortfalls cannot be carried over to the next regular time production period and are made up with overtime/vendoring, which incurs fixed or fixed plus variable costs. The second model assumes that shortages can be backlogged to the next regular time production period at a cost. For this model, we demonstrate the optimality of a (Q, s, S) policy, where Q represents the target inventory for regular time production, s represents a safety capacity "trigger" and'S represents the safety capacity inventory target. 1

1 Introduction The growing popularity of pull production systems has presented many manufacturing plants with a difficult inventory control dilemma. On one hand, the JIT literature admonishes these plants to reduce inventory levels to expose problems and engender continual improvements ([19], [20]). On the other hand, as JIT suppliers to other plants, these plants are often expected to be able to fill orders rapidly and in variable quantities. Striking an appropriate balance between low inventory and high responsiveness requires a finely-tuned inventory control policy. Further complicating the situation is the fact that a pull system is rate- driven and hence requires some form of "safety capacity" to protect against variations in the demand process or production rate [22]. This capacity can be provided through scheduled preventive maintenance periods, which can be used for overtime when necessary. It can also be provided by outside vendors, who agree to make up deficiencies on short notice. In either case, the plant effectively uses higher cost capacity to level out randomness in the system. How to use this capacity must be an integral part of an effective inventory control policy. Recently, researchers have begun to address the problems of establishing inventory levels and production quotas in pull production systems. In pull systems, inventory control is a matter of establishing appropriate card counts. (We speak as if physical cards are indeed used, although in practice the signals may be electronic.) The work on card count setting ([3], [4], [7], [23]) has not addressed the problem of using safety capacity. The safety capacity issue arises naturally in the context of setting production quotas for pull production systems. Hopp, Spearman and Duenyas [15] address this problem by assuming that regular time production is normally distributed with a known mean and variance and that the company can sell everything it can possibly make. At the end of regular-time production, overtime can be scheduled to catch up. They develop four models to calculate production quotas and optimal overtime 2

policies. Duenyas, Hopp and Spearman [7] extend this approach to demonstrate the relationship between the quota setting and the card count setting problems and propose an algorithm to calculate card counts and optimal quota for a CONWIP (constant-work-in-process) system. The above quota-setting models avoid the issue of demand variability by assuming that the plant can sell everything it makes. In the scenario we have outlined above (i.e, where a JIT plant is also a JIT supplier to another plant), it is quite likely that the plant faces uncertainity about demand as well as production. Hence, we require a more sophisticated policy than that represented by a simple periodic production quota. A reasonable class of policies to consider are the "order-up-to" policies that specify periodic target inventory levels. In fact, we will show that an order-up-to policy is optimal under specific cost assumptions. Our goal, then, is to simultaneously find the optimal order-up-to policy and the optimal scheduling policy for safety capacity. Our modeling approach is heavily influenced by our involvement with a circuit board plant of a major computer manufacturer. The demand our plant "sees" in a period is essentially the production of the downstream plants during that period. Due to the usual contingencies (e.g., machine failures, process problems, staffing, fluctuations in external demand, etc.) this production, and hence our demand process, is stochastic. While some visibility to the downstream processes is possible, the actual weekly demand for finished product is not precisely known until the end of the week. To model this situation, we will assume throughout this paper that demand in a period can be represented by a single random variable whose value is realized at the end of the period. Also, in recognition of the fact that JIT is most applicable to repetitive manufacturing systems (see, e.g., [22]), we will assume that the demand process is time homogeneous. Finally, we assume that the planning horizon is sufficiently long to justify use of an infinite horizon. In our client's circuit board plant and in all of our models, the variability in the production process is due to processing time variability. That is. each operation 3

that a job has to go through takes a (random) amount of time. This can be due to machine failures, differences in operator skill, shortages of supplies, etc. Given that each operation takes a random amount of time, the amount of product that we can produce in a fixed amount of time (e.g., during regularly scheduled production time) is also random. Note that the type of randomness we are considering is very different from randomness in production due to random yield. In random yield models (e.g., Henig and Gerchak, [11]), a certain amount of raw material is released into the plant at the beginning of the period, and at the end of the period, a (random) amount of good units are produced. In this situation, the number of good units clearly depends on the number of units of raw materials released at the beginning of the period. However, since we are modelling inventory control in a plant that has implemented a pull system, such as CONWIP or kanban, we do not release a certain quantity in the beginning of a period. Rather, work release is dictated by the cards (kanbans). For example, in a CONWIP system, we release a new unit to the first machine in the line whenever a unit has been finished at the last machine (or a unit has been scrapped somewhere in the process), thus keeping the total WorkIn-Process Inventory in the line constant. Hence, the amount that can be produced in a pull system depends on the WIP level in the system, but only depends on the production target in that we stop if we meet this target during regular time. This difference makes our model distinctly different from those used to study random yield problems. In our client's plant, safety capacity consisted of overtime (i.e., production over the weekend) and therefore was able to deliver products before the beginning of the next regular time period. Throughout this paper, we have modeled safety capacity under this type of rapid delivery assumption, making our models applicable to the overtime case or the case where a vendor provides very quick turnaround. A vendor might be able to do this by either essentially using overtime or by carrying inventory. and of course, we would expect to pay a premium for this type of service. We do not address the situation where vendored capacity requires significant delivery time. 4

This is a substantially different problem, which should be addressed at the planning level (e.g., during development of a master production schedule) rather than at the execution level. The type of randomness we are considering in this paper has been addressed in the literature. Duenyas et al. [7], Duenyas and Hopp [8], [9] have characterized the output process from a tandem CONWIP line and have approximated the output of CONWIP lines feeding an assembly operation under different processing time distributions as a function of WIP level. Mitra and Mitrani [18] have developed approximations for the throughput of exponential kanban systems. These approximations can be used along with simulation to characterize the distribution of the amount produced during regular time given the particular pull system implemented in the plant. There is a large literature on the problem of establishing inventory targets in the face of stochastic demand (see e.g., [2, Chapter 4] for a survey). However, to our knowledge, none of the above models have explicitly incorporated the issue of using overtime/vendoring to supplement capacity in environments where regulartime capacity is uncertain. In an early paper, Karlin [17] considered a single period model where demand and amount produced are random, however his model does not have safety capacity. Barankin [1] considered a single period model in which a regular and an emergency order are placed simultaneously. The regular order arrives one period later and the emergency order arrives simultaneously. However, Barankin's model does not include uncertainty in production capacity (i.e., any regular or emergency orders that are placed are delivered in full and on time) and also there is no fixed cost of placing an emergency order. Daniel [5] generalizes Barankin's model to the multi-period case. Daniel's model also does not include production capacity uncertainty and fixed costs for emergency orders. Perhaps the work closest in spirit to our work is a recent paper by Zheng [24] which considered a single-item continuous-review system where demands form a Poisson process. His model includes the usual backordering, holding and replenish-.j

ment ordering costs, but goes beyond these by assuming that discount opportunities are generated according to a Poisson process where the firm has the option to place a special order with a discounted fixed cost of replenishment. Zheng proves the optimality of an (s, c, S) policy for this problem where the firm places an order and brings the inventory position up to S if (1) the inventory position is below s and there is no discounting opportunity, or (2) the inventory position is below c and there is a discounting opportunity. Our models differ from this work in that (1) Zheng's model does not include replenishment uncertainty, (2) in our models discount opportunities occur regularly (i.e., producing during regular time is always cheaper than producing during overtime), and (3) we do not assume Poisson demand. In the following sections, we develop two models for addressing the problem of setting target inventory levels in systems with stochastic production and demand. We refer to these levels as "quotas" to emphasize their analogy with the production quotas for pull systems. In Model 1, we assume that safety capacity must be used whenever a quota shortfall occurs. In Model 2, we consider the case where the decision-maker can choose whether or not to use safety capacity when quota is not met and integrate the quota-setting problem with the safety capacity scheduling problem. The models in this paper can be used in conjunction with results on the output process from pull systems as well as part of a combined quota and WIP setting procedure described in Duenyas et al. [7]. 2 Model 1: Fixed Cost of Overtime We begin by considering the case where we always use safety capacity if the quota is not met during regular time. The marginal cost of producing a single unit during regular time is given by cR. If the quota is not achieved, then we assume that overtime must. be used, and results in finished goods inventory being brought up to the quota. We let K denote the fixed cost of safety capacity. We denote the variable cost of producing a unit. during overtime by cR + c, so that c represents 6

the unit penalty for using safety capacity. This penalty may be due to the workers being paid a premium for overtime work, if safety capacity consists of overtime, or a price differential for vendoring the product instead of producing it in-house, if safety capacity is provided by a vendor. Similarly, the fixed cost of safety capacity may represent the cost of a minimal overtime shift, in the case of overtime, or the fixed cost of an emergency shipment, in the case of vendoring. Although this quota/safety capacity protocol may seem simplistic, it is well defined and easy to follow. For this reason, it is used in a fairly literal fashion in some plants (e.g., Inman [16] gives a discussion of the use of this policy in automotive manufacturing plants). Of course, in many situations this assumption will be too strong - and we will relax it later - but its simplicity leads to clean insights and intuition that will carry over to more realistic models, so we begin our analysis with this case. For modeling purposes, we assume that demand for the period is revealed after the use of safety capacity. What we are really modeling here is the situation where overtime commitments must be made before demand is fully known. We further assume that if demand exceeds the quota, then sales are lost; if quota exceeds the demand, then a holding cost is incurred. We let p represent the unit profit, h denote the holding cost, D represent the (random) amount demanded (with density function fD(x) and distribution function FD(X)), and Y denote the (random) amount of production during regular time (with density function fp(x) and distribution function Fp(x).) Note that Y denotes the (random) amount that could be produced if production continued throughout the regular time period. However, production is stopped if at any time the quota is reached. Therefore, Y denotes the (random) production capacity during the regular time period. Finally, we let Q be the production quota (i.e., the decision variable) that maximizes the long-term average profit. Since we are focusing on a multi-period problem with an average profit maximizing objective, the amount of initial inventory does not affect the solution (we assume that if the initial inventory is greater than Q, no production takes place until the

inventory level is less than Q). We let g(Q) represent the expected profit per period with quota Q, which can be expressed as follows: g(Q) = pE(min(Q, D))- cRE[(min{(Q - D)+ + Y, Q}) - (Q - D)+] -Pr{(Q- D)+ + Y < Q}K -(CR + c)E[Q - (mn{(Q - D)+ + Y, Q}] - hE(Q - D)+. (2.1) where x if x > 0 0 otherwise The first term in (2.1) denotes the expected revenue per period. Each period inventory is brought up to Q, and if demand is D, then the minimum of D and Q units are sold. The number of units left after demand is realized is (Q - D)+. Hence, the amount of inventory carried over to the next period is (Q - D)+, and holding costs are charged on that inventory. If the amount produced during regular time plus the initial inventory is less than Q, safety capacity has to be used, and the third and fourth terms express the expected fixed and variable costs of safety capacity. Finally, the second term of (2.1) gives the regular-time production costs. rWe can simplify (2.1) to g(Q) = (p - cR)E(min(Q, D)) - Pr{(Q - D)+ + Y < Q}K -cE[Q - (min{(Q - D)+ + Y, Q}] - hE(Q - D)+. (2.2) since min(Q,D) = Q - (Q - D)+. Also min(Q,D) = D - (Q - D)+, so that maximizing g(Q) is equivalent to minimizing w(Q) = -g(Q) + (p- cR)ED since the second term is a constant. We also let pi = p - cR and get w(Q) = plE(D - Q)+ + hE(Q - D) + Pr{(Q - D)+ + Y < Q}K +cE[Q - (min{(Q - D)+ + YQ}]. (2.3) The first two terms of (2.3) are convex, but the last two terms are in general not convex. Differentiation of (2.3) yields the following first-order condition: (I fp(Q) + cFp(Q) - pl)(l - FD(Q)) + h(1 - P(D > Q)) = 0. (2.4) 8

which we can rewrite as FD(Q)= 1- (2.5) pi + h - Kfp(Q) - cFp(Q) Differentiating (2.5) yields the second-order condition f(Q) > hfD(Q) (2.6) () K(1 - FD(Q))2 where fp(Q) is the derivative of the density function of production. Hence, we can set the quota by solving (2.5) for Q. Even though a closedform expression for Q* is not possible in general for all demand and production distributions, solving for Q by convergence methods is simple. Note that in general there may be more than one value of Q that satisfies (2.5). Hence we need to check whether (2.6) is satisfied to guarantee a local minimum. Notice that because of the nonconvexity of the objective function we cannot guarantee that the local minimum is a global minimum in the general case. However, we can use these optimality conditions to examine the nature of the global optimum and, as we will show below, there are conditions under which the solution to (2.5) and (2.6) is unique and therefore globally optimal. Implicit in this formulation is the assumption that any quota shortage is made up with safety capacity. Obviously, if we choose quota to be very large, we will typically miss quota by an amount that is too large to make up in a single overtime period, or is beyond our vendor's capabilities. This will invalidate the formulation. Hence, if we let M be the maximum available safety capacity then, after computing Q* we need to check the condition Pr{Q* - ((Q* - D)+ + Y) > M} < a (2.7) where a is some acceptably small value. If this condition is satisfied then the model is reasonable. If not, then the costs are such that it is attractive to routinely use safety capacity. In this case it may make sense to replace this model by one without makeup periods (see Hopp, Spearman and Duenyas [15] for such a model). 9

Note that we are not considering randomness in safety capacity production. In the case where this capacity is vendored, this is a reasonable assumption, since our contract with the vendor will probably require such regularity. In the case where this capacity represents overtime, this is only reasonable if M is set sufficiently below actual maximum overtime capacity to enable it to be achieved consistently. In addition to providing a computational tool for quota setting, the above model can be used to gain insight into the effect of production and demand variance on the optimal quota. To illustrate this, we make some additional mild assumptions about our demand and production process. We assume that production capacity during regular time has a continuous unimodal distribution. We also assume that the distribution of demand is continuous and demand can be expressed as D = /tDD+aDX where X is a random variable with mean 0 and variance 1. We also assume that P(D > 0) = 1. We note that the assumption that production capacity during regular time has a unimodal distribution is a natural assumption, since by the central limit theorem, if the regular time duration is long enough, the distribution will converge toward the normal distribution. In fact, Duenyas et al. observed that for a CONWIP system with deterministic processing times and random failures of machines, the output process seems to converge to normal rather rapidly. The assumptions on demand are similarly rather mild. Under the above conditions, we first characterize sufficient conditions under which a unique solution to (2.5) exists. Lemma 1 Let a = pl/(pl + h). Let Q = FD1(a). Then, the optimal quota, Q*, has the property that Q* < Q. If the support of Y is contained in (Q,oo), then the optimal quota equals Q. If the distribution of production capacity during regular time is unimodal with mode Alp and either one of these conditions hold; 1. MAp >Q 2. pi < K/fp(Mp) then there exists a unique solution to (2.5). 10

Proof: The first part of the lemma is easily seen by noticing that if K = 0, and c = 0, then Q = FD1(p1/(pi +h)) minimizes w(Q). Since the third and fourth terms on the right hand side of (2.3) are increasing in Q, the quota that minimizes (2.3), QN L Q. Furthermore, if the support of Y is contained in (Q, oo), Q is the unique solution that satisfies (2.5). If Y is unimodal with mode Alp > Q, then fy(Q) is increasing in the region [0, Q]. Note that in this region, the left-hand side of (2.5) is decreasing in Q, while the right-hand side is increasing in Q. Therefore there is a unique solution in [0, Q]. Consider the case where h = 0, and c = 0 in (2.3). Then, the first-order condition, (2.4) reduces to Kfp(Q1) - Pi = 0 (2.8) and the second-order condition is Kfp(Q) > 0 (2.9) If Pi < K/fp(Mp), then there exists a solution to (2.8), Q1, and (2.9) implies that this (unique) solution must lie in [0, Mp]. Now, notice that the second and fourth terms in (2.3) are increasing in Q, therefore the quota that minimizes (2.3), Q* < Q1 < Mp, and is unique. M. Having derived sufficient conditions for a unique quota, we next explore how the optimal quota is affected by changes in the problem parameters. Theorem 1 Under the conditions of Lemma 1, the optimal quota, Q*, is nonincreasing in K, nonincreasing in h, nonincreasing in c, nondecreasing in pi. Proof: We prove the monotonicity result for K', the proofs of the other results are entirely similar. Let Q* be the optimal quota for a given set of problem parameters. Then, by (2.5), we have P(D > Q) p+ fQ)-cF(Q) Now, if K is increased, then the right hand side will be greater than the left hand side, i.e. for K' > K, we will have P(D > Q) < phh-F Since decreasing Q will increase the pl+h-K'fp(Q)-1cFp(Q) 11

left-hand side and will decrease the right-hand side, and equality will be achieved by a lower quota. D. Theorem 1 characterizes the behavior of the optimal quota with respect to changes in the cost parameters. We next describe how the optimal quota is affected by changes in demand, and production. Theorem 2 Under the conditions of Lemma 1, let Q* be the optimal quota when demand is a random variable D1, and production during regular time is a random variable Y1. 1. Let the optimal quota when demand equals D2 >st D1, be Q*. Then, Q* > Q7. 2. Let the optimal quota for the case where production capacity equals Y2 >st Y1 be Q. Then Q3 > Q* if K = 0. Proof: D2 >st D1 implies P(D2 > Q) > P(D > Q) h p( (2.10) which implies Q: > Q7. The proof of the second part is entirely analogous. D. Theorem 2 states that an increase in demand (in the stochastic sense) will cause an increase in the optimal quota, and an increase in possible production during regular time will also cause an increase in quota when K = 0. We are also interested in how the optimal quota is affected by variability. The following theorems describe the effect of demand variability on the optimal quota. Theorem 3 Under the conditions of Lemma 1, and tith demand equal to D = AID + cDX, where X is a random variable with mean 0 and variance 1, if Q* < I'D, then Q' is nonincreasing in rD. If Q* > ILD, then Q* is nondecreasing in aD. Proof: If D = iD + CDX, then we have P(D > Q') = P(X > Q- ). Notice that when Q* < PDo, an increase in cD will make (Q* - I'D)/(oD) larger, and therefore P(D > Q*) will become smaller. This means that the left hand side of (2.5) will 12

become smaller than the right-hand side of (2.5). Under the conditions of Lemma 1, equality in (2.5) can only be achieved by decreasing Q*. The proof for the case where D > JLD is entirely analogous. O One way to interpret the above result is that as the variance of demand increases, the spread between the quota and the mean demand is increasing. This can be observed in an easier way when both K = 0, and c = 0, since in this case IQ*-UDI = aDIFxl(pl/(p1 + h))l, where Fx(.) denotes the cumulative density function of X. In general, if the marginal profit p is high enough, Q* > ID holds. In that case, the above theorem implies that as demand variability increases, it is optimal to build up more inventory as a hedge against uncertain demand. However, if K and h are high enough so that Q* < JlD, then greater demand variability will cause higher holding costs when demand is low, and hence the optimal policy is to decrease quota. Interestingly, if we alter both the variance and the mean, but hold the coefficient of variation constant, the optimal quota always increases as stated in the following theorem. Theorem 4 Under the conditions of Lemma 1, and given D = - D + aDX, where X is a random variable with mean 0 and variance one, let cv = oD/lD. Q* is nondecreasing in aD for a fixed c,. Proof: Since P(D > Q*) = P(X > Q - D ), when CD is increased while keeping LD/ cD constant, the term (9I - I2) decreases, therefore P(D > Q*) increases. Therefore Q* must be increased to achieve equality in (2.5). We end this section by noting two special cases of our model. First, if the production capacity is very large compared to demand, (i.e., we can produce any amount during regular time), then we are left with the same newsboy result as when K = 0, and c = 0, (i.e., the optimal quota equals FD (p/(p + h)). Similarly, if demand is very large compared to production capacity (i.e., we can sell anything we make), we get an upper bound on Q', by Theorem 2. 13

3 Model 2: A Model with Backlogging The model in the previous section assumes that safety capacity is used whenever quota is missed. Because decision-makers might reasonably choose not to use overtime or a vendor if the shortage is very small, we now relax this assumption and consider the case where the plant can opt to backlog a shortage and make it up during the next regular time period. We assume that safety capacity cost has both fixed and variable components. Further, we allow for a variable backordering cost, b, to penalize units that are completed in periods following the one for which they were scheduled. Specifically, the system works like this. The plant tries to produce quota during regular time so that demand is met with Q units on hand. We assume that quota is rigidly observed so that when there are Q units on hand the plant stops producing. We further assume that uncertainty in downstream requirements prevents demand from being fully revealed to the plant until the end of regular time (i.e., at which time, the downstream plant will have completed its regular time production and made a decision about overtime, thereby solidifying demand on our plant). However, unlike in Model 1, since backlogging shortages to the next period is an option, we assume that the plant takes note of demand prior to making the decision of whether or not to use overtime. If demand is less than Q, a holding cost is incurred and the excess units are carried over to the next period. If demand is more than Q, the plant can either backorder (i.e., carry the shortage to the next period) or use safety capacity. If the plant uses safety capacity, it must also decide how many units to order or produce. Notice that we are assuming in our description that the decision-maker uses an "order-up-to" policy during regular time. In the development that follows, we initially make this assumption. However, in Section 3.3, we demonstrate that restricting attention to "order-up-to" policies is without loss of optimality. 14

3.1 Model Formulation We model the problem of finding an optimal safety capacity scheduling policy as a discrete time Markov decision process (MDP). We suppose that at the end of each regular time period, the decision-maker observes the shortage (or excess) level and decides whether or not to use safety capacity and, if it is used, how much to order or produce. In MDP terminology, the stages are defined by epochs representing the end of each regular time period, the state of the process is defined as the level of inventory and the actions available in state x are given by defined as Ax = {x, +1, +2,..., Q} where taking action x in state x means that we do not use safety capacity while taking action a > x means that we use safety capacity and raise our net inventory to a units. Since Pl includes variable production costs, the only costs we must consider are fixed and variable overtime costs and backorder costs. These depend on the state and action. If action a is used in state x then the one period cost is given by c(x, a) = KI{a>x} + c(a - x) + G(a), a > x where G(a) = -bala<o} + haI{a>o} and I is the indicator function. Finally, to specify transition probabilities, we discretize production and demand into convenient units and let qi represent the probability of (potentially) completing i units during regular time. Similarly, we let dj be the probability that demand equals j units. To ensure a finite state space, we approximate production and demand distributions with a discrete distribution with finite support. Also, we assume that the maximum possible production is greater than the minimum possible demand. (Note that if this is not the case, then we have the "sell-all-you-can-make" situation studied in Hopp, Spearman and Duenyas [15].) The amount we produce during regular time depends on how many units we start the period with. Hence, we can 15

denote by Y(a) the number of units produced during regular time starting from an inventory position of a units. Hence Y(a) = minY, Q - a}. We next state the following purely technical lemma, which we will need later in characterizing the optimal policy. Lemma 2 EY(a) is concave in a. Proof: The proof is omitted. We can now formulate the dynamic programming problem. We first formulate the finite horizon discounted case, then show that the infinite horizon problem is well-defined, and finally develop the average cost case. Letting Xn and an represent the inventory position and action in period n, and a represent the discount factor, we can define Kn, cn and Gn(an) as, respectively, the present value of the fixed cost of safety capacity at period n, the present value of the variable cost of safety capacity at period n and the present value of the inventory charges at period n. That is, Kn = a'A'K, c = a&c, and G,(an) = anG(an). Finally, let cN+1 denote the salvage value of goods at period N + 1. Starting with period 1, with an inventory position of x1 before making the safety capacity decision, we have the following recursion for xn and an xn+l = an + Yn+l(an)- Dn+l Let B denote the present value at the beginning of period 1, of all costs that are incurred during periods 1,..., N + 1. Then, we have N B = [Kr'I{an>xn} + Cn(an - Xn) + Gn(an)] - CN+IXN+1. n=l Substituting the expression for xz, we get N N B = Z[Inl{an>Xrn + Gn(an) + (Cn - Cn+l)an - Cn+lYn+l(an)] - C11 + E CnDn n=l n=l Letting Hn(l.) = (c, - Cn,+i ) + G',,(a) - c+lEYn+l(a) 16

we have N N E(B) = Z E[KnI{an>Xn} + Hn(a)] - clxl + cnE(Dn). n=l n=l Since the second and third terms in the above equation are not affected by our policy, we need only to minimize the first term, which is equivalent to solving the following dynamic program: fn(x) = minaEAx(InI{a>} + Jn(a)) (3.1) Jn(a) = Hn(a) + aE[fn+i(a + Yn(a) - Dn)] for n = 1,2,3,....N with fN+l(x) = 0 for all x. 3.2 Structural Results for Safety Capacity Scheduling The above MDP formulation provides a natural mechanism for examining the structure of the optimal safety capacity scheduling policy. We begin by treating the finite horizon case and then extend our results to the infinite horizon average cost case. Theorem 5 The optimal solution to (3.1) is an (s,S) policy. Proof: The proof is exactly analogous to that of Theorem 7-1 in Heyman and Sobel [14] and is therefore omitted. O Although we have assumed stationary demand and cost functions, we note that this is not necessary. Heyman and Sobel's proof allows nonstationary demand, as long as periodic demands are independent, and nonstationary costs, as long as fixed costs are monotonically decreasing with the period index. An entirely analogous proof can be used to extend the above theorem to the nonstationary case. We can extend our structural results to the infinite horizon discounted case with the following Lemma 3 f,(x) - f(x) as n -i oc for all x. 17

Proof: The proof is omitted. Since the f(.) functions inherit the structure of the fn(') functions, the optimal policy in this infinite horizon case is also (s, S). We can also extend our results to the average cost case as follows. Letting w(x) denote the relative value function (i.e., w(x) = lim_,o f(x)), and g denote the gain rate, we can state the following Lemma 4 There exist a bounded function w(x) and a constant g such that for all x w(x) + g = min {I I{a> + H(a) + E[w(a + Y(a) - D)]} (3.2) aEAx where H(a) = G(a)- cEY(a). Proof: By Corollary 2.5 in Chapter V of Ross [21], it is enough to show that there exists a state that is reachable from any other state. Let k = argmin{ri: ri > 0}. That is, k is the minimum possible demand. Then, clearly the state Q - k can be reached from any other state because of the assumption that maximum possible production is greater than minimum possible demand. C Therefore, the average cost case also inherits the (s, S) structure and we have proved the following: Theorem 6 The optimal solution to (3.2) is an (s,S) policy. 3.3 Optimality of "Order-Up-To" Policies In this section, we prove that the "order-up-to" policy assumed for regular time up to this point is indeed optimal for the infinite horizon problem. To do this, we begin by considering an N period problem. In each period there are two decisions to make; first the decision maker must choose the quantity to produce in regular time, then, after observing demand, he/she must decide whether or not to use safety capacity and how much to produce/order. 18

We formalize our model by defining f,(x) to be the expected cost from the beginning of regular time in period n through the end of period N discounted back to period n, given that there are x units on-hand at the beginning of the period. Similarly, we let 3,n(x) be the expected discounted total cost from the end of regular time in period n after demand has been realized and the net inventory position is x. We can write Pn(x) = min KnI{a,>} + c(a - x)+ G,(a)+ fn+l(a)} (3.3) aEAx Because we are not restricted to an order-up-to policy, the decision variable at the beginning of regular time is the quantity we attempt to produce (note that we may not achieve this target), which we denote by z. Hence we can write fn(x) = Jo J OI /3n(x+y -u)fp(y)dyfD(u)du+(l-Fp(z)) j n(x+z-u)fD(u)du (3.4) Lemma 5 There exist numbers Q, such that in period n, the optimal regular time policy is to bring the inventory up to Qn. Proof: We proceed recursively on n, first showing that and /n(x) is Kn-convex (see [14] for a summary of the properties of K-convex functions), then using this to demonstrate the existence of Qn, and finally using Qn to continue the recursion by showing that fn(x) and Pn-1(x) are Kn_ —convex. We initiate the recursion by letting fN+l(x) = 0, which is obviously KN+l-convex and, along with (3.3), implies that /lN+(x) is KN+l(x)-convex. Now, suppose that Pn(x) is Kn-convex. Since a A-convex function is continuous and differentiable everywhere except at most a countable number of points (see [14]), it follows that that the integral inside the minimization in (3.4) is differentiable. Differentiating this expression with respect to z yields the following first order optimality condition: (1 - FPz) /3,( z - u)fD(u)du = 0 (3.5) 19

There are two possibilities for satisfying (3.5). Either z = oo, so that 1 - Fp(z) = 0, or d I / n(x + z- u)fD(u)du= 0 (3.6) do Jo So, either Q, = oo, which is trivially order-up-to, or we choose Q, such that it satisfies d fo /3n(Qn - U)fD(u)du = 0 and achieves the global minimum of the function in (3.4). In this second case clear that z* = Qn - x, and hence the optimal policy is to try to bring the inventory level up to Qn, provided that x < Qn. Now to continue the recursion, note that because l,3(x) is K, convex and the optimal policy is order-up-to, (3.4) implies f,(x) is KI-convex. Let J,r-l(a) = Gn-l(a) + fn(a) + Cn-l(a). Then, we have 3n-l(x) + Cn-lx = mina>x{K'n-l/a>x + Jn_(a)}. Since fn(a) is Kn-convex, then Jni(a) is It'-convex, and hence Kn-iconvex, since Kn-i > KI. Hence, by Lemma 7-2 of Heyman and Sobel, O3n-l(x) + cnlx will be KInl-convex, and hence 3n-l(x) is Kin_-convex. The result follows by induction. C Notice that the above lemma does not say anything about the optimal policy for the case where x > Q,. It is possible that the optimal policy would order up to some other inventory level corresponding to a local minimum. However, the following results allow us to conclude that, provided we start with a sufficiently small initial inventory level, the optimal policy in the infinite horizon case will truly be "order-up-to." Of course, if we can discard inventory at no cost, then the finitehorizon optimal policy will be "order-up-to" as well. Theorem 7 The optimal policy is specified by numbers Qn, Sn, sn such that regular time production in period n attempts to bring inventory to Qn, and safety capacity is used to bring inventory up to Sn, if after demand is revealed, the inventory position is below s,. Furthermore, sn < Sn < Qn+l. Proof: The regular time result follows directly from Lemma 5. The proof of the (s, S) property is identical to that of Theorem 5. To prove that Qn > Sn, suppose not, i.e., assume that for some n, Sn > Q,+l. When safety capacity is used at x, 20

(i.e, when x < Sn), we have 3n(x) = I', + n(n-x) + Gn(Sn)+fn+l(Sn)' However, if we compare this to the policy that brings inventory up to Qn+1 whenever excess capacity is used, we find that the cost becomes Pn(x) = n', + cn(Qn+l - x) + Gn(Qn+l) + fn+i(Qn+i). Since Q,+l is the optimal order-up-to point for regular time period n+1, fn+l(Qn,+) < fn+l(Sn)' It is easy to show that for each n, Qn, 0, hence G,(Sn) > Gn(Qn+l). Finally we have cnQn+l < cnSn, so Pj(x) < 0n(x). But this is a contradiction; hence we have shown that sn < Sn < Qn+i. ~ In a similar fashion to that used to demonstrate Lemma 5 and Theorem 7, we can derive results for the infinite-horizon and average-cost cases. The infinite-horizon optimal policy inherits the structure of the finite horizon optimal strategy and hence is also "order-up-to." Specifically, this leads to the following Theorem 8 The optimal policy to the infinite horizon average cost inventory control/safety capacity scheduling problem is specified by numbers Q, S, s such that regular time production attempts to bring inventory up to Q, and safety capacity is used to bring inventory up to S if, after demand is revealed, inventory position is below s. Furthermore, s < S < Q. This Theorem implies that if x, < Q (where xn represents the inventory level in period n) then, since s < S < Q, we are guaranteed that Xn+1 will be less than Q. Hence, as long as x1 < Q, the optimal policy will be to order up to Q in period n for all n. 3.4 Computational Issues The above results imply that we can restrict attention to policies of the form R = (Q, s, S), where Q, s, and S are defined in Theorem 8. To develop an expression for computing the average cost of policy R, let x = inventory at end of regular time after demand has been revealed y = inventory after safety capacity has been used 21

so that S, if x < s y = x, if x > s and let yR(x) represent the value of y specified by policy R. Using our discrete approximations of production and demand distributions pi = Pr{regular time production capacity = i} dj = Pr{demand = j} we can define transition probabilities for an MDP whose decision epochs are at the end of regular time as qR(i,j) = Pr{x, = jlyn-i = i} Q-i Xo = Z pkdi+k-j + E PkdQj k=O k=Q-i+l Notice that j can take on values {-oo,...,Q}. In practice, of course, we would have to truncate the state space at some minimum allowable inventory level. However, we will ignore this implementation issue in order to keep our presentation simple. We can now express the one-period cost of having x at the end of period n and y at the beginning of period n + 1 as 7(x, y)= KI{y>,) + c(y - x) + G(y) Letting YR(X) = (X, YR(X)) represent the one-period cost of using policy R in state x, can now write the recursion to compute the average cost of using policy R. denoted by gR, as QR VR(X) = yR(x)-gR + E qR(yR(x),j)vR(j), x E I (3.7) 3 = - aC where I denotes our state space (i.e., set of allowable inventory levels at the end of regular time) and QR denotes the quota under policy R. We also need a normalizing 22

condition, such as v(S) = 0 (3.8) to give us a well-defined system of equations. To implement a policy improvement algorithm we want to check whether there exists some policy R' such that QR' YR,(x)- R + z qR'(YR'(x),j)vR(j) < vR(x) (3.9) j=-oo for some x E I, where VR and gR have been computed for some policy R using (3.7). If this is the case, then policy R' is better in state x than policy R. The standard policy improvement algorithm for Markov decision processes does not make use of the special structure of the optimal policy, and indeed is not even guaranteed to terminate with a (Q,s, S) policy. Our previous results guarantee that there is an optimal (Q, s, S) policy but not that all optimal policies are of this form. Therefore, we would like to modify the policy improvement algorithm to only consider (Q, s, ) policies. To do this, we make the following observations: (A) If we alter R by increasing (decreasing) s by one, then yR(x) and yR(x) change only for x = s + 1 (x = s). This means that an increase (decrease) in s can satisfy (3.9) only for state s + 1 (s). If no improvement is possible in this state, s is optimal for the current values of Q and S. (Note that if altering s to s + 1 (s - 1) results in the same value for state s + 1 (s) as before, then we also need to check the effect of changing s to s + 2 (s - 2), and so on.) (B) If we alter R by adding A (where A is positive or negative) to S, we do not change yR(x) and yR(x) for x > s. However, we change yR(x) and yR(x) for all x < s as follows: YRi(X) = S+6 -R,(X) = I + c(S + A - x) + G(S + A) 23

Hence, for R' to satisfy (3.9) in state x < s requires Q Q 7R/(x)-gR+ E qR(yR'(x),j)vR(j) < R(X)-9R+ E qR(yR(x),j)V)VR(j) 3=-oo00 J=-00 Since we only altered S and did not alter Q, qR'(.) = qR(.) and we have A + c(S + A - x) + G(S + A) - 9R Q + E qR(S+a,j)vR(j) < K+c(S-x)+G(S)-gR Q=+ E qR(S,j)vR(j). (3.10) Since (3.10) does not depend on x (the cx terms on both sides cancel), if this condition is satisfied for any x < s, then it is satisfied for all x < s. Hence, we can determine whether S is optimal for the current values of s and Q by checking (3.9) for a single value of x, say x = s. (C) It is straightforward to show (in a similar manner to Lemma 5) that given an (s, S) policy is being used during overtime, the optimal policy during regular time is an order-up-to policy. Hence, we can look for an improved policy, R', by checking whether there exists a quota Q' that improves cost (i.e., satisfies (3.9)) for all states x. If we alter R by changing Q to Q', then the qR(i, j) values change, which affects the condition in (3.9). However, note that since s and S are the same under R and R', we have 7R'(X) = YR(x), and, yRI(x) = yR(x). Hence, in this case condition (3.9) becomes -Q' Q q qR(yR(x),j)vR(j) < 3 qRyR(R(x),j)R(j) (3.11) = - =-00 Note that for all x < s, (3.11) results in the same inequality since yR(x) = YRI(x) = S if x < s. However, if x > s, then we obtain different equations for each x value. Hence, we need to search over the set of all possible values for the quota to find out if there exists a Q' that results in an improvement for 24

each value of x > s, and a single value of x < s. If we can not find a value of Q' resulting in an improvement for all x, then the current Q is optimal. Using the above observations, we can specify an algorithm as follows. Start with an arbitrary (Q, s, S) policy. Use results (A) and (B) to modify s and S until they are optimal for this value of Q. (Notice for a given Q, we need only consider states x = s and x = s + 1 to check the optimality of s and only a a single state, x when checking the optimality of S.) Now look for a value of Q that improves the cost in all states x. As we argued above, for any s and S, the optimal regular-time policy is order-up-to, so it is not possible that a policy that uses different Q values in different states x can produce a lower average cost than one using the same Q in every state. If no better Q value can be found, then the current policy is optimal. If a better Q is found, then change Q and continue the algorithm by finding the optimal s and S values for the new Q. Since this algorithm finds values of Q that reduce average cost in each iteration, it is not possible to cycle. Therefore, provided the Q values are restricted to a finite set (e.g., a grid consistent with the discretized distributions for production and demand), this algorithm will terminate finitely. The above outline of an algorithm is presented to illuminate the structure of the computational problem, not as an efficient methodology. Clearly, because of the requirement to evaluate (3.7) at each policy improvement iteration, the algorithm is far less efficient than modern algorithms for computing (s, S) policies (e.g., [10] and [25]). These algorithms use a renewal approach for computing the average cost of an (s, S) policy, where renewals occur each time an order brings the inventory level up to S. Because the inventory level only decreases between orders, the system of equations that must be solved to compute average cost are triangular and can be solved efficiently. In our problem, the inventory level can increase between decision epochs even when safety capacity is not used, due to the intervening regular time production period. Hence, a renewal approach to computing the cost of a (Q, S, s) policy does not result in a triangular system of equations and is therefore much more expensive to solve. It remains to be seen whether an efficient renewal approach is 25

possible for the situation treated in this paper. 3.5 State Dependent Demand Up to this point, we have assumed that demand is independent of the net inventory level. However, this may not be true in some realistic cases. If the plant has a large positive inventory, for example, the sales department may increase its efforts to increase the sales. Alternatively, in the case where the demand is generated by a downstream plant, a poor production week may induce the downstream plant to obtain more of its supply from other sources. In either case, it is reasonable to assume a demand function D(a) such that ED(a) is increasing in a. In this case, we can formulate the following dynamic problem to replace (3.1). fn(x) = min {InI{a>x} + H,(a) + E[f,+l(a + Yn(a) - Dn(a))]} (3.12) for n = 1,2,3,.....N with fn+l(x) = 0 for all x and Hn(a) = (c, - cn+i)a + Gn(a) - cn+E[Yn(a)] + cn+C E[Dn(a)] Theorem 9 If E[D(a)] is convex in a, then the optimal solution to (3.12) is an (s, S) policy. Proof: The proof is similar to the proof of Theorem 5. 3.6 A Special Case Lastly, we consider a special case of the above model, where capacity is much higher than demand so that for practical purposes, we can make any quota we set and hence we make quota in every period. Then (3.2) becomes G(x)+ Ew(Q - D) if a = x w(x) + g - (3.13) 1 G(a) + K + c(a - x) + Ew(Q - D) if a > x 26

Now, clearly, we will not choose to use safety capacity when we have a positive inventory since we make quota during regular time. Under these conditions, there are two cases to consider, namely when b > c and when b < c. First, suppose b > c. Then, when we choose to use safety capacity, we will bring the net inventory position up to 0, since the variable cost of producing a unit is less than backlogging it once we use overtime and this has no effect on holding costs next period. Hence overtime decision will be used when -bx > K - cx, or equivalently when x < K/(c - b). The total cost due to this policy has three components. If D < Q, then we have holding costs, if Q < D < Q + bc, backorder costs are incurred, and otherwise we have overtime costs. If we let z(Q) be the expected cost of using Q as the quota and let p = K/(b - c) then we can write: z(Q) = fo h(Q - x)fD(x) + f+ ( b(x - Q)fD(x) + K(1 - FD(Q + ())+ (3.14) fQ+, c(x - Q)fD(x). We can differentiate z(Q) with respect to Q and set the result equal to 0 and solve to get (h + b)FD(Q) + (c- b)FD(Q + K/(b - c)) = c (3.15) Solving (3.15) for Q, we get the optimal quota. If b < c, then we never use safety capacity. In this case, we get the familiar result that FD(Q) = b/(b + h) (3.16) which is the solution to the newsboy problem. If ALp >> lD, then this solution should provide a very good estimate of Q*. If YP > PD, this solution may still be close enough to serve as a starting point for the original model (i.e., search over Q near the solution to (3.15) or (3.16)). 27

4 Conclusions In this paper, we have formulated two models for determining an optimal inventory control policy for production systems with stochastic production and demand and have integrated this quota-setting problem with the problem of optimally using safety capacity. Under the assumption that safety capacity must be used when quota is not achieved during regular time, we have derived simple expressions for the optimal quota and have used these expressions to derive a variety of structural results, including: 1. The optimal quota is decreasing in the fixed and variable safety capacity costs and the inventory carrying cost and increasing in the unit profit. 2. The optimal quota increases as demand increases in the stochastic sense. 3. The optimal quota increases as production capacity increases in the stochastic sense. 4. The optimal quota decreases with the variance of demand when the optimal quota is greater than mean demand and increases with the variance of demand when it is smaller than mean demand. 5. The optimal quota increases in the variance of demand when the coefficient of variation of demand is held constant. Under the assumption that quota shortfalls can be backlogged to the next regular time period, we have shown that the optimal inventory control policy has an "orderup-to" structure, and the optimal safety capacity scheduling policy has an (s,S) structure, so that the overall policy is of a (Q, s, S) type. We also gave the outline of a policy improvement algorithm for computing the optimal (Q, s, S) policy. Acknowledgements: 28

We are grateful to Yehuda Bassok for his many helpful comments and suggestions. We also thank an anonymous referee for an extremely careful reading of the manuscript and several insightful suggestions and corrections. This work was supported in part by Grants DDM-8905638 and SES-9119621 from the National Science Foundation and by a grant from the IBM Corporation. Bibliography [1] Barankin, E.W., "A Delivery-Lag Inventory Model with an Emergency Provision (The single-period case)" Naval Research Logistics Quarterly, 8 (1961): 285-313. [2] Bensoussan, A., Crouhy, M., and Proth J.M., Mathematical Theory of Production Planning, North-Holland, Amsterdam, 1983. [3] Bielecki T., and Kumar P.R., "Optimality of Zero-Inventory Policies for Unreliable Manufacturing Systems," Operations Research, 36 (1988): 532-541 [4] Bitran, G.R., and Chang L.,, "A Mathematical Programming Approach to a Deterministic Kanban System," Operations Research 30 (1982): 232-251. [5] Daniel, K.H., "A Delivery-Lag Model with Emergency," Chapter 2 in Multistate Inventory Models and Techniques, edited by H.E. Scarf, D.M. Gilford, and M.W. Shelly. Stanford University Press, Stanford, CA, (1963). [6] Duenyas, I., and Hopp, W.J., "Estimating Variance of Output from Cyclic Exponential Queueing Systems," Queueing Systems 7 (1990): 337-355. [7] Duenyas, I., Hopp W.J., Spearman M.L., "Characterizing the Output Process of a CONWIP Line with Deterministic Processing and Random Outages," to appear in Management Science, 1993. [8] Duenyas, I., and Hopp W.J., "CONWIP Assembly with Deterministic Processing and Random Outages," IIE Transactions, 24 (1992): 97-109. [9] Duenyas, I., and Hopp W.J., "Estimating the Throughput of an Exponential CONWIP Assembly System," to appear in Queueing Systems, Theory and Applications 1993. 29

[10] Federgruen, A. and P. Zipkin, "An Efficient Algorithm for Computing Optimal (s,S) Policies," Operations Research 34 (1984), 1268-1285. [11] Henig, M., and Gerchak, Y., "The Structure of Periodic Review Policies in the Presence of Random Yield," Operations Research, 38 (1990) 634-643. [12] Groenevelt, H., and Karmarkar U.S., " A Dynamic Kanban System Case Study," Production and Inventory Management Journal, 29 (1988): 46-51. [13] Hall, R.W., Zero Inventories, Dow Jones- Irwin, Homewood, IL, 1983. [14] Heyman P.D., and Sobel J. M., Stochastic Models in Operations Research, Volume 2, McGraw-Hill, New York, 1984. [15] Hopp W.J., Spearman M.L., and Duenyas I., " Economic Production Quotas for Pull Manufacturing Systems," IIE Transactions, 25 (1993): 71-79. [16] Inman, R.R., "Quantifying the Advantages of Production Quota Flexibility," working paper, Operating Sciences Department, General Motors Research and Environmental Staff, Warren, MI, 1992. [17] Karlin, S., "One Stage Inventory Models with Uncertainty," Chapter 8 in Studies in the Mathematical Theory of Inventory and Production, edited by K.J. Arrow, S. Karlin, and H. Scarf. Stanford University Press. Stanford, CA, (1958). [18] Mitra, D., and I. Mitrani, "Analysis of a Kanban Discipline for Cell Coordination in Production Lines," Management Science, 36 (1990): 1548-1566. [19] Monden, Y., Toyota Production System: Practical Approach to Management, Industrial Engineering and Management Press, Norcross, GA, 1983. [20] Ohno, T., Toyota Prodution System: Beyond Large Scale Production, Productivity Press, Cambridge, Mass, 1988. [21] Ross, S., Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983 [22] Schonberger, R.J., World Class Manufacturing: The Lessons of Simplicity Applied, The Free Press, New York, 1986. [23] Wang, H., and Wang H.B., "Determining the Number of Kanbans: a Step Toward NonStock Production," International Journal of Production Research, 28 (1990): 125-147. 30

3 90Zheng, Y.S5 04735 0205 [241 Zheng, Y.S.'Optimal Control Policy for Stochastic Inventory Systems with Markovian Discount Opportunities," to appear in Operations Research (1992). [25] Zheng, Y.S., and A. Federgruen, "Finding Optimal (s,S) Policy is about as Simple as Evaluating a Single Policy," Operations Research 39 (1991), 654-665. 31