The Continuous Review (s,S) Policy for Production/Inventory Systems with Poisson Demands and Arbitrary Processing Times Hyo-Seong Lee Mandyam M. Sinivasan Department of Industrial and Operations Engineering The University of Michigan Ann Arbor, MI 48109-2117 Technical Report 87-33 December 1987 (Revised March 1989)

Abstract A production / inventory system is considered, in which a single production facility is engaged in producing items that are held in inventory. The inventory of items is thus replenished one item at a time, and the processing time required to produce an item is assumed to follow an arbitrary distribution. The demand for this item occurs according to a Poisson process. An (s,S) policy is considered in which the production stops at the instant that the inventory level is raised to S and production begins again at the instant that the inventory level drops to s. Under a cost structure that includes a set-up cost, a linear holding cost, and a linear backorder cost, an expression for the expected cost per unit time is obtained for given control values. Using convexity and unimodality properties of the cost functions, an extremely simple and efficient procedure to find the optimal (s,S) policy is presented

1. Introduction In this paper, we consider the optimal control strategy for a production/inventory system wherein a single production facility produces items of a given type. The demand for this item is assumed to arrive according to a Poisson process with rate X. The processing time for producing (replenishing) an item is assumed to be an independent, identically distributed, random variable U which follows an arbitrary distribution. For stability of the system, we assume that XE(U)<1 and E(U2)<oo. If an item is demanded, it is supplied directly from the inventory, if it is available. If the item is not available, it is backordered. We assume that i) inventory holding costs are incurred linearly over time with respect to the inventory level ii) backorder costs are incurred linearly over time with respect to the backorder level and iii) a set-up cost is incurred each time the production facility is turned on. The objective is to find a continuous review production/inventory policy to minimize the expected cost per unit time. The operating policy considered in this paper is an (s,S) policy. Such policies are known to be effective in a variety of situations (Bell 1971, Sobel 1969, Veinott 1966, Veinott 1967). The characteristics of the policy considered in this paper are now described. As soon as the inventory level reaches a prespecified value S, the production facility is turned off and a non-production period begins. During the non-production period, the inventory level is continuously monitored to determine whether the inventory level has reached a prespecified value s or not. At the instant the inventory level drops to s, the non-production period ends and production begins immediately. During the production period, the inventory is replenishc l on an item-by-item basis while the demand continues to be made on these items. When the inventory level is raised to S, the production period ends and the next non-production period begins, initiating another cycle in the (s,S) policy system (refer to figure 1). There is considerable work on (s,S) inventory policies. Such policies are studied by Beckmann (1961), Johnson (1966), Veinott and Wagner (1965), Veinott (1967), Archibald and Silver (1978), and Sahin (1979), to name but a few. However, most of the (s,S) policies present in the literature assume that any amount of inventory can be replenished all at once. Our model is different in the sense that the inventory can be replenished only on an item-by-item basis. Since the replenishment in our model is made on an item-by-item basis, it is easy to note that our model has an analogy with queueing systems. In fact, the work on control of queueing systems, for instance, the papers by Heyman (1968), Yadin and Naor (1963), Bell (1971), Sobel (1969), and Lee and Srinivasan (1989), have a close relationship with production/inventory 1

systems. Using this inventory-queueing analogy, some studies have been done on (s,S) policies for the production/inventory systems in which the inventory is replenished item-by-item. N: non-production period P:production period S <^^ ------ ^|<-^<-1 ---—!!!-1_1 — 1iu le i ic w it N P N P N P Figure 1. The Continuous Review (s,S) Policy with Poisson Demands and Arbitrary Processing Times Heyman (1968) considers the operating policy for an M/G/1 queueing system which is a special case of the (s,S) policy where the upper control limit S is set to zero. Gavish and Graves (1980) consider an (s,S) continuous review production/inventory system with unit Poisson demand arrivals and deterministic processing times. They develop a procedure to find the steady state probability of each inventory level using an underlying M/D/1 queueing system. To obtain the optimal control values, they use a search procedure in which some properties for cost functions are effectively exploited. In a subsequent paper, Gavish and Graves (1981) extend the analysis to consider general processing times. Although their search procedure is efficient, they do not prove that the policy found by their algorithm is the optimal. Tijms (1980) considers a system with arbitrary processing times and finds the optimal control values by using a denumerable state SemiMarkov decision process. In his model, a start-up time is allowed which can model the time taken to turn on the production facility. Altiok (1986) considers a system where the demand follows a compound Poisson demand process. The service times in his model are assumed to be phase-type. He uses an underlying continuous-time Markov chain to obtain the steady state probability of each inventory level for a given policy. However, in this paper no special properties for the cost function are proved, that would assist in the search for the optimum. The (s,S) policy in this paper uses a fundamentally different approach from those used in the past. In addition, this paper also develops proofs of convexity and unimodality of certain functions, which leads to a very efficient algorithm to find the optimal policy. This paper considers, in fact, the same problem that was studied by Tijms except that a start-up time is not 2

considered in our model. However, as pointed out by Gavish and Graves (1980), the computational time using a semi-Markov decision process for solving production/inventory systems is much higher than that using an intelligent search technique. In the following sections, first an expression for the expected cost per unit time for given control values will be derived and following that an efficient procedure to find the optimal control values, s and S, will be presented. 2. The Analysis 2.1. Notation For analytical convenience, we set r = S-s, and throughout the paper we will use (r,S) instead of (s,S) to describe a policy. Thus, the (r,S) policy represents the policy with S-r as a lower control value and S as an upper control value. Let U U(.),U*(.) Q qj K Ch Cb CN(r,S) Cp(r,S) C(r,S) L(r,S) TC(r,S) = the processing time to produce a unit item, XE(U) < 1, = the c.d.f. and the Laplace Stieltjes Transform of U, respectively, = the number of demands which arrive during a processing time U, = Pr( Q=j), j=0,1,2,... = set-up cost, = holding cost / item / unit time, = back order cost / item / unit time, = expected cost during a non-production period with control values r and S, = expected cost during a production period with control values r and S, = CN(r,S) + Cp(r,S) = sum of expected holding and backorder costs during a cycle with control values r and S, = expected length of a cycle with control values r and S, = expected cost per unit time with control values r and S. 2.2. General Approach Our first objective in this paper is to obtain an expression for TC(r,S), the expected cost per unit time for given control values r and S. To obtain this, note that the epoch marking the start of a cycle forms a regeneration point. It follows that we have a renewal reward process, and thus from the renewal reward theorem (see, for example, Ross 1970), the expected cost per unit time, when r and S are used as control values, is obtained by 3

TC(r,S) C(r,S) + K (2.1) TC(r,S) -= rS) L(r,S) ( Since the term C(r,S) is the sum of CN(r,S) and Cp(r,S), to obtain TC(r,S) we need to find the terms CN(r,S) and Cp(r,S) as well as the term L(r,S). We now show how the terms CN(r,S), Cp(r,S) and L(r,S) are determined. 2.3. Computing the term CN(r,S) The expected cost during a non-production period is easily determined. Let gkc,-l denote the expected cost incurred from the epoch at which the inventory level becomes k to the epoch when the inventory level drops to k-1 during a non-production period. Then gk,k-l is given by gkk-1 = C k if k > 0, (2.2a) -b k, if k < 0. (2.2b) The expected cost during a non-production period when r and S are used as control values, is then expressed by S CN(r,S) = I gk,k-i. (2.3) k=S-r+l 2.4. Computing the term Cp(r,S) During the production period, the production completion epochs are the times at which the inventory is replenished. To compute Cp(r,S), we therefore restrict our attention only to these epochs. Let fij denote the expected cost from the epoch at which the inventory level reaches i, to the epoch at which the inventory level is raised to j (j2i) for the first time with fi,i=0 for any i. Then the expected total cost incurred during the production period, Cp(r,S), is just fs-r,s which, in turn, is expressed as (note that S-r = s) S-1 Cp(r,S) = fs-r,s = I fk,k+l- (2.4) k=S-r From equation (2.4), we see that the term Cp(r,S) can be determined if we can calculate each value of fkJk+1 Let Ek denote the expected cost incurred during a processing time that is initiated with k items in inventory. Then fk,k+l is expressed as 00 fk,k+l = Ek + qjfk+l-j,k+l. (2.5) j=l 4

In equation (2.5), the term Ek is the expected cost incurred during the time required to produce the first item following the initiation of production. During this time, j items are demanded with probability qj and this takes the inventory level to k+l-j at the end of the production period. The second term in equation (2.5) is thus the expected cost incurred from the end of that processing time until the time at which the inventory level is first raised to k+1. The qi values can, in general, be computed from the following expression: 00 qi = J ( e dU(t). (2.6) Remark: Note that the computation of the qi values is considerably simplified in many cases. For example, when the production times are of phase-type, the qi's can be obtained in closed form (Neuts 1981, page 59). Also, these computations are trivial for the case of deterministic processing times. To obtain Ek, we let hk denote the expected time that the kth item is held in inventory during a processing time. (It is implicit that this processing time is initiated with at least k items in inventory.) If the number of items demanded during a processing time is less than k, then the kth item will be held during this entire processing time. The term Ek is obtained from Lemma 2.1. A proof of Lemma 2.1 is given in Appendix A. Lemma 2.1 The term Ek is expressed recursively as Ek+i = Ek + hk+l(ch + Cb) - E(U)cb, (2.7) where 1 k hk+l = hk+ (1 - qj), with hk=O for k < 0. (2.8) X j=0 XE(U2) As an initial value for equation (2.7), we use Eo = 2 cb, which can be obtained from equation (A. 13) in appendix A by setting k = 0. 2.4.1. Computing the term fk,k+i We now obtain a recursive expression for the term fk,k+i. First, we develop an expression for the term fl1,o, which denotes the expected total cost incurred from the epoch at which the inventory level reaches -1 units till the time that the inventory level is first raised to 0. Note that this is 5

equivalent to the expected total cost incurred during the busy period in the M/G/1 queueing system where the waiting cost per customer is Cb. Thus, if we know the expected total waiting time for customers during a busy period in an M/G/1 queueing system then we can easily obtain fl,o. Lemma 2.2 gives this result. Since this is a well known result, we omit the proof (note that the expected number of customers served in a busy period is just 1/(l-p)). Lemma 2.2 Consider an M/G/1 queueing system with arrival rate X and service time U. Then, the expected total waiting time, WT, of all the customers during the busy period is given by Wt = 1 )[E(U2)+E(U)] (l-p) 2(l-p) From Lemma 2.2, f-1,o can be obtained simply as f 1,o = C [ XE(U2) + E(U)] cb (2.9) - W~cb - (l-p) 2(1-p) Define Afk = fkk+ - fk-i.k, (2. 10a) and AE = Ek - Ek- = hk(Ch + cb) - E(U)cb. (2.1 Ob) Note, from equations (2.8) and (2.10b), that 1 k-i AEk-AEk-1 = -( qj)(Ch + b), k>0, (2.10c) ij=o = 0, k< 0. (2.10d) The recursive expression for fk,k+l is obtained from Lemma 2.3. A proof of Lemma 2.3 is given in Appendix B. Lemma 2.3 The term fkk+ is obtained from the recursive expression ~1 ~k k E(U) fk,k+l= fk-,k + {Afk-1 + AEk- AEk- - qj Afk-j + (1 - qj) Cb}, k > 0, (2.11a) 40 j=i j=o (l-p) E(U) k < 0. (2.11 b) = fk-lk - Cb, k. (2. b) (l-p) 1 6

Let Tk = gk+l,k + fk,k+l- (2.12) Using equations (2.3), (2.4) and (2.12), C(r,S) is expressed as S-1 C(r,S) = CN(r,S) + Cp(r,S) = Tk. (2.13) k=S-r 2.5. Computing the terms L(r,S) and TC(r,S) The expected cycle time in the (r,S) system can be obtained by using the relationship between production/inventory systems and queueing systems. To obtain L(r,S), we make an important observation that the length of a production period in our (r,S) system is the convolution of r busy periods in an M/G/1 queueing system. Therefore, the expected length of a production period is directly obtained from the busy period analysis as r E(U)Since the expected length of a non(l-p) production period when r and S are used as control values is, the expected length of a cycle is given by L(r,S) = rE(U)+ L r (2.14) (l-p) X (l-p)? Hence, from equations (2.1), (2.13), and (2.14), s-1 Ztk + K TC(r,S) = (l-p) ks —r (2.15) r 3. The Optimal Control Values In order to find the optimal control values (r*,S*), a two-dimensional search over the integer parameter space must be made. Usually, the search for the optimal point in two dimensional space is not easy. However, if we exploit some properties of the cost functions and the recursive nature of gk+l,k and fk,k+l, we can find an extremely efficient search procedure. Let us denote the optimal S value for a given r by S*(r) and the optimal r value for a given S by r*(S). We now demonstrate some properties that are possessed by this system. These properties are used in devising the search procedure. 7

Theorem 3.1: Tk is convex with respect to k. Proof: Since Tk = gk+l,k + fk,k+l, Tk is convex if we can show that gk+l,k and fk,k+l are both convex. Note from equation (2.2), that the function gk+l,k - gk,k-l is a non-decreasing function with respect to k. This proves that gk+l,k is convex. To show convexity of fk,k+l, we need to prove that Afk - Afk-l > 0 for all k. We will prove this by induction on k. From Lemma 2.3, Afk E(U)cb k < 0, and so Afk - Afk- = O for all k < 0. (l-p) Suppose Afk - Afk1 2> 0 holds for k < n, for some n > 0. We now prove that Afk - Afk-1 > 0 for 00 k=n+l using the induction hypothesis. From equation (2.5), Afn = AEn + X qj (fn+l-j,n+l - fn-j,n) j=1 and so, after some algebra, Afn+l - Afn is expressed as (also refer equation (B.2)) 00 Afn+ - Afn = AEn+ - AEn + qj(Afn+l - Afn+-j). (3.1) j=1 Equation (3.1) can be rewritten as 1 00oo Afn+i - Afn = AEn+ - AEn + q( qj(Afn - Afn+l-j)} 3.2) 40 j=l 1 n From equation (2.10c), AEn+I - AEn =-(1 - ~ qj )(ch + cb) > 0 for all n. Also by the induction ij=0 hypothesis, Afn - Afn+l-j > 0 for all j > 1. Since qj 2 0 for j 2 0, we must have Afn+1 - Afn 2 0. So, by the principle of mathematical induction, Afk - Afk-l 2 0 for all k and hence fk,k+l is convex with respect to k. 1 Figure 2 gives one possible realization for the function Tk. Note that C(r,S*(r)) is nothing but a minimum value among all possible sums of r adjacent Tk's. In figure 2, for example, C(3,S*(3)) is just the f 12 and 3. Furthermore, since k is suconvex, C(4,S*(4)) in figure 2 can be obtained directly from C(3,S*(3)) as C(4,S*(4))=C(3,S*(3))+min{To,T4} = C(3,S*(3))+T4. Corollary 3.2 generalizes this result. 8

Figure 2. One realization for the function Tk C(r+l,S*(r+l)) = If XS*(r) < x If ts*(r) > TS C(r,S*(r)) + min {Ts* ),tS(r )-r-1}. (3.3) i*(r)-r-1, then S*(r+l) = S*(r) + 1. i*(r)-r-1, then S*(r+l)= S*(r). Proof: Corollary 3.2 is a direct consequence of Theorem 3.1. Theorem 3.3 presents an important characteristic of the cost function which is very useful in the search procedure, and in guaranteeing optimality. Theorem 3.3: TC(r,S*(r)) is unimodal with respect to r. Proof: In order to prove the unimodality of TC(r,S*(r)), it is sufficient to show the following: If TC(n,S*(n)) then, TC(n+l,S*(n+l)) < TC(n+l,S*(n+l)), < TC(n+2,S*(n+2)). (3.4) The term TC(n,S*(n)) can be expressed as 9

TnS(n)(K + P(n)) (1-p)k i+n-1 TC(n,S*(n)) = (K + 5(n)) (, where P(n)= min{ ( tk }) n i k=i Using this in (3.4), observe that in order to show the unimodality of TC(r,S*(r)), we only need to show(K + t(n)) (K + (n+ 1)) (K + P(n+1)) (K + f(n+2)) show that if n — < -- - nl,then < n2 * n n+' n+1 n+2 Let 8k = 13(k) - 3(k-l) and c = K+3(n). To show unimodality, we then need to show that if C c + 6n+l c + Sn+l C + 8n+l + Sn+2 n n+l' n+l n+2 (3.5) But from the fact that 8+1 < 6n+2, relationship (3.5) is obvious. We now demonstrate two properties that the control limits possess, which assist in restricting the search. Theorem 3.4 first indicates that the optimal production quantity per cycle is bounded from below by the threshold value obtained from Heyman's N-policy formula (1968). A proof of Theorem 3.4 is given in Appendix C. Theorem 3.4: If we denote the value of r in the optimal policy by r*, then r* > r*(0), where r*(0) is one of the neighboring integers of 2(-p)K Cb Theorem 3.5: S*(r) > forany r 1. Proof Consider a policy (r,S) where S<0. Clearly the policy (r,0) always dominates this policy since only non-positive inventory levels exist for these two policies and the expected shortage level of policy (r,0) is always less than that of policy (r,S) by -S. Thus, S*(r) cannot be negative for any r. 1 We use these properties as follows. First note, from Corollary 3.2, that once we find an optimal control value, S*(r), for some r, say r=k, then we can find an optimal control value for r=k+l very quickly. In our algorithm, we make use of this fact and start with r=l, that is, we find 10

C(1,S*(1)) first. To find C(1,S*(1)), noting that Tk is convex in k, we just search for the minimum value of Tk. Since Tk > T-1 for k < -1, the search starts from k = -1, and hence we compute Tk from k=-l up to the point at which the value of Tk is first increased. Thus, if min{Tk}=q, the value of S*(1) can be obtained as q+l. The cost function TC(1,S*(1)) may then k be obtained from C(l,S*(1)) by using equation (2.1). (However, notice that by Theorem 3.4, TC(r,S*(r)) need not be calculated for r <L 2( J)K where LPj is the largest integer that Cb does not exceed p.) Once C(1,S*(1)) is found, we can find C(r,S*(r)) and, therefore, TC(r,S*(r)), sequentially in the order r=2,3,4,,*-. Note that the values of Tk that are evaluated in order to obtain C(r,S*(r)) for r=l, will also be used to obtain C(r,S*(r)) for r > 1. Since TC(r,S*(r)) is unimodal with respect to r, we can now develop an extremely simple search procedure: find C(r,S*(r)) for r=l,..., L 2(- KJ as indicated above. Following this, Cb we increase r by 1 and compute C(r,S*(r)) and TC(r,S*(r)) for this new r, repeating this process until TC(r,S*(r)) is first increased. Then, by Theorem 3.3, the local minimum point obtained in the previous step is a global minimum and the optimal control values, r* and S*, are obtained. The algorithm to find the optimal control values is described below: Algorithm to find TC(r*,S*) 1. Determine n*=L 2(1-p)K Cb 2. Calculate C(r,S*(r)) for r = 1,...,n*, and obtain TC(r,S*(r)) for r = n* 3. Set r = r+l and calculate C(r,S*(r)) and TC(r,S*(r)). If TC(r,S*(r)) > TC(r-l,S*(r-l)), then return the optimal policy as (r-l,S*(r-l)). Otherwise, repeat step 3. This algorithm is simple and efficient: only one new Tk needs to be calculated each time that r is incremented. Moreover, due to the recursive nature of fkk+l, tk is computed very quickly from k-l for k > 0, or tk+l for k < 0 (either of which would have been obtained at the previous step). In this algorithm, most of the computational effort to obtain the optimal control values is spent on calculating values of tk for k > 0 (see remark in Section 2.4). Suppose we know the optimal control values (r*,S*) beforehand and that we just need to calculate TC(r*,S*). In this case, we 11

must calculate Tk for S*-r*+l < k < S*. On the other hand, suppose we need to find both the optimal policy and its cost using this algorithm. In this case, it may be observed that at most only two additional values of tk need to be calculated as compared to the case when the optimal control values are known. This fact demonstrates the efficiency of our algorithm: very little calculation is wasted on computing the points other than those required to obtain the optimal control values. In fact, this algorithm finds the optimal solution in almost one shot. 4. Numerical Examples We now present some numerical examples to illustrate this technique. In the first example, the processing time is assumed to follow a special distribution, which is encountered frequently in manufacturing situations. The processing time has a deterministic value to if the production facility does not fail during the processing time. However, if the production facility fails during the processing time, which is assumed to occur with probability p, the processing time becomes to plus a repair time R which follows an exponential distribution with parameter g. Other parameter values are given as follows: x = 0.15, K = 500, ch = 2, cb = 10. U = to with probability l-p, = to + R with probability p, where p=0.02, to=5 and g=0.05. In the second example, the processing time is assumed to follow a uniform distribution on [2,4]. Other parameter values are given as X= 0.1, K=3000, ch=2, cb = 20. The results of the policy comparisons for these examples are presented in tables 1 and 2. In these tables, the values of r, s*(r), S*(r) and TC(r,S*(r)) are shown for each value of r. Although only two distributions for the processing time are demonstrated in the examples, many other distributions can be easily implemented if their Lapace-Stieltjes transform functions are well diffenrentiable. 5. Conclusions We have considered the (s,S) control policy for production/inventory systems. We obtained an expression for the expected cost per unit time for given values s and S and using a convexity property of the function Tk, and the unimodality property of the cost function TC(r,S*(r)), we have developed an extremely efficient procedure to find the stationary optimal (s,S) policy. 12

Table 1. Result of example 1 riS-s *(r) S*(r) TC(r,S*(r)) 1 5 6 29.8176 2 5 7 22.7503 3 4 7 20.4731 4 4 8 19.3938 5 3 8 18.8947 6 3 9 18.5638 7@ 3 10 18.4672 8 2 10 18.5041 9 2 11 18.5643 10 2 12 18.7432 (@ indicates the optimal policy) Table 2. Result of example 2 r=S-s s*(r) S*(r) TC(r,S*(r)) 10 -1 9 30.2455 11 -1 10 29.2474 12 -1 11 28.5824 13 -1 12 28.1735 14 -1 13 27.9658 15 -1 14 27.9192 16@ -2 14 27.8826 17 -2 15 27.9640 18 -2 16 28.1475 19 -2 17 28.4169 20 -2 18 28.7594 (@ indicates the optimal policy) 13

Appendix A Proof of Lemma 2.1 Lemma 2.1 The term Ek is obtained recursively as Ek+l = Ek + hk+l(Ch + b) - E(U)cb, (A. 1) where 1 k hk+l = hk+-(1 - qqj), with hk = for k < 0. (A.2) x j=0 Proof Let Ui be the length of a processing time given that i demands arrived during that processing time and let ui be E[Ui]. If we apply Bayes' formula, ui is expressed as 00 ui = E[ Ui] = (dU) (A.4) From equations (A.4) and (2.6), after some algebraic manipulation we obtain: i+l - x' ai+ = 9 (A.5) During a processing time, demands continue to occur and these demands are met from the inventory on hand if any. Let Ii denote the item in inventory which is used to satisfy the ith occurence of a demand during a processing time. The term hi is the expected amount of time that Ii is held in inventory during a processing time. If the number of demands which arrived during the processing time is less than i, say j, the Ii will not be used and will be held during the entire processing time Uj. In that case, the expected holding time of Ii is uj. On the other hand, if the number of demands during the processing time is more than i, say j, then Ii will be used when the ith demand arrives. Also note that given j arrivals during U, the joint distribution of these arrival epochs have the same distribution as the order statistics of j independent random variables uniformly distributed on [0, Uj] (see, for example, Ross 1970). Hence the expected holding time of Ii when the number of demands, j, during a processing time is greater than i, is expressed as i _ j+l Uj. Consequently hi is represented as i-i ooXhi = qjuj + i + i=1,2,. (A.6) j=0 j=i j+l i i { j qj + i (1- qj) ), i=1,2,9, (A.7) X j=1 j=O 14

where the second equality follows from equation (A.5). From equation (A.7), we observe that hk+l = 1 k 1 hk +-(1- ~ qj), for kO, with hk=O for k<O. X j=O (A.8) Let the expected time from the instant the ith demand arrives during a processing time onwards until the instant that the processing is completed be bi. Let Hk denote the sum of the expected holding time during a processing time that is initiated with k items in inventory. Correspondingly, let Bk denote the sum of the expected backorder time during a processing time that is initiated with k items in inventory. Then we have Hk k = hi, i=l and Bk 00 = bi. i=k+l (A.9) Since hk can be obtained from equation (A.8), Hk is easily obtained. To obtain Bk, note first that since the sum of bi and hi is an expected processing time, bi is given by = E(U)- hi, i=1,2,.... (A. 10) Also note that for i = 0, both ho and bo are equal to 0. From the renewal theorem, the expected total time from the instant each demand arrives until the instant that processing is completed is 00 bi = i=0 XE(U2) 2 (A.11) From equation (A.9), (A.10) and (A. 11), Bk is expressed as Bk 00 = bii=O k ~ bi i=O -kE( U) - = 2 - {kE(U)-Hk}. (A. 12) Now note that the term Ek consists of two components: the expected holding cost, chHk, and the expected backorder cost, cbBk, during a processing time. So, from equation (A.12), Ek = chHk+ CbBk Hk(h + 2) Hk(Ch + Cb) + Cb{ 2 -kE(U) }. (A.13) From equation (A.13), Ek+l is given by Ek+l = Ek + hk+l(ch + Cb) - E(U)cb. (A.14) Since hk+i = 0 for k < 0, from equation (A.15) we get Ek = Ek+l + E(U)cb, k<0.. 15

Proof of Lemma 2.3 Lemma 2.3 The term fk,k+l is obtained from the recursive expression 1 k k E(U) fk,k+l = fk-l,k + - {Afk-1 + AEk -AEk- - ~ qj Afk-j + (1 - qj) Cb, k > 0, q Oj= j=o (l-p) E(U) = fk-l,k - b, k<0. (l-p) Proof From equation (2.5), Afk = AEk + qj (fk+l-j,k+l - fk-j,k) (B.1) j=1 From equation (B. 1), after some algebra we get Afk- Afk-i = AEk- AEk- + qj (Afk - Afkj). (B.2) j=l The last term in equation (B.2) consists of infinite terms. However, as shown below, we can express Afk - Afk-1 without these infinite terms. Let Dk denote the time period from the epoch when the inventory level reaches k to the epoch when the inventory level is raised to k+1 for the first time. Note that the length of Dk is equivalent to one busy period in an M/G/1 queueing system, hence, from the well known busy period E(U) analysis, the expected length of Dk is ) where p=XE(U). (l-p) Comparing the inventory levels during Dk and Dk+l, we observe that the inventory level during the period Dk follows the same stochastic path as the inventory level during the period Dk+l if one item of inventory is added to the inventory level during Dk throughout this period. Consequently, if k < 0, then the inventory level during Dk has only one more shortage than the inventory level during Dk+l on the average. Thus, Afk for k < 0 is simply E(U) Afk = fck+l -fk-l,k = b, k~ 0. (B.3) From equation (B.2), collecting Afk terms, 16

00 qo fk = Afk- + AEEk -Ek - i qj Afk-j. (B.4) j=l For k > 0, we can express equation (B.4) as k oo q Afk = Afk- + AEk - AEk- - qj Afk-j - qj Afkj. j=1 j=k+i Applying equation (B.3) to the last term on the right hand side of the above equation, 1 k k E(U) Afk -{Afk- + AEk- AEk-l - qj Afk-j + ( - qjb k>0. (B.5) qo0 ji j=o (l-p) where the term AEk - AEk-l is given by equation (2.10c). From equation (B.5), the term Afk for k>O, can be computed recursively using the initial value =- E(U) fo = -) Cb. Since f-,o0 can be obtained explicitly from Lemma 2.2, we can now obtain all the (l-p) fkk+i terms. Noting that Afk = fk,k+i - fk-i,k, we get the desired result * 17

Appendix C Proof of Theorem 3.4 Theorem 3.4 If we denote the value of r in the optimal policy by r*, then r* 2 r*(0), where r*(O) is one of the neighboring integers of 2 2(l-p) K Cb Proof: We need to show that for O < n < r*(0), TC(r*(O),S*(r*(O)) < TC(n,S*(n)). By definition, we have TC(r*(O),O) < TC(n,O). Set -1 i) A = Y'rk, k=-r*(O) -1 i+r*(O)-l ii) a(n)= X Tk, iii) B = min{, Xk }, and iv) P(n)= min{ k=-n i k=i i i+n-1 IZk }. k=i Then, using equation (2.21), the problem can be restated as follows. For 0 < n < r*(0), given (K+A) r7(O) (K+a(n)) n show that (K+B) r*(O) (K+P(n)) n This statement, in turn, can be restated as follows. For 0 < n < r*(0), given K K n r*(0) A a(n) > (A(n) showthat r*(O) n K K n r (O) B r (0) P(n) n So it is enough to show that A a(n) r*(O)- n B r (0) o(n,) equivalen,th A-B nor, equivalently, that n or(0) a(n)-P(n) n i+r*(O)-l To show this, let m = arg min{ Xk }, and define SA = { Tk: -r*(O) < k < -1 }, and SB = i k=i { Tk:m < k < m + r*(0) -1 }. Thus, note that'XTk = A and ke SA ~ Tk = B. keSB Let ai be the ith minimum value in the set {tk: TkC SA) and similarly bi be the ith minimum value in the set {Tk: TkE SB). Note that, by definition, ai 2 bi, for 1< i< r*(O). Figure C. 1 shows one realization when r*(O)=4. 18

SA SB Figure C.1. One realization for the function Tk when r*(0)=4 Also, from convexity of the T's, note that -B > (n)-n) if the following inequality holds: ai - bi > ai- - bi-1, for < i < r*(0). (C. 1) A-B a(n)-B(n) Thus, in order to prove that — > (n-, we have only to show that the inequality given by r(O) n' (C. 1) holds. For this, we consider the following four cases: Case 1 bi-l, bi E SA: Suppose bi = aq. Then, bi-I = aq-l holds in this case. From this relationship together with the convexity of Tk, inequality (C. 1) is proved. Case 2 bi E SA, bi-1 ~ SA: Suppose bi = Tq. Then, Tq+l < bi-i holds. From this together with the convexity of Tk, we can show that ai - bi > ai-1 - tq+l > ai-1 - bi-1. Case 3 bi e SA, bi-1 E SA: Let min { Tk:Tk e SA, Tk > bi } = Tq. Then the following relationships hold: i) ai - bi > ai - Tq, ii) ai- - tq+l 2 ai- - Tq+l (since ai = Tq), and iii) bi-1 = Tq+l. (C.2) From (C.2), we have ai - bi > ai-l - Tq+l = ai- - bi-1. Case 4 bi e SA, bi-1 o SA: Let min { Tk:Tk E SA, Tk > bi } = Tq. Then the following relationships hold: i) ai - bi > ai - Tq, and ii) ai-1 - bi-1 < ai-1 - Tq+l (since Tq+l<bi-l). (C.3) From (C.3) together with the convexity of Tk, we have: ai - bi > ai - Tq > ai-1 - Tq+l > ai-l - bi-i.. 19

References Altiok, T., "(R,r) Production/Inventory System," Technical Report, Industrial and Systems Engineering (1986), Rutgers University. Archibald, B.C. and. Silver, E.A., "(s,S) Policies under Continuous Review and Discrete Compound Demand," Mgmt. Sci., v 24, pp. 899-909. Beckmann, M.(1961), "An Inventory Model for Arbitrary Interval and Quantity Distributions of Demands," Mgmt. Sci., v 8 (1978), pp. 35-57. Bell, C.E., "Characterization and Computation of Optimal Policies for Operating an M/G/1 Queueing System with Removable Server," Opns. Res., v 19 (1971), pp. 208-218. Gavish, B. and Graves, S.C., "A One Product Production/Inventory Problem Under Continuous Review Policy," Opns. Res., v 28 (1980), pp. 1228-1236. Gavish, B. and Graves, S.C., "Production/Inventory Systems with a Stochastic Production Rate under a Continuous Review Policy," Comp. & Opns. Res., v 8, no 3 (1981), pp. 169-183. Heyman, D.P., "Optimal Operating Policies for M/G/1 Queueing Systems," Opns. Res., v 16 (1968), pp. 362-382. Johnson, E.L., "Optimality and Computation of (s,S) Inventories, New Conditions and a New Proof," SIAM J. Appl. Math., v 14 (1966), pp. 1067-1083. Lee, H.S. and Srinivasan, M.M., " Control Policies for the MX/G/1 Queueing System," To appear in Mgmt. Sci., June 1989. Ross, S.M., Applied Probability Models with Optimization Applications, Holden-Day, Inc., 1970. Sahin, I., "On the Stationary Analysis of Continuous Review (s,S) Inventory Systems with Constant Leadtimes," Opns. Res., v 27, (1979), pp. 717-729. Sobel, M.J., "Optimal Average-Cost Policy for a Queue with Start-up and Shut-down Costs," Opns. Res., v 17, (1969), pp. 145-162. Tijms, H.C., "An Algorithm for Denumerable State Semi-Markov Decision Problems with Application to Controlled Production and Queueing Systems," In Recent Developments in Markov Decision Theory, ed. R. Hartley et al., Academic Press, New york (1980), pp. 143179. Veinott, A.F., Jr.,"The Status of Mathematical Inventory Theory," Mgmt. Sci. v 12 (1966), pp. 745-777., "On the Optimality of (s,S) Policies in Multi-item Infinite Horizon Inventory Problems," Mgmt. Sci., v 13 (1967), pp. 475-491. and Wagner, H.M., "Computing Optimal (s,S) Inventory Policies," Mgmt. Sci. v 11 (1965), pp. 525-552. Yadin, M. and Naor, P., "Queueing Systems with a Removable Service Station," Opns. Res. Quart., v 14 (1963), pp. 393-405. 20