Division of Research November 1980 Graduate School of Business Administration The University of Michigan STATISTICAL ESTIMATION OF A PIECEWISE LINEAR BARRIER FOR BROWNIAN MOTION BASED ON FIRST PASSAGE TIMES Working Paper No. 240 Clifford Ball The University of Michigan FOR DISCUSSION PURPOSES ONLY None of this material is to be quoted or reproduced without the express permission of the Division of Research.

Section 1. Introduction Psychologists are very interested in examining the processes that govern the body's reaction to stimulus. One theory, developed by Grice [7], postulates the existence of a growth function which develops after the onset of stimulus. A response is evoked when the strength of the stimilus attains a specific level or criterion. Grice argues that the criterion is random in nature, and in his "Variable Criterion" theory assumes it to be normally distributed. In light of this, we consider the growth function to act as a barrier and we model the criterion by a stochastic process. The reaction time is viewed as the first passage of the stochastic process to the barrier. We are interested in estimating the barrier on the basis of reaction time data. The following technique is statistically equivalent to a procedure adopted by Emerson [5]. Assume the criterion X(t), 0 < t < W, to be a Brownian Motion process, i.e., a zero mean separable Gaussian process with P[X(O) = 0] = 1 and E[X(tl)X(t2)] = min (tl,t2). Let the barrier be a straight line with intercept -a, a > 0, and slope p > 0. By the method of images (see Cox and Miller [3], pp. 220+223), the density of the first passage time g(t;a,p) is shown to be the Inverse Gaussian density: (1.1) g(t;ap) = a(27t3)/2 exp(-(a-pt)212t) t>O 0 otherwise. The maximum likelihood estimates of a and 1 are respectively (1.2) a = t -1 t1 (1.2) a =/ y t and

-2 - N (1.3) =a N. ti. The object of this paper is to generalize the estimation procedure to a continuous piecewise linear barrier while maintaining the Brownian Motion criterion assumption. With this generalization, only an integral representation for the first passage time distribution can be derived. A computer algorithm, which provides maximum likelihood estimates (m.l.e.'s) of the barrier on the basis of first passage times, is described in Section 3. The likelihood equations are solved numerically, and conditions to ensure solution are discussed in Section 4. The asymptotic statistical properties of the resultant estimators are examined analytically in Section 5. In Section 6 computer experiments are performed to establish the small and moderate sample size properties of the estimators. Finally, the estimation procedure is applied to a problem in psychology which was the initial impetus for the work.

-3 - Section 2. Definition of the Problem We consider the first passage time distribution of Brownian Motion to a continuous piecewise linear barrier with intercept th -a, a > 0, and M stages where, at the k stage between time points Tk-l and Tk' the slope of the barrier is yk' For notational convenience, let hk Tk - Tk-l' k = 12...,M-l; J C= -a + ) pihi, J = 1,2,...,M-1; and i=l T = O. Theorem 2.1 The density of the first passage time, t time units into the J stage of the barrier, gJ(t;a,p), where p = (p1, 2',..,PM), is given by J-1 J-l (2.1) gj(t;a,p) Gj(t; Fk(Yk'-1) dJ-1 for 0 < t < hJ, J = 2,3,...,M and gl(t;a,l) = Gl(tly0) for 0 < t < hi. J-l J-1 Notationally, y = (Y1,Y2,. YJ- 1) dy = dyldY2..dJ_1, [J-1 > Q] J-1 i1 > ~0]! =: Yl y>0,Y2 >0,.. ' YJ-_1 >0}, Fk(yryk-) = (2 fhk )-1/2. exp (-(Yk_ -Ykl - Pkhk)212hk (1 - exp(-2ykyk-llhk)),

-4 - and Gj(t;y -) = YJ- (27rt3)-1/2 exp(-(Y -t)212t) for k = 1,2,...,M, J = 2,3,...,M. Furthermore, we make the restrictions t > 0, yJ > 0, J = 1,2,..., M, and yo = a throughout. Proof First, the result that gl(t;a,p) = Gl(tly0) is merely a restatement of equation (1.1). Letting T denote the random first passage time, we see that (2.2) P[X(T1) < Y1] = P[X(T1) < ylr T > T1] + P[X(T1) < Yl, T < T1] for any yl. Now, by the strong Markov property for Brownian Motion, (2.3) ay P[X(T1) < y1|T=t] (2 rr(T-t)) -/exp(-(yl-C)2 12(T-t) ), for yl > C1. Also, by the properties of conditional probability, (2.4) T ay P[X(T1) < Y1' T < t1] = 1 gl(t;a,p) ay P[X(T1) < YllT=t]dt. aLS Y1, ti a=0, - 1 YBy differentiating (2.2) with respect to y1, using (2.3) and (2.4), it can be shown with some algebraic manipulation that (2.5) a P[X(T1) < y1 + C1, T > t] = Fl(yl,yO) for y1 > 0. We now have the distribution of X(T1) with T > t.

-5 - By stopping the Brownian Motion at T1 and using its Markovian nature to continue to the second stage of the barrier it is seen that 00 (2.6) g2(t;a,p) = f G (t;y)F_(yy )dy. 1 The proof of the theorem for the M-stage problem is established inductively using similar arguments. I I

-6 - Section 3: The Statistical Estimation Technique and Its Numerical Implementation Suppose that the reaction time data has the following form: n1 observations, tll'tl,2''tl.,tl time units into stage 1, n2 observations, t2 1,t2 2',...,t2,n time units into stage 2, *. * * 6 * * * * 0 * 0 * * 0 and nM observations, ltMl 2'.tM2...tM, time units into stage M. We shall assume a is known and seek the m.l.e. p of p. A method of estimating a is discussed in Sections 6 and 7. Denote the log-likelihood of the sample by M nj (3.1) LnL(p) = j Ln(gJ(tJ;a,)). J=l ij=l Jj For a maximum of Ln L(p), the likelihood equations must be satisfied; i.e., aLnL( p) (3.2) -a.= 0, for i = 1,2,...,M. ai Conditions ensuring that a global maximum of Ln L(p) is obtained are examined in Section 4. Integral representations for the partial derivatives of the first passage time density, with respect to I1, are now presented. This theorem coupled with (3.1) provides an integral representation for the likelihood equations (3.2).

-7 - Theorem 3.1 The notation For J = 1, is that of theorem 2.1. 3Lng1(t;a,p) a-Pi a-llt 0 if i=l otherwise. For J = 2,3,...M, aLngj(t;a,i) 8ai ( 1 ) J-1 = G i (t;yj-1) H Fk (yky k-,)d1 Igj(t;a, I J-ly ] if i < J 0 otherwise, where (i)y ) = Fk (ykYk-_l) Gj(i) (t;yjl) = and Fk(Yk'Yk-1 ) (Yk-Yk1- khk)F k (Y k 'Yk-1 ) k = 1,2,...J-1 GJ(t;YJ-1) (YJi- jt) Gj(t;Y_1) if i * k if i = k if i ~ J if i = J. Proof For J = 1, the proof follows by elementary differentiation of gl(t;a,p). For J = 2,3,...M, the dominated convergence theorem is applied to allow differentiation through the integral. ExaminaJ-l tion of the derivative of the integrand, GJ(t,yJ 1) H Fk(Yk'yk1)' k=l yields the proof of the theorem. I I The above representation permits the development of a computer algorithm to generate the likelihood equations. The algorithm

-8 - proceeds to solve the likelihood equations numerically to produce the m.l.e.'s. A listing of the Fortran program used to implement the algorithm is available from the author. The basic procedures followed are outlined below. The partial derivatives of Ln L(p) involve the sum of ratios of multiple integrals, which creates a potentially difficult problem computationally. Fortunately, the Markovian nature of Brownian Motion ensures that each kernel factor, Fk and GJ, in the multiple integral contains at most two variables of integration. Consequently, the multiple integrals, expressed iteratively, can be effectively computed as a sum of double integrals. The numerical integration procedure adopted is a repeated Simpson's rule. The likelihood equations are nonlinear, and so a multivariate Newton-Raphson scheme is implemented to solve the system numerically. In order to apply this technique, the mixed partial derivatives of Ln L(p) must be computed. A theorem giving the exact representation is stated below. Its proof is omitted since it is similar to that of theorem 3.1. Theorem 3.2 The notation is that of theorem 3.1. For J = 1, a Lngl(t;a,() -t if i=s=l p. ia s0 otherwise. For J = 2,3,...,M a Lngj(t;a,i) api as

-9 - j( J - (ty 1 k(i s)-(t;a (G (i,s) ( t;yj 1,s) F (Yk'Yk-1) dJ- ) J k-l [yJ-1 >0 ] (3ngj(t;a,_ _ aLng (t;a,)\ ~L(;, ) (- - J ) * if i<J, s <J 0 otherwise, where F (i,s)k k-l k (Yk 'Yk-1) = ( Yk-Yk-l-khk ) 2hk ] Fk(i) k-) k (Yk 'Yk-1) k (Yk 'Yk-1) Fk(Yk 'Yk-1 ) if i = k, i * s if s = k, i * s if i = k = s otherwise and G(i') (t;yj 1) = ii G (t;y J-1) Gi(t;yj,) if i = J, i * s if s = J, i * s if i = s = J otherwise. 2 [(yj-ljt) -t]. With both real and simulated data the algorithm has very successful. Convergence of the numerical procedure fast and usually takes place within three iterations, as illustrated in Sections 6 and 7. proven is very will be

-10 - Section 4: Sufficient Conditions for a Maximum of the Log-Likelihood Function Let Ht(u) be an MxM matrix whose i-jt element is given by a Lng(t;a,p) Ht(p) = -, where the omitted stage t ij pi 311 subscript is determined by the time value t, measured from 0. Observe that aLnL( ) -^-1= I Ht(p)ij. Now, since definiteness of pi apj all data points t matrices is closed under addition, the following conjecture would ensure a global maximum of LnL(p) at p. Conjecture Ht(p) is negative definite for all p and for all data points t. From the computer algorithm the eigenvalues of Ht(p) could easily be computed. In every case examined, the eigenvalues were all found to be negative, implying negative definiteness of the Ht(p). Despite this strong numerical evidence, no general mathematical verification of the conjecture has been obtained; however, some partial results have been derived. For the one-stage problem, the conjecture is easily verified by elementary calculus. We will now address the problem in the two-stage case. We must show that

-11 - a2Lng2(t;a,t) 92Lng2(t;a,u) -Ht(P) = 91 91p1 a 11 a a Lng2(t;a, ) a Lng2(t;a, ) aP2 a l1 a 2 is positive definite, As a corollary to theorem 2.1, the joint density of Y1, Y2,.,Y- J_,, and T-TJ 1 (where Yi is the level of the Brownian Motion above the barrier at time Ti, i = 1,2,...,J-1, and T is the first passage time at the J stage) is given by J-1 G L (t;yj-) H n(a J(;YJ_1) k- k(Yk'Yk-1) k=l Consequently, for J = 2,3,...,M, the joint conditional density of Y1,Y'*** Y J_1' given T-Tj_1 = t, f J-l( Y IT-TJ=t) = J-1 GJ(t;YJ-1) H Fk(Yk'Yk-1) IgJ(t;ap) k=l Letting Z = Yi - Yi - pih. for i = 1,2,...,J-1 and Z = Ya - ijt, i1 Y J-L J theorem 3.2 implies 2z~~~ { Covt [ZiZs] if i t s a LngJ (t;a,p) Ja Lng _ talp Vart[Zi] - h. if i = s, i ~ J 3p^ al t i I Var. ZJ - t if i = s = J, where the variances and covariances are taken with respect to the same conditional density f (YJ I JT-TJ =t). J-1 J-l theorj- =.2 yotZZ]i

-12 - For the two-stage problem, Z1 + Z2 = -a + plhl + p2t, so V = Vart[Zi] = -Covt [ZZ2]. Therefore, hi - V V -Ht(P) = t - V t - V This matrix is positive definite if and only if V < (t 1 + hll )1. To find V, we need an expression for g2(t;a,p). Algebraically reformulating the representation for g2(t;a,p) from theorem 2.1 by completing the square inside the exponential and collecting terms, we obtain (4.1) ~o -1 -1 )2 2 2 g2(t;a,p) = k f y. exp(((t +h1 )y + 2y(pl-p2) + p2t + (pl-alhl)2h1)1-2) y=O.(exp(aylhl) - exp(-aylhl)). dy where K is a constant. 2 2 Letting k(p) = exp((p2t + (pl-alhl) hl)l-2) and expressing a difference of exponentials by an integral with respect to a new variable, we may rewrite (4.1) as +1 o (4.2) g2(t;a,p) = k(p). f y. aylh. ~ ~ w=-l y=0 exp((t +h1)y2 + 2y(pl-p2-awlhl)) -2) dydw. By the change of variables, z = yls and m = -(V1-P2-awlhl)Is where s = (t-l+ h-1) -1/2 where s (t + h1 ); and setting m1 = -(pl-i2+alhl)s and m2 = -(p1-p2-alhl)s, we have

-13 - n2 ~ 2 2 (4.3) g2(t;a,p) = k(p) f f z. exp((z -2zm) 1-2) dzdm m=m- z=0 Define (4.4) I (m) = J z" exp((z -2zm) -2) dz. n z=0 Expressing V in terms of g2(t;a,p) and using (4.3) and definition (4.4), we see that V < (t-l+h-l )- if and only if (4.5) (I (m )-I (m)) (I (m2)-I(ml))-(I2 (m2)-I2 (ml)) 2- (I (m 2)-I (ml ))2<0. For notational convenience let [I ] = I (m+x) - I (x), where m,x e R and m > 0. Inequality (4.5) may be expressed as (4.6) p(m,x) = - [I3][I1] + [I12 + [I] > 0. We shall have positive definiteness of -Ht(u) for all p and t if (4.6) holds on the half-plane m > 0, x Ce R. It is possible 2 2 to show p(m,x) > 0 for x > /2 and also for m > 31xl (x2-1) )(x -3) when x < -V3. The region of positivity of p was extended using a computer grid. For details of these results, see Ball [1]. Unfortunately, it could not be shown mathematically that p(m,x) > 0 for m > 0 and for all x. In the following theorem, however, it is shown that p(m,x) is positive in a neighborhood of the x-axis. Theorem 4.1 For arbitrary x and for m sufficiently close to zero, we have p(m,x) > 0.

-14 - Proof d First note that j- I (m) = In (m). Using this fact together with (4.6) and the mean value theorem, p(m,x) will be positive for arbitrary x provided m is sufficiently close to zero if 2 2 (4.7) g(x) = -I4(x)I2(x) + 13(x) + I2(x) > 0 for all x R. 22 x -t212 Define ~(x) = e e dt. Observe that - 00 IO(x) = p(x), I (x) = x~(x) + 1, and by repeated differentiation I (x) can be expressed in terms of %(x) for all integers n. We may therefore algebraically reexpress (4.7) as (4.8) g(x) = (x2-1) 2(x) + 3x d(x) + 2 > 0 for all x R. Since d-g(x) = 2x3 2(x) + (5x +1) ((x) + 3x > 0 for x > 0 and g(0) = 2-f7/2 > 0 then g(x) > 0 for all x > 0. Now, since g is a continuous function with g(0) > 0, a sufficient condition for establishing the theorem will be g(x) * 0 for any x < 0. We see from (4.8) that g(x) = 0 only when ~(x) = (-3x + x2+8)12(x -1). In fact, it is sufficient to show p(x) < (-3x - x2+8)12(x2-1) for x < 0. This result, which

-15 -is proved in Ball [1], is equivalent to a conjecture of Birnbaum [2]. A partial verification was constructed by Murty [9] and the whole conjecture was independently established by Sampford [111. Except for numerical evidence, little progress was made for the general M-stage problem.

-16 - Section 5: Statistical Properties of the Estimators The question of sufficiency cannot be conveniently addressed in the present context since the density can only be expressed as an integral representation. According to Kendall and Stuart [8], without sufficiency the most important properties of the m.l.e.'s are the asymptotic ones; consequently, we study them. It is well known that under fairly general conditions the likelihood equations admit a consistent estimator of the true parameter; for example, see Cramer [4]. For the present context, we state without proof the optimal asymptotic properties of the m.l.e.'s and list the regularity assumptions which were required. The proof is a multivariate generalization of the work of Cramer [4] and Rao [10]. For details, see& Ball [1]. Regularity Assumptions Let f = f(y;p) be a univariate density in y depending on the multivariate parameter p = (p'lP,..," m)' 2 3 DLnf, a Lnf and a Lnf Al. For all y, aPf aP nf and 3Lnj exist for i,j,k = 1,2,...,M and for p in some open neighborhood, A, containing the true parameter p0. 2 9f a F A2. For all p A, | and I are bounded by n f is integrable functions for i,j = 1,2,...,M and - is I naPj a k

-17 - bounded by a function H which has finite expectation with respect to the density f(y;p0). 2 a Lnf(y;0O) th A3. E [-; ] is the i-j element of a negative definite Po api a matrix. a Lnf(y;10) A4. E [- j ] is continuous in p at = p0. The notation E [-] refers to expectation with respect to the density f(y,9). According to Zacks [13], a consistent asymptotically normal estimator is best asymptotically normal if its asymptotic variancecovariance matrix approaches I In as the sample size n + c. The symbol I denotes the Fisher Information matrix and its i-jth element is given by a Ln g(t,a,vi0) I. = - E0 [ ap i j O a p i a j We now state our optimality theorem. Theorem 5.1 For the continuous M-stage piecewise linear barrier, the likelihood equations aLnL( ) = 0 and i = 1,2,...,M admit a strongly consistent, aPi

-18 -eventually unique set of joint best asymptotically normal estimates of the true parameter p0.

Section 6: Experimental Study on the Statistical Properties of the Estimators for Moderate and Small Sample Size Although the asymptotic properties of the m.l.e.'s are highly optimal, we cannot be sure that the estimation procedure produces satisfactory estimates for a small or moderate sample size. In order to examine this question, a Monte Carlo simulation experiment has been performed. Pseudo-random numbers were generated by the Fortran subroutine URAND, which is described in Forsythe, Malcolm, and Moler [6], Chapter 10. This subroutine is a congruential generator; its integer sequence on the IBM system is generated by xil = 834,314,861x. + 453,816,693 (mod 231). This subroutine yields pseudo-random numbers with a uniform distribution on [0,1]. By appropriate transformation, pseudo-random variates from a one-stage first passage time density with parameters a = 3.0 and p = 1.0 were produced. Two particular experiments were examined, one with a small sample size of twentyfive and the other with a moderate sample size of fifty. Both experiments involved a five-stage barrier where the first four stages had a length of one time unit and the last stage was open-ended. In each case, 200 repetitions of the experiment were performed. On every run the intercept for the barrier was estimated by using (1.2), and the initial slopes of the barrier, before iteration, were set equal to the p given by (1.3). Careful examination revealed that on every run satisfactory convergence of the algorithm occurred within three iterations of

-20 -the Newton-Raphson procedure. A summary of the results of the experiments are now tabulated. Table 1 contains elementary statistics and Table 2 lists the correlation matrices of the resultant estimates. For both experiments, and especially for the one using moderate sample size, the estimates are quite satisfactory with small bias and variance. This indicates that for both small and moderate sample size the algorithm furnishes reasonable solutions to the problem.

-21 - Section 7: Application We now present a brief analysis of one experiment performed by John Schnizlein of the Department of Psychology, the University of New Mexico. For further details, see Schnizlein [12]. Human subjects in a sound-proof chamber receive a sonic stimulus in the form of a 60 db (decibel) tone, and their reaction times to response are recorded electronically to millisecond accuracy. The sample size in a particular experiment was 360, but, in general, some of the data were eliminated by practical considerations. Fortunately, the invalid measurements constituted only a small percentage (< 5 percent) of the sample. On the basis of psychological grounds, for this experiment no reaction time was expected to be less than 100 m.s. (milliseconds) or greater than 200 m.s. Accordingly, the effective region was begun at time 0 and split into four equal stages of length 20 m.s. each, with the fifth stage being left open-ended. The intercept of the barrier was estimated by (1.2) and the initial slope values were all set equal to the p given by (1.3). Satisfactory convergence of the numerical procedure was attained within three iterations; the results are given in Table 3. This physical example indicates the utility of the computer algorithm in an experimental setting.

-22 -Section 8: Summary The study of reaction to stimulus encompasses a wide area in the field of psychology. This paper has investigated the special area of estimating growth functions (barriers) in response to stimulus. A statistically sound procedure has been developed to estimate these growth functions and the computer algorithm used to implement the procedure is available.

-23 -Acknowledgement This work is based on a Ph.D. dissertation written under the guidance of Professor Clifford Quails. The author is deeply indebted to him for his advice and encouragement.

References [1] Ball, C. A. "Statistical Estimation of a Piecewise Linear Barrier for Brownian Motion Based on First Passage Times." Doctoral thesis, Department of Mathematics and Statistics, University of New Mexico, 1980. [2] Birnbaum, Z. W. "An Inequality for Mill's Ratio." Ann. Math. Stat. 13 (1950), p. 245. [3] Cox, D. R., and Miller, H. D. The Theory of Stochastic Processes. London: Chapman and Hall Ltd., 1972. [4] Cramer, H. Mathematical Methods of Statistics. Princeton, N.J.: Princeton University Press, 1946. [5] Emerson, P. L. "Simple Reaction Time with a Markovian Evolution of Gaussian Discriminal Processes." Psychometrika 35, no. 1 (1970), p. 99. [6] Forsythe, G. E.; Malcolm, M. A.; and Moler, C. B. Computer Methods for Mathematical Computations. Englewood Cliffs, N.J.: Prentice Hall, 1977. [7] Grice, G. R.; Nullmeyer, R.; and Spiker, V. A. "Application of Variable Criterion Theory to Choice Reaction Time." Perceptions and Psychophysics 22, no. 5 (1977), p. 431. [8] Kendall, M. G., and Stuart, A. The Advanced Theory of Statistics, Vol. 2. New York: Hafner Publishing Company, 1967. [9] Murty, V. N. "On a Result of Birnbaum Regarding the Skewness of X in Bivariate Normal Populations." J. Indian Soc. Agnc. Stat. 4 (1952), p. 85. [10] Rao, C. R. Linear Statistical Inference and its Applications. New York: John Wiley and Sons, 1965. [11] Sampford, M. R. "Some Inequalities on Mill's Ratio and Related Functions." Ann. Math. Stat. 24 (1953), p. 130. [12] Schnizlein, J. Doctoral Thesis. Department of Psychology, University of New Mexico, 1980. [13] Zacks, S. The Theory of Statistical Inference. New York: John Wiley and Sons, 1971.

Statistic 1 P2 13 14 15 Table 1 Elementary Statistics Mean Variance Mean Square Error 1.000.127.356 1.150.415.661 1.060.354.598.936.553.746 1.255.448.716 Skew Percentage Bias -.069 0%.667 +15% -.106 +6% -.047 -6% 1.847 +25% Sample Size N = 25 Statistic 12 13 14 15 Mean Variance.976 1.056 1.028.938 1.117.073.192.140.246.148 Mean Square Error.271.442.375.500.402 Skew Percentage Bias -.497 -2%.639 +6%.169 +3%.422 -6% 1.438 +12% Sample Size N = 50

Table 2 Correlations Matrices for pi' i = 1,2,...,5. Il1 ~2 113 "4 "5 Pi r 1.000 -.567.461.315.378 P2 -.567 1.000 -.293 -.064.058 P3.461 -.293 1.000 -.235.073 "4.315 -.064 -.235 1.000 -.186 "5.378.058.073 -.186 1.000 Sample Size N = 25 A A A A P 1 "2 13 14 15 1 1.000 -.589.421.404.431 P2 -.589 1.000 -.402 -.015.018 1P3.421 -.402 1.000 -.186 -.059 P4.404 -.015 -.186 1.000.011 P5 L.431.018 -.059.011 1.000 Sample Size N = 50 I

Table 3 Maximum Likelihood Estimates a = 3.326 pl = 1.407 P2 =.909 3 = 1. 578 4 = 1. 826 5 = 3. 098 Sample Size N = 347