A HIERARCHICAL MODEL FOR SERIALLY CORRELATED DATA: ANALYSIS OF BIOLOGICAL RHYTHM DATA Joel B. Greenhouse Robert E. Kass Department of Statistics Carnegie Mellon University Pittsburgh, PA 15213 C.Y. Teresa Lam Department of Industrial & Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 Ruey S. Tsay Graduate School of Business University of Chicago Chicago, IL 60637 Technical Report 90-13 April 1990

A Hierarchical Model for Serially Correlated Data: Analysis of Biological Rhythm Data Joel B. Greenhouse.1 Robert E. Kass 1 Teresa Lam,2 and Ruey S. Tsay3 1Department of Statistics Carnegie Mellon University, Pittsburgh, PA 15213 2Department of Industrial and Operations Engineering University of Michigan, Ann Arbor, MI 48109 3Graduate School of Business University of Chicago, Chicago, IL 60637 Technical Report #476 April 25, 1990

Author's Note.Joel B. Greenhouse and Robert E. Kass are both Associate Professors in the Department of Statistics. Carnegie Mellon University, Pittsurgh, PA 15213. Teresa Lam is an Assistant Professor in the Department of Industrial and Operations Engineering, University of Michigan. Ann Arbor. MIL -is109. Ruev S. Tsav is a Professor in the Graduate School of Business. University of Chicago, ('hicago. IL 60637. The authors would like to thank David B. Jarrett, M.D.. of the Western Psvchliatric Institute and Clinic. University of Pittsburgh for many stimulating discussions concerning biological rhythms and for the use of the data set. This work was supported in pait by the NIMH ('( (;Grant No. MH30915 (JBG). by the National Science Foundation under GCrants No. MICSS30 1:$31 (REK) and DNIS-8902177 (RST), the William S. Fishman Research Fund 1at the Graduate School of Business. tlie University of Chicago (RST), and a grant from the.olhn L). anld Catherine'1'. MalcArthur Research Network on the Psychobiology of Depression (TL). -This w\ork was begun when all four authors were at Carnegie Mellon University. 1

A Hierarchical Model for Serially Correlated Data: Analysis of Biological Rhythm Data Abstract ('ortisol is a hormone secreted by the adrenal gland, and, like other physiological measurelitents. the concentration of cortisol in the blood over time is periodic with an approximate 24 hour (vcle. For descriptive purposes. linear regression using a simple sinusoidal curve with a fixed 24 hour period is sometimes an adequate tool for analyzing data from such processess. Inferences based on regression models under the assumption of independent and identically distributed errors. liowever. cal often mislead seriously. We have previously presented a general class of models for ittimi'n a single biological rhythm, with use of higher order harmonic terms of one or more unknown tIldallentals and ARNMA processes for the errors (Greenhouse, Kass, and Tsay. 1987). Since com)ar'ative studies of changes in biological rhythms due to a change in disease state often involve nany subjects. it is likely that there exists sufficient inhomogeneity among subjects that a single iliodel would be inadequate and a model that incorporates variation across subjects is necessary. \'e present an application of the use of a two-stage hierarchical model to study differences in the circadian variation of plasma cortisol concentration between depressed patients and healthy controls. In our harmonic regression model the level, amplitudes, and phases are treated as randomly varving across individuals and the frequency and the ARMA parameters are taken as common to all subjects. Key Words: ARMA Process: Depression; Harmonic Regression Model: Parametric Empirical Bayes Estimation; Plasma Cortisol. 2

1 Introduction Sequential measurements of physiological processes follow patterns that typically repeat themselves approximately every 24 hours (Moore-Ede. et al.. 1982). Examples in humans include the sleepwake cycle, core body temperature, and neuroendocrine activity, such as the secretion of plasma cortisol. These patterns of measurements, called biological or circadian rhythms. are endogenous and have been observed not only in physiological systems but in behavioral activities as well. (:Circadian rlhythms. according to Moore-Ede, et al. (1982), are the outputs of a biological system whose main function is to measure time and that the functioning of this systein has important implications for physiological regulation and for health. These rhythms are suprisingly stable, in tlie sense of being regular and repeatable, and often pass unnoticed until they become disturbed as tiigiht occur due to a change in the environment, such as flying across several time zones resulting in "jet lag". or due to a disease state. For example, patients suffering from a major depressive disorder are consistently reported to have disturbances in their rest-activity cycle, disturbances which are often characterized by a phase advance of the sleep-wake cycle (Iupfer and Foster. 1972)..Just as jet lag may arise because of an acute disruption of an individual's circadian rhythms (that is. their internal biological clock is "out of sync" with the external clock), it is now widely believed that patients with major affective disorders have disturbances in the regulation of their biological rhythms that play an important role in the clinical course and pathophysiology of their disease (Mloore-Ede, et al.. 1983: Sack, et al., 1987). To better understand the role these disturbances play in the pathology and etiology of depression. psychiatrists and others are attempting to characterize changes in the biological rhythms of depressed patients. It is hoped that by observing consistent changes in these rhythms, new strategies for the treatment of affective disorders will be suggested, and new understandings of the Inechanisms by which physiological systems are regulated will be obtained. Fundamental to the study of biological rhythms is the need to accurately characterize the rhythm so that changes in a rhythm from a baseline state following an intervention can be made or the assessment of abnornalities in a rhythm between a group of patients and controls can be determined. In this paper we develop statistical models for studying changes in the circadian variation of the hormone cortisol in a group of depressed patients as compared to a group of healthy control subjects. The data described here are part of a larger study conducted at the Western Psychiatric Institute and Clinic, 3

University of Pittsburgh. under the direction of Dr. David Jarrett. C(ortisol is a hormone secreted by the adrenal gland, and, like other physiological and behavioral measurements. the concentration of cortisol in the blood over time is periodic with an approximate h2 h our cycle. Plasma cortisol concentration reaches a peak at or near the time of waking and then falls off over the course of the day, reaching the lowest level shortly after sleep onset (see Figure 1). It a review of 14 studies of the secretory activity of cortisol, Sack. et al. (1987) found that changes in the circadian rhythm of plasma cortisol levels in depressed patients have not b)een conclusively.stalblished. Although most studies report an increase in the mean level of cortisol concentration i;i depressed patients. there is disagreement as to whether some features of the cortisol circadian rlivtllh are advanced. For example. in six relatively carefully controlled studies. three found a lpase advance in the cortisol rhythm and three did not. These diverse results with respect to cortisol are typical of problems with studies of biological rhythms in general and are due in part to )oor ldesigns (e.g., many studies only observe subjects for 24 hours or less), to the use of inefficient statistical methods, and to the problem of substantial variation among subjects. In the University of Pittsburgh study, blood samples were obtained from b)otli healthy and depressed subjects every 20 minutes for approximately 48 hours. The purpose of this paper i! to develop statistical methods for analyzing periodic data such as these that will allow us to make comparisons between groups of subjects while accounting for intersubject variability. \\' will analyze the data collected to investigate the effect of depression on the circadian variation or cortisol. In the next section we describe a model for characterizing a single rhythm based on a harmonic regression model with an ARMA process for the errors. WVe then generalize that model to incorporate a two-stage model with hierarchically structured parameters. At the first stage, the model is assumed to be identical for all individuals and variability among individuals is accounted for by the distribution of the parameters specified at the second stage. This approach is similar to the random-effects model for longitudinal data considered in Laird and Ware (1982). The likelihood function for the parameters is presented in section 2 and issues related to the estimation of parameters and to computation are discussed in section 3. In section 4, we present the analysis of the cortisol data for the depressed patients and controls in the Pittsburgh study. We conclude in section 5 with a discussion of the results of the data analysis and of the methodology. We not. that the family of models considered in this article have a wide range of application. such as. to 4

the investigation of-the impact of chlorofluoromethanes on stratospheric ozone (e.g., Reinsel and Tiao. 1987,) and to the analysis of gait, that is, free-speed walking on a level surface as discussed in Olshen, Biden. Wyatt and Sutherland (1989). 2 Model Specification and the Exact Likelihood Function 2.1 First-Stage Model Statistical models for characterizing biological rhythm data, or "cosinor" analysis (Tong, 1976:.Miniors and IWaterhouse. 1988). have generally consisted of a single sinusoidal regression function. ialavin, a fixed period of 24 hours. together with independent and identically distributed (i.i.d.)'1rors. Greenhouse. IKass and Tsay (GKT) (1987) describe a more general class of models for fitting biological rhythm data. This class contains models that include higher-order harmonics of oine or more unknown fundamentals and, in addition, have ARMA processes for the errors. Specifically, the GKT model for an observation, Yt, at time t of a rhythm for a single individual. is of the additive form Yt =f(t)+ Zt (1) whlere f(t) denotes the deterministic component of the data and Zt the stochastic component. InI applications to biological rhythm data, trigonometric series are used for f(t). Since the k-th harmonic term of a fundamental frequency w can be written as R * cos{2;rk(w - >)} = A. cos(2irkw) + B. sin(2rkw) where.4 = R.cos(2:ko) and B = R.sin(21rk), we may write f(t) in (1) as K f(t) = i + Z{Akcos(27rkwt) + Bksin(2rkwt)} (2) k=1 where,u is an added constant which we call the level of Yt, w is the fundamental frequency, and kAw. (k = 1,..., K) are the harmonics of w. Here the unknown parameters are Al., the Ak's and Bk's. and in this form the only nonlinear parameter in f(.) is w. Further, the stochastic component, Zt, in (1) is taken to be an autoregressive moving average. ARMA(p,q), process. That is, Zt satisfies P q Zt= iZt - - jat- + at (3) i=1i =1 5

where the oi's and 9d's are real-valued parameters, and {at} is a sequence of independent, identically distributed. Gaussian random variables with mean zero and variance a2. Methods for specifying I\i. I and q for model (1)-(3) are described in Greenhouse, Kass and Tsay (1987) and illustrated witl ilan example of the circadian variation of core body temperature in a healthy adult. Model (1)-(:3) is useful as a description of a particular circadian rhythm in a single individual. (oilnparative studies often involve many subjects, however, and it is likely that there exists sufficient illtersuibject variability that a single model of the form (1)-(3) would be inadequate. In this case. it i., necessary to introduce a new model that incorporates variation across subjects of the parameters a)ppealing in model (2). 2.2 Second-Stage Model [(U accoliodate the intersubject variability of cortisol secretion we generalize the G;KT model by (onsidering two-stage hierarchical or "'random-effects" models, in which certain parameters in (2) are assumed to follow probability distributions. For generality, we allow subjects to belong to one of g distinct groups, though in the application of section 4 we will have g = 2. We use Ih as the index for group, i = 1, 2,.... y. We denote the j-th observation for the i-th subject in group h by Ytk. \\e let 1in be the number of subjects in the h-th group and let nih be the number of observations -nt the i-th subject. The first stage of the hierarchical model is: K;a = Pah + Z[Aih ~ cos(2'kwhj) + Bk. sin(2rkcwhj)] + Zjh, (4) k=1 j 1.. i = 1.2....nhh is the frequency for the h-th group, u and.4 and B are the coefficients of the k-th harmonic of wh for the i-th subject in the h-th group. In (4) we now use the subscript j in place of t to index the repeated measurements on the i-th subject. When it is clear fromt the context that we are referring to the h-th group, we will for simplicity denote the total inumber of observations for the i-th subject by ni. For the stochastic component of (4), we assume. as in GIKT. that {Zijh}=n1 is an ARMA(p,q) process, and that {ZijhJ%' are all independent for i = 1.2,m.., mh and h = 1,2,..., g. At the second stage of the hierarchical model, we will assume that the,u's, Ar's and Bk's. are jointly normally distributed, and are distributed independently across subjects. This specification provides a simple manageable description of the variation among individuals with respect to the 6

mean level. amplitude and phase of the circadian rhythm for the nonlinear harmonic regression model in (2). In this paper. the fundamental frequency wh and the <'s, 0's. and a of the stochastic component will be treated as common to all subjects, or as "fixed" effects. It is convenient to write Equation (4) in matrix form. Yih = Xih;/ih + Zih w lere Yih Zih /,3zh (Yilh, lY2h * * * Yinh) = (Zilh, Zi2h, * * Z in,h)T = (ih,Ah B,. Ah, Bt T (~ihr~l h ih ihh and 1 cos(27r,'h) sin(27Lwh)... cos(271rAKh) sin( 2 A'., 1 ) Xih = 1 cos( 27rhj) sin(27whj )... cos(2rA Kwhj) sin( 2rA''h j) * * ~. ~. 1 cos(27r'hni) sin(2rwhni)... cos(27rAihni) sin(27rK.u'iihi) with Xih depending on the frequency Wh. Letting Ah be a vector of the fixed effects, we can express the distributional assumptions for the two-stage hierarchical model as: Stage I: Yih I Xih,3ih, Ah Stage II: /3ih I Ah ~ N(XihQ ih, W (,.I a 2)) ~ N(h, EB) where Swv(,0,a72) denotes the within-subject component of variance expressed as a function ul the parameters of the stochastic component, and = (l, AhB Ah,B,.. Ahh )T, sB = DIAG(2,,..., 2K) 7

It will be useful forJfuture reference to write Ah = (AC, A)T, where Ac = (01,....op,1,...O, 2)T are the parameters common to all subjects, and Aa = (wh,^h^,B)T are the parameters common ro all subjects in group h. Finally, let A = (A1,A2,..,Ag). 2.3 The Exact Likelihood Function In this section we derive the exact likelihood function for A. First, we will obtain the joint distributionl of yh and Oh given Ah which we express for the h-th group as mh P(Yh,3 | A ) = nIP(Yyih /3h, Ah)P(3ih Ah). (6) i=1 \\e will then find the marginal distribution of Yh given Ah by integrating over /3ih i.e.. rnh L(Ah) p(yYih | PihAh)p(3ih I Ah)d i. (7) In the remainder of this section we consider the details for the derivation of the exact likelihood function for A. L(A). where L(A) = 19 L(Ah). 3 mh Let.V = E n lh be the total number of observations of all the subjects in all g groups. Using h=l i=l the results in the Appendix for the exact likelihood function of an ARMA(p,q) process. we have p(zth | AJ = (27ro2)-nth/2 I WTW, I |1/2 exp{ -:2Zih hzh (Ih} whlere l tih and Jih, i = 1. 2...., m are the corresponding time series matrices given in the Appendix associated with the nih observations for the i-th subject in group h. These two matrices depend only on the time series parameter Ac = (1,22,i... <^p.il. qp ) and 711h. It follows from (5) and from (8) above, that PYih I A,Xih,/ih) = (2ra)h/2 I W ih I2 exp - (Yih - Xih),iha) ih'i(Yih - Xipih,jh) T2Ka2)-"lh/a A - The joint probablity density function of the data y = (yfl,.. Y y * * * 7Y * * Yy.)m99) an d.3 = (;3r1 1'. /~ 2.,.... 3...,)T T given A and X, is p(y,. I A, X)= 8

( 9 Mh[ f g m9Vh ] (2r2)-'V/2 I Tvi h 1/2 exp - 2a2 E (Yih- - XihAih) Th(YYh - Xih)iih) h~~~~~~2'2 h i=i i —=1= x (2;r)-"/2 h II h) SI exp{m - -B (ih - h ) (9) 3 where,-( = ( nnh) x (21 + 1). Letting 3ih = (-~ TiTihXih)-(X' Th4LhYh) we have h=1 (Ysh - Xih,/3h) IT'^h h(Yih - Xih/3ih) = (Yh - Xihih )T i ih(Yih - Xih3ih) + (3ih - /3Xih)'TT'hIhA -h(3iT -,3h). (10) tIsing the following matrix identity (see, e.g., the Appendix of Chapter 7 of Box and Tiao. [973). (x - a)T.4(x - a) + (x - b)TB(x - b) = (x - c)T(A + B)(x - c) + (a - b)TA(A + B)-lB(a - b) where c = (.4 + B)-'(.4a + Bb), we obtain V -,T %X (ih - 3)rT h 02i ( ih- 3ih) + (3ih - h) B(ih _ h) (i3th - Cih) (Aih + Bih)(3ih- Cih) + (3h - 3)TAih(Aih + Bih)'Bih(3ih - /3 ) (11) wvhlere.4h = ^'Bih = and Cih = (Aih + Bih) (Aihih + Bih3h). Putting (10) and (11) into (9), the joint probability density function of y and 3 given A and X can be written as ^p(y,3A.X) = (27a2)-f/2(27r )-m/2( n n m/2), 1-/2) 11 II Wihih rJ H1 i. -1/2 h=i i=1 h=l i=1 9 mh T \ir x P{- - E Z (Yih - Xih(Wh)f3i)T fi'ih -(Yih - Xih(wh)3ih) 2 h- l i=l+ (ih - Cih)T(Aih + Bih)(Bih - C h) + (3iih-3) ATihA(Ah + Bih)fBih(3ih - 13h)] }Integrating out 13 h yields the marginal distribution of y given A and X, p(y I A.X)= 9g mh 1 e 9 mh (27r2)-V/2 [ 11 n (1 WVTW h 1 -/21 rS 1-/21 A, + B -1/2) exp (Sh + S2^ hh=1i=1 h= i= ) 9

w here Slih = (yih - Xih(Wh)ih) ih (Yih - Xihh),3h) (72 and.ih = (/ih -;3h) Ah(Aih + Bih) Bih(h -h ) L'Te loglikelihood function for A is therefore given by L(\) = ~- log( 27r2Wih I) + + l og(| B+ log) + hli + 5'ih) - h=i 1= (12) 3 Estimation and Computation 3.1 Estimation W\e wish to compare depressed and control patients' rhythms by fitting the data uising the two-stage hierarchical model described in the previous section. Assuming the values of the parameters were known. the model would enable us to make comparisons in two ways. First. we could judge the Ilagnitude of the discrepancies between the population parameters, Aepressed and A:ontroI. Second. we could plot the systematic components of (4) for all individuals in the sample. This would help ius understand the variability among subjects, as well as the distinctions between the rhythms in the two groups. Since we do not know the values of the parameters in the model, we will estimate them. The population parameters A will be estimated by A that maxmizes the likelihood function in (12). Then. the 3ih's, will be estimated by their conditional posterior means, conditionally on A = A. These estimates are the usual "parametric empirical Bayes" (PEB) estimates of the quantities 3,h. According to standard large-sample theory, the assumption that the number of patients in each group is becoming infinite provides asymptotic Normality of the MLE's. with the observed infornation matrix (i.e., the negative Hessian of the loglikelihood, -D2L(A)) furnishing the asymptotic precision matrix. We can then examine the MLE's of parameters of interest, together with their standard errors. We will also be able to produce plots of the individual systematic components. using the PEB estimates of the subject-specific parameters. 10

We note that both maximum likelihood and parametric empirical Bayes estimation may be justified as providing approximate fully Bayes estimates. That is, inference regions having confidence I - ( based on the asymptotic Normal distribution of the MLE A also have posterior probability I - a( based on the limiting Normal posterior distribution of A. Similarly, the parametric empirical Bayes estimates of 3ih differ from fully Bayes posterior expectations by a term that vanishes at a rate inversely proportional to the number of patients (see Kass and Steffey. 1989). 3.2 Computation ITo L.se the two-stage hierarchical model. we must first identify a common first-stage model that can d.Iieqluately represent the rhythm under study, and then estimate the unknown parameters of the miLodel. GIKT describe a procedure for model specification and estimation for the first-stage model. \\e apply this procedure to each individual and obtain initial estimates for the second-stage model using steps 1 and 2 below, and then obtain final estimates of all the parameters simultaneously using step 3. 1. Tentative model selection for the deterministic process: Obtain initial estimates of the frequency for each subject in group Ii either from a priorz knowledge or from spectral analysis of YijA. An initial estimate of wh, in (4) can be obtained as the average of the individual frequencies. With this initial estimate of h;yh, we can then obtain inital estimates of 3h and a2 using maximum likelihood estimation by treating the stochastic component of (4) as i.i.d. random variables. This step can be done using standard statistical packages. such as BMDP or ISP.'. Tentative model selection for the stochastic component: Using the initial estimates of the deterministic component from step 1. form the estimated residuals Zijh = Yih - fih(t) where fih(t) is the deterministic component evaluated at the initial values obtained in step I. A tentative ARMA model for the stochastic component Zijh can be specified for each subject i (for details, see GKT) and then an overall ARMA model can be chosen for Z,^.h 11

The maximum. likelihood estimates of the common time series parameters of Zijh can then be obtained.:I. Full model fitting: By means of maximum likelihood estimation with the combined, tentative model obtained inl steps 1 and 2. reestimate simultaneously all the parameters in the full model. i.e., both the deterministic and stochastic components. The likelihood function to be maximized is pl)ecified in section 2.3. TLo carry out steps 2 and 3. we have written and used two FORTRAN programs. Thle first p)roranl minimizes the residual sum of squares of the the time series Zijh in step 2. This allows us to obltain the maximum likelihood estimates of the time series parameters across all of the subjects. Tile >second program simultaneously reestimates all of the parameters in the full model in step 3 <ilid obtains standard errors of the parameters based on the inverse of the observed information Imnatrix (K ass. 1987). 4 Comparative Analysis of the Circadian Variation of Cortisol As described earlier. %we are concerned with studying changes in the circadian variation of the hormone cortisol in a group of depressed patients as compared to a group of healthy control subjects. fThe protocol for the University of Pittsburgh study required all subjects to enter the hospital for a period of 7 days. Following several adaptation days, blood sampling began on day 4 around 6:00 p.mn.. to obtain plasma cortisol measurements. Samples were obtained every 20 minutes thereafter for the next approximately 48 hours. The samples were labelled with respect to time, stored, and assayed within 72 hours to obtain the cortisol values. This analysis is based on 14 depressed outatients (11 females: median age 50, range: 30-63), and 6 normal controls (1 female; median age= 28.5. range: 24-54). All the patients satisfied research diagnostic criteria for a major depression and the normal controls were volunteers from the community. The large proportion of females in the depressed group reflects the prevalence of the disease among men and women in the population. Figure 1 presents the cortisol levels for 4 healthy subjects as an illustration of the pattern of cortisol concentration in the blood over a 2 day period. We see that the general circadian pattern of cortisol concentration is similar in these subjects, but we also see a great deal of individual variation. 12

Using the iterative rnethods for model specification described in steps 1 and 2 in section 3.2, we dletermined that a suitable harmonic regression model for stage one of our two-stage hierarchical model based on these data should consist of, for the deterministic component. the fundamental;aind the first harmonic (I =2), and for the stochastic component, an autoregressive process of order 2. \\' treat as randomly varying across individuals, i.e., "random effects" in this model. the level. the amplitudes. and the phases as in (4) of section 2.2. The frequency and the autoregressive parameters are taken as common to all subjects. i.e., "fixed effects." Given the sample size. this model with 20 parameters is the most general with respect to the number of parameters that we could seriously consider. Taibles 1 and 2 give the empirical Bayes or approximate Bayes estimates for 1ih, the parameters ut the first-stage model for each subject in the depressed group and the control group, respectively. For ease of interpretation. we have transformed the AN's and the Bk's back to amplitudes and phases (in radians). Table 3 gives the estimates of the second-stage parameters, A7, for both groups. The estimates of the common AR parameters (standard error in parentheses) are Qo = 0.792 (0.0196) and 02 = -0.140 (0.0194), and a = 2.17. From this table, we see that the differences between the two groups with respect to the estimates of the second-stage parameters, the level. the amplitudes (R's). the phases (o's), and the frequency, respectively, are not large compared to their standard errors. In Figures 2 and 3 we plot the posterior fit for the deterministic component of the model for each individual along with the estimate of the overall cortisol rhythm for each group, respectively. It can be seen from the figures that the estimates of the two group curves for the cortisol rhythm are very similar. We also see that subject 1 in the control group seems to have a higher level of cortisol than the other subjects in that group. We reanalyzed the data without that subject and again found no substantial differences between the groups with respect to the amplitudes, phases. and frequency. However, there is an apparent difference between the groups with respect to level. When subject 1 is excluded, patients have higher average cortisol levels than controls (patients = 7.425. controls= 5.896, se(diff)=.688). 13

5 Discussion 5.1 Data Analysis The tunderstanding and analysis of biological rhythms has important implications for health and disease. In this paper we have been concerned with abnormalities in the circadian rhythm of plasnia cortisol concentration in depressed patients. We have developed a statistical model for lescribing the circadian variation of plasma cortisol using a harmonic repression Miodel to fit the Iuiiderlying circadian rhythm and an ARMA process to pick up fluctuations from it. To accomodate variation across subjects and to compare the cortisol rhythm in depressed patients to controls, we 1ue a two-stage hierarchical nodel in which certain parameters in the harmonic regression model lare assumed to follow probability distributions. The analysis of the cortisol data presented here iuggests that differences in the circadian rhythm of plasma cortisol between depressed patients and healthy controls are probably small. In addition, using the parametric empirical Bayes estimates of the subject-specific parameters. we have been able to examine the posterior fit for the deterministic component of the model for each indvidual; this helped us to identify one subject in the control group who had a higher level of cortisol than the other subjects in the group. There are three components in the model we have used, the deterministic part of the first stage. the stochastic part of the first stage, and the second stage. In the deterministic component chose to fit a trigonometric series with a fundamental frequency and one higher harmonic. This harmonic picks up a substantial part of the departure from the model using only the fundamental frequency. However, our empirical fitting procedure did not include a careful attempt to determine the precise nature of this departure. Thus, although the extra harmonic might genuinely represent what is often called an'ultradian rhythm" with a period of about 12 hours, there is nothing in our analysis that provides real evidence of such a claim. Further effort to establish the existence of a 12-hour rhythm would require additional data so that higher-order harmonics could be considered. as well. We also emphasize that our approach has been to model the concentration of cortisol in the blood over time rather than attempting to model the physiological process responsible for the secretion of the hormone. The latter is an interesting and challenging modeling problem addressing different substantive questions than are being asked in the Pittsburgh study. Although some of our preliminary analysis included diagnostic checking of first-stage assump 14

tions. our choices for all three components were based largely on convenience and we did not perform thorough sensitivity studies. Additional checks designed for two-stage models could be conducted as in. for example, Waternaux, Laird, and Ware (1989). Also it would be possible to ulse a non-Normal second-stage distribution: this would entail a nontrivial computational burden. though we believe the problem could be solved with current numerical integration techniques. 5.2 Methodology \We have used a general ARMA model for the error structuve in (1) to represent the dependence among the repeated measurements made for each subject. With sufficiently many subjects it would be possible to estimate a more general covariance structure. Use of the ARMA model. on the other hand. reduces the characterization of the within-subject covariance to a small number of parameters, and is therefore suitable for studies with relatively small numbers of both subjects and observations per subject. In a recent article, Chi and Reinsel (1989) consider a linear model with random effects and AR(1) errors for longitudinal data. They derive a score test to detect serial correlation in the xwithin-individual errors for random-effects models that are "conditionally independent" in the sense that conditional on the random effects the observations on a given individual are assumed to be independent. They further use a scoring method for maximum likelihood estimation. However. there are several differences between the present article and that of Chi and Reinsel. First, the model used in this article is non-linear; it estimates the fundamental frequency jointly with all the other parameters and this frequency is often of major interest. Second, the error terms of the present article may assume the general ARMA models; an AR(1) model often cannot capture the periodic pattern showing in the data. Third, this article emphasizes the comparison of a circadian rhythm between depressed patients and normal healthy controls, not on the detection of serial correlation. Finally, Chi and Reinsel allowed for irregularly sampled data whereas only equally spaced observations were considered in the present article. Of course, a combination of the ideas given in the two articles could be useful in application. An alternative method to compute the exact likelihood function of Zt is to use a state-space formulation and Kalman filter. Nice properties of the Kalman filter approach include the ease in handling missing observations and the explicit use of a diffuse prior in the case of unit-root 15

nonstationarity of Z. See Jones (1980. 1986), Harvey and Pierce (1984), Ansley and Kohn (1985). and Kohn and Anslev (1986), among others. We used a regression approach in this article for various reasons. First. the regression approach provides a quadratic expression that is handy in the hierachical setting. Second, recent research shows that by treating missing observations as idditive outliers the regression approach can also easily handle the missing value problem (see LjuLl2 1989, 1990). Third. by adopting the techniques in Wincek and Reinsel (1986), it is not difficult to generalize the regression approach to the case of irregularly spaced observations. 16

APPENDIX: Exact likelihood function of an ARMA(p,q) Model Consider a stationary ARMA(p,q) model P q Zt at - Zatj = - j- (A.4. J=1 j=l where the at's are independent normal random variates with mean zero and variance d2. Suppose the data z = ( Z1.... Z )T are available and our goal is to derive the associated exact likelihood lunction. There are two approaches to evaluating the exact likelihood function of a stationary ARIMA process. The first approach uses a state-space model representation and the IKalman filter recursion. o.t<. Jones (1980). The second approach uses linear regression techniques to obtain estimates of the starting values of the process so that the exact likelihood function can be expressed as a quadratic function of the data, e.g. Box and Jenkins (1976), Hillmer and Tiao (1979), and Abraham and Ledolter (1983). The first approach is useful in computation, especially when missing observations oxist. However, since we are interested in obtaining an exact expression of the likelihood of a two-stage hierarchical model, the second approach is used in this article. Rewrite the ARMA(p,q) model (A.1) more compactly as <>(B)Zt = O(B)at where o(B) = l-o1B — ->pBP, @(B) = 1-l1B-''.-, Bq and B is the usual backshift operator such that BJZt = Zt-,. Let tr- be the r-weights of the model, that is, ri is the coefficient of B' in B( B) = 1- E i B = ( B)/f(B). Let z' = (zo,.... - a-p,ao,..., al-q)T be the vector of unknown starting values of the process Zt. Furthermore, for 1 < k < p, define I(k) = (kc,... p,0, *.0) be a p-dimensional row vector with the last k-1 elements being zero. For k > p, ((k) is the p-dimensional null row vector. Similarly, define the q-dimensional row vector 9(k) = (k....,0,... 0) for l < k < q, and @(k) = 0 for k > q. Finally, let G(k) = (-(k),G(k)). Then. simple algebra shows that t-1 at = Zt- E'Zt-j + htz' (.4.2 j=l 17

where hi = G1 and other ht's are defined recursively by t G(t) + Ojhtj for 2< t <max{p,q} ht= (A.3) q Zjht-j for t > max{p,q} j=i with 0 = 0 if j > q. Putting the equation (A.2) in a matrix form, we have z* | O0 Ip+q z + z (.4.4) a L D where a = (a..... an)T. D is an n x (p+ q) matrix with ith row equal to hi defined in (A.3), and 1 is tlhe n x n lower-triangular matrix of tr-weights of Zt with unity on the diagonal and the i-th row being (-7r,-1,'.f.7. 1,0,.. 0). Let a2E be the covariance matrix of z'. i.e.. C(z'z'T) = a2. Then one can find a matrix P such that PEpT = Ip+q. Pre-multiplying Equation (A.4) by the (n +p+q) x (n +p+q) matrix I = diag{P, I}, where [n is the n x n identity matrix. we obtain u' 0 Ip+q = z+ u' (.4.5 a L DP-1 where u" = Pz'. Let [p+q 1 IV = and L= DP-1 L so that (u'T + aT)T = L'z + Wu*. Now, it is important to note that (1) C(u'u'T) = ~((Pzz =TPT) = Pe(zz*T)pT - PfpTrc2 = Ip+q2, and that (2) u involves only alq,.... ao, 2z-p,..., zO, which is independent of a. Thus the elements of (u'T,aT)T are independent normal random variables with mean zero and variance a2. The joint probability density function of (u'T,aT)T is given by 18

p(u'.aT I Ac) = (27ra')-(n+p+q)/2 exp{- {u"Tus + aTa}} 2a" wvhere A, = (i.... 1.... o, aa2 )T is the vector of common unknown parameters. Since the Jacobian of the transformation of (A.5) is unity, we have that P(zT. uIT I Ac) = (27r2)(n+p+q)/2 exp{ — S(c, u) (A.6) wh lere (A. u') = (L*z + Wu')T(L'z + Wu*) i- a sum of squares that depends on, among others, u*, which in turn is a function of the unknown ~tartilin values z'. Now it can be shown that *(Ac, u-) = S(Ac) + (U'- U)TZTZ(u -') where fi = -(WT)IW')-T1 HT Lz' and 5(Ac) = S(Ac, u') = (L'z + Wi')T(L*z + W') Equation (A.6) then becomes p(z, u' I A) = (22)-('n+p+q)/2 exp{-{2 {S(Ac) + (u - U2)TWTV( -')}} (.4.) the left-hand side of which can be written as p(z, u' I Ac) = p(z I Ac)p(u I z,Ac). Integrating Equation (A.7) over u", we obtain p(z I A) = (2ra2)-n/2 I WTW I-1/2 exp{- S(Ac)} (. 8) which is the exact likelihood function for an ARMA(p,q) process. Furthermore, by L'z + WFi = Lz - W(WTW)-IWTL'z (Ip+q+n - W(WTW)-lWT)L*z. 19

and letting D\ +,+- I(WTV)- WT Di [p+q+n - W(Wv)-W [ Aj D2 where. is an (p + q + i) x n matrix, we have that L'z + Wu = ALz = %z -1 T( p(z I A) = (27ra2)-n/21 WTW 1-1/2 exp{-zT/ TIzz}..hich i tihe result used ill Section 2." *A iich i the result used in Section 2. 20

References Abraham. B. and Ledolter. J. (1983), Statistical Mlethods for Forecasting. New York: Wiley..Ansley. C. F. and lKohn. R. (1985), "Estimation. Filtering and Smoothing in State Space Models with Incompletely Specified Initial Conditions," Annals of Statistics, 13. 1286-1316. Box. G.E.P.. and Jenkins. G.M. (1976), Time Series Analysis: Forecasting and Control. 2nd Edition. San Francisco: Holden-Day. B[ox. G.E.P. and Tiao. C.C. (1973), Bayesian Inference in Statistical Analysis. London: Addison\Veslev..'lli. E. MI.. and Reinsel. G. C. (1989), "Models for Longitudinal Data with Random Effects and AR(1) Errors." Journal of the American Statistical Association, 84. 452-459. Ilarvey, A. C. and Pierce. R. G. (1984), "Estimating Missing Observations ill Economic Time Series." Journal of the American Statistical Association, 79, 125-131. Ilillimer. S. C.. and Tiao. G. C. (1979), "Likelihood Function of Stationary Multiple Autoregressive Moving Average Models." Journal of the American Statistical Association. 74. 652-660. Greenhouse. J.B.. IKass, R.E.. and Tsay, R.S. (1987), "Fitting Nonlinear Mlodels with AR.MA Errors to Biological Rhythm Data," Statistics in Medicine, 6:167-183. Jones. R. H. (1980), -"Maximum Likelihood Fitting of ARMA Models to Time Series With Missing Observations." Technometrics, 22, 389-395. Jones, R. H. (1986). "Random Effects and the Kalman Filter," in Proceedings of the Business and Economic Statistics Section. American Statistical Association, pp. 69- 75. 1Kass. R.E. (1987), "Computing Observed Information by Finite Differences." Communications in Statistics: Simulation and Computation, B16:587-599. IKass. R.E. and Steffey, D. (1989), "Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models)," Journal of the.Americal Statistical Association. 84:717-726. 21

lKohn. R., and Ansley, C. F. (1986), "Estimation. Prediction, and Interpolation for ARIMA Models \Vith Missing Values." Journal of the American Statistical Association. 81. 751-761. Kupter. D.J.. and Foster. F.G. (1972), "Interval between Onset of Sleep and Rapid-Eye-Movement Sleep as an Indicator of Depression," Lancet, 2:684-686. Laird. N.M. and Ware. J.H. (1982), "Random-Effects Models for Longitudinal Data.' Biometrics.:,3:963-974. Ljunlg.. M.. (1989). "A Note on the Estimation of Missing Values in Time Series." Communications in Statistics. Series B. 18, 459-465. Ljull. G.. M. (1990). "Outlier Detection in Time Series,' Working Paper, Statistics Center, Massachusetts Institute of Technology..linors. D.S. and Waterhouse. J.M., (1988), "Mathematical and Statistical Analysis of Circadian Rhythems." Psychoneuroendocrinology, 13:443-464. N.oore-Ede, NI.C., Czeisler. C.A.. Richardson, G.S. (1983), "Circadian Timekeeping in Health and Disease: Part 2. Clinical Implications of Circadian Rhythmicity," The.\e!w England Journal of Medicine. 9:530-536. Nloore-Ede. M.C., Sulzman, F.Mi., and Fuller, C.A. (1982), The Clocks That Time Us: Physiology of the Circadian Timing System, Cambridge: Harvard University Press. Olshen. R.A.. Biden. E.N.. Wyatt, M.P., Sutherland, D.H. (1989), "Gait Analysis and the Bootstrap," The Annals of Statistics, 17:1419-1440. Reinsel. G. C., and Tiao. G. C. (1987), "Impact of Chlorofluoromethanes on Stratospheric Ozone." Journal of the American Statistical Association, 82, 20-30. Sack. D.A., Rosenthal, N.E., Parry, B.L., Wehr, T.A. (1987), "Biological Rhythms in Psychiatry," in Psychopharmacology: The Third Generation of Progress, edited by H. Y. Meltzer. New York: Raven Press. Tong, Y.L. (1976), "Parameter Estimation in Studying Circadian Rhythms,' Biometrics. 32:85-94. 22

Waternaux. C., Laird. N.M., and Ware, J.H. (1989), "Methods for Analysis of Longitudinal DataL Blood-Lead Concentrations and Cognitive Development." Journal of the American Statistical.Association. 84:33-41. \\incek. I. A.. and Reinsel. G. C. (1986),'An Exact Maximum Likelihood Estimation Procedure for Regression-ARMA Time Series Models With Possibly Nonconsecutive Data." Journal of the Royal Statistical Society, Ser. B, 48, 303-313. 23

Table L: Posterior Estimates of the Parameters of the First-Stage Model for Each Subject in the Depressed Group 1 Jill R 1 R 2.) I'l ii 11 11, I I I 1 2 3 4 5 6 8 9 10 11 12 13 14 5.75073 7.15767 7.60592 7.37679 6.07798 7.07930 9.53174 8.68716 6.79398 9.65074 8.67777 7.78879 6.26715 5.52294 2.762761 4.795436 4.189216 3.847971 4.633564 3.961659 4.425398 4.016897 4.271711 4.132880 4.845589 4.586333 4.010485 4.119850 i 2.313958 2.135773 2.298618 2.284967 2.363536 2.204823 2.285647 2.340257 2.255000 2.265791 2.313336 2.320656 2.162387 2.329102 1.526266 1.669860 1.715918 1.710920 1.703300 1.619559 1.556337 1.575313 1.689855 1.673365 1.640384 1.722901 1.691691 1.626388 -— r -0.021732 -0.019927 -0.019342 -0.019436 -0.019489 -0.020.524 -0.021368 -0.021084 -0.019683 -0.019837 -0.202470 -0.019279 -0.029659 -0.020408 24

Table 2.- Posterior Estimates of the Parameters of the First-Stage Model for Each Subject in the Control Group 12 2 2 2 1 11.41896 5.085662 2.565209 2.577948 -0.016294 2 4.941418 3.490282 2.461203 2.293906 -0.018283 3 5.357593 3.735369 2.386339 2.374498 -0.017650 4 7.561059 4.597357 2.669817 2.331090 -0.017967 5 5.5084722 4.388673 2.625529 2.298040 -0.018226 6 6.551861 3.941468 2.467842 2.247713 -0.018660 25

Table 3: Estimates of the Parmeters of the Second-Stage Model for Each Group Patients Controls Difference SE(Diff) ft 17.425 6.890 0.536 0.865 1' 4.177 4.186 -0.009 0.489 61 2.276 2.539 -0.263 0.155 R2 1.652 2.357 -0.705 0.378 1 -0.020 -0.0018 -0.0002 0.284.01402.01378 0.0002 0.0003 26

Figure Titles Figure l(a-d). Examples of the circadian pattern of cortisol secretion for an approximate 48 hour period in four healthy control subjects. The smooth curve is the posterior fit based on the two-stage hierarchical model (see text). Measurements are obtained three times an hour. The units of time ale indexed by sequential measurement number. Figure 2. Posterior fit of the deterministic component of the model for each subject in the depressel group ( —) and the estimate of the overall population curve from the second-stage of the hlierarchical nlo(el for the depressed group (* —- ). Fiiture 3. Posterior fit of the deterministic component of the model for each subject in the control gioup (-) and the estimate of the overall population curve from the second-stage of the hierarclhical model for the control group (O —-). 27

Figure la. 0 >-0 Wc (V D(0 0 a7 time

C) 0 0 C- Figure lb. c'(O m -40 (o7 cci': _. i.. I I I l nn 1cIRQ qn 7R 45.67 6i05 75.44 90. 33 105.22 120. 1l *-v^J cr/,/ m / lr m / \\ 135. 00 time

0 Figure lc. II N () ( d 0, O0 0i C3 1.00 15.89 30.78 45.67 60. 56 75. 44 90. 33 105. 22 120. 11 135. 00 time

0 0 dC Figure id. (T) CT'-4 01 — 40 dCD - C''' 3.... 1.00 15.89 30.78 45.67 60.56 75.44 90. 33 105.22 120.11 - 1 135.00 tinme

00' 0) 0) -n ID 0r0 CO Z aJ nlT 0 0

0 0 Cd 0. eN Figure 3 (0'-4 C) C) * -4 r(t (T 0 0 C) C),.00 00 time

UINIVFRSITY OF.