Faculty Research

Estimation in the continuous time mover-stayer model with an application to bond ratings migration * Halina Frydman Stern School of Business, New York University Ashay Kadam*** University of Michigan Business School November 28th, 2002 *The authors are very grateful to Roger Stein, Managing Director of Quantitative Risk Analytics at Moody's Risk Management Services for providing data and for useful discussions. We thank Petra Miskov for her very skillful work on organizing and analyzing the data. Halina Frydman's research on this paper was supported by the summer research grant from the Stern School of Business at New York University. **Information, Operations and Management Sciences, email: hfrydman@stern.nyu.edu ***Statistics and Management Science Department, email: ashay@umich.edu

Abstract The usual tool for modeling bond ratings migration is a discrete, timehomogeneuous Markov chain. Such model assumes that all bonds are homogeneous with respect to their movement behavior among rating categories and that the movement behavior does not change over time. However, among recognized sources of heterogeneity in ratings migration is age of a bond (time elapsed since issuance). It has been observed that young bonds have a lower propensity to change ratings, and thus to default, than more seasoned bonds. The aim of this paper is to introduce a continuous, time-nonhomogeneuous model for bond ratings migration, which also incorporates a simple form of population heterogeneity. The specific form of heterogeneity postulated by the proposed model appears to be suitable for modeling the effect of age of a bond on its propensity to change ratings. This model, called a mover-stayer model, is an extension of a time-nonhomogeneuous Markov chain. This paper derives the maximum likelihood estimators for the parameters of a continuous time mover-stayer model based on a sample of independent continuously monitored histories of the process, and develops the likelihood ratio test for discriminating between the Markov chain and the mover-stayer model. The methods are illustrated using a sample of rating histories of young corporate issuers. For this sample, the likelihood ratio test rejects a Markov chain in favor of a mover-stayer model. For young bonds with lowest rating the default probabilities predicted by the mover-stayer model are substantially lower than those predicted by the Markov chain. Keywords: Ratings migration, mover-stayer model, Markov chain, estimation JEL Classification: C13, G33 1 Introduction The usual tool for modeling bond ratings migration is a time-homogenous, discrete Markov chain. Such modeling assumes that all bond issues are homogeneous with respect to their movement behavior among rating categories and also that their behavior does not change over time. However, there are many sources of heterogeneity in the rating behavior, of which the age of a 2

bond has been recognized as an important one. Evidence presented by Altman(1998), Asquith et al, (1989), and Keenan, Soberhardt, and Hamilton (1999) and references therein suggests that propensity of bonds to change rating, and in particular to default, is lower during the early years after issuance than it is for seasoned bonds. However, this aspect of aging effect has not yet been incorporated in any systematic way in modeling of the evolution of ratings. In this paper we propose a continuous time mover-stayer model to capture the aging effect described above.1 This model is an extension of a timenonhomogeneuous, continuous Markov chain. It postulates a simple form of heterogeneity: a population of bonds is assumed to consist of two subpopulations, "movers" and "stayers". "Movers" evolve according to a continuous time Markov chain, whereas "stayers" stay in their initial states. The proportion of stayers in each rating state is a parameter of the model and can be interpreted as a measure of immobility for the bonds in a given rating state. If this parameter is zero in each rating, the mover-stayer model reduces to a Markov chain. The mover-stayer model, which allows for a greater degree of immobility of bonds than a Markov chain, may be a better description of ratings migration for younger bonds. This paper develops a methodology for estimation and testing of this model against the continuous Markov chain. We consider a time-nonhomogeneuous mover-stayer model with time measured since the issuance of the bond. More precisely the age-nonhomogeneity is modeled by assuming that the parameter of the mover-stayer model is a piece-wise constant function of age with a constant value for each year of life of the bond. In addition, according to the mover-stayer model, in each year of life a bond may exhibit a stayer or a mover type behavior. Thus, the proposed model incorporates both time (age) nonhomogeneity, as well as simple heterogeneity in the movement behavior of bonds that are of similar age. Throughout we consider a continuous time framework, rather than a discrete one, because rating agencies (Moody's, Standard and Poor's) monitor 1The discrete time mover-stayer model which was introduced by Blumen, Kogan and McCarthy (1955) has been since employed in many areas (e.g. Colombo and Morrison (1989), Sampson (1990), Chaterjee and Ramaswamy (1996), and Chen et al (1997)). Frydman, Kallberg and Kao (1985) applied the estimation methods for the discrete time mover-stayer model developed in Frydman (1984) to the analysis of credit behavior. Subsequently, Altman and Kao (1991) used this methodology in their study of ratings migration. 3

changes in bond ratings on daily basis which gives rise to very detailed data. For a given bond issue the complete history of its rating changes is available, which includes the exact dates of rating changes, the types of the changes that occur, and the lengths of stay in different rating states. Modeling such data using only information obtained at discrete time points would necessarily entail a loss of information. In addition, a continuous time framework affords a possibility of making predictions of quantities of interest, such as of future rating state probabilities, at any relevant time horizon, whereas the predictions with a discrete version of the model can be made only at times that are multiples of the sampling interval. Until recently a discrete time framework has been employed almost exclusively to model rating migrations. The first statistical analysis of ratings migration with a continuous time Markov chain has been in Lando and Skodeberg (2000) to which the reader is referred for a more extensive discussion of the advantages of a continuous time framework and related recent references. To implement the continuous time mover-stayer model, we derive the maximum likelihood (ml) estimators of its parameters based on a sample of independent continuously observed realizations from this process. The ml estimation in the continuous time mover-stayer model has not been considered before.2'3 Based on the derived ml estimators we formulate the likelihood ratio test for discriminating between the Markov chain and the mover-stayer models. The main tool for obtaining the ml estimators is the EM algorithm (Dempster, Laird, and Rubin (1977)). However, we show that when realizations are observed over the same fixed time horizon, the ml estimators can be easily obtained by direct maximization of the likelihood function. We illustrate our methods using the rating histories of the sample of 856 corporate bond issuers that were observed in the period from January, 1985 to December, 1995. On the basis of this sample, we estimate the continuous age-nonhomogeneous mover-stayer model and Markov chain. The agenonhomogeneity is modeled by assuming that the parameter of the process is a step function which is constant within each one-year age interval, that is, the models are estimated separately for bonds in their first year of life, the second year of life, etc, yielding age specific one-year transition probability 2Maximum likelihood estimation in the discrete time mover-stayer model has been discussed in Frydman (1984), Fuchs and Greenhouse (1988), and Swensen (1996). 3Maximum likelihood estimation in the mixtures of continuous time Markov chains that generalize the mover-stayer model is considered in Frydman (2002). Another generalization of the discrete time mover-stayer model is in Cook et al. (2001). 4

matrices.4 Because of data limitation we estimate the two models only up to the fifth year of life of the bond. We briefly summarize our empirical results reported in Section 4. The likelihood ratio test rejects the Markov chain in favor of the mover-stayer model in each of the one-year age intervals. The overall expected proportion of stayers estimated by the mover-stayer model is large for very young bonds and then decreases as bonds become more seasoned. This is consistent with the aging effect. The interesting aspect of our results is the large estimated proportion of stayers in the C rating for young bonds. Thus, according to the mover-stayer model a large proportion of young bonds (up to 4 years after issuance) rated C today will stay rated C for a whole year, the remaining proportion will evolve according to the estimated transition matrix for the movers. This has a substantial implication for the estimation of the probability of default from rating C: for young bonds one-year probabilities of default from rating C estimated by the mover-stayer model are substantially smaller than those estimated by a Markov chain. Thus, according to the mover-stayer model Markov chain substantially overestimates the probability of default from C for bonds up to 4 years after issuance. The difference in the default probabilities resulting from the two models is largest for very young bonds and then decreases with age. For both models the probability of default is largest in the fifth year and much smaller for younger bonds. We do note that our results are based on a small sample of bonds and thus we treat the empirical analysis solely as an illustration of the methodology. An application of the methodology developed here to a much larger sample is required to evaluate the usefulness of the continuous-time mover stayer model in modeling the aging effect. This paper is organized as follows. In Section 2 we define the moverstayer model in continuous time and its time nonhomogeneuous version which we use to model aging effect. Section 3 develops the ml estimation in the mover-stayer model from continuous observations. In Section 4 we report and discuss the results of the estimation of the mover-stayer model and the Markov chain for ratings migration of young bond issuers. 4We note that the one-year transition matrix for a portfolio containing bonds of different ages can then be estimated by the weighted average of the age specific one-year transition matrices with the weights representing the proportions of bond issuers of different ages. 5

2 The mover-stayer model To define a mover-stayer model in continuous time we first consider a Markov chain in continuous time with state space W = (1, 2,..., w). The states correspond to different rating categories. Such chain is characterized by the generator matrix Q, which is the matrix with the following structure qii < 0, qij > 0, qij = -qj q, i E W. j~i In the context of our application to ratings migration the entries in Q have the following probabilistic interpretation: each time a bond enters rating i it stays in it for the time that is exponentially distributed with parameter (-qii). When it exits from rating i, it makes a transition to rating j, j $ i with probability qij/ (-qi). In particular, expected length of time for an issuer in -qii rating i to remain in that rating. Matrix Q is called a generator, because it generates M(t), the matrix of transition probabilities mij(t) of a continuous time Markov chain. M(t) is obtained, for every time t, by exponentiation of tQ, that is, M(t) = exp(tQ),t > 0. For the definition of the matrix exponential and an exposition of Markov chains in continuous time see, for example, Norris (1997). A continuous time mover-stayer model on state space W ={1, 2,.., w} is a mixture of two independent Markov chains, one which evolves according to some infinitesimal generator Q, and the other whose transition probability matrix is an identity matrix. The transition probability matrix, P(t), of a continuous time mover-stayer model on state space W is then defined as P(t) = SI + (I- S) exp(tQ), t > 0, (1) where S =diag(sl, s2,..., so), with si = proportion of stayers in state i, i E W. 6

The Markov chain defined above is time homogeneous because its generator is constant in time. Similarly the mover-stayer model defined above is time homogeneous because it involves a time homogeneous Markov chain, and proportions of stayers that do not change over time. A time-nonhomogeneous Markov chain has a generator Q(t) which is a function of time. A simple, but for our purpose very useful, time-nonhomogeneous Markov chain can be defined by assuming that its generator, Q(t), is a piecewise constant function of time on some time interval (0, T) that corresponds to the time of the study. The particular specification of Q(t) of interest to us is Q(t) = Q(1),0< t<1, (2) =Q(2), l<t<2, =Q(m),m-1 <t<m=T where 1, 2,... < m- 1 are the times where regime changes occur and Q(k) is the generator in the k'th time subinterval, 1 < k < m. With the view towards our applications, we assume that time is measured in years since the issuance of the bond so that the generator in (2) is depends on the age of the bond issuer and the one-year transition probability matrix for bonds age (k - 1) is given by M(k - 1, k) = exp(Q(k)), 1 < k < m- 1. More generally, by Markov property, the transition probability matrix M(s, t) of a Markov chain with the generator (2) can be easily computed for any choice of s, t such that 0 < s < t < m. For example, M(0.5, 2.5) = exp(0.5Q(1)) exp(Q(2)) exp(0.5Q(3)). We define the time-nonhomogeneous mover-stayer model by assuming that a generator of a Markov chain which describes the evolution of movers is as in (2), and that the k'th one-year age interval, (k - 1, k), has its own vector of proportions of stayers, denoted by s(k) = (si(k), 1 < i < w), where si(k) is the proportion of stayers in rating i in the k'th age interval. Then the transition probability matrix of a mover-stayer model in the k'th one-year age interval is P(k - 1, k) = S(k) + (I - S(k)) exp(Q(k)), 1 < k < m- 1, 7

where S(k) =diag(sl(k), S2(k),..., s,(k)). The transition matrices M(k-1, k) and P(k - 1, k), 1 < k <m - 1, are age specific one-year transition matrices for the Markov chain and mover-stayer model respectively. For both a Markov chain and a mover-stayer model the estimation of their parameters can be done separately in each one-year age interval. By assumption in each such time interval both models are time homogeneous. Thus, to estimate the generator in (2) for a Markov chain, we use the ml estimate of Q(k) using the data on bonds with age (time since issuance) in the interval (k - 1, k). This is a well known ml estimator of the generator of a time homogeneous Markov chain and is presented in Section 3.1. Similarly we estimate Q(k) and S(k) in the mover-stayer model based on bonds with age in the interval (k - 1, k). The ml estimators of these parameters, that is, of the parameters of a time homogeneous mover-stayer model are derived below. 3 Maximum Likelihood Estimation Let X = (Xt,t > 0), be a mover-stayer model with transition probability function defined in (1). Assume that we observe n independent realizations of X and that the k'th realization, Xk, is observed continuously on some time interval [0, Tk] with Tk < T, where T is the time horizon of the study. Thus, Xk = (Xt, 0 < t < Tk) and individual realizations may be observed over time intervals of different lengths. This may be the case when right censoring is present or when the mixture process has an absorbing state. The right censoring is assumed to be independent. Let A be the set of all realizations that stayed continuously in an initial state, and B be the set of all realizations with at least one transition. We note that A may contain movers as well as stayers. Let LQ be the likelihood of observing Xk when it is generated by a Markov chain with an intensity matrix Q. Then conditional on knowing an initial state (see e.g., Albert (1962)), LQ f (qij)n f exp(- qii), isij i where n.j = the number of times Xk makes an i - j transition, i 7 j, T = the total time Xk spends in state i. 8

Thus, the likelihood of Xk E B under the mover-stayer model, conditional on knowing an initial state is W (I - si),,kLQ i=and the similar likelihood for Xk E A is w w Lk = (si) i + (1 - si) LQ i=1 i=1 where Irk= 1if Xk=r 0, otherwise. It is seen that the likelihood function of n independent realizations, L k=1 Lk, is difficult to maximize directly. Instead we develop the EM algorithm for obtaining the mles of Q and s. To implement this algorithm we require QC and sC, the mles of the parameters based on complete information. The derivation of QC and sC, is straightforward, but we include it below for completeness and also to introduce the notation needed for the formulation of the EM algorithm. Let Yk = 1, if the k'th realization is generated by a stayer, 0, otherwise. Obviously, for any realization with nonzero number of transitions, yk = 0, and for any realization with zero transitions, yk is not observed. Assuming that all Yks are observed, L, the likelihood function of the kth realization of the mover-stayer model is simply W / \ ^k Lk 17 (Sr)I'Yk (1 - S)I'r(1-Yk) (qij)ni exp(-qiTi ) r=l i7j i or w w log L Irk ilog(l- s) + Yk Ir log [s/(l- s)1 ogL = I log(l- Sr) +Yk log[Sr/(1 - Sr)] r=l r=l -+(1- Yk) I nk log(qij) - (1 - Yk) 3 (qiTi), ( Ykij i i 9

which for all realizations becomes w w logL = mr log(l-Sr) + M log [Sr/(1 Sr)] r=l r=l + nij log(qij) - qiTi + q iij i i where w mr = I = total number of individuals that begin in state r, r=1 n ms =y I~Yk = number of stayers in state r k=1 n nij = nk = total number of i - j transitions in the sample, k=1 w ni nij total number of transitions out of state i jii n Ti rTi = total time in state i for all individuals in the sample, k=1 n r = y Y the total time in state i for stayers, TM Ti- r k=1 Solving the score equation a log Lc/0si = 0, gives the natural estimator = miS/mi. Now setting 0 log Lc/qij - 0, we obtain ^c nij nij q4ij = S- is ~'M (3) Ti-Ti Ti and from yjwzi qij = qi, c n ~ij _ i (4) q~-.. T~ — ~' ( 4 From (3) and (4) we also get nij ic ni- qi (5) 10

3.1 The EM algorithm Based on the mles assuming complete information we develop the EM algorithm for the estimation of the parameters (si, qi, i E W). We note that because of (5) we don't have to update the value of qij, j 7 i, at each iteration of the algorithm. After the algorithm converges to (i, qi, i E W), we compute qij using (5). 1. Initialize At the p + 1st iteration, p > 0, set the values of (si, qi, i E W) to: si, q,i C W. Define QP to be the intensity matrix with the entries given by qi= (nij/ni)qP, i j, and qi = -qP. 2. Expectation step For the k'th history, which starts in state r, 1 < k < n, r E W, and does not make any transition compute the probability that it is generated by a stayer: Sp EP(Yk) = Sr + (1 - sP) exp(-qrTk)' For the k'th history with at least one transition set EP(Yk) =0. Then compute the following expectations n EP~rf) = f^kEp(y,), EP(TiS) = TEP(Y k) k=1 EP(T?) = T - E(Ti), n EP (m,) = IkEP(Yk), k=1 3. Maximization step 11

Compute the quantities -ip+ EP(mI) Mi and p+1l ni EP(T?) 4. Iterate Go back to Step 2 and iterate until convergence. 3.2 The special case of identical observation horizons In a special case when all realizations are observed continuously during a fixed period of time [0,T], the estimates of the parameters can be easily obtained by direct maximization of the likelihood function. In this case we define ar = number of realizations that stay continuously in state r br = number of realizations with at least one transition that start in state r, rA = total time in state i for histories with no transitions, B = total time in state i for histories with at least one transition, The likelihood function, LA(Q, s), of the realizations in set A is LA(Q, ) = { [Sr + (1- Sr) exp(-qT)] (6) kEA r = I [Sr + (1 - Sr) exp(-qT)]ar r and the likelihood function, LB(Q, s), of the realizations in B is LB(QS) -i= - r(1-r - tI iexp(-qi-i ) (7) kEB{ r j ni i r j7i i 12

Thus, the overall loglikelihood function becomes log L(Q, s) log LA(Q, s)+ log LB(Q,) = ar log sr + (1 - r) exp(-qrT) (8) r + br log( ) ijlog) + n log qij - qir. r j7i i The score equation with respect to Sr logL ar[l -exp(-qrT)] br asr Sr + (1- Sr) exp(-qrT) 1- Sr gives ar - mr exp(-qrT) ( mr - m exp(-qrT)' Substituting (9) into (8), we obtain, up to the terms not depending on the parameters, log L(Q, s) - - br log(l - exp(-qrT)) + y nij log qij - 3 qiT7. (10) r j7i i From the score equation 0 log L biT n+ -t - r/B = 0, Oqij 1 - exp(qiT) qij we get nij(exp(qiT)- 1) n(ij) qij Tbi + iB(exp(qiT) - 1) Tbi(exp(qiT) - 1)-1 + TB Now taking into account that jZi qij = qi, gives the following equation for qi ni Tbi(exp(qiT) - 1)-1 + TB = qi, (12) which can be rewritten as (ni - qT-B)(exp(q/T)- 1) qiTbi. (13) 71~~~~~yj i 13

It follows from (13), that qi < ni/T-F. The equation (12) can be easily solved in an iterative fashion for q. At the n'th iteration the left hand side of this equation is evaluated at qm). This results in the value qi"+l). The iterations are repeated until convergence is achieved. The starting value for qi could be taken as any value in the interval (0, ni/rf).By (11), the estimates of the transition rates qij are given by qij = (nij/ni)qi, and the estimate of Si is obtained from (9). 3.3 The likelihood ratio test We note that the a Markov chain can be obtained from the mover-stayer model by setting all si equal to zero, that is, a Markov chain is nested in the mover-stayer model. This allows us to use the likelihood ratio statistic to test a Markov chain model against a mover-stayer model. The hypothesis test is of the form Ho: s = 0 versus H1: s 7 0, where the equality s = 0 and the inequality should be understood in the vector sense. The likelihood ratio statistic is A = sup L(Q, s)/sup L(Q, s) = L(C, )/L(Q, ). Q,s=O Q,s Here C is the mle of the intensity matrix Q under Ho, that is, when the process is assumed to be a Markov chain, and Q, s are the mles of the intensity matrix and fractions of stayers, respectively, in the mover-stayer model. By the standard result, under Ho, the asymptotic distribution of -2 log A is X2 with w degrees of freedom. We now compute -2 log A. We write the likelihood function for the moverstayer model as L(Q,s) = LA(Q,s)LB((QS), where, LB((Q,s) is given in (7), and LA(Q, s) n [Sr= + (1 Sr) exp( —qrr) }, kEA r Note that this is more general than (6), because here we do not assume that realizations are observed over the same time horizons. The likelihood function of the observations under Ho evaluated at C is L(C, 0) = nI (I(c)2 nI exp(-ii k i-~j i 14

where ij = (nij/ni) i, and ci = ni/ri. In order to simplify the expression for -2 log A, we write L(C,O) = LA(C, O)LB(C, 0), where LACO = nfn ^^- ^ ^^kT) exp f-E L (C ) H (H exp(- i ) - exp iA) kEA i LB(C,) = H0) =nk H ni In C Hex- ep(-CiT) keB ji~i i kEB j7i i i Now, evaluating L(Q, s) at Q, S, and noting (5), we obtain A =LA(C'O) f.(ci x( }-Bd LA( ~) ~ i exp [(qi - ci)T] = (qiTi -iTi) }/LA(, 8s) i L\%/ J~n\exp ^ or -2logA = -2 { ni log (^) + (qi^Ti - ni) -log LA(Q, ). (14) For the case of realizations observed over fixed time horizon (0, T), using (6) and (9), we get -2 logA = -2 { ni log() i+ i - 'i) - 5 ai log(ai/mi). i exp i i i 4 Application to bond ratings migration 4.1 The Data and the Methods The data consist of the rating histories of 856 corporate bond issuers in the industrial sector that were observed for some time in the period from January, 1985 to December, 1995. The data was obtained from Moody's and thus uses 15

Moody's rating system. As is customary, and in our case necessary due to the small sample size, we grouped the original ratings into eight states: Aaa, Aa, A, Baa, Ba, B, C, D and WR where the ratings are ordered from the highest to the lowest with Aaa being the top ranking, D being the default state and WR denoting the state of rating withdrawal. This resulted in an initial distribution of 7, 50, 131, 119, 264, 256, 29, 0, 0 bonds in these states, respectively. Thus a majority of the bonds under consideration were issued with Ba or B rating. About 30% of issuers entered the sample after 1990, thus providing us with only with relatively short times under study. Because of this sample limitation we decided to study aging effect only in the first five years of life of the bonds. To study this effect we estimate a time-nonhomogeneuous Markov chain and a mover-stayer model defined in Section 2 on the age interval (0,5). In both processes time t represents age of the bond issuer. Thus, we assume that a generator of a Markov chain, Q(t), is a piecewise constant function of age: Q(t) = Q(1),0<t< 1, (15) Q(2),1 < t <2, Q(5),4 < t <5, and in case of a mover-stayer model, that the k'th one-year age interval has its own vector of proportions of stayers, s(k) = (si(k), 1 < i < 8), 1 < k < 5, where i refers to a rating. We now discuss estimation of the age specific transition matrices under the two models. To obtain these matrices under a Markov model we first estimate Q(k) for each age interval. The mle C(k) of Q(k), is given by (see for example, Andersen et al. (1993)) ij(k)= i (k -,k) j,Ci(k) =- ij(k) -- (k) (16) fk_1 Yi(s)ds where nij(k - 1, k) is the total number of i - j transitions for all issuers in the k'th year of their life and Yi(s) is the number of issuers in rating category i at time s. Thus, fkk_ Yi(s)ds is the total exposure time in rating category i in the age interval (k - 1, k). We note that the estimators cij(k) use all of 16

the available information and do not require that an issuer be present for the whole observation period. To estimate the age specific transitions matrices under the mover-stayer model we estimate for each age interval the generator for the movers and the vector of proportions of stayers using the EM algorithm developed in Section 3.1. After testing that the algorithm converged to the same final values for different initial values we chose the initial values for the algorithm in the following way. For each one-year age interval we chose the observed proportion of bonds that stayed in a rating for that whole interval as an initial value for the proportion of stayers in this rating. For the k'th age interval we chose ci(k), as the initial value for qi, 1 < i < 8, where ci(k) is given in (16). 4.2 Estimation results We summarize here the results of the estimation of the mover-stayer and the Markov chain models for the five age intervals. As default probabilities play an important role in the pricing of bonds and other related applications in finance, the main focus of the summary will be the comparison of default probabilities estimated by the two models. First, for the purpose of the illustration we report and compare the results of the estimation of the two models for the age interval (2,3). The results are in Tables 1-6. We note (Table 4) that for this age interval there is a very high proportion of stayers in rating C (81%). As a result the CC entry in the mover-stayer transition matrix (Table 6) is much larger (0.8594) than the corresponding one in the Markov chain transition matrix (Table 3) (0.7288). Since there is little movement out of the C rating to other nondefault ratings, this, in turn, results in the smaller default probability from the C rating in the mover-stayer model (0.0826) as compared to the Markov chain estimate (0.1569). Rating Baa also has a relatively high proportion of stayers (56%) but the BaaBaa entry is only somewhat larger (0.8715) in the mover-stayer matrix than in the Markov chain matrix (0.8558), and the default probabilities from this rating estimated by the two models are comparable. In fact the mover-stayer model predicts somewhat higher probability of default from this rating. Here the small increase in the BaaBaa entry in the mover-stayer model as compared to the Markov chain results in the mover-stayer's model lower probability of the downgrade from Baa to Ba (0.0379) as compared to the Markov chain's estimate (0.0425). Thus, a large proportion of stayers in 17

a rating may have a substantial (as in the case of the C rating) or relatively minor (as in the case of Baa rating.) implication for the default probability from that rating. We now consider the estimation results for the five age intervals. For each one-year age interval including the one just discussed, we carried out the likelihood ratio for discrimination between the mover-stayer and Markov chain. The test rejected the Markov chain in favor of the mover-stayer model at less than 0.1 percent significance level in each of those intervals. To obtain an overall heuristic measure of the proximity of the moverstayer and Markov chain models we computed for each age interval the expected proportion of stayers as estimated by the mover-stayer model. This is defined for each one-year age interval as a weighted average of the estimated proportions of stayers in each state with weights given by the initial distribution. The expected stayer proportions estimated for the first through fifth year of age were 0.3417, 0.2483, 0.1748, 0.1549 and 0.1345 respectively, demonstrating that the estimated proportion of issuers that are not going to move out of their ratings during a one year horizon decreases as bond issuers age and therefore the two models are most different for the young issuer age and become similar with age. We now consider the dependence of the default probabilities on age of a bond. In Table 7 we report the estimated by the two models one-year default probabilities from speculative ratings for bonds 0-4 years old. In the last column of this table we also report the proportion of stayers in each speculative rating for each age interval. The one-year probabilities of default from rating C estimated by the mover-stayer model are substantially smaller than those estimated by a Markov chain. (For bonds just issued the one-year probability of default is estimated to be zero by both models.) The difference in the default probabilities estimated by the two models is largest for young bonds and then decreases monotonically with age. For both models the probability of default is largest in the fifth year and much smaller for younger bonds. The two models estimate similar probabilities of default from rating B for all considered ages with the mover-stayer proving somewhat larger estimates for 1-2 years old bonds. These probabilities increase with age in a monotonic fashion reflecting the aging of the bonds. The estimated proportion of stayers in rating B declines from 15% in the first year of bond issuer's life to 0 in the fifth year. This explains the virtual equality of probabilities of default from B estimated by the two models in the fifth year of life of the bonds. There 18

is no clear age pattern in the estimated probabilities of default from rating Ba. However, the estimates of default probabilities from this rating in the two models are again closer for older bonds. Our empirical results show that for 1-4 years old bonds the mover-stayer model estimates substantially lower default probabilities from rating C than a Markov chain. These probabilities are particularly different for 1 and 2 years old bonds. It also estimates somewhat different default probabilities than a Markov chain from other speculative ratings for 1 and 2 years old bonds. The empirical discrepancy suggests that a mover-stayer model, as a model subsuming a Markov chain, may provide, in particular for younger bonds, more accurate estimates of the default probabilities than a moverstayer model. However, a much larger sample of bonds is needed to further assess this possibility. References Albert, A. (1962) Estimating the Infinitesimal Generator of a Continuous Time, Finite State Markov Process, Annals of Mathematical Statistics. 38, 727-753. Altman, E. I. (1998) The Importance and Subtlety of Credit Rating Migration. Journal of Banking and Finance. 22, 1231-1247. Altman, E. I. and Kao, D. L. (1991) Examining and Modeling Corporate Bond Rating Drift. New York University Salomon Center Working Paper Series s-91-39. Andersen, P. K., Borgan, O., Gill, R.D., and Keiding, N. (1993) Statistical Models Based on Counting Processes. New York, Springer-Verlag. Asquith, P., Mullins, D.W., and Wolff, E.D. (1989) Original Issue High Yield Bonds: Aging Analyses of Defaults, Exchanges, and Calls. The Journal of Finance. 44, 923-952. Blumen, I., Kogan, M., and McCarthy, P.J. (1955) The Industrial Mobility of Labor as a Probability Process, Cornell Studies of Industrial and Labor Relations, vol. 6, Ithaca, N.Y., Cornell University Press. 19

Chatterjee, R. and Ramaswamy, V. (1996), An Extended Mover-Stayer Model for Diagnosing the Dynamics of Trial and Repeat for a New Brand. Applied Stochastic Models and Data Analysis, Vol. 12, 165-178. Chen, H. H., Duffy, S. W. and Tabar, L. (1997) A Mover-Stayer Mixture of Markov Chain Models for the Assessment of Dedifferentiation and Tumor Progression in Breast Cancer. Journal of Applied Statistics 24 (3), 265-278. Colombo, R. A. and Morrison, D. G. (1989)) A Brand Switching Model with Implications for Marketing Strategies. Marketing Science, 8 (Winter), 89-99. Cook, R. J., Kalbfleisch, J. D. and Yi, Y. (2001) A Generalized MoverStayer Model for Panel Data. Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion).Journal of Royal Statistical Society, Series B, 39, 1-38. Frydman, H., (1984) Maximum Likelihood Estimation in the MoverStayer Model., Journal of the American Statistical Association, 79, 632-637. Frydman, H., Kallberg, J.G., and Kao, D.L. (1985) Testing the adequacy of Markov chains and mover-stayer models as representations of credit behavior., Operations Research, 33, 1203-1214. Frydman, H. (2002) Estimation in the Mixture of Markov Chains Moving at Different Speeds. Manuscript available from the author. Fuchs, C. and Greenhouse, J. B. (1988) The EM Algorithm for maximum likelihood estimation in the mover-stayer model.,Biometrics, 44, 605-613. Keenan, S. C., Sobehart, J. and Hamilton, D.T. (1999) Predicting Default Rates: A Forecasting Model for Moody's Issuer-Based Default Rates. Moody's Special Comment, August. Lando, D. and Skodeberg, T. M. (2002) Analyzing Rating Transitions and Rating Drift with Continuous Observations. The Journal of Banking and Finance, 26, 423-444. 20

Norris, J. R. (1997) Markov Chains. Cambridge University Press. Sampson, M. (1990) A Markov Chain Model for Unskilled Workers and the Highly Mobile. Journal of the American Statistical Association, 85, 177 -180. Swensen, A., (1996) On Maximum Likelihood Estimation in the MoverStayer Model., Communications in Statististics-Theory and Methods, 25, 1717-1728. 21

Table 1: Initial distribution of two years old bonds and observed number of stayers for each rating in the age interval (2,3) Rating Aaa Aa A Baa Ba B C WR Initial Distribution 6 39 126 104 253 193 23 45 Observed number 5 31 106 91 208 157 20 43 of stayers __________________ Table 2: Maximum likelihood estimate of Markov chain generator for the (2,3) age interval Aaa Aa A Baa Ba B C D WR Aaa -0.1593 0.0000 0.1593 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 Aa 0.0283 -0.2262 0.1697 0.0000 0.0000 0.0000 0.0000 0.0000 0.0283 A 0.0000 0.0166 -0.1745 0.1329 0.0000 0.0000 0.0000 0.0000 0.0249 Baa 0.0000 0.0000 0.0296 -0.1580 0.0494 0.0296 0.0000 0.0000 0.0494 Ba 0.0000 0.0000 0.0000 0.0128 -0.1967 0.0684 0.0086 0.0043 0.1026 B 0.0000 0.0000 0.0000 0.0000 0.0816 -0.2214 0.0350 0.0466 0.0583 C 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 -0.3164 0.1808 0.1356 D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 WR 0.0000 0.0000 0.0159 0.0000 0.0000 0.0637 0.0000 0.0318 -0.1114 Table 3: Maximum likelihood estimate of Markov chain transition probability matrix for the (2,3) age interval Aaa Aa A Baa Ba B C D WR Aaa 0.8527 0.0011 0.1350 0.0090 0.0001 0.0001 0.0000 0.0000 0.0019 Aa 0.0233 0.7987 0.1411 0.0095 0.0002 0.0009 0.0000 0.0004 0.0259 A 0.0002 0.0136 0.8429 0.1127 0.0028 0.0024 0.0000 0.0004 0.0248 Baa 0.0000 0.0002 0.0255 0.8558 0.0425 0.0274 0.0006 0.0015 0.0465 Ba 0.0000 0.0000 0.0009 0.0108 0.8241 0.0585 0.0076 0.0075 0.0906 B 0.0000 0.0000 0.0004 0.0005 0.0663 0.8053 0.0270 0.0456 0.0549 C 0.0000 0.0000 0.0009 0.0000 0.0001 0.0035 0.7288 0.1569 0.1098 D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 WR 0.0000 0.0001 0.0138 0.0009 0.0022 0.0540 0.0009 0.0315 0.8965

o o 0 0 0 0 0 0 0 > 0 00 ) 0 ) ) 0) 00 00 00 C 0) ( ) o0 0 0 0 0 0 0 00 C 000000-n000 OOOOOO o w) C 000 4 0 0 0) 00 0 0 )01 00 CD ( po co oo) c_ 0 0 0 0C C0 C) 00 0 0) 00 C0 00 0 00 0 00 0 000 0 w -0, CD W -4 C)b I CJ - C) 1 b 9 0000000000 -OO0, N)0CD OO- 0 _ )Oo O N)000 CD CD0CDCD 00 0- D _CD CD _ C01 CD 100 N)N)0 N) 00o C 0o 01 oo00 0 n)00 o Ol (~a>01~LOOCON P )O O o -^ICDO 3 - cr D *-. 3 3 5 fD CL C1. CD e O O CS - _ 13 _ r. or &p O D' D fD 01C fDc~ cr ~1 fD i-( 0r ~:~ CDCDCD ) C ) C ) C ) 00000000 I0 9I 0000000N)01 C D000000000CO ooooo oo oao o_ CD 0 0 0 0 C) o 01 CODOOO O CD CD -OOCDCOO) 0CDDCD oo oooooo S ~OOOD)N)CDOCD 0 0C0 CD N) CDC 00 0000-4 — wCDC o C OO o o C- 000 mCC o o o o C C D CD 00 O C D O 0 O 0.0.0100000. N)0000^0000 00 -00 i) 0000 $$0000~>) O 00 00 CDOCD-N)N)000000 - C - O O O- -CO OC Sn i( toQ 3 ~fD c5 t/l n *=. _, CD _. 5r e~ =. O r e~ fD {lm =F. eD er o r D o =. 0e 5 D er 13 w srD 5 5 -fD F D 3^ - 0 0 0 0 01 0) 0 ^il o 00 00),..t 3 e~ D *a. 3 3 I 5 o eD 3 F I o =D o _. O r cr e~ rD i-( _. D e~ 0 X {lm 0 e? o 3 fD rn 5"~ ^^r, fD ^5 crc C1~ Ir ^* 0 01

Tables 7a-c: One-year default probabilities from ratings C, B and Ba using mover - stayer and Markov chain models. Table 7a Year Markov Chain Mover - Stayer Model Proportion of ______ _____ _____ Stayers in C 1 0.0000 0.0000 0.88 2 0.1920 0.1153 0.77 3 0.1569 0.0826 0.81 4 0.1000 0.0737 0.81 5 1 0.2118 1 0.1892 1 0.43 Table 7b Year Markov Chain Model Mover - Stayer Model Proportion of ______ _____ _____ Stayers in B 1 0.0039 0.0039 0.15 2 0.0379 0.0435 0.3 3 0.0456 0.0519 0.07 4 1 0.1006 0.1038 0.13 5 0.1173 0.1189 0 Table 7c B a —>D_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Year Markov Chain Model Mover - Stayer Model Proportion of ______ _____ _____ Stayers in Ba 1 0.0000 0.0000 0.86 2 0.0379 0.0064 0 3 0.0075 0.0106 0 4 0.0202 0.0214 0 5 0.0077 0.0088 0