KKI,~Gi BUS. ADM. LIBRARY RESEARCH SUPPORT UNIVERSITY OF MICHIGAN BUSINESS SCHOOL THE STOCHASTIC MODELING OF PURCHASE INTENTIONS AND BEHAVIOR Working Paper #9612-25 Martin Young University of Michigan Business School Wayne S. DeSarbo Smeal College of Business Pennsylvania State University Vicki G. Morwitz Leonard N. Stern School of Business New York University OCTOBER 1996

~9 - I -Mi,. P.,.,: _ _ — - -- isiness School|:; '' *' ".'',;" '. -,,, f' -:y 1 _" a*' ' i- f'^ ^'; xz. r^,:.'. A ^s's.. ee }-e,..:' * ~.-.a:-rxt;:ia^ ';.^^.sr^; >65e'*;I ' iii. 1, I...-I ---- L_-~ THE STOCHASTIC MODELING OF PURCHASE INTENTIONS AND BEHAVIOR Working Paper #9612-25 Martin Young University of Michigan Business School Wayne S. DeSarbo Smeal College of Business Pennsylvania State University Vicki G. Morwitz Leonard N. Stern School of Business New York University School of usinessAdministratio A or Mi chigan 4 < _; - -:_8109 -123' A^^^|fi|||gan48109:-r234J^^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.

The' Stochastic Modeling of Purchase Intentions and Behavior To appear in Management Science:, Martin R. Young Department of Statistics & Management Science University of Michigan School of Business Administration Wayne S. DeSarbo Department of Marketing Smeal College of Business Pennsylvania State University Vicki G. Morwitz Department of Marketing Leonard N. Stern School of Business, New York University October. 1996

Abstract A common objective of social science and business research is the modeling of the relationship between demographic/psychographic characteristics of individuals and the likelihood of certain behaviors for these same individuals. Frequently, data on actual behavior is unavailable; rather, one has available only the self-reported intentions of the individual. If the reported intentions imperfectly predict actual behavior, then any model of behavior based on the intention data should account for the associated measurement error, or else the resulting predictions will be biased. In this paper, we provide a method for analyzing intentions data that explicitly models the discrepancy between reported intention and behavior, thus facilitating a less biased assessment of the impact of designated covariates on actual behavior. The application examined here relates to modeling relationships between demographic characteristics and actual purchase behavior among consumers. A new Bayesian approach employing the Gibbs sampler is developed and compared to alternative models. We show, through simulated and real data, that, relative to methods which implicitly equate intentions and behavior, the proposed method can increase the accuracy with which purchase response models are estimated. Key Words: Bayesian Methods, Hierarchical Bayes, Markov Chain Monte Carlo, Measurement Error, Probit Regression, Purchase Intentions, Stochastic Models The authors wish to thank the Departmental Editor, Linda Green, and two anonymous reviewers for their insightful comments which improved the content of this manuscript.

'I. 1 Introduction A common objective of applied social science and business research is the modeling oI the relationship between the demographic/psychographic characteristics of individuals. and the behavior of these individuals. For example, in many marketing research applications, the goal of many concept and product tests is to understand the purchase behavior of a population of consumers, to determine whether or not to introduce a product to the market, and, if so, to which market segments. Typically in such concept and product tests, one cannot observe the actual purchases of individuals, so self-reported purchase intentions for the new product or concept are measured and used as a proxy variable. Intentions are also often used to predict sales over time for existing products among different segments of customers. In order to determine the purchase propensity of members of different market segments, models relating demographic, psychographic, or other characteristics of consumers to purchase intentions are often developed. When there is an imperfect correspondence between self-reported purchase intentions and actual behavior, estimates concerning consumer characteristics obtained from these models will be biased. A key question, then, is what is the relationship between self-reported purchase intentions and actual behavior? Research in social psychology suggests that intentions should be the best predictor of an individual's behavior because they allow each individual to independently incorporate all relevant factors that may influence his or her actual behavior (Fishbein and Ajzen 1975). However, most empirical evidence suggests that purchase intentions cannot be taken literally. Over the last fifty years, several studies have examined the relationship between stated purchase intentions and actual purchase behavior. The U.S. government conducted studies and experiments concerning purchase intentions between the 1940s and the 1970s. Their goal was to predict short term movements in future expenditures based on stated purchase intentions and the financial position and demographic characteristics of American households. In many of these studies, significant relationships between intentions to buy 1

durable goods and subsequent purchase were found using various econometric models on panel data (Juster 1966; Tobin 1959). Although the relationships are significant, Juster describes the performance of binary intentions measures as unimpressive. He notes that although intender purchase rates are higher than those of non-intenders, intender purchase rates are significantly lower than one, and non-intender purchase rates are significantly greater than zero. In fact, since for most durable goods, the majority of households report that they are non-intenders, most of the actual purchases are made by non-intender households. Juster therefore advocated the use of purchase probability measures over direct binary intent measures. However, he noted that self-reported purchase probabilities also provide biased estimates of actual purchasing, typically underestimating the actual purchase rate. Several other studies have examined the relationship between purchase intentions and purchase behavior for durable goods (Adams 1974; Clawson 1971; Ferber and Piskie 1965; Granbois and Summers 1975; McNeil 1974; Pickering and Isherwood 1974) and for nondurable goods (Gormley 1974; Tauber 1975; Taylor, Houlahan, and Gabriel 1975; Warshaw 1980). The observed relationship between intentions and purchase is generally positive and significant, however, the strength of the relationship varies from study to study. For example, in Juster (1966), of those respondents who claimed they would "definitely or probably" purchase a car in the next six months, only 50 percent actually purchased. Jamieson and Bass (1989) and Morwitz and Schmittlein (1992) found similar results in different marketing contexts. Table 1 summarizes the results from these studies. [Insert Table 1 Here] Overall, based on the empirical evidence, intentions appear to almost always provide biased measures of purchase propensity, sometimes underestimating actual purchasing and other times overestimating actual purchasing. Manski (1990) provides further support that intentions data should not be taken literally. Manski develops a model of the relationship between binary intentions measures and subsequent behavior under the "best-case" situation where respondents have rational expectations. In this situation, 2

respondents' reported intentions are their best predictions of their future behavior. However, respondents typically do not have perfect information about changes that may occur in the future that may affect their probability of purchase. Based on his model, Manski demonstrates that intentions data do not identify a specific probability of behavior, but rather bound the probability of behavior, and that the bounds are nonparametrically estimable. Manski notes that, in contradiction to prior assertions, there is no reason to expect individual-level differences between intentions and behavior to average out in the aggregate. In other words, we should expect to observe that not all intenders purchase and that some non-intenders do purchase even with perfectly rational respondents. More recent studies of purchase intentions have developed models that incorporate the discrepancies between stated intentions and actual behavior. The psychometric beta binomial model of Morrison (1979) is a descriptive model of the relationship between stated purchase intent and subsequent purchase. Morrison incorporated the observed discrepancies between intent and purchase by taking into account measurement error of intentions (the difference between true intent and stated intent), the impact of exogenous events (that true intent may change over time), and systematic bias (the systematic tendency for intentions to overestimate or underestimate purchase). Kalwani and Silk (1982) report further analyses and applications of Morrison's model. Bemmaor (1995) more recently extended the Morrison (1979) model to the case of heterogeneous switching probabilities to attempt to explain the sign and magnitude of the discrepancy between overall mean purchase intent and subsequent proportion of buyers. In a different modeling effort, Infosino (1986) assumed that one can interpret a purchase intention response as a monotonic transformation of value (willingness-to-pay minus price), truncated to an integer on the intention scale. Infosino further assumed that, at the individual level, willingness-to-pay is stochastic. He also assumed that exogenous events, such as changes in the marketing mix, shift, but do not change the shape of the distribution of willingnessto-pay. According to this model, the proportion of customers who purchase is equal to the proportion of the transformed distribution of value above some threshold value. 3

The Morrison (1979) and the Infosino (1986) models both capture systematic biases in intentions measurement. The goal of several other intentions studies has been to identify causes of systematic biases in intent measurement. One determinant of the magnitude and direction of bias in intent measurement is the type of product under consideration. Research by Kalwani and Silk (1982) and Jamieson and Bass (1989) demonstrates that the relationship between purchase intentions and purchase behavior is different for durable goods than for non-durable goods. For example, Kalwani and Silk find that for durable goods, a linear model provides a good fit between intentions and purchase behavior, but for nondurable goods, a piecewise linear model provides a better fit. Morwitz and Schmittlein (1992) find that the relationship between purchase intentions and subsequent purchase also varies across demographic and product usage based segments. They find that some segments of intenders are more likely to fulfill their intentions than others, and that some segments of non-intenders are more likely to purchase despite their intentions than others. One additional cause of systematic bias in intent measurement is the effect of measuring intentions on actual purchasing. Morwitz, Johnson, and Schmittlein (1993) demonstrate that merely asking respondents whether or not they intend to purchase a durable good actually increases subsequent purchasing of the product. In summary, based on past empirical research on the relationship between intentions and behavior, we know that although intentions are often used as a proxy variable for actual purchase in applied marketing research, not everyone who says they intend to buy will actually purchase, and conversely not everyone who says they do not intend to buy will not make a purchase. The reasons that intentions are not perfect measures of behavior include measurement error in measuring intentions, changes that occur between the time intent is measured and the purchase occasion, systematic biases that might arise from the effect of product characteristics, the effect of respondent characteristics, and the effect of measuring intentions on behavior. The imperfect correspondence between purchase intentions and behavior suggests that standard (regression) models relating demographic/psychographic covariates to intentions can provide inaccurate estimates of the relationship between these covariates and actual 4

~, li purchase behavior. In this paper, we develop a model and estimation procedure for analyzing purchase intention data that explicitly take into account the discrepancy between purchase intentions and purchase behavior. While the models previously described are concerned with identifying the specific mechanisms by which purchase intention deviates from purchase behavior, our methodology takes this discrepancy between intention and behavior as a given, and then uses this known discrepancy to help correctly identify correlates of behavior based on data from intentions only. An extension of the model is developed for the case in which intentions data is available for multiple brands within a product category; this extended model uses hierarchical Bayes methods (e.g., Berger 1985) to achieve optimal pooling of information across brands. The estimation procedure introduced in this paper makes use of Markov chain Monte Carlo techniques (Gelfand and Smith 1990; Roberts and Smith 1993) to identify model parameters. We show via a simulation study that, when the model underlying the proposed methodology is valid, the proposed method can accurately estimate coefficients driving purchase activity, while methods that ignore the inequivalence between intentions and behavior lead to substantially biased parameter estimates. We also show that, given data on multiple brands, the optimally pooled estimates derived in this paper can provide significant gains in efficiency relative to the natural unpooled estimates. Finally, we consider a marketing dataset collected in Morwitz and Schmittlein (1992) in which personal computer consumers were surveyed regarding their purchase intentions, and then surveyed again, a year later, to discover their actual purchase histories. We demonstrate that our proposed method for analyzing intentions data provides a more accurate estimate of the relationship between covariates and purchase behavior, suggesting that the proposed method may be valuable to researchers working in the usual setting in which only purchase intentions data are available, but in which inferences about purchase behavior are desired. 5

2 The Model The model presented here is used for analyzing the simplest form of intentions data, namely binary response questions, but can be generalized to categorical, ordinal, or continuous response. The model states that purchase behavior is related to covariates according to a binary regression model, and that the purchase intention data represents a randomly distorted version of the purchase behavior data. Let variables yi and wi denote purchase intention and purchase behavior, respectively: 1 if consumer i indicates intention to purchase the product Yi = within the designated time horizon, 0 otherwise 1 if consumer i actually purchases the product wi = within a designated time horizon, 0 otherwise. The declaration of intention, or lack of intention, to purchase the product necessarily occurs prior to the actual purchase. The variable wi is postulated to be directly dependent on covariates xi, according to the model: P (wi ) = 4(f'xi), (1) where 4(-) is a cumulative probability function; if 4>(.) is the cumulative normal or cumulative logistic, then (1) denotes a probit or logistic regression equation. The covariates xi in (1) typically represent demographic or psychographic characteristics of respondent i; e.g., age, gender, present job category, etc. The objective in many marketing research studies is to identify the relationship between the covariates xi and purchase behavior wi for market segmentation purposes; i.e., to estimate the parameters /8. Since the wi's are not observed in a typical marketing research survey, one cannot ii ~6~~~~ t

perform a probit or logistic regression of w versus x to infer the coefficients i/. If the observed purchase intentions yj exactly correspond to the wi, then one can clearly perform binary regression of y versus x to obtain consistent estimates of the coefficients in (1). Suppose, though, that the observed intentions yi are only imperfect proxies for the actual purchase actions wi, related to one another by the probabilities: P (yi= Wi=-1) = Pll, (2a) P(yi-=0wi =) = Poo. (2b) If either p" or poo is different from 1, then the observed intentions will not perfectly match the purchase behavior, and so regression of y versus x will give biased estimates of the relationship between the covariates and actual purchase. We will assume, for the present, that the probabilities p1, and poo are known, at least approximately. As seen in Table 1, such probabilities are well established in the marketing literature for many product categories.1 The model proposed, then, is a corrupted binary regression model; Copas (1988) presents a somewhat similar model in which the corruption probabilities Pli and poo are necessarily equal. As in Albert and Chib (1993), the analysis of the model is facilitated by data augmentation; i.e., by the introduction of particular latent variables. Let zi denote a latent utility for each consumer that determines product purchase; in particular, consumer i will purchase whenever the utility zi is positive. This utility is assumed to be determined by the covariates xi according to the linear model: zi = /3xi +, c ~N(0, 1). (3) 1The probabilities in Table 1 refer to P (w j y), the probability of purchase given the stated intention, while the probabilities p,1 and poo refer to P (y I w). However, as is discussed in section 2.2, the necessary quantities P (y [ w) can be inferred from P (w I y) using Bayes theorem. 7

The purchase behavior, wi, then is determined by: 1 if zi > 0, wi - (4) 0 otherwise. Equations (3) and (4) are in fact equivalent to equation (1). The stated purchase intention depends on wi as before: P (yi = j l i k) pjk, j=0,1, k-=0,1. (5) The data for the problem consist of independent observations (y, Xi), i = 1,...,n. The objective of the study is to infer the parameters /3 relating the xi to actual, but unobservable, purchases wi (which may occur in some future time period). It is this modelling objective, the identification of the demographic factors predictive of purchase behavior, which distinguishes the model presented here from other marketing models for purchase intention and behavior such as those of Morrison (1979) and Bemmaor (1995). As is shown in the following sections, the /3 coefficients in the model, which measure the effect of covariates on purchase behavior, can be estimated via Bayes estimation implemented with Gibbs sampling (Gelfand and Smith 1990). 2.1 Bayes Estimation with Gibbs Sampling There are advantages to adopting a Bayesian estimation approach for the problem of analyzing purchase intentions data. One potential benefit of applying a Bayesian analysis to purchase intention data is that the inferences are exact for finite samples (conditional on the assumption that the model (3)-(5) is correctly specified), while confidence intervals obtained through maximum likelihood estimation are valid only asymptotically. Another advantage of the Bayesian approach is that, as is to be discussed in section 2.2.1, it is possible to incorporate uncertainty about the probabilities poo and p11 into the estimation procedure. A final advantage is that, if parameter vectors are to be estimated for a number of related product lines, a hierarchical Bayes approach can be used to achieve 8 * k

I an optimal pooling of data across products; this technique can significantly improve estimation accuracy, as is shown in section 2.3. The posterior for /3 will be proportional to the likelihood function times the prior. The likelihood function for the parameters /3 in terms of the observable data (yi, xi), i = 1,...,n is given by: n C(y|/3,X,poopn) = JJ [P (y = 1 1,xIpoopi )` P (yi = 0 /,x,poopii)' Yi] i=1 (6) Here, P (Yi = 1, xi,Poo, Pll) P (yi = 11 /3,xi, poo, Plwi = 1)P (wi = 11 3, xi,Poo,Pll) + P(yi = 11, xi, poo,Pnp, wi= O)P (wi = 0 3, xi, Poo, P1,) = pli(/3'xi) + (1 - poo)(l -('xi)) = Plo + (P11 - Plo)>(IP(xi), (7) and P (yi =0 1[3,xi,Poo, P1) 1-P (yi- 1 /3,xi, poo,Pll) ' = Poo - (poo - po1) (/3'Xi), (8) where Plo = 1 - poo, and Poi = 1 - Pi. Thus, the likelihood function is given by: n /~(yl3,X, poo,p11) = I [(P + (Pll - Pio))(/3'xi))Yi (Poo - (Poo - Poil)(Q'xi)) 1-yi i==l (9) Regardless of the form of the prior, the posterior distribution for /3 will not be from 9

a standard family, so quantities like the posterior mean and standard deviation cannot be computed analytically, but can only be obtained through some numerical integration approach. Here, we use a Markov chain Monte Carlo method (Roberts and Smith 1993) to perform this integration, by generating a Markov chain whose steady-state distribution is equal to the posterior distribution of /3. One particular Monte Carlo technique which is convenient for this problem is Gibbs sampling (Gelfand and Smith 1990; Roberts and Smith-1993; Albert and Chib 1993). Gibbs sampling is a multivariate random number generation technique that may be used in the frequent case in which a joint density to be sampled from is complex, but in which the conditional distributions are simple. As is described in Tanner (1993), the Gibbs sampler works by sampling alternatively from the full conditional posterior distribution of each parameter in turn. Under appropriate conditions (Tierney 1994), the stationary distribution of the chain produced by this alternating technique is in fact the desired, complex, multivariate posterior density of the unknown model parameters. Since the chain is ergodic (Geman and Geman 1984), the posterior mean and standard deviation of the parameter of interest, /3, can be obtained by simply computing the sample mean and standard deviation over the generated samples (in practice, the first several hundred samples generated are typically discarded, to avoid transient effects). Allenby and Lenk (1994) and McCulloch and Rossi (1994) present further applications of Gibbs sampling to marketing models. In the context of the purchase intentions data, the entire set of unobservable parameters consist of /3, as well as the Zi and the wi. The full conditional distributions for these unknown parameters are all easily obtained. If the zis are known, and if the prior on /3 is the non-informative uniform distribution, then the conditional distribution for the coefficient /3 is obtained by standard results on Bayesian linear models (e.g., Zellner 1971): 7r(/3|z,y,x) N((X'X)-X'z, (X'X)-1), (10) 10

where: z 1 X1 Z=, X= ~ zn 1 xn The mean of the full posterior conditional distribution for /3 in (10) is just the usua ordinary least-squares estimate obtained by regressing z versus x, and the variance of the distribution is the usual sampling variance for the least squares estimate of,3. If one has available an informative prior on (3 which is multivariate normal, with mean pI and covariance T-1, then the conditional posterior for beta is also of simple form: ir(/3z, y,x, p, T) - N ((X'X + T)-1 (X'z + T/), (X'X + T)-1). (11) Also, as is discussed in section 2.3, it is possible to estimate the informative prior empirically, given purchase intention data on multiple related brands. If 3 and the wi are known, then the full conditional of zi is obtained as: N(,(xi, 1)I(z _> 0) if w = 1 7r(zil3, W, Y7X) < (12) N(/3'xi, i)I(zi < 0) if wi = 0. That is, if W = 1, then zi is restricted to be positive, while if wi = 0O then zi is restricted to be negative (Albert and Chib 1993). Finally, the unobservable wi have to be generated as well. Their posterior distributions can be obtained via Bayes theorem: x 13y J ) 7r(y = jlwi = k,, x)Tr(wi = kl,,xi) 7r(wg - kl,y, x =- j, x) - l — --— ) (13) Ew=O (yi = jwi = 1,, Xi)7r(wi = l3,xi) (13) The prior, 7r(w{ = l1/3,xi), is given by N(/'xJ), by equations (3) and (4). The likelihood, 7r(yi = l1wi = l,/3,xi), is given by Pii from equation (5). The necessary full conditional 11

probabilities are given by: r(wi = 1/3,yi = 1,x) = P(14a)x) Pli xb X-,) + lo(1 - ( 'xi))(14a) 7r(wi=1l/3,yi=0,x)= pio (' (fi) pio40(<'x,) + p(l - ('x)) (14b) The Gibbs sampler then alternates between the three steps of generating the coefficients f3 from (10), generating the zi from (12), and generating the wi from (14a14b). This sequence forms a Markov chain; at convergence, the values of /3 come from the desired posterior distribution. These values reflect the imperfect match between wb and yi; the correction is accomplished through appropriate random generation of the wi, by equations (14a) and (14b). The chain can be started at some rational starting point, for example with w. = ys, and with zi = 1 or -1 according to the value of ys. If the values of /3 generated by the Markov chain scheme are denoted 3(1),...,(G), then the usual Bayes estimate for /3 is the sample mean of the /3(g), ignoring the first B samples; i.e., /3 = (G - B) E=B+l 3(). The posterior variance is similarly estimated as the sample variance for the last G-B iterates. The first B samples are discarded in order to guarantee that the Markov chain has reached the correct steady state distribution. Considerable research has been devoted to diagnosing convergence of the Gibbs sampler; that is, to identifying acceptable values of G and B above. Tanner (1993) provides guidance for implementation of Gibbs sampling techniques, and Roberts (1992), Robert (1995), and Zellner and Min (1995) describe diagnostic measures for assessing convergence of the Markov chain. 2.2 Determination of Pll and Poo The quantities Pi, and poo defined in (2) are the probabilities for intentions conditional on actual purchase behavior, while the probabilities described in Table 1 describe the probabilities of purchase behavior conditional on purchase intention. Since it is these latter probabilities for which published historical values (or product class level estimates vis a vis 12

1, -\ industry standards) exist, it would be desirable to be able to express the required inputs Pi and poo in terms of these more familiar probability figures. Let qjk = P (w = j J y = c), so that, for example, Yq1 denotes the proportion of survey respondents who indicate an intention to buy that actually do complete a purchase. The pjk can be written in terms of the qjk via Bayes theorem: Pi = P(y=11w =1) P(w-= 1 y= l)P(y=l 1) P(w=lly= 1)P(y-= 1)+P(w- ll y=0)P(y- =0) qjP(y = 1) (15) qn1P (y = 1) + qioP (y 0) Poo = P(y-01w 0) P (w = 01 y = 0) P (y = 0) P(w =01 y= 0)P(y - 0) +P(w = 01 Y = 1)P(y -= 1) qooP (y = O) (16) qooP (y = 0) + qoP (y 1)' The quantities P (y = 1) and P (y = 0) can be calculated by the marketing researcher based on the survey of intentions at hand, since P (y = 1) simply denotes the proportion of respondents indicating an intention to buy. Thus, using these values, and the industry standards for qoo and qj as listed in Table 1, one can obtain approximate estimates of the required probabilities pjk via equations (15) - (16). 2.2.1 Uncertainty about pi, and poo Since the appropriate probabilities Pi and poo may not be known exactly, one may wish to incorporate uncertainty about these parameters into the estimation procedure. Instead of regarding these probabilities as fixed and known, one can instead assign prior distributions to these quantities, and can then draw random values of poo and P11 from their full conditional posterior distributions at each iteration of the Markov chain. The 13

beta probability distribution provides a flexible family of distributions on [0,1] that can be used to express knowledge about Pu and poo. Suppose that the prior for Pu is beta(cai, co): 7r(pll) (c pl"-l( -pll)c~-l. Let njk denote the number of observations in the dataset for which yi = j and wi = c. Then the full conditional posterior density for pi is proportional to the prior times the likelihood, and the likelihood is proportional to pW (l -p1)n0; thus, the full conditional posterior is proportional to pfl+"'1 (1 _ -pll)co+nol-1. I.e., the density is beta(ai + nl, ao + no0). Similarly, if the prior for poo is beta(-yo, -), then the full conditional posterior for poo is beta(To + noo0, 7 + n10). Note that the data is informative about Pu and poo00, since these parameters appear in the likelihood function for the observable data (y/, xi) in equation (9). 2.3 Hierarchical Bayes Estimation Frequently, marketing researchers will obtain purchase intention data on multiple brands and/or product lines within a common product class; a typical survey may ask the survey respondents: "Do you intend to purchase Brand 1 in the next year? Do you intend to purchase Brand 2 in the next year?... Do you intend to purchase Brand M in the next year?" Given this data on multiple (M) brands, one may wish to identify the coefficient vector /3k for each brand k, k 1,..., M. One approach to estimating the /3k parameter vectors is to simply estimate model (3)(5) separately for each brand, in effect, assuming that the /3k are completely unrelated. Alternatively, one could make the strong assumption that the 3k vectors for the different brands are in fact all equal; under this assumption, one could then pool the data from the M brands, thus considerably multiplying the amount of data available. The advantage of the former approach, of estimating the Al models separately, is that it makes no assumptions about the data, and thus cannot introduce bias. The advantage of the latter approach, pooling the data, is that it increases the effective sample size, and hence reduces the sampling variance of the estimates. The former method will be superior in cases when the vectors /3k are very dissimilar for the different brands, while the latter will be superior when the 3Ok are homogeneous. To appreciate the potential gain from pooling, if the /3k 14

are truly all equal and data from M brands are pooled, the sampling standard error of the parameter estimates will decrease by a factor of vM/IA; thus, with 9 brands, the unpooled estimates will have 200% larger sampling error than the pooled estimates. Hierarchical Bayes and empirical Bayes methods (e.g., Berger 1985) are designed to obtain the optimal tradeoff between a completely unpooled and a completely pooled estimator. The methods are based on a supposition that the 3k vectors are not identical, but that they arise from a common distribution. Let r(/3) denote this common prior distribution, and assume that the distribution is multivariate normal, with mean pt and covariance S = T-1. In the Gibbs sampling framework, each vector 3k will be generated using equation (11), based on the informative prior 7r(/3) r- N(g, S). From (11), if T is zero, indicating the variance of the /3's, A, is very large and the brands are heterogeneous, then the posterior mean for /3Ok will be (X'X)'X'zk which is equivalent to the mean given in equation (10), the formula for the simple, unpooled estimate for brand k. Alternatively, if T is infinite, indicating that the brands are completely homogeneous, then the conditional posterior mean for Ok is equal to it for all k; i.e., the estimate behaves like a pooled estimate. Thus, if appropriate prior parameters p and S are available, the Bayes estimator behaves as desired, acting as a disaggregate estimator when the brands are heterogeneous, and as a pooled estimator when the brands are homogeneous. However, because response data is available on multiple brands, the prior parameters can in fact be estimated from the data themselves. Suppose at a given iteration in the Gibbs sampling scheme, one has samples (/31,..., 3M) for the MI brands. Then sensible point estimates of the prior parameters are: M ^ M'ZDk (17) k=1 M = M- M' (/k - )(/3k -4 )'; (18) k=1 i.e., the prior mean and covariance for the 3's is estimated by the sample mean and 15

covariance of the /3's. For the true hierarchical Bayes method, one would not use a point estimate of (, Y) but would instead sample values from their respective conditional posteriors. To obtain these conditional posteriors, one must specify a prior on the hyperparameters. A hyper-prior on (p, S) which is proportional to SE2, with p denoting the dimension of /3, can be said to represent "maximal uncertainty" about these parameters (Wakefield et al. 1994). Under this non-informative prior, the conditional posteriors for the hyper-parameters are: 7r(pl/x,... /3k, X ) N(M1EZ l /k M~3S) (19) (s-i31,..., 3, P xL)~ w(M +p, ZEi - ( - )(k - ), (20) where W(d, A) denotes the Wishart distribution with d degrees of freedom and expected value dA-~ (Wakefield et al. 1994). Odell and Feiveson (1966) present a procedure for generating deviates from the Wishart matrix distribution. To summarize, the Gibbs sampling estimation algorithm for the hierarchical Bayes model involves iterative repetition of the following steps: 1. For each brand k, generate the vector /3k using the prior r(/3) r N(i, S), via equation (11). 2. For each brand k and subject i, generate the values Wik using equation (14), conditional on the current value for 13k. 3. For each brand k and subject i, generate the values Zik using equation (12), conditional on the current value for f3k. 4. Given the values /3,.., 3AI, and A, generate the prior mean pL using equation (19). 5. Given the values 1,..., 3,M, and Au, generate the prior covariance S using equation (20). The procedure can be initialized by, for example, setting p = 0, S = I. See Lenk and Rao (1990), Allenby and Lenk (1994) and Allenby and Lenk (1995) for further applications of 16

It hierarchical Bayes models in marketing science. 2.4 Potential Degeneracies There is a potential problem that can arise with model (3) - (5), regardless of estimation technique employed. The problem is that under certain conditions, the likelihood is maximized for infinite values of the /3 coefficients. To illustrate, consider a simple case in which there is just one covariate, x, which is binary, taking values 0 or 1. Then, from equation (7), P (yi = 11 |3,x = 0,poo,P1) = Pio + (P - pio)>(Qo); similarly, P (yi = 1 1 3, x = l,poo, pll) = PlO + (P1 — Pio)(/o + AI). If iro denotes the empirical proportion of observations for which yi = 1 given that xi = 0, and 17r denotes the proportion of observations for which yi = 1 given that xi = 1, then the MLE's of (,o,A1) can be found simply by setting the probabilities implied by the model to be equal to the observed proportions: 7o = Plo + (Pul - po)N(Po) (21) -= Plo + (pl - Po)Q(fo +/31), (22) and the solution is: It = <- _-( ___ ) (23) o - — Pio 0 = %u01P1O) -. (24) Pll-PlO The degeneracy arises if the argument of 4'l(.) in (23) or (24) falls outside the interval (0,1); this event occurs if either 7ro or 71r falls outside the interval (plo,Pl). In such a degenerate situation, the likelihood function is maximized for coefficient values that are infinite. The model given by equations (3)-(5) implies that for any value of x, the conditional probability th th1 must be between degeneracy arises when the data conflicts with this assumption. Degenerate cases are diagnosed by observing the Gibbs samples of the g coefficients 17

wandering toward infinity. Degeneracy may, but does not necessarily, indicate that the values assumed for poo and Pi are inappropriate for the given dataset. The degeneracy can be prevented by putting constraints or bounds on the parameters. In the Gibbs sampling framework, this implies rejecting any generated set of coefficients /3 that fall outside of some "reasonable" neighborhood. Alternatively, the use of a proper prior distribution on /3, such that the prior probability of infinite /3 is sufficiently small, will prevent the incidence of the degeneracy. 3 A Simulation Study A modest simulation study was performed to evaluate the potential ability of the Bayes estimators to recover the true parameters relating covariates to product purchase, based just on data from purchase intentions. Data were generated from the model described in (3) - (5), with /3 = (/3o,/3), xi = (1,xi), and with the xi generated uniformly over [0,1]. Four different simulation blocks, with different sets of generating parameters, were evaluated. Table 2 lists the parameter values (n, /io,I1,poo,pi1) used in the study. The values were chosen to examine the performance of the estimation method over a range of possible parameter settings. Table 2 also displays the results obtained by analyzing the data using both the Gibbs sampling technique described in this paper, and the results obtained using the naive approach of simply performing maximum likelihood probit regression of the intention data yi versus the xi. This latter approach is essentially equivalent to applying the method of this paper, but erroneously assuming that poo = pi = 1. The Gibbs sampling procedure was run for 2000 iterations, and the final Bayes estimate was chosen as the mean of the last 1500 samples generated. [Insert Table 2 Here] In the first simulation block, a modest amount of discrepancy between intentions and purchase was assumed: poo = P, = 0.8. Note that even for this apparently small amount of error, the naive parameter estimates are grossly biased; the mean naive estimate for 18

; * J. was 0.63, though the true value was 3.0. The Bayes estimator proposed in this paper has much less bias, with a mean of 3.93. When the measurement error is increased to Poo = 0.9, Pl = 0.6, the Bayes estimator is still able to recover the parameters governing purchase behavior (the wi) based on data from simulated intentions (yi). The Bayes estimator appears to be unbiased in Block 3, with Pio = 2, fi1 = -3, but the standard deviation is more than twice that in Block 2, with fbo = -2, /1 = 3. The reason for the different performances in Blocks 2 and 3 is that, given the different coefficients, more wi's have a value of 0 in Block 3 than in Block 2, and since poo > pi for these blocks, there will be more misclassified yi's with Block 3 than Block 2. The generally higher variance of the Bayes estimator relative to the naive estimator seems to be due to the fact that the probability function for the data yi assumed with the Bayes estimator is less sensitive to changes in /PI than is the model used in the naive approach. For the Bayes model, the probability function is: P (y =1 /3, xi,poo,pll) Po + (P - Plo)((/'xi), (25) while for the naive estimator, the probability function is simply: P (y = 1 1 3, xi) = (D(O'xi). (26) The sensitivities to changes in the coefficients are, for the Bayes estimator, S2,kdfi —P (Yi = 1 3,xi,poo,pll) = (Pll - Plo)2XikXir#(3'Xi), (27) and for the naive estimator, / 3kf- P (yi = 1 /3, xi) = xikXrq(O3'Xi). (28) The quantity Pu - pio = P\\ + poo -1 must be less than or equal to 1 in absolute value, equalling 1 only when P = poo = 1, or when P1 = poo = O.2 The relative insensitivity of 2The latter case indicates a condition in which all respondents claiming an intention to purchase do not in fact purchase, while all respondents denying an intention to purchase do purchase. In this very 19

the probability function when Pi or poo is less than 1 corresponds to a relatively flatter likelihood surface, and hence smaller Fisher's information. 3.1 Hierarchical Bayes Estimation A further modest simulation study was performed to evaluate the potential gains from using the hierarchical Bayes (HB) approach described in section 2.3. In this study, data from M brands were generated from the model: Zik = /kXi + fijk, ijk - N(0, 1) (29) Wik =1 if Zik > 0, 0 otherwise (30) P (yik = 1 wik = 1) =p1 (31) P (yik = 0 Wik = 0) = p00, (32) with the f3k distributed normally across brands: 3k -, N(t, S), k = 1,...,M. In each simulation trial E was equal to a2I for some scalar cr. Small values of a indicate homogeneity of 3k across brands, and thus a favorable condition for pooling of data. The coefficients 3k were estimated two ways: using the hierarchical Bayes procedure described in section 2.3, and by a disaggregate estimator in which the Bayes estimator with noninformative priors is applied to each brand separately. Both the hierarchical Bayes and the non-informative Bayes estimates were based on correct values of the Pij probabilities. For all the simulation replications, Pi and poo were set to 0.9, and the prior mean p was set to (-2,3). The cross-sectional variance a, the number of subjects n, and the number of brands M were varied to discover the benefit of hierarchical Bayes methods under different conditions. For each replication, the accuracy was measured by the root mean squared error for estimating the M coefficient vectors /3k: i.e., the RMSE for estimating untypical situation, a researcher could reverse the intentions data, and apply simple binary regression to the reversed data, to obtain an accurate model for purchase behavior. 20

fo in a given replication was given by (M Z (k=(o - 4o)2)12 where Z denotes the true value for the parameter. Table 3 lists the estimation results. As can be seen, the gains in estimation accuracy from using the hierarchical Bayes approach are significant. For example, in Block 1, the mean RMSE for the HB estimate of i31 is 0.736, while the mean RMSE for the noninformative Bayes estimate is 1.051; thus, the pooling -achieved by the HB approach reduces estimation error by 43%. As expected, the gains are greatest in cases in which the number of subjects n is small, and when the cross-sectional variation a is small. Even if the cross-sectional variation in the /k is fairly large (a = 0.5), the gains from using HB appear to be non-negligible. In the limit, if a is large, or n is large, the HB method will essentially reduce to the simple Bayes estimate, which will be similar to the non-informative Bayes estimate. Thus, for small cross-sectional variation, the HB method will significantly outperform the non-informative Bayes estimate, whereas the two methods should have comparable performance under conditions less favorable to HB. [Insert Table 3 Here] 3.2 Parameter Estimation: A Summary In section 3, it was demonstrated that equating intentions with actual behavior can lead to substantially biased inferences, and that, given knowledge of the relationship between intentions and behavior, one can obtain relatively unbiased estimates of the model relating covariates to purchase behavior, by using the technique described in section 2.1. In section 3.1, it was shown that, given intentions data on multiple brands, one may be able to achieve substantial gains in estimation efficiency by pooling the data through hierarchical Bayes methods. It is worth noting here the assumptions and limitations underlying the methodology. Model (3)-(5) assumes that correlations between the covariates x and stated intentions y are explained strictly through the relationships between the variables x and actual purchase behavior w, and through the relationship between purchase behavior w and 21

intention y. Thus, it is assumed that the parameters poo and Pi, relating behavior and intentions are independent of x. Also, it is assumed that one has fairly informative prior knowledge concerning the values poo and p,. While section 2.2.1 shows that the observable intentions data will be informative about these parameters, in practice it will be difficult to get accurate estimates of the probabilities, unless one has substantial numbers of observations with extremely large, and with extremely small, values of f3'xi, thus affording a view of the tails of the probability function P(yj = I 1 xi,pooi, P) given in equation (7). 4 An Application Involving Personal Computers Morwitz and Schmittlein (1992) present a study in which members of a large consumer panel were surveyed multiple times concerning their intentions to buy and their actual purchasing of personal computers. In addition, extensive demographic and product usage variables about the panel households were obtained. At the time of the first survey, none of the sample households currently owned a personal computer. During the first survey, respondents were asked the following intention question: Do you or does anyone in your household plan to acquire a personal computer in the future for use at home? - Yes, in the next 6 months - Yes, in 7 to 12 months - Yes, in 13 to 24 months - Yes, sometime, but not within 24 months - No, will not acquire one One year later, the same sample of respondents indicated whether or not they had purchased since the first survey. Note that the intention question asked to panel households is not a binary intentions question, but rather is a variation of a standard binary question that contains a purchase 22

t. timing element. In order to estimate the model, we classify those respondents who indicated that they intended to purchase within the one year time frame of interest as intenders (i.e., they either intend to buy in 6 months, or in 7 to 12 months). Respondents who indicated that they did not intend to purchase a personal computer and those who indicated that they intend to buy sometime after the first year are classified as nonintenders. The covariates available to predict purchase probability included household income, profession, geographic region, and current level of PC usage. Due to the confidential nature of the dataset, we have rescaled the data and shuffled labels of the predictor variables; however, these operations should not effect the suitability of the model for the application. The particular set of variables included in the analyses reported here were: PROFESS = 1 if a professional; 0 otherwise STUDENT = 1 if a student; 0 otherwise USE = 1 if a household member uses a PC at work or school; 0 otherwise PACIFIC = 1 if house is in pacific region of U.S.; 0 otherwise INCOME = household income. To achieve approximately equal scaling for the variables, income was coded in $100,000's; e.g., a respondent with income of $60,000 was coded as 0.6 for the INCOME variable. The binary intentions data yi were analyzed under several different specifications for the parameters (poo,pii): Analysis 1, with poo = 0.691, Pu = 0.812, corresponding to the actual conditional probabilities observed with the given dataset; Analysis 2, with poo = 1.0, plu = 1.0, corresponding to the naive assumption that stated purchase intention corresponds exactly to purchase behavior, and is thus equivalent to simple probit regression analysis of the intentions data; Analysis 3, with poo = 0.658, P, = 0.907, obtained by using the values of 00oo = 0.962, q1 = 0.429 reported in Jamieson and Bass (1989) for purchases of personal computers, and applying formulas (15) and (16). In addition, an Analysis 4 was performed, in which the true actual purchase responses wi were analyzed using the standard probit model. The Gibbs sampling procedure was run 23

for 10,000 iterations, and the coefficient estimates were taken as the means of the final 5,000 samples. Table 4 lists the estimated coefficients and standard errors for the different analyses. [Insert Table 4 Here] The coefficient estimates based on Analysis 1 of the purchase intentions, with poo and PI chosen based on the corresponding values in the dataset under analysis, are very close to the values obtained from simple probit regression of the purchase responses as presented in Analysis 4. Perhaps one interesting finding is the significant impact of geographic region (PACIFIC) on purchase behavior, even when controlling for income and professional status. One possible explanation is that PACIFIC is acting as a proxy for employment in the computer industry, and that such employees would be more likely than other professionals to purchase a personal computer. The coefficient estimates based on Analysis 2 of the intentions data, with poo and pl taken to be equal to 1.0, are quite far from the values obtained in Analysis 4, though the order of the magnitudes appears to be correct. Analysis 3, using values for poo and pli based on the intention conversion probabilities observed in Jamieson and Bass (1989) provides estimates that are somewhat close to the estimates of Analysis 4, and substantially less biased than when poo and pi are taken to be 1. The coefficient estimates from Analysis 2 all appear to be substantially attenuated; i.e., the impact of the covariates on purchase probability is underestimated by the analysis of the purchase intentions. Ignoring the discrepancy between intentions and behavior would lead an analyst to conclude that the covariates have weak effects on purchase probability, when in fact the effects are fairly strong. The phenomenon of measurement error leading to underestimation of regression coefficients is common in econometrics (e.g., Fuller 1989). In the case of ordinary regression, it is measurement error in the independent variables which leads to attenuated parameter estimates, whereas measurement error added to the dependent variable does not lead to bias in OLS estimates of regression coefficients. In the case of binary regression models, measurement error in the independent variables is 24

known to lead to biased parameter estimates (e.g., Stefanski and Carroll 1985), but here it is seen that error in the response variable leads to bias in the parameter estimates as well. Finally, as expected, the standard errors of the coefficient estimates for analyses 1 and 3 are considerably higher than those for analyses 2 and 4; since analyses 1 and 3 acknowledge the presence of uncertainty about true purchase behavior, the uncertainty about the parameter estimates is correspondingly greater. As a simple test of the assumption of conditional independence of y and w given x, a probit regression analysis was run of the ys vs. (wv, xi). The results, which are presented in Table 5, show that the coefficients for the covariates x are not significant. Non-significance does not prove the model is correct; however these results suggest that the assumption of conditional independence between y and x given w may be reasonable for these data. In usual practice, one will not be able to empirically verify the assumption that the pjk are independent of the covariates x, since data on actual behavior are typically unavailable. [Insert Table 5 Here] In summary, it appears that ignoring the discrepancy between purchase intentions and purchase behavior leads to grossly inaccurate estimates for binary regression coefficients, and that taking the discrepancy into account through use of model (3)-(5) can lead to improved accuracy in parameter estimation. Further, significant improvement can be obtained even if the correct parameters poo and Pi are not known precisely. In practice,. a researcher may wish to fit the model for a variety of different values of poo and pn, in order to assess the sensitivity of the inferences with respect to assumptions about the conversion of intentions to actual purchases. 5 Alternative Models Equations (3)-(5) describe one possible model for characterizing both the effect of covariates on purchase behavior, as well as the discrepancy between stated intentions and 25

actual behavior. As an alternative model, one could suppose that zi and wi are defined as in (3) and (4), but that ys, the stated purchase intention, equals 1 if and only if zi + Si is positive, where si. N(,, a2).3 The parameter a here measures the correlation between intentions and behavior, and the parameter p measures the asymmetry of the switching process. The naive model, in which intentions are treated as equivalent to behavior, is characterized by p = a = 0. Reasonable values for p and a2 could be inferred from studies such as are listed in Table 1, This alternative specification implies that, for the observable data yi, P(YI = l[x) = ) (33) while for the model (3)-(5), the corresponding probability is: P(Yt = llx) = Pio + (P11 - piO)4((3'x). (34) A choice between the two models could be based on the shape of the empirical probability curve P(Y = ljx). Log-likelihood ratios and criteria such as AIC could provide a scheme for choosing between different model specifications. 6 Conclusion The models and analyses of Morrison (1979), Infosino (1986), Manski (1990), and Bemmaor (1995) provide interesting insights into the psychological mechanisms by which purchase intentions deviate from actual purchase behavior. These and other papers have documented the typical magnitude of the discrepancy between intentions and behavior. The present paper takes this discrepancy between intentions and behavior as a given, and develops an estimation procedure for identifying the demographic/psychographic correlates of behavior based on data on intentions. When data on multiple brands from a 3We thank an anonymous referee for suggesting this interesting model. 26

common product class is available, the hierarchical Bayes method introduced in this paper can be used to obtain optimal pooling of data across brands. We show, with simulated data and with data from a recent marketing study, that the proposed estimation method can be useful in improving upon the accuracy of estimates of the relationship between the covariates and true purchase behavior. This ability to make better use of intentions data may lead to more accurate market segmentation, and thus improved marketing management decisions. Future research will attempt to extend the current model to include multi-level intentions scales, to explicitly take into account the elapsed time between intent measurement and purchase occasion, and to incorporate latent class and multivariate generalizations. 27

References Adams, F. G. (1974). Commentary on McNeil, 'Federal Programs to Measure Consumer Purchase Expectations'. Journal of Consumer Research 1, 11-12. Albert, J. and S. Chib (1993). Bayesian Analysis of Binary and Polychotomous Response Data. Journal of the American Statistical Association 88, 669-679. Allenby, G. and P. Lenk (1994). Modelling Household Purchase Behavior with Logistic Normal Regression. Journal of the American Statistical Association 89, 1218-1231. Allenby, G. and P. Lenk (1995). Reassessing Brand Loyalty, Price Sensitivity, and Merchandising Effects on Consumer Brand Choice. Journal of Business and Economic Statistics 13, 281-289. e Bemmaor, A. (1995). Predicting Behavior from Intention-to-Buy Measures: The Parametric Case. Journal of Marketing Research 32, 176-191. Berger, J. (1985). Statistical Decision Theory and Bayesian Analysis. New York, NY: Springer Verlag. Clawson, C. (1971). How Useful are 90-Day Purchase Probabilities? Journal of Marketing Research 35, 43-47. Copas, J. (1988). Binary Regression Models for Contaminated Data. Journal of the Royal Statistical Society, Series B 50, 225-265. e Ferber, R. and R. Piskie (1965). Subjective Probabilities and Buying Intentions. Review of Economics and Statistics 47, 322-325. Fishbein, M. and I. Ajzen (1975). Belief, Attitude, Intention, and Behavior. Reading, MA: Addison-Wesley.Fuller, W. (1989). Measurement Error Models. New York, NY: John Wiley and Sons. Gelfand, A. and A. Smith (1990). Sampling Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association 85, 398-409. 28

Geman, S. and D. Geman (1984). Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721-741. a Gormley, R. (1974). A Note on Seven Brand Rating Scales and Subsequent Purchase. Journal of Market Research Society 16, 242-244. @ Granbois, D. and J. Summers (1975). Primary and Secondary Validity of Consumer Purchase Probabilities. Journal of Consumer Research 1, 31-38. Pr Infosino, W. (1986). Forecasting New Product Sales from Likelihood of Purchase Ratings. Marketing Science 5, 372-384. ' Jamieson, L. and F. Bass (1989). Adjusting Stated Intention Measures to Predict Trial Purchase of New Products: A Comparison of Models and Methods. Journal of Marketing Research 26, 336-345. Juster, F. (1966). Consumer Buying Intentions and Purchase Probability: An Experiment in Survey Design. Journal of the American Statistical Association 61, 658-696. o Kalwani, M. U. and A. J. Silk (1982). On the Reliability and Predictive Validity of Purchase Intention Measures. Marketing Science 1, 243-286. Lenk, P. J. and A. G. Rao (1990). New Models from Old: Forecasting Product Adoption by Hierarchical Bayes Procedures. Marketing Science 9, 42-53. w Manski, C. (1990). The Use of Intentions Data to Predict Behavior: A Best-Case Analysis. Journal of the American Statistical Association 85, 934-940. McCulloch, R. and P. E. Rossi (1994). An Exact Likelihood Analysis of the Multinomial Probit Model. Journal of Econometrics 64, 207-240. McNeil, J. M. (1974). Federal Programs to Measure Consumer Purchase Expectations, 1946-73: A Post-Mortem. Journal of Consumer Research 1, 1-10. s Morrison, D. G. (1979). Purchase Intentions and Purchase Behavior. Journal of Marketing 43, 65-74. 29

Morwitz, V. G., E. J. Johnson, and D. Schmittlein (1993). Does Measuring Intent Change Behavior? Journal of Consumer Research 20, 46-61. F Morwitz, V. G. and D. Schmittlein (1992). Using Segmentation to Improve Sales Forecasts Based on Purchase Intent: Which 'Intenders' Actually Buy. Journal of Marketing Research 29, 391-405. Odell, P. and A. Feiveson (1966). A Numerical Procedure to Generate a Sample Covariance Matrix. Journal of the American Statistical Association 61, 198-203. Pickering, J. and B. Isherwood (1974). Purchase Probabilities and Consumer Buying Behavior. Journal of Market Research Society 16, 203-226. Robert, C. (1995). Convergence Control Methods for Markov Chain Monte Carlo Algorithms. Statistical Science 10, 231-253. Roberts, G. 0. (1992). Convergence Diagnostics of the Gibbs Sampler. In J. Bernardo, J. Berger, A. Dawid, and A. Smith (Eds.), Bayesian Statistics 4: Proceedings of the Fourth Valencia International Meeting, pp. 775-782. Oxford University Press. Roberts, G. 0. and A. F. M. Smith (1993). Bayesian Computation via the Gibbs Sampler and Related Markov chain Monte Carlo Methods. Journal of the Royal Statistical Society, Series B 55, 3-23. Stefanski, L. A. and R. J. Carroll (1985). Covariate Measurement Error in Logistic Regression. Annals of Statistics 13, 1335-1351. Tanner, M. (1993). Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions (2nd ed.). New York, NY: Springer Verlag. Tauber, E. M. (1975). Predictive Validity in Consumer Research. Journal of Advertising Research 15, 59-64. ' Taylor, J., J. Houlahan, and A. Gabriel (1975). The Purchase Intention Question in New Product Development: A Field Test. Journal of Marketing, 90-92. 30 *1

I Tierney, L. (1994). Markov Chains for Exploring Posterior Distributions (with Discussion). Annals of Statistics 4, 1701-1762. d Tobin, J. (1959). On the Predictive Value of Consumer Intentions and Attitudes. Review of Economics and Statistics 41, 1-11. Wakefield, J., A. Smith, A. Racine-Poon, and A. Gelfand (1994). Bayesian Analysis of Linear and Non-Linear Population Models by Using the Gibbs Sampler. Applied Statistics 43, 201-221. Warshaw, P. R. (1980). Predicting Purchase and Other Behaviors from General and Contextually Specific Intentions. Journal of Marketing Research 17, 26-33. Zeilner, A. (1971). An Introduction to Bayesian Inference in Econometrics. New York, NY: John Wiley and Sons. Zellner, A. and C.-K. Min (1995). Gibbs Sampler Convergence Criteria. Journal of the American Statistical Association 90, 921-927. 31

Table 1: Correspondence between purchase intention and purchase behavior observed in different markets and different studies. Purchase Probability (%) Study Intenders Non-Intenders Product Juster 1966a 50 11 Automobile Jamieson and Bass 1989b 52.2 16.7 Pump toothpaste 61.5 17.1 Diet drink mix 43.5 15.6 Fruit sticks 2.8 4.7 Stay fresh milk 56.3 12.3 Salad dressing 42.9 3.8 Home computer 12.5 11.4 Cordless phone 0.0 2.7 Touch lamp 0.0 0.0 Cordless iron 0.0 1.3 Shower radio Tauber 1975c 26 12 Packaged goods Infosino 1986d 34.5 9.7 Service option Pickering and Isherwood 1974e 37.5 7.6 Automobile aDerived from Table 3, based on six month probability scale. Intenders defined as those in the "Definite, Probable" intentions class, non-intenders, those in the "No" intentions class. bDerived from Table 1. Intenders defined as those who "Definitely/probably will buy", non-intenders, those who "Definitely/probably will not buy". 'Derived from Table 1. Intenders defined as those who will "Positive purchase intent, solves need", non-intenders, those with "Negative purchase intent". dDerived from Figure 1. Intenders defined as those with likelihood rating 7-10, non-intenders, those with likelihood rating 1-6. p. 205, derived from Theil and Kosobud, 1968. 32 1 It ~

Table 2: Results of simulation study of Bayes estimator with 200 replications. Estimates of 0o Estimates of $1 Bayes Naive Bayes Naive Block n Po /1 Poo Pll Mean SD Mean SD Mean SD Mean SD 1 200 -2 3.8.8 -2.51 0.81 -0.46 0.35 3.93 1.22 0.63 0.54 2 200 -2 3.9.6 -2.10 0.34 -0.85 0.12 3.19 0.52 0.57 0.14 3 200 2 -3.9.6 2.15 0.82 -0.09 0.10 -3.27 1.21 -0.41 0.09 4 200 -3 4.9.6 -2.88 0.86 -0.94 0.23 3.87 1.16 0.63 0.43 33

Table 3: Results of simulation study of hierarchical Bayes (HB) estimator and disaggregate (DIS) estimator with 100 replications. RMSE, 30 oRMSE, 01 HB DIS HB DIS Block n M a Mean SD Mean SD Mean SD Mean SD 1 200 12 0.50 0.308 0.082 0.445 0.116 0.736 0.302.1.051 0.348 2 100 12 0.50 1.176 0.448 1.478 0.272 0.368 0.121 0.541 0.159 3 200 12 0.25 0.618 0.196 0.911 0.272 0.224 0.057 0.313 0.081 4 200 6 0.50 0.704 0.327 1.030 0.407 0.306 0.091.484.151 34

Table 4: Parameter estimates for Morwitz and Schmittlein (1992) PC dataset. a Analysis poo Pll USE INCOME PROFESS PACIFIC STUDENT Mean SD Mean SD Mean SD Mean SD Mean SD 1 0.691 0.812 1.163 0.164 0.773 0.154 1.853 0.204 0.539 0.150 0.112 2 1.000 1.000 0.513 0.069 0.320 0.067 0.813 0.071 0.213 0.063 0.060 0.070 3 0.658 0.907 1.128 0.140 0.753 0.148 1.652 0.145 0.472 0.163 0.102 0.185 46 1.000 1.000 1.235 0.076 0.792 0.072 1.546 0.078 0.627 0.069 0.052 0.078 aNumbers in the table represent posterior means and standard deviations of respective regression coefficients. bProbit regression of actual purchase responses. I ZI I 35

Table 5: Probit regression analysis of purchase intention versus purchase behavior and demographic covariates for the Morwitz and Schmittlein (1992) PC dataset. Coefficient Variable Estimate Std Err P-value INTERCEPT 0.5212 0.131 0.0001 PURCHASE w -1.3690 0.086 0.0001 USE x1 0.0181 0.114 0.873 INCOME x2 -0.0314 0.113 0.781 PROFESS X3 -0.0491 0.112 0.661 PACIFIC X4 -0.0730 0.112 0.516 STUDENT x5 0.0234 0.111 0.833 36