Division of Research January 1985 Graduate School of Business Administration The University of Michigan A SECOND GENERATION OF MULTIVARIATE ANALYSIS: CLASSIFICATION OF METHODS AND IMPLICATIONS FOR MARKETING RESEARCH Working Paper No. 414 Claes Fornell FOR DISCUSSION PURPOSES ONLY None of this material is to be quoted or reproduced without the expressed permission of the Division of Research. The author thanks Jim Anderson, Paul Anderson, Richard Bagozzi, Bill Dillon, Oded Gur-Arie, J. Paul Peter, and Michael Ryan for their valuable comments on an earlier draft of this paper.

I

A SECOND GENERATION OF MULTIVARIATE ANALYSIS: CLASSIFICATION OF METHODS AND IMPLICATIONS FOR MARKETING RESEARCH Abstract A second generation of multivariate methods is now available to marketing research. Virtually all earlier multivariate techniques are special cases of these methods. The second generation methods are not only more powerful than earlier multivariate techniques, they also bring a new perspective on overall research methodology. The features of the new methods are discussed via comparisons to the familiar first-generation approaches and by explaining key issues in application and method choice. The new methods are classified according to these issues and according to their objectives and requirements. Examples that illustrate their value to theory as well as marketing decision-making are also presented. Finally, the impact of second-generation methods in terms of methodology for theory testing and measurement validity is examined. It is suggested that the application of the new methods will demonstrate that traditional approaches to measurement validity are not cogent. approaches to measurement validity are not cogent.

INTRODUCTION Beginning about two decades ago, the increasing availability of new computer technology led data analysis in marketing through a substantial change. The significance of this change and the growing popularity of multivariate analysis were manifested in what was called "the multivariate revolution in marketing research," the title of an oft-cited article written by Jagdish Sheth (1971). The new methods represented a shift from univariate and bivariate approaches to simultaneous analysis of multivariance. Multiple regression, multiple discriminant analysis, factor analysis, principal components, multidimensional scaling, and cluster analysis rapidly became familiar to marketing researchers and were classified according to their assumptions and objectives (Sheth, 1971; Kinnear and Taylor, 1971; Horne, Morgan, and Page, 1973). What was a "multivariate revolution" in the early 1970s developed into a matter of course and was firmly established within academic (Myers, Massy, and Greyser, 1980) as well as commercial marketing research (Bateson and Greyser, 1982) only a few years later. Today, another dramatic change is occurring (or about to occur) in marketing research. With contributions from psychometrics, econometrics, quantitative sociology, statistics, biometrics, education, philosophy of science, numerical analysis, and computer science, a new set of multivariate methods has become available. So far-reaching is the change from earlier multivariate analysis that it seems appropriate to speak of "a second generation of multivariate analysis" (Fornell, 1982a). Some of the new methods have already found widespread application in marketing, others have yet to be discovered by the field. One approach in particular —covariance structure analysis as implemented by the LISREL program (Joreskog and Sorbom, 1978, 1981, 1983) —has become immensely popular. A recent special issue of the Journal of Marketing

-2 - Research (Nov. 1982) and the 1983 AMA Winter Educators' Conference Proceedings on Causal Modeling (Darden, Monroe, and Dillon, 1983), were primarily devoted to covariance structure analysis via LISREL. The purpose of this paper is to highlight the fact that covariance structure analysis is but one method among other recently developed methods that, in many ways, require a new way of thinking about research methods and are bound to have a profound impact on all social sciences, including marketing. The similarities and differences among these methods will be discussed, basic considerations in their application will be presented, and the impact of the new methods on research methodology and practice in marketing will be assessed. To the extent that the methods have been applied in marketing, illustrations will be presented. For methods that have yet to be applied, illustrative suggestions for marketing usage will be discussed. In order to understand the value as well as difficulties associated with the new methods and why they can be said to represent a second generation of multivariate analysis, let us briefly look back and reexamine the nature and diffusion of the first generation methods. THE FIRST GENERATION OF MULTIVARIATE METHODS According to Sheth (1971), the rapid diffusion of multivariate analysis in marketing was due to a number of factors. Among them was the fact that the multivariate methods were generalizations and extensions of the then wellknown simpler bivariate models. Another important reason, according to Sheth, was that the new methods were "largely empirical," in the sense that data preceded conceptualization. The implication was that the first generation of multivariate analysis entered marketing with a promise of ridding the researcher from many of the restrictive assumptions that were necessary in

-3 - econometrics and operations research and suggested an exchange of inferential statistics for fewer assumptions. Similar to the transition from bivariate statistics to the first generation of multivariate analysis, the second generation methods represent generalizations and extensions of the first generation methods. In fact, all first generation methods are special cases of second generation methods. In contrast to a "data then conceptualization" approach, the second generation methods also allow a more theory-based approach to research. It is not that first generation methods operate in the absence of substantive theory (although they were often used in this manner), but that they are limited with respect to their ability to bring theory and data together. For example, such popular techniques as traditional multidimensional scaling, factor analysis, cluster analysis, principal components and, to a somewhat lesser extent, discriminant analysis, are not ideally suited for statistical testing of theoretical propositions because some or all of their estimators lack (known) properties necessary for statistical inference. Multiple regression (including analysis of variance and analysis of covariance) results can be subjected to statistical tests but they suffer from other limitations with respect to inclusion of theory. Specifically, standard multiple regression postulates a very simple model where all explanatory variables have a direct effect on a dependent variable. In other words, each explanatory variable is given the same status; the only difference between these variables is empirical —the estimation of regression weights. Another limitation of regression (including simultaneous equations) in this context, and especially when applied to behavioral data, is that it cannot incorporate what Blalock (1982) calls "auxiliary measurement theories." That is, the very process of measurement

-4 - involves theoretical assumptions that, if excluded from the empirical model, will confound results and bias estimates. THE SECOND GENERATION OF MULTIVARIATE ANALYSIS The essence of research methodology is to advance understanding by combining, theoretical knowledge with empirical knowledge. Without theory, any statistical manipulation of data is of limited value. Without data, theory remains imaginary and abstract. Scientific progress depends on a continual dialogue between the two. In this dialogue, one cannot be totally separated from the other: data are interpreted in terms of the theoretical context and vice versa. A fundamental feature of second generation multivariate analysis lies in the flexible interplay between theory and data. When theoretical knowledge is well developed, it is possible to let this knowledge have greater bearing on the analysis. When one has less confidence in theory, it is possible to let the data play a larger role. At the same time, second generation methods can also be used to perform "first-generation type analysis," because they are general models of the earlier methods. However, second generation methods require the analyst to be explicit about the theoretical knowledge, or the lack thereof, that he wishes to bear on the analysis. Specifically, second generation methods combine theoretical and empirical knowledge by (1) modeling errors in observation (measurement or nonsampling error), (2) incorporating both theoretical (unobservable) and empirical (observable) variables into the analysis, (3) confronting theory with data (hypothesis testing), and by (4) combining theory and data (theory building).

-5 -Most of the methods that can be said to belong to the second generation have been developed fairly recently. Their origins, however, can be traced back to the early works of Spearman (1904) and Thurstone (1931) in factor analysis, to Haavelmo (1943), Koopmans (1949) and others in econometrics, to Wright (1934) in path analysis, to Hotelling (1933, 1936) in principal components and canonical correlation, and to Richardson (1938) in multidimensional scaling. With the exception of covariance structure analysis, second generation methods are just beginning to find application in marketing. Nevertheless, there have already been findings that would not have been discovered by first generation methods and findings that question the appropriateness of previous conclusions arrived at by first generation methods. For example, Aaker and Bagozzi (1979) found a significant negative relationship between sales people's role ambiguity and self-fullfilment in a study of industrial selling. Correlation or regression analysis would have suggested a much weaker relationship. Similarly in a field related to marketing —evaluation research — researchers have arrived at different conclusions when data have been reanalyzed using the new methods. Sorbom and Joreskog (1982), for example, question Keeves' (1972) finding that "home environment" is unrelated to "mathematics achievement" for school children. When measurement error was taken into account and the data were subjected to covariance structure analysis, it was found that there was a strong and significant relationship. A perhaps even more drastic example is the reevaluation of the Westinghouse Head Start program on school achievement. The negative findings from applying traditional analysis of covariance to the data were used to justify the decision to phase out the summer programs for school children (Magdison and ee Sorbom, 1980). When measurement considerations were incorporated in an

a -6 -explicit auxiliary theory about the measurement process, Magdison and Sorbom's (1980) covariance structure analysis found a weak but positive relationship between participation in the summer program and school achievement. Thus, by explicitly modeling both substantive theory and measurement theory, insignificant findings have been shown to be significant and negative relations have been shown to be positive. Similar types of results have been reported in marketing by Phillips, Chang, and Buzzell (1983). Again by using covariance structure analysis in conjunction with explicit measurement theory, these researchers found that, contrary to popular opinion, high product quality is not incompatible with low cost for most industries. In terms of correlation coefficients, Phillips et al. found positive and significant relationships between product quality and cost for consumer durables. In their covariance structure analysis, however, the relationship is insignificant (and negative). Not surprisingly, the second generation methods are more powerful and complicated than the first generation methods. They certainly demand more of the analyst. The characteristics used to classify the former (functionalnonfunctional, metric-nonmetric, number of dependent variables) are not sufficient or relevant for second generation methods. Therefore, before discussing the specific methods, it may be helpful to explicate some major considerations: the nature of theoretical variables, the nature of the relationships between theoretical variables, and the nature of the relationships between empirical and theoretical variables (Fornell, 1982b). Theoretical Variables Theory is a form of abstraction. Theoretical variables are, by definition, abstract. Abstract variables are not directly observable and can only be applied on the basis of indirect observations, proxies, or imperfect

-7 - empirical indicators. Examples of abstract variables in marketing, as in other fields, abound: personality and attitude in consumer behavior, demand elasticity in pricing, retail power in distribution, the marketing mix, and product performance in strategy. A theoretical variable (sometimes referred to as a hypothetical construct or theoretical concept) represents an abstract nonobservational phenomenon that obtains its conceptual meaning via its relationship to other theoretical variables in some network of which it is a part. The empirical meaning of a theoretical variable stems from its linkages to observations in the empirical world. Thus, empirical meaning is furnished by observation and conceptual meaning by a system of theoretical hypotheses. (For a discussion, see Bagozzi and Fornell, 1982). Let us now examine how empirical and conceptual considerations are combined into the construction of theoretical variables. Depending on the particular second generation method, two types of theoretical variables can be distinguished: defined and indeterminate. A defined variable can be expressed as a function of its observed variables (indicators). In this sense, it is similar to a principal component. However, it differs from principal corponents in that the relationship to its indicators is determined by the theoretical context (conceptual meaning) as well as by the indicators themselves (empirical meaning). An indeterminate variable, like a factor in traditional factor analysis, cannot be expressed as a function of its empirical indicators (without including an error term).l Otherwise, it is constructed in the same manner as the defined theoretical variable; that is, its content (i.e., variance) is determined by both empirical and theoretical considerations.

-8 - The characteristics of defined and indeterminate variables have long been debated in the psychometric literature (see Steiger, 1979) and involve a trade-off of desirable characteristics. Basically, the choice is dependent upon the analyst's confidence in data vs. confidence in theory. If the analyst has strong confidence in the theory but considers the data to be full of random noise, indeterminate variables with subsequent correction for attentuation (due to random measurement error) would be preferable. This would move the analysis "away" from the data and "closer" to the theory. If, on the other hand, more faith is placed on the accuracy of the data, the analyst would like to remain "closer" to the empirical level and defined variables would be more appropriate.2 Thus, the choice between the types of theoretical variables has implications for the weighting of empirical vs. theoretical knowledge in the analysis. How this is accomplished varies between specific methods and will be discussed later. Relationships Between Theoretical Variables Irrespective of how the theoretical variables were constructed, the relationship between them can be described in linear terms as orthogonal, symmetric, recursive, nonrecursive, or "causal." Orthogonality, of course, implies independency. Orthogonality can be invoked for (1) purposes of interpretation by separating variance components into distinct parts (exploratory studies), as (2) suggested by theory (confirmatory studies), or as (3) necessary for identification. Symmetric associations make no distinction in the directionality of a relationship. The most commonly used such associations in marketing are the Pearson product moment correlation coefficient and the Euclidean distance measure (e.g. in multidimensional scaling).

-9 - An example of directional relationships is provided by regression coefficients which indicate the expected change in a dependent variable for a given change in an independent variable. There are two types of directional relationships: recursive and nonrecursive. Recursive relationships are unidirectional whereas nonrecursive relationships are bidirectional or cyclical. The strongest form of inference is provided by "causal" relationships. Second-generation analysis assumes that one is interested in patterns, processes, or systems of behavior. If patterns exist, we can think in terms of causal sequences (Zaltman, LeMasters, and Heffring, 1982). Thinking causally about a problem and constructing a model that reflects causal processes often facilitate clearer statement of hypotheses (Van Meter and Asher, 1973) and by thinking in causal terms, it is possible to eliminate some alternative explanations and include others that fit better into the particular sequence under consideration (Zaltman, LeMasters, and Heffring, 1982). Obviously no statistical technique by itself is powerful enough to furnish causal laws of science. Yet, some second-generation methods are able to take a step further than previous methods in the analysis of causal systems. As is well known from introductory statistics, one cannot infer causality from correlations or regression coefficients. However, if one knows (hypothesizes) the causal ordering (theory) in a system of variables, it is possible to reason in the other direction, i.e., the specified theoretical system (model) implies what the data (e.g., correlations, covariances) should look like. The theoretical structure imposed on the data makes it possible to decompose covariances (correlations) into direct, indirect, and spurious effects as well as into unanalyzed (exogenous) associations. When there is agreement between the empirical covariances (correlations) and the sum of

their components (as suggested by the theoretical structure), there is evidence in favor of the theory. To avoid misunderstandings about the nature of theoretical relationships, it should be recognized that whether a relationship can be construed as "causal," directional, or symmetric, is not something that is inherent in a particular technique. The interpretation of the nature of the relationship is dependent upon both a prior theoretical knowledge and on empirical findings. With sufficient confidence in theory, a single correlation coefficient might be interpreted as causal. However, the theory requirements are quite formidable. Consider, for example,.the relationship between market share and return on investment (ROI). Cross-sectional data from the PIMS project (Buzzell, Gale and Sultan, 1975) suggest both a positive correlation between these variables and that market share is a key to profitability. Assuming that the measures are solid, under what circumstances can this correlation be interpreted in causal terms? According to Simon (1954), it is necessary that we have a theory that specifies (1) the direction of the relationship and (2) that the residuals of the two variables are orthogonal. The orthogonality requirement is a very stringent one for it implies that there are no omitted variables that affect both market share and return on investment. In other words, it is necessary that we have a self-contained or closed system. In most situations in marketing, we never know for certain if all relevant variables are included or where to close the system. With respect to the market share-ROI correlation, it is obvious that we are dealing with an open system. There are many additional variables that affect both market share and ROI. It is not even clear that we can specify the directionality of the relationship.

-11 - If we do not have enough confidence in our theory to suggest a causal interpretation from a single correlation coefficient, how can we move towards causal analysis? The answer lies in theory specification. The more complete the specification of hypotheses in terms of systems of variable relationships, the greater the probability that our hypotheses will be rejected. But if we are successful and our hypotheses are not rejected, more credence can be given to a causal interpretation of the findings (cf. Popper, 1962). While our theoretical hypotheses may not be accurate, it is necessary that they are explicit. That is, we must bring out our expectations and a priori notions in the open, because they make up the context in which the results are interpreted. Otherwise, the interpretation becomes, as in exploratory analysis, ambiguous and tentative. This is not to say that exploratory analysis is not useful as long as it is recognized that the findings are merely suggestive, and that they are not subject to statistical inference. Relationships Between Empirical and Theoretical Variables We have discussed the nature of theoretical variables and the relationship between them. Let us now turn to the relationship between theoretical and empirical variables. The link between theoretical and empirical variables is often referred to as epistemic correlations, epistemic relationships, or correspondence rules. There are three types of epistemic relationships embodied by second generation methods: reflective indicators, formative indicators, and symmetric indicators.3 Figure 1 depicts the three different relationships. Reflective indicators (Figure la) suggest that one or more underlying unobservables "cause" the observables. Classical true score measurement theory

-12 - is a special case of reflective indicators that assumes a single underlying unobservable for a distinct group of observables. Examples in marketing might be such constructs as consumer attitude and personality which are unobservable and typically considered underlying causes of overt behavior or measured scores on attitude and personality scales. If formative indicators are used, the unobservables are conceived of as effects rather than as causes. Consequently, the arrowheads in Figure lb are directed toward the theoretical variable. Formative indicators are typical of experimental designs where the researcher manipulates one or more of the empirical variables and the theoretical unobservables are taken as dependent on these empirical variables (for an example, see Bagozzi, 1977). The other category of application of formative indicators is when the theoretical variable is indeed "formed" from one or more observables. For example, we cannot directly observe "the marketing mix," but we know that it is composed of observables such as price, number of retail outlets, advertising expenditures, and so forth. The same logic can be applied to theoretical variables such as consumers' social status (which might be formed by occupation, education, income, location of residence) or to any theoretical variable that can be viewed as a composite effect of observables. The third type of empirical indicator variable is the symmetric indicator (Figure lc). Here, there are no assumptions of causality or directionality. This type of epistemic relationship is particularly useful when it is difficult to distinguish between cause and effect. Combinations of formative and reflective indicators are also possible. In any given model, some constructs may be reflective, others formative (Figure Id, e). Even within a single construct, it is possible to use both types of indicators. For example, a manufacturer's power in the distribution

-13 -network might be formed by its expert power, coercive power, and referent power which, in turn, might be reflected in observables such as retailer and distribution margins, shelf space allotted to the manufacturer, the degree of cooperative advertising, and so forth. THE METHODS The requirements for a method to be a member in the second generation of multivariate analysis are that the method has a capability to analyze (1) multiple criterion and predictor variables; (2) unobservable theoretical variables (3) errors in measurement (in one form or another), and (4) confirmatory applications. By confirmatory, it is merely implied that the analyst must make some explicit substantive (theoretical) and measurement assumptions or hypotheses that can be tested statistically. While some first generation methods can address one to three of the aspects above, none is well equipped to deal with all four. For example, traditional factor analysis handles unobservable variables but is not confirmatory; multiple regression can be applied in a (weak) confirmatory sense by testing the significance of estimated parameters and the regression equation, but it is limited to a single observable criterion variable. Figure 2 presents the methods of the second generation and their relationships to the most common methods of the first generation. All first generation methods are special cases of one or more second generation methods. For example, multiple regression, multiple discriminant analysis, analysis of variance and covariance, and principal components are all special cases of canonical correlation (Knapp, 1978) which, in turn, is a special case of PLS (Wold, 1975) and covariance structure analysis (Bagozzi, Fornell, and Larcker, 1981).

-14 - An example of each method is provided in Figure 3. Defined theoretical variables are drawn as solid circles; indeterminate theoretical variables are represented by broken circles. A criterion theoretical variable is denoted Y and the corresponding predictor variable is denoted X. Empirical variables are drawn as squares and denoted x or y, respectively. Residual variables or disturbance terms in the structural portion of the model (the part of the model that represents the substantive theory) are, by convention, denoted z. Disturbance terms are causal effects on criterion variables due to factors outside the model. Residual variables represent variation in the criterion variables when the effects of the included variables have been partialled out. Thus, the residual variables are orthogonal to included variables —disturbance variables are not. In all models illustrated in Figure 3, r is not correlated with other variables. Hence, here it represents residual variance. This is a necessary condition for interpreting the effects of the X's on the Y's and is equivalent to what was discussed earlier as "closing the system" (Simon, 1954). Residual variables or disturbance terms in the measurement portion of the model (the part of the model that represents the auxiliary measurement theory) are, by convention, denoted 6 (for x-variables) and e (for yvariables). As with the corresponding variables for the structural equation, it is necessary that these terms are orthogonal (i.e., that they are residuals —not disturbances) unless there are some very specific reasons (for a discussion, see Fornell, 1983; Gerbing and Anderson, in press). Consequently, all e's and 6's are drawn as residual measurement variables uncorrelated with other variables.

-15 - Canonical correlation occupies an intermediate position between first and second generation methods. Let us therefore begin the discussion of methods with this method. Canonical Correlation Since its introduction to marketing by Green, Halbert, and Robinson (1966), canonical correlation has been used in numerous studies. Interpretational aids have been suggested by Stewart and Love (1968), Alpert and Peterson (1972), Lambert and Durand (1975), Fornell (1978), Cliff and Krus (1976), Perreault and Spiro (1978), and DeSarbo, et al. (1982). In a more confirmatory sense, Bagozzi, Fornell, and Larcker (1981) and Wildt, Lambert, and Durand (1982) have shown how to subject canonical solutions (weights and loadings) to statistical tests. A canonical correlation model is depicted in Figure 3a. In the example there are three x-variables and three y-variables (indicated by squares). These are the observed empirical variables. Two theoretical variables A (X1, X2) are formed from the x-variables and two theoretical variables A A (Y1' Y2) are formed from the y-variables. 1 and z2 represent error in equations ("errors in theory"). The S's and e's represent residual variance in the measurement equation ("errors in variables") for the x-side and y-side, respectively. The objective of canonical correlation is to maximize the correlation between the theoretical variables Xland Y1, which is equivalent to minimizing the variance of the residual variable. Subsequently, the correlation between X2 and Y2 is maximized (equivalent to minimizing the variance of 2). As depicted in Figure 3a, there are two types of theoretical variable relationships: orthogonal and symmetric. As indicated by the dotted lines

-16 - A A A A the relationships X1 - Y1 and X2 - Y2 are symmetric. This implies that the A tA A A directionality is not specified. The correlations X1 - Y2 and X2 - Y1 are zero (orthogonal) in canonical correlation. As indicated by the figure, the canonical model employs determinate theoretical variables; each theoretical variable is a weighted aggregate of the corresponding empirical variables. The substantive theory affects the estimation of the weights by determining them such that the residual variance in the structural portion of the model is minimized. As indicated by the arrowheads pointing toward the theoretical variables, all empiricaltheoretical variable relationships are formative. Obviously, one does not need a particularly elaborate theory to use canonical correlation. The assumptions on data are also minimal. The method can be applied for metric data as well as nonmetric contingency table data (Goodman, 1981). However, in order to use canonical correlation in a confirmatory sense, more a priori knowledge must be incorporated into the analysis. This can be done via rotation (Cliff and Krus, 1976) using a Procrustes procedure (Fornell, 1982c). An illustration of confirmatory canonical analysis is provided in a study of the relationship between role strain and the performance of salespeople (Fornell, John, Stern, and Triki, 1984). The theory, in part based on individual coping responses, suggests that two contrasting relationships may simultaneously exist between role strain and performance. In other words, role strain may be both positively and negatively related to performance. Thus, traditional correlation or regression coefficients would produce coefficient estimates from contrasting forces that may cancel out. Consequently, different samples with different absolute values on role strain and

-17 - performance would generate conflicting results. Such results have indeed been reported by studies using traditional first generation methods, where both positive and negative as well as insignificant relationships have been found (House and Rizzo, 1972; Schuler, 1975; Tosi, 1971). To briefly illustrate the confirmatory canonical analysis procedure, consider Figure 3a. The only difference between this figure and the model used in the role strain-performance study is in the number of y-variables and x-variables. In the study, there were two y-variables (role strain and performance) and several x-variables. First, a traditional canonical correlation analysis was performed. Second, a priori notions were incorporated by constructing a "target matrix." Thus, if one expects role strain-performance to have both a positive and a negative relationship (depending on certain other variables) one would set the target matrix such that role strain and performance both have loadings (i.e., epistemic correlations) of the same sign in the first dimension (i.e., on Y1, in Figure 3a) and where they would have opposite signs in the next dimension (i.e., on Y2 in Figure 3a). Other loading signs (in this case for the explanatory variables) would also be determined with respect to sign. The analysis then proceeds with a rotation of the loadings matrix from the original canonical correlation toward the specified target matrix (which typically contains no elements other than +1 and -1). The objective of the rotation, which can be orthogonal as well as oblique, is to minimize the differences (in a least-squares sense) between the target matrix and the rotated matrix. Consequently, the degree to which the hypotheses are supported by the data is reflected in the fit between the target matrix and the rotated matrix. An often used measure of fit is the coefficient of congruence

-18 - (Wrigley and Neuhaus, 1955) which is sensitive to both the pattern and magnitude of difference in the two matrices. The sampling distribution of this measure is not known and statistical significance is perhaps best evaluated via jackknifing (see Wildt, Lambert, and Durand, 1982). Substantive significance can be evaluated via the size of the congruence coefficient (which can vary between 0 and 1). In the role strain-performance study, the coefficient was.92 indicating a very good fit between the target and the rotated canonical solution. Redundancy Analysis and ESSCA Redundancy analysis (van den Wollenberg, 1977) and External Single-Set Components Analysis (ESSCA) (Fornell, 1979) are generalized versions of multiple regression and principal components. The first formulation of the general approach can be found in Wold (1975). The two methods are illustrated in Figures 3b and 3c. Both minimize the residual variance of the empirical y-variables. That is, the trace (the sum of the diagonal elements) of ~ is minimized. This is equivalent to maximizing redundancy which is an index of the proportion of variance of the y-variables that is accounted for by a linear combination of x-variables. The difference between redundancy analysis and ESSCA is evident from Figures 3b, c. ESSCA is a MIMIC (Multiple Indicator-Multiple Causes) version of the redundancy model. As such it does not extract theoretical variables from the y-variables. The sole purpose of a theoretical variable in ESSCA is to account for the variances of the empirical y-variables. As can be seen in Figures 3b, c, both redundancy analysis and ESSCA use a combination of formative and reflective indicators (note that the arrowhead is always in the direction of the variable(s) to be explained) and the

-19 - theoretical variables are defined. The relationship between the corresponding pairs of theoretical variables is directional in redundancy analysis; in ESSCA there is no such relationship. The assumptions on data are the same as in canonical correlation. It is somewhat surprising that redundancy analysis and ESSCA have found little application in marketing. One reason might be that they have not been included in widely distributed computer programs.4 Nevertheless, DeSarbo (1981) presents an algorithm for cases where the analyst is interested in both canonical correlation and redundancy; research applying ESSCA is underway at The University of Michigan. The ESSCA study examines the relationship between market share and monetary company performance. A problem in estimating this relationship is the fact that market share can be defined and measured in a number of ways and the same is true with company performance. Therefore, it is probably not appropriate to draw conclusions about the relationship based on a single measure of market share and a single measure of performance (such as return on investment) from correlation and regression coefficients. The relationship is probably more complicated and affected by a number of factors involving strategy and the competitive situation. ESSCA is being applied to shed some light on this issue. In essence, the question is: what kinds of market share measures are related to what kinds of performance measures? Referring to Figure 3c, this analysis, using longitudinal data at the individual business unit level, employs several x-variables (broadly measured market share, narrowly measured market share, market share relative to the largest competitors, etc.) as well as several y-variables (return on investment, earnings, operating income, etc.). The objective is to explain the variation in the various measures of monetary performance by some combination of the various measures of market share. Because the performance measures

-20 - are correlated and because one would not expect a unidimensional relationship, traditional multiple regression is not a suitable technique. Partial Least Squares (PLS) As shown in Figure 2, PLS is a general model for canonical correlation, redundancy analysis, and ESSCA (among second generation methods) and multiple regression, multiple discriminant analysis, analysis of variance, and covariance, and principal components (among first generation methods). Introduced by Herman Wold in 1966, PLS is a method for estimating predictive-causal relationships. In contrast to canonical correlation, redundancy analysis, and ESSCA, PLS is a method for systems analysis. The previously discussed methods can handle multiple variables on both sides of the equation but are restricted to analysis of direct effects between theoretical variables. In PLS it is possible to model indirect as well as direct effects. This is illustrated in Figure 3d where the theoretical variables X1 and X2 have a direct effect on A A Y1 and an indirect effect on Y2. The primary purpose of PLS is to predict empirical and/or theoretical variables. In other words, the analyst specifies the residual variances that are to be minimized. Estimation relies upon an iterative procedure in which each step involves the minimization of some residual variance with respect to a subset of the parameters, given either a proxy (fix-point constraint) or final estimates for other parameters. In the limit of convergence of the iterations, all residual variances are minimized jointly. In Figure 3d, for example, the residuals of the following variables are minimized: C 1 ' 2e, ~2, and S3. A PLS model can have both reflective and formative indicators. contrast to redundancy analysis and ESSCA, the x-variables can be specified and reflective. Had that been the case in Figure 3d, the residuals 61, 6 2,

-21 - and 63 would also have been minimized. Thus, if a model where all indicators are formative is specified, the errors in the structural equation will be minimized. If all indicators are reflective, the minimizing principle covers both structural and measurement residuals. Regarding the nature of theoretical variable relationships, PLS is flexible. Some variables (and some estimates) can be constrained to be orthogonal. Both symmetric and unidirectional relationships can be estimated. Bidirectional (nonrecursive) systems can be handled via a combination of the fix-point method (Wold, 1965) and PLS estimation (Hui, 1982). The evaluation of causal relations is based on the predictive quality of the relationships. It is also possible to compare the theoretical (hypothesized) correlation matrix with the correlation matrix of the estimated theoretical variables. Overall, the emphasis of PLS is on prediction, given a causal structure. Because the theoretical variables are defined, case values for both (predicted) empirical and theoretical variables are directly available. Thus, prediction can be made for both types of variables. Meaningful application of PLS requires that the substantive theory can be spelled out in an arrow scheme (such as in Figure 3d, for example). The assumptions on data are weak. There are no distributional assumptions, nor are there assumptions about independence of observations or the metric of the measured variables. In fact, the categorical scaling in the Alternating Least Squares Optimal Scaling (ALSOS) discussed in a marketing context by Perreault and Young (1980) is a form of PLS estimation. Wold and Bertholet (1981) and Lohmoller (1983) discussed the PLS approach to analysis of contingency tables and Lohmoller (1983) considered PLS analysis of mixed-scaled variables. Similar to canonical correlation, redundancy analysis, and ESSCA, PLS does not provide unbiased estimates. However, with increasing sample.size

-22 - and number of indicators, the estimates are consistent (i.e., they approach the true value.) Thus far, there have been few marketing applications. Advertising effects have been analyzed by Jagpal (1981), and Fornell, Tellis, and Zinkhan (1982), consumer exit/voice by Fornell and Bookstein (1982), consumer dissatisfaction by Fornell and Robinson (1983), and industrial buying behavior by Ryan and Holbrook (1983). Other applications of PLS have been in the area of market segmentation. However, these studies have been done in commercial marketing research and are of a proprietary nature. The basic advantage of PLS in this context is that it simultaneously identifies segments and functionally relates the segments to explanatory variables. It thereby avoids the suboptimization problems that arise if segments first are identified via principal components, multidimensional scaling or some other structuring method and subsequently related to predictor variables via, say, multiple regression or discriminant analysis. The results of the PLS applications are promising. Fornell and Robinson (1983) were able to specify a parsimonious model consisting of no more than three exogenous and two endogeneous theoretical variables that accounted for more than 50 percent of the variation in consumer dissatisfaction with respect to over $200 billion of consumer spending in 23 industries. This result is in contrast to most previous studies of consumer satisfaction/ dissatisfaction where explanatory power is typically low. An interesting exception is the study by Churchill and Surprenant (1982) who recorded high explanatory power from using covariance structure analysis. An example of a PLS application where first generation techniques would have been deficient is provided by Fornell and Bookstein (1982). At issue

-23 - was the relationship between market (manufacturer) concentration and consumer response to quality decline (or dissatisfaction). A theory developed by Hirschman (1970) distinguishes between consumer response in terms of consumer exit (e.g., brand switching) and in terms of voice (i.e., complaining). Because exit represents a direct revenue loss and voice typically involves some degree of complaint processing cost, it is important for management to know the factors that affect voice and exit. Hirschman's theory suggests that market concentration is negatively associated with exit and positively associated with voice. From census data and nationwide survey data, the product moment correlation coefficients between various measures of voice and concentration ratios were indeed positive as predicted by theory. However, the correlations between concentration ratios and exit were conflicting. Some were positive, others negative. The PLS analysis suggested that when concentration ratios were treated as formative indicators of an unobserved theoretical variable, only a small portion of their variance was relevant to the theory; and this variance was, in fact, negatively associated with exit. Covariance Structure nalys Sometimes referred to as causal modeling, structural equations, or LISREL, covariance structure analysis is, by far, the most widely used second generation method. Regarding terminology, it should be noted that "LISREL" is a computer program and that "causal modeling" is a broader term than "covariance structure analysis."5 Further, the term "structural equations" belongs to traditional econometrics. Covariance structure analysis appears to be an appropriate term because it does not encompass several different methods and it actually says something about what the method does —it analyzes covariance structures.

-24 - In 1968 Karl Joreskog, the Swedish statistician, gave a presentation before the Psychometric Society. According to Norman Cliff (1983), the 1980 -1984 editor of Psychometrika, those who heard his presentation "knew they were hearing something important, but certainly did not appreciate how far reaching the application would be" (p. 115). Yet, the basic approach had already been introduced several years earlier in sociology by Otis Dudley Duncan (1966), and Hubert M. Blalock (1961, 1962, 1969a), drawing upon the works by Herman Wold and Lars Jureen (1953), Herbert Simon (1952, 1954) and Sewall Right (1934). A problem with the early developments in sociology was the ad hoc approach to estimation of overidentified models. Lawley (1940) had formulated factor analysis as a statistical model, but the problem was not solved until Joreskog developed an efficient iterative maximum likelihood estimation method (1967, 1973) along with the computer program LISREL, now in its sixth commercial version (incorporated in the SPSS-X package). Although both can be said to represent structural equation models, covariance structure analysis differs from PLS in several ways (see Fornell and Bookstein, 1982). Two of the more important differences concern the loss-functions and the degree of a priori knowledge necessary for meaningful application. First, the objective of covariance structure analysis is to recover the structure (as measured by covariances) of the observed data in terms of parameter matrices. The objective of PLS is to minimize error variances. Second, because the general covariance model is not identified, a priori knowledge is needed to impose constraints on certain parameters. Even though most test statistics have been developed under the assumption of multivariate normality, it is possible to use categorical as well as ordinal variables in covariance structure analysis. For example, discrete variables can be handled via polychoric correlations; combinations of discrete

-25 - and continuous variables can be handled via polyserial correlations (Olsson, 1979; Olsson, Drasgow, and Dorans, 1981). Muthen (1983) has developed a model that allows for a mixture of dichotomous, ordered polytomous, and continuous variables. Thus, it is quite feasible to use nonmetric indicators in covariance structure analysis as long as one assumes that the unobservable theoretical variables are continuous. Typically, one must also assume that these variables have a multivariate normal distribution. Recently, however, significant advances have been made in the development of asymptotically distribution-free estimates and statistics (Browne, 1983; Bentler, 1983b). An example of a covariance structure model is presented in Figure 3e. Note that the indicators are all reflective, which follows the factor analysis implementation of true-score theory.7 As indicated by the broken circles, the theoretical variables are indeterminate. That is, the practical advantage of obtaining the case value distributions for theoretical variables is sacrificed for the theoretical advantage of having truly unobserved variables or, as Bentler (1983a) puts it, by having the dimensions spanned by the unobserved variables be greater than the dimensions spanned by the measured variables. In terms of analyzing relationships between theoretical variables, covariance structure analysis is very powerful. The analyst can specify orthogonality constraints, symmetric, unidirectional, bidirectional, and "causal" relationships.8 Most of the pioneering applications in marketing are due to Bagozzi, beginning with his doctoral dissertation (1976) and followed by numerous papers and a monograph on causal analysis in marketing (1980a). In the marketing journals literature, covariance structure analysis has also been applied to studies in consumer behavior (Arora, 1982; Burnkrandt and Page, 1982; Punj and Staelin, 1983; Reilly, 1982; Ryan, 1982), personal selling

-26 - (Churchill and Pecotich, 1982; Aaker and Bagozzi, 1979; Bagozzi, 1980b) product adoption (Bearden and Shimp, 1982; Bagozzi, 1983), and consumer satisfacton/dissatisfaction (Churchill and Surprenant, 1982; Anderson, Engledow and Becker, 1979; Bearden and Teel, 1980; Fornell and Westbrook, in press), marketing strategy and distribution channels (Phillips, Chang, and Buzzell, 1982; Phillips, 1981; John and Reve, 1982). It is probably fair to say that covariance structure analysis has had much more impact on behavioral studies in marketing than on marketing models and decision making. However, the applications in evaluation studies (Sorbom *e and Joreskog, 1982; Magdison and Sorbom, 1980), imply that covariance structure analysis could be useful in answering analogous questions in marketing. Consider the problem of determining the factors that drive sales and the decomposition of the individual influences of these factors. In employing first generation methods to problems such as these, there is a high risk of drawing incorrect conclusions because of the ramifications of measurement error and the focus on empirical associations. Even with moderate error in measurement and the presence of multicollinearity, coefficients estimated under the assumption of zero measurement error have substantial and unpredictable biases (Fornell, 1983; Bagozzi and Fornell, 1982). For example, if one is interested in finding out to what extent controllable (e.g., marketing mix variables) versus uncontrollable (e.g., competitor strategy and other environmental variables) factors affect sales, it is likely that the measurement of the uncontrollables is more difficult and results in greater (random) error than the measurement of the controllables. It is also likely that there is some degree of collinearity between controllables and uncontrollables (reactions to competitor strategy, environmental changes, and so forth). The result of applying ordinary regression to a situation like this is that the

-27 - estimated effect of the uncontrollables will be biased downwards and the estimated effect of the controllables (even if perfectly measured) will be biased upwards. Thus the marketer ts given a false impression about the control of sales. The nature and degree of bias for individual variables depend on the ratio of true to error variances in these variables and the multi-collinearity involved. Confirmatory Multidimensional Scaling The development of confirmatory MDS stems from the works of Shepard (1962) and Kruskal (1964) with recent contributions by Carroll, Pruzansky, and Kruskal (1980), Bloxom (1978), Lee and Bentler (1980), Borg and Lingoes (1980), Lingoes and Borg (1978), and Heiser and Meulman (1983). Another related stream of research has focused on probabilistic models (Richardson, 1938; Bechtel, 1976; Ramsay, 1969, 1977; Zinnes and Mackay, 1983). There are several approaches for imposing a priori knowledge in MDS. The most common are via restrictions on distances (Borg and Lingoes, 1980) or parameters (Carroll, Pruzansky, and Kruskal, 1980) and by constraints on the coordinates (Lee and Bentler, 1980). Figure 3f illustrates the distances restriction approach. The basic idea is that we may have a priori notions about the structure of relationships. For example, suppose we know that the distance between A and Z is shorter than the distance -between A and Y. By specifying regions of connected points, we do not have to pay attention to the coordinate-axes. In other words, the approach of restriction on distances is coordinate free —no functional form has to be specified. If, on the other hand, one wants to specify a functional form such as ellipses or curved manifolds, the coordinates themselves could be constrained (see Lee and Bentler, 1980).

-28 - Fornell and Denison (1981) applied facet-theory and used restrictions on distances to demonstrate how confirmatory MDS can be used to assess convergent, discriminant, and nomological validity. Subsequently, their approach was developed further by incorporating a form of measurement error and the estimation of theoretical variables (Fornell and Denison, 1982). The approach can be illustrated by referring to Figure 3f. The minimum number of theoretical variables for distance constrained analysis is three. If only two theoretical variables are involved, the constraints are limited to the measurement model.9 The model in Figure 3f suggests that A is closer to Z than it is to Y. Similar to covariance structure analysis, this configuration can be tested against empirical data.10 The theoretical variables are defined as the geometric centroid of the respective empirical indicator variable points. The epistemic relationships are found by measuring the distance between indicator points and the corresponding centroid. Thus, the treatment of random measurement error variance follows classical measurement theory in the sense that the value of the unobserved variable is approximated by the expected value of the corresponding observed indicators; the more repeated measures (indicators) one has, the closer to the true values one would expect to come. Since the theoretical variable is a centroid, it is defined; since a distance is symmetric, both theoretical and epistemic relationships are symmetric. Despite the fact that traditional MDS has been used extensively in marketing, confirmatory multidimensional scaling is one of the least known methods of the second generation. Nevertheless, the recent advances in constrained MDS can be very useful to marketing, especially in light of the criticism charging that traditional MDS is an atheoretical exercise whose

results are largely ad hoc (Green, 1975). For example, when used for product mapping, it is clear that competitive relationships can be modeled in a number of ways and, without a theoretical framework on which to rely for interpreting the results, it is difficult to name a strong case for favoring any particular outcome (Shocker and Stewart, 1983). Confirmatory multidimensional scaling offers a solution to this problem. It would seem that management, certainly in most situations, is not totally without knowledge about the competitive structure of brands in a market. As with all second generation methods, this knowledge, be it certain or shaky, should not be ignored. For example, management may have hypotheses about what a consumer perceptual map may look like in terms of product proximity, at least for some of the brands. These hypotheses can operate as constraints on the solution. The constrained solution is then compared to an unconstrained solution. If the difference is insignificant, there is support for the A A A hypotheses. For example, let us assume that A, Z, and Y in Figure 3f represent positions of brands in a two-dimensional perceptual map. Each brand's position is the centroid of its respective indicators. The indicator measures could be obtained from, say, a rank order of paired similarities, semantic differentials and response latency. Let I represent an ideal point. The interbrand distances might be interpreted as measures of substitutibility and the brand-ideal distances might be viewed as measures of consumer utility. The task of management is now to improve the competitive position of its brand, say A. This can be done by increasing its perceived utility (moving closer to I), by differentiation (moving away from Y and/or Z), by reducing A A the perceived utility of competitive brands (moving Y and/or Z away from I), and/or by altering consumer tastes (moving I closer to A). The feasibility of each of these strategies, or some combination thereof, can be assessed via

-30 - confirmatory multidimensional scaling. Each strategy implies certain constraints. For example, the constraint involved in increasing the perceived utility of A, may specify that A be closer to I than are Y and/or Z. In a similar manner, the other strategies suggest certain constraints. The disparity between each constrained solution and the unconstrained solution could now be examined. Presumably, where the disparity is small, consumer perception and preference does not need to be altered a great deal, and the strategy is more likely to succeed than where the disparity is large. Once the strategy has been implemented, the degree of success could be ascertained by constraining distances in accordance with the desired solution and comparing against data collected after implementation of the strategy. Finally, constrained MDS would also be useful for systems analysis where the analyst is primarily interested in the structure of the system and has data unsuitable for covariance structure analysis (e.g., the number of variables exceeds the number of observations, there are observational dependencies, complex nonlinearity or no known functional form, and/or discrete variables without underlying continuums). Latent Structure Analysis The final method to be discussed here is the latent structure model, developed by Paul F. Lazarsfeld over thirty years ago (Lazarsfeld, 1950). Similar to the early development of covariance structure analysis in sociology, however, the statistical estimation methods were rather primitive in the beginning. With the exception of an early attempt by Anderson (1954), it was not until recently that significant advances were made in estimation. The theory and practical application of latent structure analysis were greatly advanced by the work of Leo Goodman (1972, 1973, 1974, 1979) who

-31 - fitted Lazarsfeld's model into the overall framework of log-linear models and developed maximum likelihood estimation techniques for parameter estimation. Clogg (1977) implemented these estimation techniques in the MLLSA (Maximum Likelihood Latent Structure Analysis) program. A corresponding least squares estimation program was developed by Mooljaart and Kapel (1981). Similar to covariance structure analysis, it is possible to fix or constrain certain parameters to a given value, or to constrain parameters to equal each other. Whereas all the methods discussed hitherto are based on analysis of bivariate cross-products (correlation, covariance, similarities, dissimilarities), latent structure analysis typically considers higher-order cross-products as well. Thus, the normality assumption becomes irrelevant. Latent structure analysis uses both discrete observed and discrete unobserved variables. Similar to covariance structure analysis, the objective is to test whether a latent factor (theoretical variable) accounts for observed associations. Latent structure analysis also estimates the relationships between theoretical variables via the conditional probabilities that a case (respondent) will be at some observed level, given that the case belongs to some latent class of the theoretical variable. These conditional probabilities are the epistemic relationships and have an interpretation similar to factor loadings in covariance structure analysis. They can also be expressed in both formative and reflective formulations.11 While latent structure analysis has advantages in its acceptance of discrete theoretical variables and higher-order relationships, it has limitations as a method for systems analysis. In particular, it faces difficulties with its inconsistent treatment of intervening variables. For example, an endogenous intervening variable may appear both as a logit of a probability and as a dummy variable (Winship and Mare, 1983). Similarly, the

specification given in Figure 3g is not distinguishable from a model where xl is "causing" (or symmetrically related to) X, and where X is "causing" Y. Recently, new advances have been made in the analysis of discrete data. Winship and Mare (1983) propose a method in which the analyst must determine if a particular indicator measures an inherently discrete phenomenon or if it is an effect of an underlying continuous variable. It is then shown that traditional path analysis can be applied to systems where some endogenous variables are discrete. Since the first marketing application of latent structure analysis by Myers and Nicosia (1967), the approach has only recently begun to gain in popularity, primarily through the works of Dillon and Madden (Madden and Dillon, 1982; Dillon, Madden, and Mulani, 1983; Dillon, 1980; Madden, Dillon, and Weinberger, 1982). In these applications, restrictions are typically imposed on parameters and the goodness-of-fit of the model is tested with the chi-square likelihood ratio statistic. A CLASSIFICATION SCHEME Table 1 summarizes our discussion in the form of a classification scheme for second generation methods. All methods incorporate multiple dependent variables. Thus, the number of dependent variables cannot be used for classification (as it did for first generation methods). Perhaps a case can be made for retaining the metric/nonmetric distinction from the classification of first generation methods (Sheth, 1971; Kinnear and Taylor, 1971), but it would rest on a rather weak argument. The reason is twofold: first, virtually all variables that we measure take on a finite number of values and are therefore discrete or categorical. Second, even if one is unwilling to approximate such data by continuous models, it is not always clear what the choice of analysis

-33 - method should be. Consider covariance structure analysis which assumes a parametric model. If there are categorical exogenous indicator variables, the distribution of these variables is not necessarily critical because one may treat them as fixed, which means that it is the conditional distribution that matters. If there are categorical indicator variables among the endogenous variables as well, one might nevertheless make the assumption that these variables reflect one or more continuous underlying normal variables and proceed via estimation of tetrachoric correlations from contingency tables. However, among the methods discussed here, only latent structure analysis allows for categorical theoretical variables. Even the requirement of multivariate normality is about to be relaxed with the development of asymptotically distribution-free estimates (Browne, 1983; Bentler, 1983a, 1983b) and the increasing usage of nonparametric estimates of standard errors such as the jackknife and the bootstrap (see Efron, 1981). The latter are not only distribution-free, but also applicable to small sample studies. It is suggested that a meaningful choice of second generation method be based upon considerations of study objectives, nature of theoretical variables, nature of epistemic relationships, the type of theoretical variable relationships desired, the a priori (substantive) theoretical knowledge available, and the method's data requirements. Table 1 represents an attempt to classify the second generation methods according to these considerations. The first distinction refers to the objective of analysis. Basically, second generation methods' primary focus is on either (1) prediction or (2) explanation (of structure). Canonical correlation, redundancy analysis, and ESSCA are essentially predictive in the sense that a causal structure is not specified nor tested. PLS is predictive-causal, but primarily evaluated

-34 - in terms of its predictive qualities. Covariance structure analysis is causal-predictive with the emphasis on recovering the observed data structure. If the observed data structure (covariances) is satisfactorily accounted for, the predictive quality of the results may become important as well (Fornell and Larcker, 1981a, 1981b, 1984). Confirmatory (nonmetric) MDS and latent structure analysis are essentially structure recovering methods (although the latter also makes predictions (classifications) at the empirical variable level). Confirmatory MDS attempts to recover the original data in terms of distances; latent structure analysis seeks to account for contingency table associations. The second consideration, defined vs. indeterminate theoretical variables, has an impact upon the treatment of measurement error and analysis of theoretical variable distributions (e.g., examination of individual cases, outliers, scaling). Covariance structure analysis and latent structure analysis involve indeterminate theoretical variables. From a theoretical point of view, this may be preferable (see Bentler, 1982, 1983a); these variables are truly unobservable. By contrast, canonical correlation, redundancy analysis, ESSCA, PLS, and confirmatory MDS use defined theoretical variables. Thus, these variables are indirectly observed; as such they can be viewed as approximations or proxies of the true underlying and unobservable variables. As for the linkages between theoretical and empirical variables, the various epistemic relationships were discussed earlier. Note that PLS and latent structure analysis are the most flexible methods in this regard: the analyst is free to specify both formative and reflective relationships, as desired. In redundancy analysis and ESSCA, both formative and reflective indicators must be used. Recall that the objective of these methods is to account for the variance of the observed endogenous variables. As a result,

-35 - these variables must be in a reflective mode. On the exogenous side, the observed exogenous variables must be formative because these variables are used to form optimal linear aggregates (for explaining the variance of the observed endogenous variables). Covariance structure analysis is designed for reflective indicators. If formative indicators are used, the theory upon which correction for measurement error (i.e., attenuation) is used is no longer applicable. Confirmatory MIDS requires the least of all second generation methods in terms of epistemic specification; the relationship is symmetric with no directionality implied. PLS and covariance structure analysis are the most comprehensive methods with regard to the number of different types of theoretical variable relationships that the analyst may be interested in. In Table 1, it is suggested that PLS, covariance structure analysis, and (to a more limited extent) confirmatory MDS provide a means for causal interpretation.12 The degree to which causal inferences can be drawn depends, to a large extent, on the a priori credibility and elaborateness of the substantive theory. The tenability of theory is evaluated in different ways: in covariance structure analysis, the covariance (parameter) matrix of the specified model is confronted with the covariance matrix of all observed variables included in the model; in confirmatory MDS, the distances of the specified model are confronted with the original proximity measures of all observed variables included in the model; in PLS, the primary test refers to the predictive quality of the estimated relationships.13,14 All second generation methods allow theory to play a larger role than do the first generation methods. Canonical correlation is typically the least demanding among the second generation methods with respect to theory

-36 - specification: no system of relationships needs to be specified and directionality between the theoretical variables is not assumed. For confirmatory canonical correlation, it is necessary to specify a target matrix of loadings. However, the amount of detail necessary for the specification of the target is not overwhelming. Typically, ones (plus and minus) and zeros are the only numbers used. Redundancy analysis and ESSCA require somewhat more in terms of prior specification: one set of variables must be treated as exogenous, the other set as endogenous. Overall, however, these methods do not require elaborate a priori theory. PLS is a flexible method. It can be specified as a canonical correlation and as redundancy analysis or ESSCA. In other words, PLS provides a powerful means for theory-data interaction. The more elaborate and the better specified the theory is, the more dominant a role it plays. For example, a well-specified theory may be reflected in a large system of variable relationships; a weaker or less elaborate theory may be reflected in models that are similar to canonical, redundancy, or ESSCA specifications. Covariance analysis, confirmatory MDS, and latent structure analysis are primarily designed for theory testing and thus benefit more from a wellspecified theory. The indeterminate nature of the theoretical variables in covariance structure analysis and latent structure analysis makes it possible to fit numerous different theoretical structures to the same data; theory is necessary to help sort out competing empirical results. Both these models also require a priori constraints for purposes of identification. These constraints should, to the extent possible, be based on theoretical arguments. Confirmatory MDS does not have indeterminate theoretical variables, but is still subject to similar requirements. The fewer indicators a theoretical

-37 - variable has and the fewer restrictions on its spatial location relative to other variables, the less determinate the solution becomes. The final consideration with respect to the choice of method is empirical. The data requirements depend on (1) whether or not the analyst wants to draw statistical inference and on (2) the nature of the tests applied for this kind of inference. If the objective of the analysis requires estimation but not statistical hypothesis-testing, only covariance structure analysis puts strong demands on data (especially with maximum likelihood estimation). That is, a large sample is required (Anderson and Gerbing, in press; Bearden, Sharma, and Teel, 1982; Boomsa, 1982), the observations must be independent, an underlying multivariate normal distribution is assumed, and categorical variables are either treated as fixed effects or as having an underlying continuous scale. If traditional statistical theory is employed for testing, then canonical correlation, redundancy analysis, ESSCA, and PLS all require assumptions of multinormality, independence of observations, and interval scaling. Confirmatory MDS transforms the data and applies the test to a population of points. As a result, it is not subject to the same data requirements.15 Similarly, latent structure analysis is designed for categorical variables and does not make distributional assumptions. If the analyst is unwilling to make the assumptions of multinormality, and so forth, but still wants to subject the estimated parameters to statistical tests, there are several distribution-free approaches available. Wildt, Lambert, and Durand (1982) show how jackknifing can be applied to canonical correlation. Redundancy analysis and ESSCA results can be tested in the same way. PLS uses the Stone-Geisser (Geisser, 1974; Stone, 1974)

-38 - nonparametric test for predictive relevance (for reflective endogenous variable indicators) and standard errors obtained via jackknifing.16 IMPLICATIONS Theory Testing and Validity Assessment: Philosophy of Science Considerations The adoption of the second generation of multivariate analysis implies fundamental changes in current marketing research methodology. It shifts the focus from empirical associations to systems analysis of relationships between theoretical constructs; it necessitates explicit hypotheses about both measurement and theory; and it challenges traditional approaches to validity assessment. According to logical empiricism, a philosophy of science considered by several (Bagozzi, 1984; Zaltman, LeMasters, and Heffring, 1982; Sauer, Nighswonger, and Zaltman, 1982; Olson, 1981; Peter and Olson, 1983; Hunt, 1983; Deshpande, 1983; Anderson, 1983) to be a major influence on marketing research methodology, theory must be subjected to empirical verification before it can be accepted. However, the principle of verification has many problems. One of its most discussed difficulties has to do with the limitations of induction as conclusive proof (Popper, 1962; Chalmers, 1976). The basic problem is that no finite number of empirical tests can ever guarantee the truth of universal statements (see Anderson, 1983). In applying second generation methods, the limitations of induction are readily demonstrated: for every covariance structure model or latent structure model that is successfully fitted to the data, the critical and probing analyst will find many other models that fit the data equally well. Thus, the finding that a proposed model is congruent with the data does not verify a theory.

-39 - The most popular alternative to verification appears to be the Popperian (Popper, 1962) program of falsification. Unfortunately, Popper's theory of science, which lays no claims for conclusive falsification, has, in application, been reduced to a naive belief that it is possible to refute theory by observation. Mockingly referred to as "naive falsificationism" (Lakatos, 1970), its logical impossibility seems to have escaped attention among practicing researchers despite its abasement in philosophy of science (Duhem; 1953, Kuhn, 1962; Laudan, 1965, 1977; Chalmers, 1976; Suppe, 1977) and the reiteration of the criticism against it in the marketing literature (O'Shaughnessy, 1972; Sauer, Nighswonger, and Zaltman, 1982; Ryan and O'Shaughnessy, 1982; Anderson, 1983; Peter and Olson, 1983; Zaltman, LeMasters, and Heffring, 1982). A major reason for why theories cannot be conclusively falsified by data is that theory and data interact (Hanson, 1958; Kuhn, 1962; Laudan, 1965; Feyerabend, 1975). That is, data are always interpreted in the context of some theoretical frame of reference. Depending on the frame of reference, a single phenomenon can have several interpretations. For example, the definition of a market and the identification of market segments differ depending upon whether a "top-down" or a "bottom-up" framework is applied (Day, 1980). Marketing, as perhaps no other discipline, has continually emphasized both the importance of frame of reference for interpreting phenomena, and the fact that frames of reference differ among individuals. For example, suppliers and consumers view products quite differently. Consumer segments view products differently from one another. Accordingly, it should be easy for the marketing analyst to realize why the existence of data-theory interdependencies is an issue no longer contested in the contemporary philosophy of science literature,17 nor by leading methodologists (e.g., Blalock, 1982; Cook and

-40 - Campbell, 1979). The implications for research conduct, however, appear less well understood. First, conclusive falsification is not possible because a failure to fit a given model (representing the theory) to the data may be due to measurement (observations), theory, or assumptions governing the use of method. Whatever the reason might be, whenever we obtain a poor fit, we do not know if it is because of inaccurate measurement or faulty theory. Second, traditional methods for assessing measurement validity can be misleading. Traditionally, it is held that validity of measurement should be demonstrated before the measures are used in a substantive context and that this is a necessary prerequisite for theory development and testing. This view is rooted in naive positivism and assumes that theory and data are independent. Since this is not a cogent assumption; measurement validity cannot be meaningfully addressed in isolation of the theoretical context in which it occurs. As a result, it is necessary that convergent and discriminant validity be examined within the context of theory (i.e., construct or nomological validity). That is, the measurement model and the substantive (structural) model in the second generation methods should be analyzed simultaneously.18 Only after such an examination is it possible to draw conclusions about the quality of measurement and theory, although one is always interpreted in the context of the other. If one is changed (e.g., theory), chances are that the way we interpret the other (e.g., data) changes too. Again, second generation analysis operates accordingly. For example, if the substantive (structural) relations are changed in, say, a PLS or covariance structure analysis, the epistemic relationships (i.e., the loadings) may change as well. Of course, this does not suggest that one's measurement model always changes as a result

-41 - of a respecification of theory. It seems entirely possible that certain variables are more or less indifferent to certain differences in theory. The point is that it would be better to test for the extent of data-theory dependence than to assume it away as would be necessary if one first subjects measurements to validity testing via, say, confirmatory factor analysis, and subsequently employs the measures found "valid" in a substantive context. DISCUSSION Recently, it has been argued that much of the criticism of marketing research stems from a failure to meld together theoretical knowledge with empirical knowledge (Bagozzi, 1984). Regardless of how serious this problem might be, the second generation of multivariate analysis offers a remedy as it forces the analyst to make the theoretical frame of reference more explicit and provides a better interplay between theory and data. The second generation multivariate analysis allows the analyst, via judicious choice of method and careful model specification, to determine the bearing of a prior knowledge relative to data in the analysis. Of particular importance here is the specification of how the theoretical model relates to the measurement model. Consider, for example, the specification of formative vs. reflective indicators. In the latter case, the theory is assumed to imply certain observations. In the former case, the observations imply something about the theory. Some methods allow for both formative and reflective indicators within the same model. Accordingly, the nature of the interplay between theory and data may well vary within a single model. While the importance of making one's th eoretical framework explicit can hardly be overstated, this does not mean that it is always necessary to have well specified and highly accurate theory in order to apply second

-42 - generation methods meaningfully. Recall that virtually all first generation methods are special cases of second generation methods. Hence, any "first generation type of analysis" (e.g. regression, factor analysis, etc.) can be accomplished with a second generation method. But there is a price to be paid for failure to be explicit. The price is in the interpretation of findings. That is, the more data-driven the analysis is, the more uncertain and ambiguous the interpretation of results. Further, the absence of a theoretical specification renders inferential statistical theory inapplicable. The reluctance of accepting the consequences of imprecise or implicit a priori thinking is widespread and probably constitutes the most common abuse of the application of covariance structure analysis (see Cliff, 1983; Fornell, 1983). The literature is replete with cases in which the model did not fit the data until it was adjusted in light of the very same data. Support for the model is then claimed with reference to a statistical test of fit. Clearly, a search for "theory" by attempting to fit several models to the same data makes conventional statistical inference methods invalid. Certainly, one cannot employ the likelihood ratio chi-square goodness-of-fit statistic for statistical inference if one has also used the modification index of the LISREL program to "improve" the model. In view of the availability of the automatic model modification option in LISREL VI, it seems appropriate to reissue a warning originally given for econometric estimates (Leaner, 1983): "There are two things you are better off not watching in the making: sausages and LISREL estimates." Even if a model survives the statistical test without post hoc tempering, it is not conclusively confirmed. If the model fails the test, the theory which it embodies is not conclusively disconfirmed. This is so regardless of what methods are used. On the other hand, the more specific and elaborate

-43 - one's theory is, the more restrictions can be placed on the model, and the more likely it is that the model will be rejected. Thus, whenever a model with many restrictions on the parameters is found to be consistent with the data, the more confidence can be placed in the credibility of the theory it represents. Therefore, even though conclusive falsification and verification are futile, attempts to falsify models via severe testing are still important (cf. Hauser, 1983). For example, a highly over-identified (i.e., a model with many parameter restrictions) covariance structure model that is consistent with the data would have more credence than a less restricted model with the same data consistency simply because the former has passed a more stringent test. Even though the new methods are generalizations to the now well known first generation multivariate methods, their diffusion throughout the research community might not be quite as rapid. This is because they will, no doubt, dispute previous findings (obtained via traditional methods) and, perhaps more significantly, challenge firmly held beliefs about criteria for measurement and theory validation. Further, and as already indicated, second generation methods are not always judiciously implemented and may themselves lead to questionable and easily disputed conclusions; something that will not facilitate their diffusion. Nevertheless, once the various barriers have been removed and researchers become more familiar with the new methods, their role in advancing knowledge in the field should be substantial.

-44 -NOTES 1) Indeterminacy in the context of second generation analysis refers to factor scores. In first generation methods, there is also rotational indeterminacy. That is, factor analysis and principal components solutions are not considered unique because there is an infinite number of rotations (e.g. varimax, quartimax) that can be done from one set of coordinate axes to another. 2) Another concern in the choice of defined vs. indeterminate theoretical variables is the analysis objective. If the objective involves prediction of empirical variables, defined variables are obviously preferable. 3) For discussions of different types of indicators, see Bagozzi and Fornell, 1982; Fornell, 1982a; Fornell and Bookstein, 1982; Hauser, 1973. 4) A simple iterative algorithm incorporating Johansson's (1981) extension to redundancy analysis is presented by Fornell and Barclay (1984). The algorithm is very easy to implement within many standard programs for multiple regression. 5) For example, PLS is a causal modeling method but not a method for covariance structure analysis. 6) Besides maximum likelihood, several other estimation techniques (instrumental variables, two-stage least squares, ordinary least squares, and generalized least squares) are available in LISREL VI. 7) In some cases, it is possible to specify covariance structure analysis models with formative indicators as well, but the measurement error conceptualization requires reflective indicators in these models. 8) For inferences about causal order, covariance structure analysis attempts to decompose covariances into direct, indirect, unanalyzed, and spurious effects. Substantive theory suggests what the true covariances should be and dictates the restrictions of the model such that it becomes overidentified. 9) This is equivalent to the number of degrees of freedom in covariance structure analysis. With two theoretical variables, the structural portion of the model has no degrees of freedom. 10) For a variety of tests that can be used, see Lingoes and Borg (1983), Dillon, Frederick, and Tangpanichdee (1982). 11) Clogg (1981) shows the importance of correct specification of the direction of epistemic relationships in an example of the relationship between the theoretical variables "latent opinion" and "latent vote intention." In this particular example, the sign of the estimated theoretical variable relationship depends on the formative/reflective specification of the "latent opinion" variable.

-45 - 12) The extent to which a causal interpretation of the results in confirmatory MDS is warranted is limited by the symmetric nature of the parameters. Thus, it seems difficult to claim causal inference for individual parameters: it is the overall structure of the model that is being tested. 13) It is not clear how meaningful it is to compare covariance matrices (in the covariance structure analysis sense) in PLS. PLS does not attempt to account for a covariance structure and should probably not be evaluated in terms of it (for a discussion, see Dijkstra, 1983). Nevertheless, it seems possible to compare nested PLS models in terms of the structural portion of the covariance matrix. (For an example, see Zinkhan and Fornell, 1983). 14) The sensitivity of PLS and covariance structure analysis is affected by the data in opposite ways. As the correlations between observed variables decrease, the covariance structure becomes easier to fit whereas the PLS predictive criteria become more difficult to satisfy (see Fornell and Larcker, 1984). 15) Ways to test distance structures are still fairly rudimentary from a statistical viewpoint. Recent developments are reported in Lingoes and Borg (1983). 16) The computer program by Lohmoller (1981) computes the Stone-Geisser test. Fornell and Barclay (1984) have incorporated an additional program in Lohmoller's package that also gives jackknifed estimates and corresponding t-statistics for all estimated parameters. 17) Even proponents of logical empiricism admit the possibility that meanings of concepts depend on the theories about them, but argue that this is not always the case. Brodbeck (1982) for example, provides several illustrations of terms whose referential meaning is not dependent upon theory and argues that some things sustain their interpretation across different theories. Along the same line of reasoning, Levin (1979) argues that even though every description is theory-laden, it does not follow that no description is apodictially certain or purely observational. 18) For a divergent viewpoint, see Calder, Phillips and Tybout (1983), footnote 3.

TABLE 1 A CLASSIFICATION OF SECOND GENERATION MULTIVARIATE METHODS PRIMARY OBJECTIVE THEORETICAL VARIABLES EPISTEMIC RELATIONSHIPS THEORETICAL VARIABLE RELATIONSHIPS THEORY REQUIREMENTS EMPIRICAL REQUIREMENTS METHOD Canonical Correlation Prediction of Theoretical Variables Defined Formative Orthogonal Symmetric Weak Weak Redundancy Analysis Prediction of Endogenous Empirical Variables As Above As Above Formative and Reflective As Above Orthogonal Symmetric Directional Orthogonal Somewhat Stronger Than Above As Above As Above As Above ESSCA As Above PLS Prediction of Theoretical And/or Empirical Variables As Above Formative and/or Reflective Orthogonal Symmetric Directional Bidirectional Causal Flexible Flexible Covariance Structure Analysis Explanation of Empirical Data Structure (Covariances) Indeterminate Reflective (Formative) Orthogonal Symmetric Directional Bidirectional Causal Strong Strong Confirmatory Nonmetric MDS Latent Structure Analysis Explanation of Empirical Data Structure (Distances) Explanation of Empirical Data Structure (Contingency Table Association) Defined Symmetric Symmetric (Causal) Orthogonal Symmetric Directional Causal Fairly Strong Strong Weak As Above Indeterminate Formative and/or Reflective

FIGURE 1 EPISTEMIC RELATIONSHIPS xl x2 X3 X1 X2 2 I- I I \ xl x2 X3 c3 a. Reflective Indicators b. Formative Indicators c. Symmetric Indicators 1\s tlrY1 x3/~ x2 -- ( A ) ---Y2 x3 J-Y3 d. Combination of Formative and Reflective Indicators Ax1 /x2 \ tx X1 X2 X3 X4 X5 X6 x7 8 X9 e. Combination of Formative and Reflective Indicators for Higher Order Analysis

FIGURE 2 RELATIONSHIPS AMONG METHODS a > b means that a is a special case of b

FIGURE 3 EXAMPLES OF SECOND GENERATION METHODS — E2 3 --- a. Canonical Correlation b. Redundancy Analysis 81 --- 82 --- 83 --- 61 c. ESSCA d. PLS ON -~~ X 81 8, ~ 2 `2 ^-^^_^~o E Dim. I A oQ2 / E2 / - Kf -1 —. Dim. 2 e. Covariance Structure Analysis) L 11 11131 Y ' —' "s^1 f. Confirmatory MDS I= Y3 -tS^% — I g. Latent Structure Analysis 1) This model would not be identified without several constraints on the illustrated paths and correlations. 2) For ease of illustration, the epistemic relations are not shown here.

-50 - REFERENCES Aaker, D. A., and R. P. Bagozzi (1979) "Unobservable Variables in Structural Equation Models with an Application in Industrial Selling," Journal of Marketing Research, 16 (2), 147-58. Alpert, M. I., and R. A. Peterson (1972) "On The Interpretation of Canonical Analysis," Journal of Marketing Research, 9, 187-92. Anderson, D., L. Engledow, and H. Becker (1979) "Evaluating the Relationships Among Attitude Toward Business, Product Satisfaction, Experience, and Search Effort," Journal of Marketing Research, 16 (3), 394-400. Anderson, J. C., and D. W. Gerbing (1984) "The Effect of Sampling Error on Convergence, Improper Solutions, and Goodness-of-Fit Indices for Maximum-Likelihood Factor Analysis," Psychometrika, 49, 155-174. Anderson, P. F. (1983) "Marketing, Scientific Progress, and Scientific Method," Journal of Marketing, 47, 18-31. Anderson, T. W. (1954) "On Estimation of Parameters in Latent Structure Analysis," Psychometrika, 19, 1-10. Arora, R. (1982) "Validation of an S-O-R Model for Situation, Enduring, and Response Components of Involvement," Journal of Marketing Research, 19 (3), 505-16. Bagozzi, R. P. (1976) "Toward A General Theory for the Explanation of the Performance of Salespeople." Unpublished doctoral dissertation, Northwestern University. (1977) "Structural Equation Models in Experimental Research," Journal of Marketing Research, Vol. XIV, 209-26. (1980a) Causal Models in Marketing. New York: Wiley. (1980b), "Performance and Satisfaction in an Industrial Sales Force: An Examination of Their Antecedents and Simultaneity," Journal of Marketing 44 (2), 65-77. (1983) "A Holistic Methodology for Modeling Consumer Response to Innovation," Operations Research, 31 (January-February), 128-76. (1984) "A Prospectus for Theory Construction in Marketing," Journal of Marketing, 48, Winter, 11-29., C. Fornell, and D. F. Larcker (1981) "Canonical Correlation Analysis as a Special Case of a Structural Relations Model," Multivariate Behavioral Research, 16, 437-54., and C. Fornell (1982) "Theoretical Concepts, Measurements, and Meaning" in A Second Generation of Multivariate Analysis: Measurement and Evaluation, C. Fornell (ed.). New York: Praeger, 24-38.

-51 - Bateson, John, and Stephen Greyser (1982) "The Effectiveness of the Knowledge Generation and Diffusion Process in Marketing - Some Considerations and Empirical Findings." The London Business School Paper No. 82/4. Bearden, W. 0., and J. E. Teel, (1980) "An Investigation of Personal Influences on Consumer Complaining," Journal of Retailing, 56 (Fall), 3-20. ____, S. Sharma, and J. E. Teel (1982) "Sample Size Effects on ChiSquare and Other Statistics Used in Evaluating Causal Models," Journal of Marketing Research, 19, 425-30., and Terence A. Shimp (1982) "The Use of Extrinsic Cues to Facilitate Product Adoption," Journal of Marketing Research, 19 (2), 229-39. Bechtel, G. G. (1976) Multidimensional Preference Scaling. The Hague: Mouton. Bentler, P. M. (1982) "Linear Systems with Multiple Levels and Types of Variables" in Systems Under Indirect Observation: Causality, Structure, Prediction, —Part I, K. G. J6reskog and H. Wold (eds.). Amsterdam: North-Holland, 101-30. (1983a) "Simultaneous Equation Systems as Moment Structure Models," Journal of Econometrics, 22, 13-42. (1983b) "Some Contributions to Efficient Statistics in Structural Models: Specification and Estimation of Moment Structures," Psychometrika, 48 (December), 493-517. Blalock, H. M. (1961) Causal Inferences in Nonexperimental Research. Chapel Hill: University of North Carolina Press. (1962) "Four-Variable Causal Models and Partial Correlations," American Journal of Sociology, 68, September, 182-94. (1969a) Theory Construction: From Verbal to Mathematical Formulations. Englewood Cliffs, New Jersey: Prentice-Hall. (1969b) "Multiple Indicators and the Causal Approach to Measurement Error," The American Journal of Sociology, 75, September, 264-72. (1982) Conceptualization and Measurement in the Social Sciences. Beverly Hills, CA.: Sage Publications. Bloxom, B. (1978) "Constrained Multidimensional Scaling in N Spaces," Psychometrika, 43, 397-408.

-52 - Boomsa, A. (1982) "The Robustness of LISREL Against Small Sizes in Factor Analysis Models" in Systems Under Indirect Observation: Causality, Structure, Production. Part I, K. G. J6reskog and H. Wold (eds.). Amsterdam: North-Holland. Borg, I., and J. C. Lingoes (1980) "A Model and Algorithm for Multidimensional Scaling with External Constraints on the Distances," Psychometrika, 45, 25-38. Brodbeck, M. (1982) "Recent Developments in the Philosophy of Science" in Marketing Theory: Philosophy of Science Perspectives. Chicago: AMA, 1-6. Browne, M. W. (1983) Asymptotically Distribution Free Methods for the Analysis of Covariance Structures. Research report 83/5. University of South Africa, Department of Statistics and Operations Research. Burnkrandt, R. E., and T. J. Page, Jr. (1982) "An Examination of Convergent, Discriminant, and Predictive Validity of Fishbein's Behavioral Intention Model," Journal of Marketing Research, 19 (3), 550-61. Buzzell, R. D., B. Gale, and R. Sultan (1975) Market Share - A Key to Profitability," Harvard Business Review, 53 (January-February), 97-106. Calder, B. J., L. W. Phillips, and A. M. Tybout (1983) "Beyond External Validity," Journal of Consumer Research, 10, June, 112-14. Carroll, J. D., S. Pruzansky, and J. B. Kruskal (1980) "CANDELINC: A General Approach to Multidimensional Analysis of Many-Way Arrays with Linear Constraints on Parameters," Psychometrika, 45, 1, 3-24. Chalmers, A. F. (1976) What Is This Thing Called Science? St. Lucia, Australia: University of Queensland Press. Churchill, G. A., Jr., and C. Surprenant (1982) "An Investigation Into the Determinants of Satisfaction," Journal of Marketing Research, 19 (3) 491-504. and A. Pecotich (1982) "A Structural Equation Investigation of the Pay Satisfaction - Valence Relationship Among Salespeople," Journal of Marketing, 46 (4), 114-24. Cliff, N., and D. J. Krus (1976) "Interpretation of Canonical Analysis: Rotated vs. Unrotated Solutions," Psychometrika, 41, 1, 35-42. (1983) "Some Cautions Concerning the Application of Causal Modeling Methods," Multivariate Behavioral Research, 18, January, 115-26. Clogg, C. C. (1977) "Unrestricted and Restricted Maximum Likelihood Latent Structure Analysis: A Manual for Users." University Park, Pennsylvania: Population Issues Research Offices, working paper, 1977-09.

-53 - (1981) "New Developments in Latent Structure Analysis" in Factor Analysis and Measurement in Sociological Research: A MultiDimensional Perspective, D. J. Jackson and E. F. Borgatta (eds.). Beverly Hills, California: Sage Publications, 215-46. Cook, T. D. and D. T. Campbell (1979) Quasi-Experimentation - Design and Issues for Field Settings. Boston, MA: Houghton Mifflin Company. Darden, W. R., K. B. Monroe, and W. D. Dillon (1983) (eds.) Research Methods and Causal Modeling in Marketing, 1983 AMA Winter Educators' Conference Proceedings. Chicago: Illinois. Day, G. (1980) Strategic Market Analysis: Top-Down and Bottom-Up Approaches. Report 80-105, Cambridge, MA.: The Marketing Science Institute. DeSarbo, W. S. (1981) "Canonical/Redundancy Factoring Analysis," Psychometrika, 46, 307-29. ____, R. E. Hausman, S. Lin, and W. Thompson (1982) "Constrained Canonical Correlation," Psychometrika, 47, 4, December, 489-516. Deshpande, R. (1983) "Paradigms Lost: On Theory and Method in Research Marketing," Journal of Marketing, 47 (4), 101-10. Dijkstra, T. (1983) "Some Comments on Maximum Likelihood and Partial Least Squares Methods," Journal of Econometrics, 22, 67-90. Dillon, W. R. (1980) "Investigating Causal Systems with Qualitative Variables: Goodman's Wonderful World of Logits" in Advances in Consumer Research Vol. 8, K. Monroe (ed.). Washington, D. C., Association for Consumer Research, 209-19., D. G. Frederick, and V. Tangpanichdee (1982) "A Note on Accounting for Sources of Variation in Perceptual Maps," Journal of Marketing Research (August), 302-11., T. J. Madden, and N. Mulani (1983) "Scaling Models for Categorical Variables: An Application of Latent Structure Models," Journal of Consumer Research, 10, September, 2, 209-24. Duhem, P. (1953) "Physical Theory and Experiment" in Readings in the Philosophy of Science, H. Feigl and M. Brodbeck (eds.). New York: Appleton-Century-Crofts, 235-52. Duncan, O. D. (1966) "Path Analysis: Sociological Examples," American Journal of Sociology, 72, 1-16. Efron, B. (1981) "Nonparametric Estimates of Standard Error: The Jackknife, The Bootstrap, and Other Methods," Biometrika G8, 3, 589-99. Feyerabend, P. (1975) Against Method. Thetford, England: Lowe and Brydone. Fornell, C. (1978) "Three Approaches to Canonical Analysis," Journal of the Market Research Society, 20, 3, 166-81.

-54 - (1979) "External Single-Set Components Analysis of Multiple Criterion/Multiple Predictor Variables," Multivariate Behavioral Research, 14, 323-28. ______ (1982a) (ed.) A Second Generation of Multivariate Analysis: Methods. New York: Praeger. (1982b) "A Second Generation of Multivariate Analysis —An Overview" in A Second Generation of Multivariate Analysis: Methods, C. Fornell (ed.). New York: Praeger, 1-21. (1982c) (ed.) A Second Generation of Multivariate Analysis: Measurement and Evaluation. New York: Praeger. (1983) "Issues in the Application of Covariance Structure Analysis: A Comment," Journal of Consumer Research, 9, March, 443-48. ___.__, and D. F. Larcker (1981a) "Evaluating Structural Equation Models with Unobservable Variables and Measurement Error," Journal of Marketing Research, 18, February, 39-50. and ___ (1981b) "Structural Equation Models with Unobservable Variables and Measurement Error: Algebra and Statistics," Journal of Marketing Research, 18, August, 382-88. ___.__, and D. R. Denison (1981) "Validity Assessment via Confirmatory Multidimensional Scaling" in The Changing Marketing Environment: New Theories and Applications, K. Bernhardt, et al. (eds.). Chicago: AMA, 334-37. and _ _ (1982) "A New Approach to Nonlinear Structural Modeling by Use of Confirmatory Multidimensional Scaling" in A Second Generation of Multivariate Analysis: Methods, C. Fornell (ed.). New York: Praeger, 367-92..___.__, G. J. Tellis, and G. M. Zinkhan (1982) "Validity Assessment: A Structural Equations Approach Using Partial Least Squares" in An Assessment of Marketin Thought and Practice, B. J. Walker, et al. (eds.). Chicago, Ill.: AMA, 405-9. and F. L. Bookstein (1982) "Two Structural Equation Models: LISREL and PLS Applied to Consumer Exit-Voice Theory," Journal of Marketing Research, XIX, 440-52..___.__._, and W. T. Robinson (1983) "Industrial Organization and Consumer Satisfaction/Dissatisfaction," Journal of Consumer Research, 9, March, 403-12. and D. W. Barclay (1983) Supplement to Lohmoller's LVPLS Program. Graduate School of Business Administration, The University of Michigan.

-55 - and D. F. Larcker (1984) "Misapplications of Simulations in Structural Equations: Reply to Acito and Anderson," Journal of Marketing Research, (February) 113-17. and D. W. Barclay (1984) "A General Model and Simple Algorithm For Redundancy Analysis," Working Paper, Graduate School of Business Administration, The University of Michigan., G. John, L. W. Stern, and M. Triki (1984) "The Role StrainPerformance in Industrial Selling: A Process Analysis," Working Paper, Graduate School of Business Administration, The University of Michigan. and R. A. Westbrook (in press) "The Vicious Circle of Consumer Complaints," Journal of Marketing. Geisser, S. (1974) "A Predictive Approach to the Random Effect Model," Biometrika, 61, 101-7. Gerbing, D. W., and J. C. Anderson (1984) "On the Meaning of Within-Factor Correlated Measurement Errors," Journal of Consumer Research, 11, 572-80. Goodman, L. A. (1974) "The Analysis of Systems of Qualitative Variables when Some of the Variables are Unobservable. Part I: A Modified Latent Structure Approach," American-Journal of Sociology, 79, 1179-1259. (1972) "A General Model for the Analysis of Surveys," American Journal of Sociology, 77 (May), 1035-86. (1973) "The Analysis of Multidimensional Contingency Tables when Some Variables are Posterior to Others: A Modified Path Analysis Approach," Biometrika, 60:174. (1974) "The Analysis of Systems of Qualitative Variables when Some of the Variables are Unobservable. Part I: A Modified Latent Structure Approach," American Journal of Sociology, 79, 1179-1259. (1979) "A Brief Guide to the Causal Analysis from Surveys," American Journal of Sociology, 84 (March), 1078-95. (1981) "Association Models and Canonical Correlations in the Analysis of Cross Classifications Having Ordered Categories," Journal of the American Statistical Association, 76, 320-34, 374. Green, P. E. (1975) "Marketing Applications of MDS: Assessment and Outlook," Journal of Marketing, 39 (January), 24-31., M. H. Halbert, and P. J. Robinson (1966) "Canonical Analysis: An Exposition and Illustrative Application," Journal of Marketing Research, 3, No. 1, 32-39. Haavelmo, T. (1943) "The Statistical Implications of a System of Simultaneous Equations," Econometrica, 11, 1-12.

-56 - Hanson, N. R. (1958) Patterns of Discovery* Cambridge: Cambridge University Press. Hauser, John R. (1983) "The Coming Revolution in Marketing Theory." Paper Presented at The Harvard Business School 75th Anniversary Marketing Colloquium, July 26-29. Hauser, R. M. (1973) "Dissaggregating a Social-Psychological Model of Educational Attainment" in Structural Eu ation Models in the Social Sciences, A. S. Goldberg and 0. D. Duncan (eds.). New York: Seminar Press, 255-84. Heiser, W. J., and J. Meulman (1983) "Analyzing Rectangular Tables by Joint and Constrained Multidimensional Scaling," Journal of Econometrics, 22, 139-67. Hirschman, A. 0. (1970) Exit, Voice, and Loyalty - Responses to Declining In Firms,Organizations, and States. Cambridge: Harvard University Press. Home, A., J. Morgan, and J. Page (1973) "Where Do We Go From Here?" Journal of the M'arket Research Society, 16, No. 3, 157-82. Hotelling, H. (1933) "Analysis of a Complex of Statistical Variables into Principal Components," Journal of Educational Psychology, 24, 417-41, 498-520. ____ (1936) "Relations Between Two Sets of Variates," Biometrika, 28, 321-77. House, R. J., and R. J. Rizzo (1972), "Role Conflict and Ambiguity as Critical Variables in a Model of Organizational Behavior," Organizational Behavior and Human Performance, 7, June, 467-505. Hui, B. S. (1982) "On Building Partial Least Squares Models with Interdependent Inner Relations" in Systems Under Indirect Observations Causality, Structure, Prediction, vol. 2, K. R. Joreskog, and H. Wold, (eds.). Amsterdam: North Holland, 249-72. Hunt, S. D. (1983) Marketing Theory: The Philosophy of Marketing Science. Homewood, Ill.: Richard D. Irwin, Inc. Jagpal, H. C. (1981) "Measuring the Joint Advertising Effects in Multiproduct Firms," Journal of Advertising Research, 21, 1, 65-69. Johansson, J. K. (1981) "An Extension of Wollenberg's Redundancy Analysis," Psychometrika, 46, 93-103. John, G., and T. Reve (1982) "The Reliability and Validity of Key Informant Data from Dyadic Relationships in Marketing Channels," Journal of Marketing Research (November), 517-24.

-57 - *e Joreskog, K. G. (1967) "Some Contributions to Maximum Likelihood Factor Analysis," Psychometrika, 32, 443-82. ____ (1973) "A General Method for Estimating a Linear Structural Equation System" in Structural Equation Models in the Social Sciences, A. S. Goodberg and 0. D. Duncan (eds). New York: Seminar Press, 85-112. and D. Sorbom (1978) LISREL IV: Analysis of Linear Structural Relationships by the Method of Maximum Likelihood. Chicago, Ill.: National Education Resources. and (1981) LISREL V: Analysis of Linear Structural Relationships by Maximum Likelihood and Least Squares Methods. Chicago, Ill.: National Educational Resources. and (1983) LISREL VI: Supplement to the LISREL V Manual, University of Uppsala, Department of Statistics, Uppsala, Sweden. Keeves, J. P. (1972) Educational Environment and Student Achievement. Melbourne: Australian Council for Educational Research. Kinnear, T. C., and J. R. Taylor (1971) "Multivariate Methods in Marketing Research: A Further Attempt at Classification," Journal of Marketing, 35, October, 56-59. Koopmans, T. C. (1949) "Identification Problems in Economic Model Construction," Econometrica, 17, 125-43. Knapp, T. R. (1978) "Canonical Correlation Analysis: A General Parametric Significance-Testing System," Psychological Bulletin, 85 (2), 410-16. Kuhn, T. S. (1962) The Structure of Scientific Revolutions. Chicago, Ill.: University of Chicago Press. Lakatos, I. (1970) "Falsification and the Methodology of Scientific Research Programmes," in Criticism and The Growth of Knowledge, I. Lakatos and A. Musgrave (eds.). Cambridge: Cambridge University Press, 91-196. Lambert, Z. V., and R. M. Durand (1975) "Some Precautions in Using Canonical Analysis," Journal of Marketing Research, 12, 4, 468-75. Laudan, L. (1965) "On the Impossibility of the Crucial Falsifying Experiment: Gruntaum on the Cuhemian Argument," Philosophy of Science, 32 (July) 295-99. (1977) Progress and Its Problems. Berkeley, California: University of California Press. Lawley, D. N. (1940) "The Estimation of Factor Loadings by the Method of Maximum Likelihood," Proceedings of Royal Statistical Society in Edinburgh, 60, 64-82.

-58 - Lazarsfeld, P. F. (1950) The Logical and Mathematical Foundation of Latent Structure Analysis" in Measurement and Prediction, S. A. Stouffer, et al. (eds.). Princeton: Princeton University Press, Ch. 10-11. Leaner, E. L. (1983) "Let's Take the Con Out of Econometrics," The American Economic Review, March, 73, 1, 31-43. Lee, S. Y., and P. M. Bentler (1980) "Functional Relations in Multidimensional Scaling," British Journal of Mathematical and Statistical Psychology, 33, 142-50. Levin, M. E. (1979) "On Theory-Change and Meaning-Change," Philosophy of Science, 46, 407-24. Lingoes, J. C., and I. Borg (1978) "CMDA-U: Confirmatory Monotone Distance Analysis —Unconditional," Journal of Marketing Research, 15, 610-11. and (1983) "A Quasi-Statistical Model for Choosing Between Alternative Configurations Derived from Ordinally Constrained Data," British Journal of Mathematical and Statistical Psychology, 36. Lohmoller, J. B. (1981) LVPLS 1.6 Program Manual: Latent Variables Path Analysis with Partial Least-Squares Estimation. fMunchen: Hochschule der Bundeswehr. (1983) "Path Models with Latent Variables and Partial Least Squares (PLS) Estimation." Unpublished doctoral dissertation, Hoschschule der Bundeswehr, Munchen. Madden, T. J., and W. R. Dillon (1982) "Causal Analysis and Latent Class Models: An Application to a Hierarchy of Effects Model," Journal of Marketing Research, XIX, November, 472-90. _______,,and M. G. Weinberger (1982) "Causal Models in Marketing: A Latent Structure Approach" in Marketing Theory: Philosophy of Science Perspectives, R. F. Bush and S. D. Hunt (eds.). Chicago, Ill.: AMA, 289-93. Magdison, J., and D. Sorbom (1980) "Adjusting for Confounding Factors in Quasi-Experimentation: Another Reanalysis of the Westinghouse Head Start Evaluation." Paper presented at the 1980 American Statistical Association Meetings, Houston, Texas (August), 11-14. Mooljaart, A., and R. Kapel (1981) "LSA I: A Computer Program for Nonmetric Latent Structure Analysis." Subfaculteit der Psychologie van de Riksuniversiteit te Leiden. Muthen, B. (1983) "Latent Variable Structural Equation Modeling with Categorical Data," Journal of Econometrics, 22, 43-65.

V 1 -r I -59 -Myers, J. G. and F. M. Nicosia (1967) "New Empirical Directions in Market Segmentation: Latent Structure Models" in Changing Marketing Systems: Consumer, Corporate, and Governmental Interfaces, R. Moyer (ed.). Chicago, Ill.: AMA, 247-52., W. F. Massy, and Stephen A. Greyser (1980) Marketing Research and Knowledge Development - An Assessment For Marketing Management. Englewood Cliffs, N.J.: Prentice-Hall. Olsson, U. (1979) "Maximum Likelihood Estimation of the Polychoric Correlation Coefficient," Psychometrika, 44, 443-60., F. Drasgow, and J. J. Dorans (1981) "The Polyserial Correlation Coefficient." Working paper, University of Illinois: Urbana, Illinois. Olson, J. C. (1981) "Presidential Address —1981: Toward a Science of Consumer Behavior," in Advances for Consumer Research. Ann Arbor, MI.: ACR. O'Shaughnessy, J. (1972) Inquiry and Decision. London, England: George Allen and Unwin Ltd. Perreault, W. D., Jr., and R. L. Spiro (1978) "An Approach for Improved Interpretation of Multivariate Analysis," Decision Sciences, 9, 402-13. _____, and F. W. Young (1980) "Alternating Least Squares Optimal Scaling: Analysis of Nonmetric Data in Marketing Research," Journal of Marketing Research, 17, February, 1-13. Peter, J. P., and J. C. Olson (1983) "Is Science Marketing?" Journal of Marketing, 47 (4) 111-25. Phillips, L. W. (1982) "Explaining Control Losses in Corporate Marketing Channels: An Organization Analysis," Journal of Marketing Research, November, 525-49., D. R. Chang, and R. D. Buzzell (1983) "Product Quality, Cost Position and Business Performance: A Test of Some Key Hypotheses," Journal of Marketing 27 (Spring), 26-43. Popper, K. (1962) Conjectures and Refutations. New York: Harper and Row. Punj, N., and R. Staelin (1983) "A Model of Consumer Information Search Behavior for New Automobiles," Journal of Consumer Research, 9 (4), 366-80. Ramsay, J. 0. (1969) "Some Statistical Considerations in Multidimensional Scaling," Psychometrika 34, 167-82. (1977) "Maximum Likelihood Estimation in Multidimensional Scaling," Psychometrika 42, 241-66.

-60 - Reilly, M. D. (1982) "Working Wives and Convenience Consumption," Journal of Consumer Research, 8 (4), 407-18. Richardson, M. W. (1938) "Multidimensional Psychophysics," Psychological Bulletin, 35, 659-60. Ryan, M. J. (1982) "Behavioral Intention Formation: A Structural Equation Analysis of Attitudinal and Social Influence Interdependency," Journal of Consumer Research, 9 (3), 263-78. _____, and J. O'Shaughnessy (1982) "Scientific Explanation and Technological Prediction" in Marketing Theory: Philosophy of Science Perspectives, R. F. Bush, and S. D. Hunt (eds.). Chicago, Ill.: AMA, 22-25. __, and M. B. Holbrook (1983) "The Impact of Stress on Purchasing Agents' Evaluations of Their Own Decision Process." Working paper, Graduate School of Business Administration, The University of Michigan. Sauer, W. J., N. Nighswonger, and G. Zaltman (1982) "Current Issues in Philosophy of Science: Implications for the Study of Marketing," in Marketing Theory: Philosophy of Science Perspectives, R. F. Bush and S. D. Hunt (eds.). Chicago, Ill.: AMA, 17-21. Shocker, S. D., and D. W. Stewart (1983) "Mapping Competitive Relationships: Practices, Problems, and Promise." Working Paper 83-115, Owen Graduate School of Management, Vanderbilt University. Schuler, R. S. (1975) "Role Perceptions, Satisfaction, and Performance: A Partial Reconciliation," Journal of Applied Psychology, 60 (December), 683-87. Shepard, R. N. (1962) "The Analysis of Proximities: Multidimensional Scaling with an Unknown Distance Function: I, II," Psychometrika, 27, 125-40, 219-46. Sheth, J. N. (1971) "The Multivariate Revolution in Marketing Research," Journal of Marketing, 35, January, 13-19. Simon, H. A. (1952) "On the Definition of the Causal Relation," Journal of Philosophy, 49, 517-28. (1954) "Spurious Correlation: A Causal Interpretation," Journal of the American Statistical Association, 49, 467-79. Sorbom, D., and K. G. Joreskog (1982) "The Use of Structural Equation Models In Evaluation Research," in A Second Generation of Multivariate Analysis - Measurement and Evaluation, C. Fornell (ed.). New York: Praeger, 381-418. Spearman, C. (1904) "General Intelligence, Objectively Determined and Measured," American Journal of Psychology, 15, 201-93.

-61 - Steiger, J. H. (1979) "Factor Indeterminacy in the 1930's and the 1970's — Some Interesting Parallels," Psychometrika, 44, 157-87. Stewart, D., and W. Love (1968) "A General Canonical Correlation Index," Psychological Bulletin, 70, 3, 160-63. Stone, M. (1974) "Cross-Validatory Choice and Assessment of Statistical Predictions," Journal of The Royal Statistical Society, B36, 111-33. Suppe, F. (1977) The Structure of Scientific Theories, 2d ed. Urbana, Ill.: University of Illinois Press. Thurstone, L. L. (1931) "Multiple Factor Analysis," Psychological Review, 38, 406-27. Tosi, H. (1971) "Organizational Stress as a Moderator of the Relationship Between Influence and Role Response," Academy of Management Journal, 14 (March), 7-20. van den Wollenberg (1977) "Redundancy Analysis: An Alternative for Canonical Correlation Analysis," Psychometrika, 42, 2, 207-19. Van Meter, D. S., and H. D. Asher (1973) "Causal Analysis: Its Promise for Policy Studies," Policy Studies Journal, 2 (Winter), 103-9. Wildt, A. R., Z. V. Lambert, and R. M. Durand (1982) "Applying the Jackknife Statistic in Testing and Interpreting Canonical Weights, Loadings, and Cross-Loadings," Journal of Marketing Research, XIX, February, 99-107. Winship, Christopher, and Robert D. Mare (1983) "Structural Equations and Path Analysis for Discrete Data," American Journal of Sociology, 89, 54-110. Wold, H. (1965) "A Fixed-Point Theorem with Econometric Background, I-II," Arkiv for Matematik, 6, 204-40. (1975) "Path Models With Latent Variables: The NIPALS Approach" in Quantitative Sociology: International Perspectives on Mathematical and Statistical Modeling, H. M. Blalock, et al. (eds.). New York: Academic Press, 307-57., and L. Jureen (1953) Demand Analysis. Stockholm: Almquist and Wiksell., and J. L. Bertholet (1981) '-"The PLS Approach to Multidimensional Contingency Tables" in Transactions of the International Meeting on Multivariate Contingency Tables, Rizzi, A. (ed.). University of Rome. Wright, S. (1934) "The Method of Path Coefficients," Annals of Mathematical Statistics, 5, 161-215.

-62 -Wrigley, C., and J. Neuhaus (1955) "The Matching of Two Sets of Factors," American Psychologist, 10, 418-19. Zaltman, G., K. LeMasters, and M. Heffring (1982) Theory Construction in Marketing. New York: Wiley. Zinkhan, G. M., and C. Fornell (1983) "Hierarchy of Effects Models under Conditions of High and Low Involvement." Working paper, Department of Marketing, University of Houston. Zinnes, J. L., and D. B. Mackay (1983) "Probabilistic Multidimensional Scaling: Complete and Incomplete Data," Psychometrika, 48, 27-48.