Division of Research Graduate School of Business Administration The University of Michigan January 1985 A GENERAL MODEL AND SIMPLE ALGORITHM FOR REDUNDANCY ANALYSIS Working Paper No. 412 C aes Fornell and Donald W. Barclay FOR DISCUSSION PURPOSES ONLY None of this material is to be quoted or reproduced without the expressed permission of the Division of Research.

f~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A GENERAL MODEL AND SIMPLE ALGORITHM FOR REDUNDANCY ANALYSIS Abstract Stewart and Love proposed redundancy as an index for measuring the amount of shared variance between two sets of variables. Van den Wollenberg presented a method for maximizing redundancy. Johansson extended the approach to include derivation of optimal Y-variates, given the X-variates. This paper shows that redundancy maximization with Johansson's extension can be accomplished via a simple iterative algorithm based on Wold's Partial Least Squares.

I i I

1. Introduction Following the publication of van den Wollenberg's [1977] paper on redundancy maximization, discussions and extensions of his approach have been developed by Johansson [1981], DeSarbo [1981], Muller [1981], Dawson-Saunders [1983], and Tyler [1982]. A limitation of van den Wollenberg's redundancy solution was pointed out by Johansson and later by Tyler. Specifically, van den Wollenberg's approach implies that the Y-variates are derived independently of the X-variates using -1 the eigenvectors of R R R, i.e., the transformation of Y-variates is not -yy-yx-xy related to the transformation of the X-variates. As a result, as opposed to canonical correlations, the correlations between X and Y variates are not optimal. Johansson [1981] extended van den Wollenberg's method to include the derivation of Y-variates which are maximally correlated with the X-variates constructed to maximize the redundancy of the y-variables. He also shows that these Y-variates have some desirable properties of orthogonality. Thus, Johansson's approach appears very attractive. The purpose of this paper is to demonstrate that Johansson's extended version of redundancy analysis can be accomplished via a very simple iterative algorithm involving nothing more than a series of simple and multiple regressions. A useful feature of the algorithm lies in its simplicity. There is no need for the analyst to construct his own computer program; all that is necessary is a program for standard multiple regression such as MIDAS, SAS, or TROLL. As a result, Johansson's extended redundancy analysis is easily available to most applied researchers. 2. Redundancy Analysis Stewart and Love [1968] developed a nonsymmetric index of redundancy which represents the mean variance in one set of variables predicted by a -1 -

-2 - linear composite or variate of another set of variables. The redundancy index is: RD = 2 1 A A Yg X qy Ynyyng where RD is the redundancy of the criteria given the tth canonical variate of the predictors, y is the canonical correlation coefficient, qy is the number of y-variables, A is the vector of loadings of the y-variables on their tth canonical variate n~. It has also been shown [e.g., Johansson 1981] that the redundancy index can be viewed as the mean squared loadings of the variables of the criteria set on the canonical variate of the predictor set, F, i.e., RD - A A' A YR qy ~Y~i Yi. In canonical correlation analysis, only the y2 portion of the redundancy index is maximized. Van den Wollenberg [1977] suggested that redundancy per se could be maximized. Maximization of redundancy results in two general characteristic equations: (R R -X R )a = O -xy -yx axx (R R - XR )S= 0 ~-yx-xy 8Ryy ~ where Xa and Xg are eigenvalues while a and g are weight vectors. Van den Wollenberg goes on to develop the case for extracting successive variates such that these are orthogonal to variates already constructed within the same variable set. It is not, in general, possible to have biorthogonal variates in redundancy analysis. Johansson [1981] presents two solutions for deriving Y-variates given X-variates derived to maximize the redundancy in the y-variables. The first

-3 - solution, a least squares approach, satisfies the orthogonality condition B'R B = 0, ~ # m but not the condition a'R B = O, Z # m where a defines ~-Ryym ~-~xy~m ~the weights of the Ith X variate and m defines the weights of the corresponding Y variate. The second approach, a restandardized solution, fulfills the opposite orthogonality condition, i.e., the solutions are complementary. We first present the iterative algorithm for redundancy maximization. In the next section we show the equivalence of this approach to Johansson's solution. Following this, the numerical example used by both van den Wollenberg and Johansson is applied. 3. The Algorithm The algorithm derives from Wold's [1966] extension of his fix-point method to nonlinear iterative Partial Least Squares (PLS) which has been shown to be a general model for principal components and canonical correlation analysis [e.g., Areskoug 1982]. As we will show here, Johansson's extension of redundancy analysis can also be obtained via this algorithm. Thus, even before the term "redundancy" was coined by Stewart and Love in 1968 and well before van den Wollenberg presented his analytical solution to redundancy maximization in 1977, a method existed that not only maximized redundancy but also incorported the attractive features of Johansson's [1981] extension. And, as we will see, the approach involves nothing but traditional OLS regressions in an iterative fashion using a fixed point constraint. The following series of OLS regressions is executed: Initialize Let n = yl + + yq; = x+...+x P

-4 - Loop Normalize i,n to variance unity. Regress n on Xl,..., x jointly: n = olx +. + apXp + E6 ~ Compute ^ i E aixi i=l Regress yl,.., yq separately on g: Y1 = B1 + 1 yq = q ~ + ~q Compute n = kyk*. k=l Test If 5 not equal to i or n to n (within some chosen convergence criteria), loop again. Otherwise Finish Regress n on 5 for the parameter y (the correlation between the Y-variate and the X-variate). 4. Equivalence to Johansson's Least Squares Solution Following analysis similar to that outlined by Areskoug [1982] in the case of canonical correlation, we can show that the above algorithm leads to eigenvalue equations which could be solved for a and ~. It also permits us to compare these eigenvalue equations with van den Wollenberg's and Johansson's to assess the equivalence of the approaches.

-5 - In PLS, the estimated latent variables (components) are defined as the linear forms: = flxa (1) and n = f2yX (2) where: x and z are matrices of observations from T cases and: fl and f2 are A scalar constants to give unit variance to i and n. 1 -1 2 = T a'(x'x)a = a'R a;.2. XX~ 1 -1 1 = T '(y'y) = 'Ryy. f2 f is the scalar constant which transforms the eigenvector corresponding with the largest eigenvalue in classical eigenvalue equations to standardized weights. The parameters of the redundancy model were estimated by PLS in the following iterative fashion. Start Choose arbitrary weights for 0 and from (2) let n(O) = fO)(o) (3) Step 1 Regress n(0) on x jointly to get a(O) -1,-(0) aJO) = (x'x) x n and from (3) a(0) = f(O)Rl R (4) Now fro2 -xx-xy Now from (1) let:

-6 - t(0) = f 0)xa(0) (5) A(0) and regress yl,..., y on () separately B(0) = r(y) and from (5) (0) = (O)R a(0) (6) 1 yxSubstitution of (6) into (4) gives a(1) in terms of c(0) a(1) = f(O)f(O)R1R R a(o) (7) 1 2 -xx-xy-yx= f(O)M a(0) (8) a. where f = flf and M = R- R R 1 2 a -xx-xy-yx Substitution of (4) into (6) gives ) in terms of B(0) M = f(O)f(0)R R-1R () (9) 1 2 -yx-xx-xy~ = f(0)M () (10) where M = R R1R -yx-xx-xy Step n a(n) = f(n-1)M (n-) (11) and (n) = f(n-l)M (n-1). (12) Now if the iterative procedure converges we must have: lim a(n) = lim (n-1) = (13) n —>~ n —>

-7 - and lim (n) =im m (n1) = g (14) n —>~o n —>~ The iterative procedure v(n) Av (n1) where v, scaled in some fashion known as the power method, converges to the eigenvector of A that corresponds to its largest eigenvalue [Morrison 1976, pp. 279-82]. Then the solution must satisfy the general eigenvalue equation: (Mv - XvI)v = 0 where v = a,B. (15) To obtain nonzero solutions consider first a: From (15) and (7) (R- R R - X I)a = 0 ~xx-xyyx a - or equivalently (R R - R ) = O. (16) -xy-yx -xx Solving (16) for the largest root Xa, we can then proceed to solve for a. Similarly for f, from (15) and (9) we have the eigenvalue equation (R R-R - XI)B = 0 (17) yx-xx-xy - which we solve for the largest eigenvalue X and hence proceed to solve for When comparing the above eigenvalue equations to those derived by van den Wollenberg we find that (16) is consistent with his general characteristic equation: (R R - XR )a = O ~xy"yx XX and hence the PLS solution and van den Wollenberg's redundancy solution give identical solutions for a, i.e., the weights of the i variate, as expected.

-8 - Van den Wollenberg does not present a solution for B given the optimal i variate, however Johansson does for his least squares and restandardized solutions. Examining the least squares solution for the first n variate, Johansson arrives at his equation (6):.'R yx = X (18) where X indicates the correlation between the first variate pair. Now, from (4) and (13) a = fR R 2-xx-xy and substituting into (18) we get f 2'RR RR - XI = 0 (19) 2 -yx-xx-xy post-multiplying by B and dividing by f2 '? where S't is a scalar we arrive at R-1 A -1 (R R-R - - I): = 0 or (R R R - XI) = (20) yx-xx-xy f20 yx-xx~xy ~ (0 which is the same eigenvalue equation as (17) for the first n variate. Hence the PLS solution is equivalent to Johansson's least squares solution for the weights of both the i and n variates in the first redundancy pair. The PLS results would be consistent with Johansson's for the second and higher order redundancy variates based on his argument regarding the general case for the jth variate [Johansson 1981, p. 96]. Also similar equivalencie s can be found if the redundancy of the x-variables is examined given n variates. 5. Numerical Example Van den Wollenberg's [1977] "artificial" data will be used to illustrate the equivalency of results using the PLS algorithm and results from Johansson's least squares extension. The input correlation matrix is presented in Table 1. Using van den Wollenberg's data as analyzed by both van den Wollenberg and Johansson, we note that our results are completely consistent with

-9 - Johansson's (see Table 2). As expected, the correlation between 5 and r in the iterative algorithm and Johansson's extension is larger than in van den Wollenberg's case, while the opposite is true for the sum of the squared loadings of the y-variables on n. The net result is that the redundancy of the y-variables in all solutions is the same. 6. Conclusion This paper has shown that a simple, easily implementable algorithm based on Wold's PLS, when applied to redundancy analysis, not only derives optimal X-variates to maximize redundancy with respect to the y-variables but also produces Y-variates which maximally correlate with the derived X-variates. In addition, this algorithm produces Y-variates with the desirable orthogonality properties of Johansson's least squares solution, i.e., B'R = 0, SLQ~~~ +~~-yy m L t m.

-10 - TABLE 1. Van den Wollenberg's Correlation Matrix l x 2 X3 X4 Y1 Y2 Y3 X2.800 X3.140.060 x4.060.140.800 Y1 -.003.062.422.710 Y2.265.203.714.440.400 y3.404.709 -.142.089.200.000 y4.723.461 -.012 -.037.000.200.400

-11 - Wold' s PLS.732 Table 2. Redundancy Analysis Van den Wollenberg's Johansson's Redundancy Least Squares Analysis Extension.689.732 Wold' s PLS Van den Wollenberg's Johansson's Redundancy Least SquareAnalysis Extension Correlation between r, n i.e. y Redundancy of y-variables Weights of x-variables for i(i.e. X) x 1 x2 x3 x4 Loadings of x-variables on 5 (i.e. X) x1 x2 x3 x4 4.210.210 a) Weights of y-variables for n (i.e.Y).508.413 -.266.606.508.413 -.266.606 a) Y1 Y2 Y3 Y4 Loadings of y-variables on n(i.e. Y) Y1 Y2 Y3 Y4 Loadings of y-variables on (i.e. X).298.257.513.468 b).294.211.598.500.298.257.514.467.837.888.315.482.837.888.315.482 a).503.470.760.725 b).498.428.856.781 c) Y1 Y2 Y3 Y4.343.295.589.538 d).343.295.590.538 a)

-12 - Table 2. (Cont'd) Notes: a. Results, although not given, must be the same as in PLS and van den Wollenberg's Redundancy Analysis since Johansson's extensions do not change the redundancy or the 4 variate. b. As van den Wollenberg does not present these results, REDANAL program [Thissen and van den Wollenberg 1975] these loadings and weights have been calculated from: nor does the produce them,./2 x Yk k = 1, ~.., q and = = R A. ~ yy-y c. Since the weights of the y-variables for n in PLS and Johansson's extension are equivalent, so must be the loadings since A = R a. "y Ryy d. It appears that there Wollenberg's Table 5. rectly labeled in his Y-variates. has been a transposition error in van den Thus the loadings presented here are incorTable 5 as the x-variable loadings on the

-13 - REFERENCES Areskoug, B., The First Canonical Correlation: Theoretical PLS Analysis and Simulation Experiments. In K. G. Joreskog and H. Wold (Eds.), Systems Under Indirect Observation —II. Amsterdam: North-Holland, 1982, pp. 95 -117. Dawson-Saunders, B. K., The Effect of Affine Transformation on Redundancy Analysis, Psychometrika, 1983, 48, 299-302. DeSarbo, W. S., Canonical/Redundancy Factoring Analysis, Psychometrika, 1981, 46, 307-29. Johansson, J. K., An Extension of Wollenberg's Redundancy Analysis, Psychometrika, 1981, 46, 93-103. Morrison, D. F., Multivariate Statistical Methods, 2nd Edition. New York: McGraw-Hill, 1976. Muller, K. E., Relationships Between Redundancy Analysis, Canonical Correlation, and Multivariate Regression, Psychometrika, 1981, 46, 139-42. Stewart, D., and W. A. Love, A General Canonical Correlation Index, Psychological Bulletin, 1968, 70, 160-63. Thissen, M., and A. L. van den Wollenberg, REDANAL: A Fortran G/H Program for Redundancy Analysis, Research Bulletin 26. Nijmegen, the Netherlands: University of Nijmegen, Department of Mathematical Psychology, 1975. Tyler, D. E., On the Optimality of the Simultaneous Redundancy Transformations, Psychometrika, 1982, 47, 77-86. van den Wollenberg, A. L., Redundancy Analysis —An Alternative for Canonical Correlation Analysis, Psychometrika, 1977, 42, 207-19. Wold, H., Estimation of Principal Components and Related Models by Iterative Least Squares. In P. R. Krishnaiah (Ed.), Multivariate Analysis. New York: Academic Press, 1966, pp. 391-420.