T H E U N I V E R S I T Y OF 1 M I CH I G A N COLLEGE OF LITERATURE, SCIENCE, AND THE ARTS Department of Mathematics Technical Report THE BEHRENS-FISHER DISTRIBUTIONS Venkutai H Patil ORA Project 04597 under contract with: DEPARTMENT OF THE NAVY OFFICE OF NAVAL RESEARCH CONTRACT NO. Nonr 1224(41.) TASK NR o42-227 WASHINGTON, D.C. administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR February 1963

ACKNOWLEDGMENTS I wish to express my deep sense of gratitude and appreciation to Professor Leonard J. Savage for his guidance, encouragement, and infinite patience. I am grateful to Dr. B. M. Hill for his help and advice. I am also most thankful to Mr. Peter H. RoosenRunge for his help in programming. Work on this dissertation was financially supported by the Office of Naval Research under Contract Nonr 1224(41).

TABLE OF CONTENTS Page LIST OF TABLES v ABSTRACT vii INTRODUCTION 1 Scope 1 Conclusion Organization 6 CHAPTER 1. THE BEHRENS-FISHER PROBLEM AND ITS BAYESIAN SOLUTION 9 1.1. The Behrens-Fisher Problem 9 1.2. Bayesvs Theorem and Personal Probability 12 1.3. Bayesian Inference 15 1.4. Stable Estimation 16 1.5. Stable Estimation and Behrens-Fisher Distributions 17 1.6. Bayesian Counterparts of Hypothesis Testing 21 1.7. Application to the Behrens-Fisher Problem 24 2. BASIC FORMUJLAS 26 2.1. Exact Integral Formulas of the Behrens-Fisher Distributions ~ 26 2.2. Ruben s Integral Forms 29 3. CLOSED FORMS 32 3olo Generating Functions, Recurrence Relations 32 3.2. Closed Forms 38 4. EARLIER NUMERICAL WORK 43 5. NUMERICAL VALUES OF DENSITIES 46 5.1. Need for Numerical Values 46 5.2. Case when Both f and. f2 are Odd 47 5.o3 Case when Either fl or f2 is Even 49 5~4. Harmonic Interpolation in fi f2 50

TABLE OF CONTENTS (Concluded) CHAPTER Page 6. MOMENTS AND CUMUTLANTS OF d 56 6.1. Moments and Cumulants of Student's Distributions 56 6.2. Moments and Cumulants of d 58 7. APPROXIMATIONS 60 7o1. Aim 60 7.2. Comparison with the Nominal Exact Values 61 7.3. The Hermite Polynomial Method 63 7.4. Application of the Delta Method to Integral Forms Due to Ruben 69 8. APPROXIMATION BY ONE DILATED STUDENT'S DISTRIBUTION 73 8.1. Choosing a Dilated Student's t for Good Over-all Fit 73 8.2. Alternate Expressions for f and h 75 8.3. Densities 77 8.4. Percentage Points of d 78 8.5. Improvement over the Approximation for the Percentage Points 81 8.6. Cumulative Probabilities 83 8.7. Choosing a Dilated Student's t to Approximate any Linear Combination of Three Independently Distributed Student's t 83 8.8. Description of Tables 2-6 85 9. APPLICATIONS OF ONE DILATED STUDENT'S t APPROXIMATION 106 9.1 Use of Tabulated Percentage Points 106 9.2. Computation of Densities, Cumulative Probabilities, and Percentage Points 107 9.3. Generalized Behrens-Fisher Distributions 108 APPENDIXES 1. Closed Forms of Cu v(d;a,c) for u,v = 1(1)7 109 2. To Obtain a Generating Function when Either of the Degrees of Freedom is Even 124 3. Program in the "MAD" Language for the IBM 709 Computer to Calculate the Densities ~(difl, f2;G) for Odd Integral Values of fl and f2 129 LIST OF REFERENCES 133 iv

LIST OF TABLES TABLE Page 1. Comparison of the Densities B*(dlflf2;G), as in (5.4.1), Obtained by Harmonic Interpolation in fl,f2 with the Exact Densities ~(dlfl,f2;G) in Terms of Percentage Errors in (d fl,f2;G) 52 2. Comparison of the Approximate Densities with the Exact Densities /(d) in Terms of Percentage Errors in /(d) 87 3. Comparison of the Approximate Percentage Points Due to One Dilated Student's t with the Exact Percentage Points dp in Terms of the Differences Adp and Percentage Errors II(p) in p 92 4. An Empirical Table of Corrections Af to be Added to the Degrees of Freedom f of a Dilated Student's t 95 5. Comparison of the Approximate Percentage Points Due to One Dilated Student's t with f and (f+Af) Degrees of Freedom with the Exact Percentage Points dp in Terms of the Differences Adp, Ad2 and the Percentage Errors II(p), II(p) in p 96 6. Comparison of the Approximate Cumulative Probabilities with the Exact Cumulative Probabilities F(d) in Terms of Percentage Errors 103 7. Comparison of the Sixth and Eighth Cumulants of hd and Student's t with f Degrees of Freedom 104 8. Closed Forms of Cu,v for v = 1, u = 1(1)7 111 9. Closed Forms of Cuv for v = 2, u = 2(1)7 115 10. Closed Forms of Cu v for v = 2, v = 2(1)7 113 11. Closed Forms of Cu v for v = 4, u = 4(1)7 117 12. Closed Forms of Cu v for v = 5, u = 5(1)7 119 13. Closed Forms of Cu for v = 6, u = 6(1)7 121 14. Closed Form of Cuv for u = v = 7 123

ABSTRACT The Behrens-Fisher distributions are important in problems of inference about the difference between the means 41 and 42 of two normal populations of unknown means and variances, or any linear combinations of them. The purpose of this study is to facilitate the computations of the densities, cumulative probabilities, and percentage points of the Behrens-Fisher distributions with a view of making them availablefor everyday practical use, much more than they are today. The Behrens-Fisher variables are the linear combinations of.two independent random variables t$ and t2, where t1 and t2 are distributed according to Student's distribution with fl and f2 degrees of freedom. An extension of the two-t problem to any linear combination of an arbitrary number of t-like random variables leads to generalizations of the Behrens-Fisher distributions. A recognition of the importance of the many-t problem has influenced the direction of this thesis. For one respe'ct in which any method of computing the Behrens-Fisher distributions must be considered is its adaptibility tothe many-t problem as well as the two-t problem. The following approximations of the Behrens-Fisher distributions were undertaken with a view to putting the computation of the densities, cumulative probabilities, and percentage points of these distributions within the range of a user who has access to a desk-calculator and widely available tables such as those of the normal distributions and its derivatives, and of StudentIs distribution: 1. The Hermite polynomial methods. 2. Application of the delta method to integral forms due to Ruben. 3. Choosing a dilated Student's t with width h-L and degrees of freedom fo These approximations were checked against the numerical values published by others, and certain numerical densities which were computed specially for this purpose on the basis of recurrence formulae. Of the three methods, the third proved the most practical. f and h were so chosen that hd has the same second and fourth moments as that of Student's t with f degrees of freedom. This does a good over-all job that makes it practical to calculate the Behrens-Fisher densities, cumulative probabilities, and percentage points. The main conclusion is that the tables of ordinary Student's t distribution provide an excellent basis for most applications of the more general Behrens-Fisher distributions. vii

INTRODUCTION Scope In certain applications of statistics, it is important to calculate densities, cumulative probabilities, and percentage points of Behrens-Fisher distributions and of certain natural generalizations. The justifications that have been proposed for these applications are interesting and controversial, but they are secondary here; for the purpose of this thesis is simply to facilitate Behrens-Fisher calculations for those who wish to apply them. For the problem of inference about the difference between the means of two normal populations of unknown variances, Fisher (1935 proposed solutions based on the Behrens-Fisher distributions. His argument is based on his concept of fiducial probability. Some other frequentists have rejected these solutions because they do not lead to confidence intervals in the Neymsn-Pearson sense (Bartlett, 1936; Welch, 1938; James, 1959). Jeffreys (1940) arrives at the same result, and Bayesians borrow his derivation in the personalistic Bayesian theory of statistics.

The Behrens-Fisher distributions, in a slightly generalized sense, are the distributions of linear combinations of pairs of independent random variables t1 and t2; that is, the distributions of random variables of the form bt2 - atl, where a and b are real and not both zero, and t1 and t2 are independently distributed according to Student's distribution with fl and f2 degrees of freedom, respectively, Writing, /~. \b- Lc~,t, - = C*-, ~-~ (, i CCO- ts,~~ where it suffices, of course, to study the distribution of d = t2cos - t1sin8. In view of the symmetry of the t-distribution, there is no loss in supposing 0 < 0 < fr/2. The Behrens-Fisher distributions will, therefore, often be understood in a narrower sense as the distributions of variables of the form t2cos8 - t1sin9, and this is the usual sense. Heretofore, relatively little has been done to make the BehrensFisher distributions available for practical applications. Their direct computation from first principles is prohibitively difficult, especially

for daily use, except possibly when both degrees of freedom are very small. Some percentage points and cumulative probabilities have been published. (Refer to Chapter 4 for details.) But these are inadequate even for applications that involve only the percentage points and cumulative probabilities, and do not help at all with applications that involve the densities of the Behrens-Fisher distributions. The latter are important in connection with Bayesian statistics. A straightforward table of the Behrens-Fisher densities, cumulative probabilities, and percentage points is not a promising solution for an applied statistician. A table of densities or cumulative probabilities yielding even moderate accuracy with the effort of considerable interpolation would seem to require more than one hundred thousand entries. Nonetheless, this solution ought not be abandoned, because with ingenuity, the scheme might be brought within practical bounds. In fact, as more ingenuity is applied to it, the straightforward table would become less and less straightforward and more and more like other devices to be mentioned next. It might be possible to construct some special tables of practical size which would not be tables of the Behrens-Fisher distributions themselves, but would facilitate the computation of these distributions. Finally, it may be possible to reduce the calculation of Behrens-Fisher distributions to a practical application of tables

that are already widely available. I have pretty much confined myself to this last line of investigation. My main conclusion is that the tables of the ordinary Student's t distribution provide an excellent basis for most applications of the more general BehrensFisher distributions. I report here on my exploration of several possible approximations of the BehrensFisher distributions undertaken with a view to putting the computation of the densities, cumulative probabilities, and percentage points of these distributions within the range of a user who has access to a desk-calculator and widely available tables such as those of the normal distribution and its derivatives, and io Student's distribution. What values of d, a, fl, and f2 must a practical method be able to deal with? Values of Id] > 5 seem unimportant. The entire natural range of B, 0 to r/2, is necessary. It would be desirable to cover all values of fl and f2, but a method confined to values of fl and f2 > 5 would meet almost all practical needs. An extension cf the two-t problem to any linear combination of an arbitrary number of t-ltke random variables leads to generalizations of Behrens-Fisher distributions. Not all methods of calculating the Behrens-Fisher distributions themselves are well adapted to calculating these generalizations. A recognition of the importance

of the many-t problem has influenced the direction of this thesis. For one respect in which every particular method of computing the Behrens-Fisher distributions must be considered is its adaptability to the many-t problem as well as the two-t problem. Conclusion The following approximations were explored: 1. The Hermite polynomial method. 2. Application of the delta method to integral forms due to Ruben. 3. Choosing a dilated Student's t with width h and degrees of freedom f. The comparison of densities in Table 2 shows that the approximations due to the first two methods are not good for large values of d. Of the three methods, the third is the most practical. For the third approximation, f and h were so chosen that hd has the same second and fourth moments as a Student's t with f degrees of freedom. This does a good over-all job that makes it practical to calculate densities, cumulative probabilities, and percentage points of d. An empirical table of corrections Af to be added to f is given n Table 4, so that certain percentage points can be calculated with highler accuracy.

An attempt was made to justify the use of the third approximation for the three-t problem by calculing the higher cumulants of hd and Student's t with f degrees of freedom, where h and f were calculated on the same principle as for the two-t problem. The comparison is given in Table 7. Organization The thesis is so divided into nine chapters that separate chapters are accessible to the reader, depending upon his interest. To begin with, the Behrens-Fisher problem and the solution based on Behrens-Fisher distributions are stated. The criticisms and justifications of this solution by holders of different views of probability are explained briefly (Chapter 1). Next, the basic mathematical formulas (in the integral form) of the Behrens-Fisher densities and cumulative probabilities are covered (Chapter 2). One of the several generating functions that were tried to obtain closed forms of the Behrens-Fisher densities is given. This generating function enabled the writer to calculate certain closed forms and nominally exact numerical values (except for rounding errors in computation) by means of recurrence relations (Chapter 3).

Reference to earlier numerical work is given and that part which is published in widely available tables is reviewed (Chapter 4). Certain relatively exact numerical values of the BehrensFisher densities were computed to serve as the standards against which to check the approximations in which I am ultimately interested. The most time-consuming work for this thesis has been obtaining the recurrence relations and certain numerical values of the Behrens-Fisher densities. For the most part, the study was confined to odd degrees of freedom, since this was the easiest. Just a few numerical values were calculated for even degrees of freedom by numerical integration. This enabled the writer to try harmonic interpolation in f1 and f2, suggested by Fisher, which is the same as direct interpolation in f and f2. The comparison of the interpolated values for the even degrees of freedom from those of odd degrees of freedom with the exact values is given in Table 1 (Chapter 5. The moments and cumulants of Student's t and d are used in some of the approximations that were explored in this thesis. The expressions for the moments and cumulants of d are calculable from those of Student's t, because of the independence of tI and t2 (Chapter 6).

The different approximations that were explored are subdivided into two categories: 1. Asymptotic approximations. 2. Approximation by one dilated Student's distribution. The comparison with the exact values is given in Tables 2-7 (Chapters 7, 8). Finally, the implications of the thesis are exploited to provide a set of practical instructions for computing values of density, cumulative probability, and percentage points of Behrens-Fisher distributions for everyday applications (Chapter 9).

CHAPTER 1 THE BEHRENS-FISHER PROBLEM AND ITS BAYESIAN SOLUTION 1.1. The Behrens-Fisher Problem The Behrens-Fisher problem, or more accurately, a group of problems, of making inferences about the difference between the means of two normal populations, r1 and v2' of unknown and not necessarily equal variances on the basis of independently drawn samples of sizes nl and n2 from v1 and r2' is of some practical importance. Let (Xal,..., Xan) be a sample of size na from a normal population ra with a and variance 2a (a = 1,2). Then, hesCea0.,ndJ )%c) are the sample means and variances. Fisher (1935) proposed a solution to the problems of estimating and testing the difference between the means 1 and 2 based on the Behrens-Fisher distributions, sometimes referred to as

10 distributions of d-variables in the following discussion. A dvariable is a linear combination of two independent random variables, each distributed according to Student's distribution with not necessarily equal numbers of degrees of freedom. A test equivalent to Fisher's was given earlier for a special case by Behrens (1929). Many frequentists have criticized and rejected Fisher's solutions because these do not lead to confidence intervals or significance levels (in the Neyman-Pearson sense) (Welch, 1938; Bartlett, 1936; James, 1959). Scheffe (1943, 1944) and Welch (1947) have offered solutions to the Behrens-Fisher problem, and these, too, have been subjected to severe criticism, as will now be explained. Scheffe generalized an unpublished solution of Bartlett. The solution is explained here for the simplest case n1 = n2. Let x. = xlj - x2j. A confidence interval for g1 - I 2 is obtained by the usual Student technique in terms of xj. Some statisticians object to this solution, because it depends upon the ordering of the observations in the two samples, which amounts to making a random analysis of the data and ignores aspects of the data that might be relevant. The Neyman-Pearson school has also tried to offer solutions free of the objection to Scheff6's. Their nearest approach, it seems,

is that of Welch (1947) in terms of infinite series which define asymptotic confidence intervals. Fisher's argument is based on his concept of fiducial probability and depends on the assumption that nothing whatsoever is known in advance about 1,2 al, and 2. If 1 a are not mutually irrelevant, as they would not seem to be if there is a serious question of their equality, then the required condition, which is at best somewhat mysterious, is not satisfied. For estimation, Bayesians believe that there is a cogent argument for the "fiducial intervals" of Fisher based on the Behrens-Fisher distribution. For testing, they distinguish a variety of cases. In one of these, the two-sided version of the test proposed by Fisher is appropriate; in another, a test based on density, not on cumulative distribution, of the Behrens-Fisher distribution is appropriate. For the problem of estimating the difference of i,1 and AL2, Jeffreys (1940) has arrived at the same conclusion as Fisher, but along a more comprehensible path. In fact, Jeffreys' argument is easily modified into the approximate conclusion of the personalistic Bayesian theory of statistics. A generalization of Jeffreys' argument, which is based on uniform prior distributions of the means and the logarithm of

variances, is inherent in the theory of conjugate families of prior distributions of Raiffa and Schlaifer (1961, pages 54-56). This generalization consists of replacing the uniform prior distributions by other distributions that are analytically convenient and that promise to be somewhat more realistic for some applications. Another important direction of generalization is this: One might be interested not only in the difference between /i1 and /u2 but also in their sum or in any linear combination of them. The approach to this generalized Behrens-Fisher estimation problem is easy, not only for any inear combination of two unlknown means, but of any number of interest. This generalization is easy in principle, but it leads to generalizations of the Behrens-Fisher distributions, which pose additional computational problems. 1.2. Bayes's Theorem and Personal Probability The personalistic Bayesian theory of statistics is called "Bayesian" for the rather secondary reason that it finds more occasions to apply Bayes's Theorem than the frequentistic theory of statistics does. The deeper characterization of a personalistic Bayesian is systematic use of a concept of probability called "personal" or "subjective" probability, on which Bayesians believe that statistics can be founded.

The personal probability of an event to a particular person is a certain kind of numerical measure of the confidence of that person in that event. Thus, the personalistic concept of probability is related in an opertonal sense to the prior opinion about an event. The concept of personal probability is a frankly subjective one. Among the criticisms that have been brought against it, one is, of course, its Lack of objectivity. Another is that personal probability can often be determined only crudely, which leads some critics to say that it must be regarded as a noMnumerical concept. For one introduction to personal probability and for some dlscussion of other views of probability, see Savage (1954). For a discussion of the applicat ian of personal probability to statistics, see Savage et al. (1962). The personal probability of an event for an individual can be thought of roughly in terms of the odds one is prepared to offer in favor of the event and is calculated by the formula, probability = odds/(1 + odds). Personal probability can also be expressed in terms of contingent payments. For example, an individual's personal probability for the event that it will rain today is the price he is prepared to pay for a unit payment to him in case it really does rain today.

The frequency concept of probability defines probability in terms of repetitions of a certain kind of event under certain conditions. This frequency concept is held to be a correct foundation for the theory of statistics by a large number of statisticians. In this concept, the probability of an event is to be determined according to the frequency definition and in no other way. This cuts frequentists off from the application of Bayes's Theorem to problems where the prior probability of the event in question cannot be determined according to their definition. Therefore, frequentists can not speak of the probability of an uncertain hypothesis or the probability distribution of an unknown parameter. They must, on that account, seek to express statistical inference in some other terms. The third main view of probability is, in the nomenclature of Savage (1954), the necessary view. The holders of this view regard probability as a generalization of implication. For them, probability is a logical relationship between one proposition and another; that is, one proposition partially necessitates the truth of the other proposition. Probability here is much like personal probability except for the assumption that one and only one opinion is justified by any body of evidence. Again, though any such description is at best approximate, Harold Jeffreys (1948) can fairly be described as a holder of this view. Some personalistic Bayesians

find his works a source of invaluable material, after making some easy changes. For example, the derivation of the Behrens-Fisher distribution can be modified from Jeffreys (1940) by a Bayesian into a personalistic Bayesian argument for adaption of this solution as an approximation. 1.3. Bayesian Inference The prior (or initial) probabilities of events are available in the personalistic theory of probability, and Bayes's Theorem is applied to generate the posterior probabilities (new opinions). Bayes's Theorem is the truism where H and D are any events, but these letters are chosen to suggest hypothesis and datum, corresponding to the most important application of the theorem. The concept of inference as the change in opinion induced by the evidence according to Bayes's Theorem, together with the principle of stable estimation which will be explained in the next section, illuminates many questions that have been raised about interval estimation, of which the Behrens-Fisher problem is one.

1.4. Stable Estimation This section reviews an approximation of great practical value in the personalistic Bayesian theory of statistics and does much to account for the agreement induced in diverse opinions by a common evidence. This approximation, called "the principle of stable estimation" (Savage et aL, 1962), leads to conclusions that are often in harmony with the classical, or frequentistic, theories of statistics. The principle of stable estimation concerns inference about a continuous parameter, say Mi, which is not necessarily one-dimensional, on the basis of the datum D. According to Bayes's Theorem, [ (I | D) = KPt (- |) 0 (A) $ where K is a normalizing constant, and Pr(DIM) may be a density function, The idea of stable estimation is that if p(p.) is sufficiently diffuse or gentle relative to Pr(D jg), then for many practical purposes p(AID) will be well approximated by Pr(DII). In an important extension of the principle, it is assumed not that p(g) is gentle but that p(A) = f(g)g(A), where g(;j) is a specified function, not necessarily gentle, and f(M) is gentle. This

extension plays some part in the derivation of the Behrens-Fisher distribution in the personalistic Bayesian theory of statistics. 1.5. Stable Estimation and BehrensFisher Distributions Let x = (x1, x2,.., xn) be a sample of size n from a normal population with mean gi and variance a2. Let p(, a2) be the prior density of (A a2) and let p(x,, a2) be the density of the datum x given gu and a2. According to Bayes's Theorem, P(9, ~J }:C) = / P Ct}) # 6 ) ( t C where K is a normalizing constant, We have

where x x/ s - ~ (x; - ) -). Adapting the calculations of Jeffreys (1940) to the principle of stable estimation, Savage has pointed out in an unpublished manuscript that, under suitable circumstances of diffuse prior opinion, the posterior marginal distribution of /z is approximately like that of x + s't, where x, s' are constants, s' = n s, and t is a random variable distributed according to Student's distribution with (n - 1) degrees of freedom. The result agrees with tht obtained by formal operation with the improper prior density p(l, 2) = 2, as defined earlier by Jeffreys. Instead of taking the improper density p(/ =, a= one -2 -2k could take a generalized improper density as a power of a, a, which would lead to the slightly different marginal posterior distribution that A is distributed like x + [(n - 1)/(n - 2 + k)]s't, where t is distributed according to Student's distribution with (n - 2 + k degrees of freedom. This yields the above result when k = 1. Ralffa and Schlalfer (1961, section 3.2.5, pages 54-56) carry this generalization further. The case k = 1 is in harmony with the result in the classical theory of statistics that t = i -,)/s' is distributed like Student's t

with (n - 1) degrees of freedom. The result stated above, though harmonious with the classical result, says something very different. In the classical theory, the exact distribution of (x - g)/x' is a Student's distribution for fixed A. According to the above result in the case k = 1, the posterior distribution of (At - x)/x' is approximately a Student's distribution for specified x and s'. This result is also harmonious with the theories of Fisher and Jeffreys. The result can be extended to two normal distributions with unknown means and variances. Under the assumption that the joint prior distribution of the four parameters., log al, p2 log a2 is gentle, and that the two samples are drawn independently, (giL, al) will be practically independent of (2, 2';) given the datum (i1, s; x2' s', Each Mua is distributed approximately like x + s' ta (a = 1, 2). Thus, g2 - Al is distributed approximately like a constant plus a certain linear combination of a pair of independent Student's t like variables; more specifically, At2 - /u1 is distributed like x2 -x 1 + t2s2'tlS' For some purposes it is useful to reexpress this by saying that (I2 - L1 + 1 - x2)/s* is distributed like the d-variable, d = t2cos 0 - t1 sin 0, where S = ( LS'lI I Sa )L it@- h./51 ~ol8 — SvS v)

20 and t1 and t2 are random variables distributed according to Student's distribution with fl = (nl - 1) and f2 = (n2 - 1) degrees of freedom, respectively. The posterior density p(g2 - 11, s; x2' s) f (2 1 given the datum 1, s, x2, s2), evaluated at / 2 - I1 = 0, is approximately equal to the density (s*)-lp(djfl, f2; ) evaluated at d = (X1 - x2)/s*, where p(djfl, f2; 6) is the density of d given fl' f2' i The preceding paragraph illustrates the convention of using the same letter p to symbolize various densities, so that the function that is meant is determined in part by the letter which is used as its argument. This system is known to have disadvantages, but in this context, all other systems seem to have greater ones. For the problem of inference about the difference in the means,u2 - /1' it is, therefore, important to be able to calculate the densities, cumulative probabilities, and percentage points of d. It is useful, for some purposes, to specify an interval of g2 - Il that has a definite high posterior probability. Such an interval is called a credible interval. Credible intervals are a counterpart of the confidence intervals of the Neyman-Pearson theory, but it is easier and some think more useful to construct credible intervals than to construct confidence intervals. A credible

interval of,u2 - A 1 is any interval of 12 - 1 that has the required posterior probability according to P(j2- I Xl' S'l; x2' s'2) In stable estimation, we approximate posterior distributions, and thus, only approximate credible intervals are used. All known methods of derivation of the Behrens-Fisher distributions lead to the generalized Behrens-Fisher distributions for the problem of inference about any linear combination of an arbitrary number of It's; Fisher's as well as Jefferys' derivations do; the argument of this section can be extended to the generalized problem; 2 caua given the datum X s' is distributed like a linear coma a aa bination of ta, where ca are constants and ta are random variables distributed according to Student's distribution with fa degrees of freedom. 1.6. Bayesian Counterparts of Hypothesis Testing A fairly general theory is presented here first, and then its application to the Behrens-Fisher testing problem is considered. The paper of Lindley (1961) presents many of the ideas that are sketched below. Let x be the datum whose density involves two unknown parameters a and B (not necessarily one-dimensional), and suppose that we want to test the null hypothesis Ho against the alternative

hypothesis H1, where Ho is a hypothesis that implies a = 0 and yields the conditional (upon H0 being true) prior density p(I H0) for ( and H1 is an alternative hypothesis under which one's prior opinion about a, ( is given by p(a, I IH1). Let f2 be the prior odds in favor of H0, Pr ( 4 ) The posterior odds Q(x), after observing x, are According to Bayes's Theorem, r(x I H, klP(H\ where L(x) is the likelihood-ratio, A,~ s, a,

23 De -n Hi ) =JS (K j i W #5)<VA. d a We assume that under H1 the conditional distribution of f given that = O is the same as the distribution of B given Ho, \(IH=) = Q(a \o1 ) OH), and we also assume (x pi,: =4 (x\B,& o,?)~ In many problems this will be exactly or approximately correct. Under the assumption that p(BJH1, a = 0) and p(a, I3IH1) are gentle, the principle of stable estimation is applied to Num. and Den. separately to obtain L(S) -N~a~~=- ______J __________ A A'ark Ilk), Il(kfid=o A t)(d so X, H ) where p(a = OIx, H1) is the approximate marginal posterior density of a given x and H1; a, /3 may be taken at the posterior

24 expectations of a, 3 under H I and %g may be taken at the posterior expectation of, under a = 0 and H1; _ is "approximately equal to." Under the assumption that p(a, [ H1) is gentle, a and j given H1 are approximately independent, and we could write and we get LCr (x.eFCI I o -o, 1,) (o0j H I ) Further we assume that I,% are very close and Then, 1.7. Application to the Behrens-Fisher Problem In the Behrens-Fisher problem, p(a = OIx, H1) is the approximate posterior density of (/t2 - IL1) = D evaluated at D = 0, which is equal to (s*)- p(dlfl, f2; 0) evaluated at d = (Xl -x2)/s*, as defined in section 1.5, (,vI.. (S) W d =(XI-X cK(3Hft8

25 If D is very close to zero, as it will be in many applications, we have another approximation (Q (x() _ cs-' <(a cX X' In both the approximations, the densities of the d-variable are needed. The only difference in the two approximations is that in one the density is evaluated at D = D, and in the other, at D = 0, If D is taken as the approximate posterior expectation (xi - 1 2) and if this is close to zero, as it will be in typical applications, both the approximations are almost the same.

CHAPTER 2 BASIC FORMULAS 2.1. Exact Integral Formulas of the Behrens-FIsher Distributions Let where t. and t2 are random variables distributed according to Student's distribution with fl and f2 degrees of freedom. Let: F(dfl,, f e0) = F(d) = be the cumulative probability of d,,(dlfl, f2; 0) = +(d) = be the density of d, T(tlf) be the cumulative probability of Student's t with f degrees of freedom, r(tlf) be the density of Student's t with f degrees of freedom. Then, (2.11) + Ko kaA ( d)cos 26 26

The density of a Student's t with f degrees of freedom is well known to be where K(O) = P (f+ i) *(d) can be written' (2.1.2) (d)= tK)( t *foO MO\G)U CWcok] where c) 2 = S;n t/)

28 In still other terms, (2.1.3) 4,(c\ = t((QK&,(S C\ V (A d, ) ), where + 00v c,~,,V (; S i, c)- t (- eta si' +Wa h. lcot 9>,2]A W where (The factor-, though artificial here, will save us from writing r repeatedly later.) Thus, the problem of computing 0(d) is basically the problem of computing Cu V(d; a, () with respect to d. The computation of C is technically an evaluation of eleUv mentary integrals or elliptic integrals, depending upon the parity of fl and f2. Specifically, when both fl and f2 are odd, the integrand is rational in w and the integral is an elementary one. When just one of the indices fl and f2 is odd, the integrand is rational in w except for the square root of a quadratic function in w, so it can

29 be evaluated in terms of inverse trigonometric functions. When both indices are even, the integrand is a rational function in w except for the square root of a quartic function in w, so the integration reduces to the evaluation of the complete elliptic integrals. The integration is comparatively easy when both fl and f2 are odd, but even then it is actually quite complicated, unless fl and f2 are small. Devices to bring order into the calculations are much to be desired. In particular, various generating functions were tried, and are reported in section 3.1, although only one of them seemed to be helpful. The extension of the Behrens-Fisher problem to several means leads to a linear combination of an arbitrary number of Student's t variables, d = Z cta, where ca are constants and ta are random variables distributed according to Student's distribution with f degrees of freedom. The study of the exact distribution of this a generalized d involves the evaluation of multiple integrals; and except for the odd degrees of freedom, there seems little hope of evaluating them in closed forms suitable for our computing purposes. 2.2. Ruben's Integral Forms Ruben (1960) has obtained integral forms expressing a dvariable as the ratio of two independent random variables, the

30 numerator of which is a Student's variable and the denominator a function of a beta variable, & (\,V1; a) =d= t3/,%,, where t3 is a random variable distributed according to Student's distribution with (fl + f2) degrees of freedom, and x is an independent beta variable with parameters f1/2, f2/2, and O(X) =C x'. CosG + C\- x)- -'L ] Thus, (2.2.1) C( a /,,- K; () ____ ('). - c)' _'- 2. 2 where,f P, P ) = F() r()/ r(p C \I We have (2.2.2) F (d' =]'-F(c~(,) ~ (~c,,\X X -Iz )i^- & 2,.

The form (2.2.1) is a relatively simple tautological reformulation of (2.1.1). The main reason for studying the formula (2.2.1) in addition to the formula (2.1.1) was the hope that the former would be adapted to the approximation method, sometimes called the "delta method," as will be reviewed in section 7.4, and the former might be better adapted to numerical integration than the latter. Neither of these hopes was actually particularly justified. An apparent advantage of (2.2.1) is that the range of integraion is only from 0 to 1, but this may be more than offset by rather violent behavior of the integrand in that interval.

CHAPTER 3 CLOSED FORMS 3.1. Generating Functions, Recurrence Relations Both fi and f2 are odd; u,v are integers The idea of obtaining the coefficients C v in a closed and compact form by means of a generating function was tried. Several generating functions were considered, and the one given below was found to be the simplest and the only one that seemed to be of use. x(3.1.1)'Y)== C4,x Y where o ~ xy \ c oo u v Since, for each u, v, and w, the integrand is at most x y, we can interchange the order of summation and integration to obtain 32

33 This integral is quite elementary and has been thoroughly checked by deriving it by a number of methods. From a statistical point of view, it is interesting to look at it as the convolution of two Cauchy distributions, which is again a Cauchy distribution. The result is (3.1.2)'r(x,y) + )_ j a^R20al O -x)1/+B (_Sz2) e)2 where e _ d(tt 9 c ct 9) - d/S+ 9 CoSt Therefore, o oC v~~~l r ~~~~t sl C~ ~ ~

34 u V By equating the coefficients of x y, the following recursion formulas have been obtained in terms of M, a r, br, where 2( (1-6S (zY-5)t2Y-3) K - -2-4 68 ST =- - zC- l Case (1), v = 1. =.-z~~__ ~, _.. (3.1.4) CuL 3K~ Ccee>)Cl | riIi COA < Case (2), u= 1 (3.1.5) Cas \),,,V Case (3), u,v > 1 L_'S~~~~V\

35 - 2.9'=C rVQ-rCA - ~ C5 CAkv j These recursion formulas are straightforward and they have been thoroughly tested by their numerical implication. Some thought was given to the development of +(x,y) as a power series in x and y. 4(x,y) can be re-expressed as: and y, where Ac\(~-x) +S 1\-y~ where The basic problem is to express (x,Y) as a power series in x and y, where R c \- X)

36 Some ideas that occurred to me are these: where Im(z) is the imaginary part of z. This largely reduces the problem to developing in a power series in x and y. A different direction is this: ~, (X y) - A(<-x)'> + (,- y)/ I + AM I B2- y- _ B-y+ 2 AB (- X )Y (s- ~/2='- A~x - x)/ + B+ - 4\' LD'_- A x - B", + 2_~,1 (t- xZv~ (A _ A w/

37 where )Z =, +A2 It is very easy to express Num. as a power series in x and y, but it is not so easy to expand D as a power series in x and y. Den. Some thought was given to the function X! a C x'.-~)1 (V- ) This leads to the evaluation of the integral )0 [+8(, -dton U ((w-dt+ C A cot 9)'f Another slightly different generating function is ~'- ~= XI ClV X. This leads to the evaluation of the integral

38 \ 5 VUX( \+Q,\A;X I1Gp' Ac c,Both look hopeless. 3.2. Closed Forms Closed forms of C v(d; a, B) have been obtained for u,v = 1(1)7 by starting with C11 = (a + 3)M and using the recurrence relations (3.1.4), (3.1.5), (3.1.6). These expressions were checked by calculating the zeroth and second moments of d by using 0(d) = K(fl)K(f)r CuCv as the density of d, and comparing them with the known zeroth and second moments of d: 70 ~ CosL +(V X ) (1A- L7 These moments are easily obtained since t1 and t2 are independent and the second moment of a Student's t with f degrees of freedom is f/(f - 2) for f > 2. The closed forms are also strongly guaranteed because numerical results derived from them agree with results obtained on a very different basis.

39 One interesting way to express Cupv obtained from the recurrence relations is for u > v, (3.2.1) C W(d; L where L. is a polynomial in a and f3. C is a linear comj;u,v u,v bination of u dilated Student's densities with different widths and degrees of freedom ranging from (2v - 1) to 2(u + v - 1) - 1. For simplicity, writing Lj;,v = L, -V CjZ where 7(tjf) is, as before, the Student's density with f degrees of freedom, and e. = (2j - 1)1/2/(a + +)flf2 From this, and (2.1.3), and

40 (3.2.3) E(4j(d Aca 4(( ~)o where T(tjf) is, as before, the cumulative probability of Student's t with f degrees of freedom. The coefficients Lj;uv' polynomials in a and i, do not seem sufficiently regular to be written In a closed and compact form covering all values of the indices; at least any such form was not discovered while studying the closed forms for u,v = 1(1)7, given in the Appendix. The density and cumulative probability of Behrens-Fisher distributions are linear combinations of dilated Student's densities and cumulative probabilities. If the polynomials Lj were not j;u,v complicated and the number of terms (= max(u,v)) not potentially numerous, then this would be a very satisfactory way to compute Behrens-Fisher densities and cumulative probabilities when fl and f2 are both odd, though not percentage points. The idea is usable for special purposes, especially for smal values of u and v, though it is not promising for everyday practical use. Since, if fl and f2 are odd, the Behrens-Fisher density is a linear combination of dilated Student's densities with odd degrees

of freedom, the same will be true of the generalized Behrens-Fisher density, the density of any linear combination of an arbitrary number of independently distributed t-like random variables with odd degrees of freedom. Thus, the task of computing the density of such a generalized Behrens-Fisher distribution is elementary in the technical sense and, in fact, it can be done with sufficient time, energy, and patience. The generating function (3.1.1) does not touch the case of even values of fl and f2. Analogous generating functions were derived, but they are much more cumbersome because they involve inverse trigonometric functions and in the worst case, elliptic integrals, and seemed of very little help for our purpose. Their derivation is sketched in the Appendix. Finally, interest in several terms of in a linear combination of several independent Student's t variables is almost as great as in two. The formula for odd degrees of freedom does generalize in principle, but threatens to become even more cumbersome, and the limitation to odd values becomes even more severe. So, formula (3.2.1), and its possible generalizations, is far from providing the man in the laboratory easy access to densities or cumulative probabilities of the Behrens-Fisher distributions and their generalizations.

42 Computation of percentage points would still remain a problem even where the formula (3.2.1) is usable (whether because the indices are small or because the computing facilities are large). So, for all these reasons, it is important to find simple, easily implemented approximations.

CHAPTER 4 EARLIER NUMERICAL WORK Abstracts of the tables of the Behrens-Fisher variable d are given by Greenwood and Hartley (1962, pages 232-36). Only those that are published in widely available tables are reported here in detail; others are just mentioned. Let: F(d), as defined before, be the cumulative probability of d, d be the loop-percent point of d (two tailed). Behrens (1929): Behrens gives F(d) for some values of d and f, where fl f2 = f, and s1/S2, where tan 6= s=/s2. Sukhatme (1938): The values of the 5-percent and 1-percent points of the distribution of d for values of 0 differing by 15 degrees and for all twenty-five combinations of f1,f2 in the harmonic series 6, 8, 12, 24, a) were tabulated by Sukhatme, at Fisher's suggestion. Sukhatme determined these percentage points to three decimal places, by direct numerical integration. d are tabulated for 43

44 flf2 = 6 8, 12, 24, Ao; 0 = 0~ (150) 90~; p = 0.05, 0.01. Errors in the third place were discovered by Fisher (1941), and these values have been revised by Sukhatme et al. (1951) to correct this error. These revised values have been printed in Fisher and Yates tables (1948, Table V1; 1957, Table VP). Fisher (1941): Fisher gave asymptotic expansions in terms -1 -1 of the powers of f and f2 for calculating the cumulative probability and the percentage point in any particular case, and a further range of tables when either fl or f2 is large. He calculated dp to three decimal places for fl = OD, f2= 10, 12, 15, 20, 30, 60, co; o = 0~ (10~) 90~; p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002. These are reprinted in Fisher and Yates (1943, 1948, 1953) as Table V2 and in 1957 as Table VIZ 2

45 Chapman (1950): Chapman gives F(d) for certain values d, f, where fl f2 = i, and 8 3 45'; he also gives certain percentage points dp, and for fi = f2 " 11, a normal approximation. Fisher and Yates (1957, Table VI1): Based on the formulas of Fisher and Healy (1956), Fisher and Yates calculated dp to 5 decimal places for fI = 1(2)7, f2 = f1(2)7; 8 = 0~ (150) 90~; p = 0.1, 0.05, 0.02, 0.0o.

CHAPTER 5 NUMERICAL VALUES OF DENSITIES 5.1. Need for Numerical Values This chapter is interesting not for the user of the BehrensFisher distributions as such, but only for someone who wants to compute them accurately on high-speed computing equipment, especi lly with a view to doing further research. It seemed impractical to evaluate the errors of various methods of approximations which came up for consideration analytically, so a supply of nominally exact values was found necessary; that is, values exact up to certain significant figures, to check proposed approximations against. For the most part, the study was confined to odd values of fl and f2, since these were easier to compute and seemed adequately representative except, possibly, for very small numbers of degrees of freedom. 46

47 5.2. Case when Both fI and f2 Are Odd Several methods of obtaining numerical values presented themselves; the ordinary numerical integration, the closed forms, are available, as well as recurrence relations. It was thought possible to proceed reliably and with less computing time by the recursion formulas than by numerical integration. Closed forms may be convenient sometimes for computation for small odd values of fl, f2; but they become very inconvenient for machine computation with substantial values of fl and f2, because it does not seem easy to have the machine compute the polynomial coefficients L. nor are they, by any means, easy to compute by hand. Certain numerical values were computed by writing a program (in the MAD language) for the IBM 709 computer on the basis of the recurrence relations (3.1.4), (3.1.5), (3.1.6), and (2.1.3). The densities *(dfl1, f2; 0) were calculated to seven significant figures, not to be taken seriously after six significant figures, for f2 = 1(2)9, fl = f2(2)19; 0 = 00 (7.5~) 90~; d = 0(0.2)7.

48 It would not be practical or useful to present a table of 18720 values here. The numerical values are punched on cards, and copies of these cards could be provided at cost. Any potential user would find the program on the basis of which they were computed far more useful and economical; this is given in the Appendix. The following checks were made for the machine (IBM 709) calculated values. Certain numerical values were computed on a desk calculator from the closed forms given in Tables 8-14, and formula (3.2.2), and also for the special case A, = At Z4 = 450 d = O (5.2.1) (dO F, F 450) = ( (__)) ) _F l K(2PF,+ 1) 2,+ 2f1 1/2 K(fl) are available up to six decimal places. 2f 1) (2f1 + was computed up to nine decimal places, and o(d = Olfl, fl; 45~) were computed for fl = 1(2)9. For the closed forms, each operation was carried to nine decimal places, and *(dIfl, f2; 0) were computed for f2 = 1(2)5, fl = f2(2)5; 0 = 300, 600; d = 0(1)5. In both the checks, the machine values and the desk-computed values agreed to six significant figures.

49 The number of operations in the program for calculating *(dJfl' f2; 9) do not change as d and 0 change. According to the three recurrence formulas (3.1.4), (3.1.5), (3.1.6), and (2.1.3), *(dlfl' f2; 0) is the sum of uv - 1 terms, where u = (f1 + 1)/2, v = (f2 + 1)/2, and not both u,v = 1, in which case *(dJl, 1, 8) is just one term. The agreement of the decimal places of the machine-calculated and desk-calculated values did not change when the number of terms increased from 1 to 24 as fl and f2 increased from fl = f2 = 1 to f = f2 = 9. This is quite encouraging and it appears that the same might be true when the number of terms increased from 24 to 49 as fl and f2 increase from fl = f2 = 9 to fI = 19, f2 = 9. This is all that can be said about the accuracy of the numerical values that were computed on the IBM 709 machine on the basis of the recurrence relations. 5.3. Case when Either fl or f2 Is Even Certain numerical values of the densities were calculated for this case by using Ruben's integral form (2.2.1) and the numerical integration method in the book by Hildebrand (1956, formula (3.6.2), pages 71-78). These were obtained by writing a program (in the MAD language) for the IBM 709 computer.

The program was so written that the value of the integral was recomputed with twice as many points in the net until two successive values were the same to four significant figures. The formula and the program were checked in two ways. A few values were calculated by this method for fl = f2 = 3, and checked against the hand-computed values. Secondly, a few values were calculated by using the formula (5.2.1) for the special case f = f2, 0 = 450, and d = 0. Both these checks indicate that the values on the high-speed computer appear to be correct to four significant figures. These numerical values *(dlfl, f2; 0) were computed for the following cases: (fl' f2) = (2,2); 0 = 15~; d = 0(1)3; (fl' f2) = (4,4); 0 = 150 (150) 45~; d = 0(2)6; (fl, f2) = (6,6), (8,8); 0 = 15~, 45~; d = 0(2)6; (fl' f2) = (7,6), (8,6); 0 = 150 (30~) 75~; d = 0(2)6. These values are given in Table 1. 5.4. Harmonic Interpolation in fl, f2 The numerical values obtained in section (5.3) were compared with the values calculated by harmonic interpolation in fl and f2, as

suggested by Fisher. The harmonic interpolation was found better than the direct interpolation. Harmonic interpolation in fl, f2 is the same as direct interpolation in f, f2 This was studied to find out whether the numerical densities for even degrees of freedom can be calculated from those of odd degrees of freedom by harmonic interpolation in fl' f2. The interpolated value, f2 is calculated according to. the formula (5.4.1),\ t4 +\,t\ 4t\ where The comparison of the interpolated values ~*(dIfl, f2; 0) with *(dlfl, f2; 6) is given in Table 1, in terms of the percentage error, which is calculated as ce~ V (\ L -. (...\$\.4X.. - ---....

52 TABLE 1 COMPARISON OF THE DENSITIES #*(dlfl,f2;0), AS IN (5.4.1), OBTAINED BY HORMONIC INTERPOLATION IN fl,f2 WITH THE EXACT DENSITIES #(dIfl,f2;8) IN TERMS OF PERCENTAGE ERRORS IN #(dIfl,f2;0) (d).Pet. Pct. Error 6 d *(d) Error fl= =2 00 0 0.3536 + 0.48 150 0 0.3251 + 0.31 1 0.1925 + 1.25 1 0.1952 + 0.51 2 0.06804 - 2.19 2 0.07423 - 2.96 3 0.02741 - 8.11 3 0.03004 - 7.33 fl f2 2 4 0~ 0 0.3750 + 0.023 150 0 0.3627 - 0.030 2 0.06629 - 0.44 2 0.06936 - 0.28 4 0.006708 - 1.02 4 0.006625 + 0.60 6 0.001186 + 5.06 6 0.001112 + 6.31 300 0 0.3474 + 0.031 450 0 0.3417 + 0.032 2 0.07436 - 0.54 2 0.07631 - 0.66 4 0.006627 + 0.75 4 0.006675 + 0.90 6 0.0009874 +11.81 6 0.0009334 +13.61 f' f2 = 6 0~ 0 0.3827 + 0.0039 15~ 0 0.3752 + 0.0021 2 0.06404 - 0.13 2 0.06593 - 0.10 4 0.004055 + 0.17 4 0.003879 + 0.41 6 0.0004220 + 5.21 6 0.0003769 +15.12

53 TABLE 1 (Continued) Pct. Pct. 0 d O(d) 0 d (d) Error Error fl =f2 =6 450 0 0.3596 + 0.0011 450 4 0.003460 + 1.45 2 0.07118 - 0.16 6 0.0002357 +11.58 f =7, f =6 00 0 0.3827 + 0.0039 150 0 0.3761 + 0.0050 2 0.06404 - 0.13 2 0.06559 - 0.12 4 0.004055 + 0.17 4 0.003846 + 0.26 6 0.0004220 + 5.21 6 0.0003747 + 6.11 450 0.03623 + 0.0031 750 0 0.3777 - 0.0011 2 0.07025 - 0.071 2 0.06505 + 0.025 4 0.003064 + 0.85 4 0.003165 + 0.19 6 0.0001826 + 7.33 6 0.000239 + 0.29 i! =8, f2 =6 f1'20 00 0 0.3827 + 0.0039 150 0 0.3768 + 0.0011 2 0.06404 - 0.13 2 0.06536 - 0.12 4 0.004055 + 0.17 4 0.003826 + 0.29 6 0.0004220 + 5.21 6 0.0003735 + 6.10 450 0 0.3643 - 0.0051 750 0 0.3796 - 0.0080 2 0.06951 - 0.12 2 0.06431 - 0.025 4 0.002796 + 1.29 4 0.002648 + 0.57 6 0.0001541 +10.77 6 0.0001585 + 5.05

54 TABLE 1 (Continued) 0 d +(d) Pt. d +(d) Pct. Error Error f =f = 8 0~ 0 0.3867 + 0.0010 150 0 0.3813 + 0.0012 2 0.06237 - 0.053 2 0.06371 - 0.016 4 0.002756 + 0.33 4 0.002597 + 0.50 6 0.0001800 + 4.44 6 0.0001561 + 5.06 450 0 0.3690 + 0.0013 450 4 0.002133 + 0.80 2 0.06771 - 0.015 6 0.00007675 + 8.46

55 Conclusion The comparison in Table 1 shows that harmonic interpolation works very well for 0 < IdI < 4. The maximum percentage error in this range, for the cases considered, is 1.45 percent for fl, f2 > 4. the interpolation gets worse as the densities get smaller and the maximum percentage error for d = 6, for the cases considered, is 15.12 percent.

CHAPTER 6 MOMENTS AND CUMULANTS OF d 6.1. Moments and Cumulants of Student's Distributions Owing to the symmetry of Student's distribution, its odd th moments are zero. The 2v moment of a Student's distribution with f degrees of freedom is _. 4 ) (s - r (Cram6r, 1945, page 239.). In particular, the second, fourth, and sixth moments are (a (.. -- ) _3 = 3 (F-2) (F-4) (- I) 56

57 Y, = 1543 15 Using the relation between moments and cumulants given in Cramer (1945, page 187), the second, fourth, and sixth cumulants are (4 _ 2) (I z - s ) ) 240, e240 and (6.1.2) t = $ 6.2. Moments and Cumulants of d As defined before, C - tXc- ts )

58 where C- CoS 8 D -," ~ Since tl and t2 are independently distributed according to Student's th distribution with fl and f2 degrees of freedom, the 2v cumulant of d is (6.2.1) (K f C ) ath 1/2 c and the 2v cumulant of d/K2 (fl f2 C) S) is (6.2.2) 4 (,, Sc5 =' - The I2v are calculated by substituting the expressions for (fl)' 2f2)' Expand d, 6 in ascending powers of fl f 2 and (flf2). By retaining the terms only up to the order of -2 -2 fl f2,- (flf2) 1 the I4 and 16 are (6.2.3).6e ( e-+ 6. ~. - (&2.4)'4- / o 6 s

59 Once again, using the relation between cumulants and moments, or proceeding directly, the second, fourth, and sixth moments of d are ML. (, T.. _C)S) s) ___ _ - 5Ss, q. cS A-. C 3'S) 3 k C-)( C C -~ —-2.4

CHAPTER 7 APPROXIMATIONS 7.1. Aim Several approximations were explored in a search for methods that would enable a person with a modest kit of equipment, such as standard statistical tables (like Fisher and Yates), a slide rule or a desk-calculator, and perhaps some special table, to compute Behrens-Fisher densities, cumulative probabilities, and percentage points. Different applications require different levels of accuracy and, of course, higher levels of accuracy might be expected to require a more elaborate kit of equipment as well as more computing time. The ideal is to cover all degrees of freedom beginning with (1,1), when the distribution of d is a Cauchy distribution. In practical applications, very rough accuracy might be adequate. Someone who has an occasion to say that all but 1% of his posterior probability is in a certain interval will not ordinarily be much harmed if only all but 1-1/2% or 2% is in it, nor would he, perhaps, be much harmed if the ostensible 95% interval is a 95-1/2% 60

interval. Similarly, where densities are to be applied in testing hypotheses, a factor of even 2 or 3 in either direction might be quite unimportant, and errors of as little as 10% or 15% would be negligible. A major part of the practical problem will be solved if a good approximation is obtained for the cases in which the two degrees of freedom fl, f2 are greater than 10. Then, one could look for an approximation which is, perhaps, slightly more complex for the cases in which fl and f2 are smaller. Several ideas were tried to achieve this aim, and are divided into two categories: 1. Asymptotic approximations. 2. Approximation by one dilated Student's distribution. 7.2. Comparison with the Nominal Exact Values These approximations were explored by checking them against some of the nominal exact values of the densities which were computed for this purpose. Only approximation (2) was explored further by checking it against some of the nominal exact values of the cumulative probabilities and percentage points that are available, for the simple reason that this was found to be most satisfactory and

62 practical for the densities and therefore, might work for the cumulative probabilities and the percentage points as well. The merit of any approximation depends upon how well it works for the densities, cumulative probabilities, and percentage points. If it works satisfactorily for the densities and percentage points, we can expect it to work for the cumulative probabilities, too. In our exploration, the comparison of the densities (cumulative probabilities) is given in terms of the percentage error in density (cumulative probability) calculated according to the formula, (7.2.1): (YC.)Y - exoec. &sA>oo The comparison of an approximate loop-percent point, d'p, with the exact loop-percent point, dp, is given in terms of the percentage error in the level p. This is calculated by considering the probability Ap contained in the interval [d'p, dp] and according to the formula (7.2.2) 4. e, o' = _L _)_

63 It is not possible to obtain exact nominal Ap, and hence, we can compute only approximate percentage error in p. Asymptotic approximations are considered in this chapter and the approximation by one dilated Student's distribution is considered in Chapter 8. The following asymptotic approximations are explored in this chapter: 1. The Hermite polynomial method. 2. Application of the delta method to integral forms due to Ruben. 7.3. The Hermite Polynomial Method Fisher (1925) gave an asymptotic expansion in terms of the -1 powers of f for the Student's density with f degrees of freedom. Then, Fisher (1935) considered the convolution of the two component densities to give an asymptotic expansion in terms of the powers of f-1 f1 for the cumulative probability and percentage points of d in'2 any particular case. His method can be very easily extended to calculate the density of d and also to the generalized BehrensFisher distributions. His method yields the same result as obtained by the application of the Hermite polynomial method directly to d to approximate the Behrens-Fisher distribution.

64 The density of a random variable x can formally be expressed in terms of the normal density, Hermite polynomials, and cumulants of x. The density of a random variable x with odd cumulants zero is represented formally by (7.3.1) (x)= eqp g<~' i S + _6,, ].(/ where V (X/b ) = __ This development is called "the Edgeworth form of the type A expansion." Refer to Kendall (1943, Vol. 1, section 6.24, pages 156-57). The use of the series (7.3.1) has been severely criticized by several authors who have objected to it on mathematical as well as statistical grounds. Refer to Kendall (1943, pages 152-53). In practical applications, however, the important question is not whether an infinite series represents a density, but whether a finite number of terms does so to a satisfactory approximation. Even when the infinite series diverges, its first few terms may,

65 nonetheless, give a satisfactory approximation. This subject has not been fully explored for the present problem (Wallace, 1958). Suppose 2ra-2r, for r > 1 is of order n-r and 2 is of order of n 1 where n is some parameter. Put I2 =i _ ( l 62), r &Y _ k 2* -~ o Then (7.3.1) is I (, )j L 146* t 6 /6)* 6C Expand the operator and retain terms up to and including O(n2) in _'s. The result of this operation is obtained by replacing the operator rr by (-1)rHr(X/) and multiplying by v(x/a Then the density of x is (7.3.2) 6 _ A x)-V. ~ 4/ ~ / 6) [ _t (/X ~) __ H (x/6)

66 where h-00 specifically, 2 H2(x)= - 1 4 2 H4(x) =x -6x +3 H6(x) = x - 15x + 45x2 15 H (x) = x8 - 28x6 + 210x4 - 420x2 + 105. This is an approximation of Edgeworth's series (7.3.1) in the sense that only few terms of the series are considered. The terms v(z)Hr(z) have been tabulated in Aiken et al. (1952). We obtain two different approximations and 2 by applicaWe obtain two. different approximations H and H by application of (7.3.2) to the Behrens-Fisher density due to two choices of a: a = 1 and a = 1/2 Approximation H (1) We take x = d, a = 1. Then V =I1 2 1, 2r for r > 1, and these are calculated from the formulas (6.2.1). From (7.3.2) we obtain,

67 (7.3.3) (&d) - (d)FI (k,-t) H2 (dL)+ L.. LH4() c )+ IIeCd) +-'_H4(cl)+ + kj H C(d) + I Htd) 24 I5s2 72' This approximation H is the same as that of Fisher s method extended to densities. Approximation 2) 1/2 We take x = d a. Then Q2 = O, for r >' and these are calculated according to the formulas (6.2.2). From (7.3.2), we have 4 -4Ny) +1 t_,HL~)+ {where Conclusion Certain numerical values of 0(d) were computed according to (1), 2) the approximations H and H, and the comparison with the nominally exact values is given in Table 2, in terms of the percentage errors

68 IH(1) and IH(2). The comparison in Table 2 shows that both the approximations can be safely used for the range [0, 3] of Idl for sufficiently many degrees of freedom, say fl, f2 > 9. Both the approximations are equally good or bad. From 3 onward, they are not good. For this reason, more numerical values were not computed. The Gram-Charlier series of Type A The density of a random variable x can be expressed formally as 9(n)_ =~l~ ~z~ ci\-J)v x) (Refer to Kendall, 1943, Vol. 1, section 6.23, pages 147-50.) Multiplying by Hr(x) and integrating from -a. to +co, we have, in view of the orthogonal relationship, Appato0 o! ~ rmtoy: /2 Application of this formula to y = d/42 gives, considering only first three terms, l4_ 7. o

69 and (7.3.5) d) ( The two formulas (7.3.4) and (7.3.5) are the same except for the t Hs(y) term 1152 No numerical values were computed using the formula (7.3.5), because it was thought of quite late. 7.4. Application of the Delta Method to Integral Forms Due to Ruben The delta method is usually applied and works satisfactorily when the integral of the product of two. functions, one of which is sharp and the other gentle, is to be evaluated. In Ruben's integral form (2.2.1) for the density of d, neither of the two functions is gentle. Hence, we do not altogether expect the delta method to work. However, some numerical values of +(d) were calculated by the delta method and empirically, the approximation by the delta method seems to be quite good in the interval [0, 3] of Id[ if fl and f2 are fairly large. Ruben's integral form (2.2.1) for the density of d can be rewritten as

70 (7.4.1) (dl. a 1 )pXj, where (x) ____________a -(>is the density of a beta variable with parameters f1/2, f2/2. We expand g(x) in a power series around the mean x0 of the beta variable with parameters f1/2, f2/2 where Consider the first few terms (usually two or three) of this series. The delta method is to integrate these few terms times the beta density and to expect this to approximate the exact integral. Then, % (d )F,,<::: i- f ( s + _9(

71 where where xoY(7.4.2) )= t(dr, " - ~~~i(7.4.3) J y X7 1, = 9Y~ O X-~ =').. (- cf - 4% - amp t 3'). (, )(,) +:,.( _.:..p( ) 3 (7.4.3) 4 i(a li) (\ \ Xt(5 j Lj S) (& d' Z4*;)(L:v

72 Conclusion Some numerical values of,(dlf1, f2; 8) were calculated by using the formula (7.4.2) and compared with the nominal exact densities. Table 2 shows the comparison of the approximation with the exact densities of d in terms of percentage errors IID in density. It is seen that the approximation works quite well in the range [0, 3] of jdl for fl, f2 > 7. The approximation is of little use for Idl > 3.

CHAPTER 8 APPROXIMATION BY ONE DILATED STUDENT' S DISTRIBUTION 8.1. Choosing a Dilated Student's t for Good Over-all FitThe idea of picking the degrees of freedom f so that a Student's t with f degrees of freedom would do the job of approximating densities, cumulative probabilities, and percentage points of d, was tried first. This idea did not work. Then an extension of the idea was explored; a dilated Student's t was tried. This calls for finding two numbers f and h so that the second and fourth cumulants of Student's t and f degrees of freedom are the same as those of hd. Solve the equation (8.1.1) sl = qui for f, and then solve the equation (8.1.2) k.(Wd) Ic(t1) 73

74 th for h, where K2 (x) and /2v(x) are the 2v cumulant and relative cumulants of the random variable x. These are calculated according to the formulas (6.1.1), (6.1.2), and (6.2.1), (6.2.2). We get (8.1.3) = 4+A(-,, jcs), where R(F )A, C s) = -- ________ 4 f l A) (I -4 1 and then (8.1.4), A This idea of a dilated Student's t is very simple, and since it is not easily extended to a sequence of better and better approximations, might be, justifiably, regarded as crude. But, as we shall see, it works rather well for the present purpose. A similar idea, but less flexible, has been used by Satterthwaite (1946) for approximating the distribution of a linear combination of random variables,

75 each distributed according to chi-square distribution, by a chisquare distribution. The densities, cumulative probabilities, and percentage points of a Student's distribution with nonintegral degrees of freedom are obtained by harmonic interpolation in degrees of freedom. Welch (1947) has derived that, in the classical theory, d = (x1 - x2)/s* is distributed approximately like a Student's t with fl degrees of freedom, where $ L 7, 5 + - z - Approximation by one dilated Student's t is better than that of Student's t with fl degrees of freedom. 8.2. Alternate Expressions for f and h Put Then, Acalcute Then, calculate

76 9 (4,4, cs) =.;+ s*AR c f and h are calculated from the expressions (8.2.1) = 4 * 7\Ck'2,cls) (8.2.2) \ The forms for Q and R(fl, f2' c, s) can be rewritten in a slightly different way. From the relation C' = Cts I' =C\ cos\)/ s1 - s~rn8 = t\-C\- ts ) we get (8.2.3) q =, (8.2-4) 4 (t,Q VL 5 - J Sz = 0\ C + In the Behrens-Fisher problem, the values c and s are calculated before 8, in which case the form given in (8.2.1), (8.2.2) is more convenient to use. Whereas, if we are studying

77 Behrens-Fisher distribution theoretically, then we may start with the angle 6 in which case the forms (8.2.3), (8.2.4) of Q and R are, perhaps, more convenient to use. 8.3. Densities Some numerical values of the densities +(d) were calculated according to the approximation that hd is distributed according to a Student's t with f degrees of freedom. (8.3.1)) ~ Table 2 gives the comparison of the approximation with the nominal exact values for four combinations of fl and f2. The approximation works quite well for the range [0, 5] of jdl even for fi and f2 as small as 7. In this range [0, 5] of Idl, for the values of d considered, the maximum percentage error in density is 7.42%. For 7 > Idl > 5, sometimes the percentage error is as high as 47.5% which is not good. At the same time, the densities +(d) for Idj > 5 get smaller and smaller, and in many practical applications, values of Idl > 5 might be unimportant. This approximation is exact for 0 = 00, 90~.

78 8.4. Percentage Points of d Some percentage points were calculated by using the technique of the section (8.1). Two numbers f and h were so chosen that hd is approximately distributed like Student's t with f degrees of freedom. The percentage points were explored for the cases for which the exact values have been tabulated so that the comparison was possible. The approximate loop-percent values were calculated according to the formula (8.4.1) where tp is the corresponding percentage point of Student's t with f degrees of freedom, for e = \5O (\5y)75 and for -, = i ) X

79 O= \<z, 40~, 50~,) J P -'\ -,5 0, I'm\ ) a 0s. 1 For 8 = 0~, 900, the approximation is exact. The comparison of the approximate percentage points d'p with the exact percentage points dp is given in Tables 3 and 5 in terms of the difference Ad = d' - dp, and the percentage error in the p p P level p. We are interested not so much in Adp but in how the level p is approximated. This measured in terms of the percentage error in p. The percentage error in p is to be calculated according to the formula, (8.4.2) q-(') _ y ce~~ee errory,y, p _. Co)/p where Ap is the probability contained in the interval [d'p, dp]. pp It is not possible to calculate the exact nominal value of Ap from the available tables including those which have been computed. Hence, Ap is computed approximately by considering the probability contained in the interval [hd', hd ] according to a Student's distrip p button with f degrees of freedom.

80 Since the exact percentage points dp are available only up to three decimal places, certain allowance has to be made in the percentage error in p, II(p). The real difference could be anywhere between Adp ~ 0.001. Then, II(p) would be anywhere in the range \\ (pa ) Conclusion The comparison in Tables 3 and 5 shows that the maximum percentage error in p, for the cases considered, is -2.20% for f f2 > 12; p = 0.05, 0.01; 3.80% for fl =, f2 = 10; p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002; -7.00% for fl f2 > 8; p = 0.05, 0.01; -16.66% for fl f2 > 6; p = 0.05, 0.01. The result that the maximum percentage error in p is -3.80% for p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002 when fl = co, f2 = 10 is extended to other cases off f f2 > 10 on the basis of the following remark. The maximum percentage error in p (p = 0.05, 0.01) for f2 = 12, fl > 12 is greater than the maximum percentage error in p for f2 = 24, fl > 24. From this, it may be

conjectured that the maximum percentage error in p for f = 10, fl > 10 is greater than the maximum percentage error when f = 12, 24; fl > f2 for p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002. The maximum percentage error in p for f2 = 12, 24; fl > f2 occurs when fl = co. Therefore, it is likely to be maximum for f2 = 10 when fl = ow, which is -3.80%. From this, it is concluded that the maximum percentage error in p for p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002 is -3.80% when fl, f2 > 10. 8.5. Improvement over the Approximation for the Percentage Points In order to improve the approximation of section (8.1) as applied to calculating percentage points, an empirical table of corrections Af to be added to f, given in Table 4, was explored. The percentage points d" were calculated by considering hd to be p distributed like a Student's variable with f' = (f + Af) degrees of freedom, Ad'p, II'(p) are the corresponding differences and percentage errors in p, as defined in section (8.4). Table 4 gives the corrections Af for f2 = 6, 8; fl = 6, 8, 12, 24, oa; 0 = 0O (15 ) 90'. The correction for fl, f2 > 8 is zero, and for fl, f2 > 6, the corrections are obtained by harmonic interpolation.

82 Table 5 gives the comparison of the approximate p-percent points d'p, d" with the exact p-percent point d in terms of the p p p differences Adp, Ad'; and percentage errors II(p) and IIp). The values are explored for the following cases: (1) f2 = 6, 8; f = 6, 8, 12, 24, oc; =15~ (15') 75; p =0.05, 0.05, 0.01; () f1 = f2 = 7; 0= 15 (15) 45'; p =0.1, 0.051, 0.0 2, 0.01; Conclusion The maximum percentage error in p, for the cases considered, has been reduced from 16.66% to 4.42% for fl, f2 > 6, p = 0.05, 0.01, 0 = 15 (15~) 75. Table 5 shows that for fl = f2 = 7, the approximation gets worse when p = 10 due to the correction Af. From the above remark, Table 4 of corrections Af improves the approximation for fl, f2 > 6 and for 0.01 < p < 0.05.

83 8.6. Cumulative Probabilities The approximation of the distribution of d by dilated Student's distribution works very well for densities at d = 0, 1, 2, 3 and for the percentage points dp. This justifies the use of the approximation to compute the cumulative probabilities as well; each percentage point amounts to such a calculation. A few numerical values of cumulative probabilities have been calculated according to the dilated Student's approximation, and the comparison with the exact values is given in Table 6. The maximum percentage error is 0.32%, for the cases considered and this leads to the conclusion that the approximation can be used to compute cumulative -probabilities of d in practical applications. 8.7. Choosing a Dilated Student's t to Approximate Any Linear Combination of Three Independently Distributed Student's t An attempt to justify the use of dilated Student's t to approximate any linear combination of an arbitrary number of independently distributed Student's t's was made by considering a linear combina3 tion of three Student's t-like variables, d = 2a=1cata, where not all Ca are zero. When one c is zero, this reduces to the study of Behrens-Fisher distributions which has been already covered. When

84 all c + O, nominally exact numerical values of these distributions a are not available. These will have to be computed for a thorough investigation of this approximation for a good over-all fit. This has not been done in this thesis. However, the dilated Student's t approximation is studied by comparing the higher cumulants of hd and Student's t and f degrees of freedom, where f and h are calculated on the same principle as that of section (8.1), by equating the second and fourth cumulants of hc ta and Student's t with f degrees of freedom. The comparison of the sixth and eighth cumulants, for certain cases, is given in Table 7. The agreement is not good. The sixth and eighth cumulants were also calculated for certain cases of Behrens-Fisher distributions. The agreement is not good in the latter case either. Still, a thorough investigation of the numerical values of densities, percentage points, and cumulative probabilities of the Behrens-Fisher distributions has proved that the dilated Student's t approximation is satisfactory for many practical purposes. The same might be true for the generalized Behrens-Fisher distributions, and this hope is strengthened by the central limit theorem.

8.8. Description of Tables 2-6 Table 2 In this table IH(1), IH(2),IID, IIfh are the percentage errors in the density +(d) due to the approximations by Hermite polynomial method H(1) H(2), the delta method, and one dilated Student's t, according to the formulas (7.3.3), (7.3.4), (7.4.2), and (8.3.1); and the percentage errors are calculated according to the formula (7.2.1). The densities *(dlfl, f2; 0) are explored for the following cases: fl = f2 = 7, 0 = 15~ (150) 45~, d = 0(1)7; ft = f2 = 9, 0 = 15~ (150) 45~, d = 0(1)7; f 19 = 19, 9, = 15 (150) 75~, d = 0(1)7; f = 9, f2 = 11, e = 15~ (15~) 75~0, d = 0(1)7; Table 3 In this table, d is the loop-percent point of d in Fisher and P Yates (1957, Tables VI, VI1, VI2) where d'p is the approximate loop-percent point calculated according to the formula (8.4.1). The percentage error in p is calculated approximately according to the formula (8.4.2).

86 Tables 4-5 Table 4 gives the corrections Af to be added to f calculated from the formula (8.1.3). d" I Ad'p IIfp) are calculated in the same way as in Table 3, by considering (f + Af) degrees of freedom, instead of f. Table 5 shows the result of adding Af to f in terms of Adp, Ad'p, IIp), I The values are explored for the following cases: f2= 6, fl = 6, 8, 12, 24, ao; 8 = 15~ (15~) 75~; p = 0.05, 0.01; f2 = 8, fl = 8, 12, 24, ao; 8 = 150 (15~) 750; p = 0.05, 0.01; f = f2 = 7, 8 = 15~ (15") 45~; p = 0.1, 0.05, 0.02, 0.01. Table 6 F(d) is the cumulative probability of d. The comparison of the approximate cumulative probabilities due to one dilated Student's t with F(d) is given in terms of percent of error in F(d) calculated according to the formula (7.2.1). The values are explored for the following cases: 1 = fl 7 = f = f 9; = 45; d = 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 4, 5.

87 TABLE 2 COMPARISON OF THE APPROXIMATE DENSITIES WITH THE EXACT DENSITIES +(d) IN TERMS OF PERCENTAGE ERRORS IN,(d)a 0 d f(d) Ifh IH(2) IH(1) IID fl = f2 = 7 150 0 0.3787 + 0.53 1 0.2276 - 0.44 2 0.06471 - 0.62 3 0.01403 + 1.43 4 0.003133 + 2.88 5 0.0008008 + 3.63 6 0.0002376 - 9.17 7 0.00008061 + 2.11 300 0 0.3690 + 1.08 1 0.2299 - 0.43 2 0.06783 - 1.18 3 0.01408 + 1.42 4 0.002813 + 6.05 5 0.0006326 + 7.42 6 0.0001676 + 1.79 7 0.00005204 - 0.77 45~ 0 0.3650 + 0.82 - 0.54 - 0.052 + 0.66 1 0.2306 - 1.30 + 0.0014 - 0.012 + 1.30 2 0.06931 - 2.45 + 0.58 + 1.41 - 1.59 3 0.01418 - 1.41 + 2.82 + 6.34 -13.38 4 0.002669 + 1.12 + 6.74 +15.35 -28.84 5 0.0005471 + 1.83 -16.27 -98.68 -48.08 6 0.0001308 - 1.53 -71.45 -98.47 -61.54 7 0.00003692 - 1.90 -96.36 -100.00 -75.61 aIH(l), IH(2), IID, and Ifh are the percentage errors in +(d) due to the approximations by Hermite polynomial method H(1), H(2), the delta method, and one dilated student's t with width h-l and degrees of freedom f.

88 TABLE 2 (Continued) 9 d (d)fh IH(2) I) II fl f2 -9 150 0 03833 + 0.26 - 0.26 + 0.52 1 0.2307 - 0.43 + 0.44 + 0.0012 2 0.06288 - 0,16 - 3.89 + 0.48 3 0.01198 + 0,83 - 1.65 - 5.83 4 0.002200 + 1.82 +14.89 -25.45 5 0.0004494 + 1.34 + 5.16 -51.67 6 0.0001062 - 0.0011 -67.66 -71.03 7 0.0000289 - 3.11 -97.13 -59.86 300 0 0.3755 + 0.27 - 0.53 + 0.53 1 0,2328 - 0.43 + 0.20 + 0.43 2 0.06530 - 0.46 - 1.38 + 1.53 3 0*01181 + 1.69 + 0.85 - 8.47 4 0.001903 + 3.68 +10.53 -25.26 5 0.0003316 + 1.81 -15.15 -47,89 6 0.00006776 - 4.86 -72.57 -66.22 7 0.00001644 -15,85 -97.56 -49.75 450 0 0.3722 + 0.27 - 0.27 + 0.051 + 0.40 1 0.2335 - 0.43 + 0.09 - 0.012 + 0.73 2 0.06650 + 0.75 - 0.60 + 0.75 - 1.05 3 0.01178 + 2.54 + 1.69 + 2.54 -10.17 4 0.001756 + 2.27 + 7.95 +20.45 -27.78 5 0.0002694 + 2.23 -19.23 -82.97 -40.74 6 0.00004729 - 3.81 -74.63 -97.46 -60.00 7 0.000009892 -15.57 -97.98 -100.00 -66.67 fl 19, f2 9 150 0 0.3853 + 0.52 - 0.26 + 0.26 1 0,.2307 - 0.43 + 0.13 + 0.0021 2 0.06219 - 0.16 + 0.0012 + 0.19 3 0.01177 - 0.85 - 0.85 - 5.08 4 0.002159 + 1,85 +16.210 -34.72

89 TABLE 2 (Continued) 0 d (d) fh IH(2) IH(1) IID fl = 19, f2= 9 150 5 0.0004422 + 1.36 - 2.27 -68.55 6 0.0001048 - 0.95 -70O.00 -86.48 7 0.00002862 - 3.15 -97.67 -95.94 300 0 0.3805 + 0.78 - 0.26 + 0.21 1 0.2336 - 0.85 + 0.0011 + 0.39 2 0.06323 - 0.16 + 0.16 + 0.95 3 0.01085 + 4.63 + 1.85 - 5.56 4 0.001684 + 7.14 +10.12 -33.93 5 0.0002923 + 2.39 -21.92 -65.75 6 0.00006076 - 8.55 -80.42 -86.41 7 0.00001507 -21.85 -98.47 -95.76 450 0 0.3790 + 0.53 - 0.26 + 0.26 1 0.2359 - 0.42 + 0.08 + 0.42 2 0.06352 + 0.47 + 0.47 - 0.16 3 0.009716 + 4.12 + 2.06 - 7.84 4 0.001165 + 6.90 + 0.0011 -27.16 5 0.0001447 - 2.07 -32.14 -54.41 6 0.00002217 -23.42 -83.64 -74.46 7 0.000004399 -47.50 -98.50 - -90.61 600 0 0.3826 + 0.0010 1 0.2369 - 0.0011 2 0.06204 + 0.97 3 0.008733 + 2.86 4 0.0008686 + 3.34 5 0.00007706 + 0.00011 6 0.000007286 -10.56 7 0.0000008405 -29.60 750 0 0.3897 + 0.051 1 0.2365 - 0.042 2 0.05953 - 0.021

90 TABLE 2 (Continued) 0 d *(d) II IH(2) i(1) ~ _ _.. ii L, i fl- 19, f2 9 750 3 0,008231 + 0.33 4 0.0008416 + 0.18 5 0.,00007946 - 1.71 6 0.000007901 - 4.87 7 0,0000008818 - 9.72 f: -9 f,' 11 150 0 0,3855 + 0,23 1 0.2327 - 0,44 2 0.,06185 + 0.49 3 0,1072 + 1,8&7 4 0.001676 + 1,79 5 0.0002814 + 0,0011 6 0,00005403 - 4.26 7 0.000001199 - 8.40 30 0 0.3780 + 0.26 1 0.2342 - 0.0022 2 0.06427 + 0.78 3 0.01075 + 3.75 4 0.001495 + 3.33 5 0.0002133 + 0.47 6 0.00003470 - 7,20 7 0.000006679 -17.51 450 0 0.3746 + 0.27 1 0.2344 0.00 2 0.06552 + 0.76 3 0.01104 + 2.73 4 0.001514 + 3.31 5 0.0002101 + 0.95 6 0.00003341 - 6.29 7 0,000006431 -20.53

TABLE 2 (Continued) 0 d I(d) if IH(2) IH(1) fl = 9, f2 11 60~ 0 0.3772 + 0.53 1 0.2331 - 0.43 2 0.06458 + 0.15 3 0.01149 + 3.48 4 0.001812 + 6.08 5 0.0003137 + 2.87 6 0.00006431 - 6.84 7 0.00001573 -18.47 750 0 0.3840 + 0.42 1 0.2307 - 1.08 2 0.06263 - 0.37 3 0.01190 + 1.01 4 0.002184 + 1.87 5 0.0004465 + 1.25 6 0.0001056 - 1.14 7 0.00002881 - 2.78

92 TABLE 3 COMPARISON OF THE APPROXIMATE PERCENTAGE POINTS DUE TO ONE DILATED STUDENT'S t WITH THE EXACT PERCENTAGE POINTS IN TERMS OF THE DIFFERENCES A4s AND PERCENTAGE ERRORS II(p) IN p 8 f d d IIf d.l d.0 IIO. 1 0.05 Ad0.05 (0.05).01 0.01 (0.01) f2=12 150 12 2.175 +0.001 -0.32 3.029 +0.004 -0.80 24 2.168 +0.002 -0.40 3.020 +0.005 -1.00 O, 2.163 +0.001 -0.28 3.014 +0.004 -0.80 300 12 2.169 +0.001 -0.36 2.978 +0.005 -1.20 24 2.146 +0.004 -0.76 2.938 +0.009 -2.00 OD 2.120 +0.005 -1.00 2.909 +0.010 -2.20 45~ 12 2.167 0.000 2.954 +0.002 -0.60 24 2.112 +0.002 -0.36 2.853 +0.005 -1.20 Oa 2.064 +0.005 -1.04 2.775 +0.009 -2.20 600 12 2.169 +0.001 -0.36 2.978 +0.005 -1.20 24 2.085 +0.0004 -0.08 2.803 +0.001 -0.20 Oa 2.011 +0.002 -0.32 2.661 +0.002 -0.60 750 12 2.175 +0.001 -0.32 3.029 +0.004 -0.80 24 2.069 0.000 2.793 +0.002 -0.40 OD 1.973 0.000 2.595 0.000 f2 24 15~ 24 2.062 0.000 2.785 0.000 D 1 2.056 +0.001 -0.12 2.777 +0.001 -0.20 300 24 2.058 0.000 2.759 +0.001 -0.40 o 2.035 +0.001 -0.20 2.726 +0.001 -0.20

93 TABLE 3 (Continued) 0 f d Ad H d A II 1 d0.05 Ad0.05 (0.05) d0.01 Ad.01 II(0.01) f2 = 24 t,=24 450 24 2.056 0.000 2.747 0.000 co 2.009 0.000 2.664 +0.001 -0.20 600 24 2.058 0.000 2.759 +0.001 -0.40 co 1.983 +0.001 -0.12 2.613 0.000 750 24 2.062 0.000 2.785 0.000 O 1.966 0.000 2.585 0.000 0 p dp Adp II(p) f= o, f2 10 100 0.10 1.808 0.000 0.05 2.219 +0.002 -0.24 0.02 2.748 +0.003 -0.48 0.01 3.148 +0.002 -0.52 0.005 3.553 +0.003 -0.64 0. 002 4.106 +0.001 -0.50 400 0.10 1.749 +0.002 -0.38 0.05 2.112 +2.112 +0.008 0.02 2.559 +0.014 -2.67 0.01 2.883 +0.017 -3.80 0.005 3.203 +0.016 -3.52 0.002 3.630 +0.002 -0.50 500 0.10 1.721 +0.002 -0.31 0.05 2.066 +0.006 -1.24 0.02 2.481 +0.010 -2.37 0.01 2.775 +0.012 -2.88 0.005 3.058 +0.010 -2.72 0.002 3.425 +0.002 -0.30

94 TABLE 3 (Continued) 9 p dp (P) 80 ~ 0.10 1.651 0.000 0.05 1.967 0.000 0.02 2.335 0.000 0.01 2.586 0.000 0.005 2.818 0.000 0.002 3.103 0.000

95 TABLE 4 AN EMPIRICAL TABLE OF CORRECTIONS Af TO BE ADDED TO THE DEGREES OF FREEDOM f OF A DILATED STUDENT'S t f2= 6, 8 fl 00 150 300 450 600 750 900 6 0.0 0.1 0.2 0.4 0.2 0.1 0.0 8 0.0 0.1 0.2 0.4 0.2 0.1 0.0 12 0.0 0.1 0.3 0.6 0.3 0.1 0.0 24 0.0 0.1 0.4 0.8 1.2 0.1 0.0 OD 0.0 0.1 0.5 1.0 4.8 0.1 0.0

96 TABLE 5 COMPARISON OF THE APPROXIMATE PERCENTAGE POINTS DUE TO ONE DILATED STUDENT'S t WITH f AND (f+Af) DEGREES OF FREEDOM WITH THE EXACT PERCENTAGE POINTS dp IN TERMS OF THE DIFFERENCES Adp, Ad'p AND THE PERCENTAGE ERRORS II(p), II'(p) IN p Differences & Pet. Errct. 6 8 12 24 co Errors 2 6 15 d +2.440 +2.430 +2.423 +2.418 +2.413 0.05 Ad +0.004 +0.009 +0.008 +0.008 +0.007 0.05 Ad' 0.000 0.000 0.000 -0.001 -0.001 0.05 II(0.05) -1.16 -1.08 -1.12 -1.00 -0.98 II' +0.12 +0.11 (0.05) do 01 +3.654 +3.643 +3.636 +3.631 +3.626 Ad0 01 +0.020 +0.027 +0.026 +0.024 +0.021 Ad'01 +0.007 +0.006 +0.004 +0.002 0.000 O (0.01) -3.80 -3.40 -3.40 -3.00 -2.84'(0.01) -1.33 -0.76 -0.52 -0.25 H(0.01)

97 TABLE 5 (Continued) Differ- fi ences _ _ & Pct. 6 8 12 24 co Errors f2 X6 30~ do05 +2.435 +2.398 +2.367 +2.342 +2.322 Ad0.05 +0.012 +0.015 +0.019 +0.021 +0.022 Ad.05 0.000 +0.002 0.000 -0.004 -0.009 (0. 05) -1.72 -1.90 -2.74 -3.00 -2.93 (0.05) 11(0.05) -0.25 +0.57 +1.20 d0,011 +3.557 +3.495 +3.453 +3.424 +3.402 Ad001 +0.051 +0.061 +0.067 +0.068 +0.066 d,01 Ad' +0.020 +0.030 +0.022 +0.007 -0.009 0.01 (001) -6.80 -8.80 -9.42 -9.74 -9.26 (0.01) (0.01) -2.84 -4.42 -3.12 -1.00 +1.26 (0.01) 45- d 05 +2.435 +2.364 +2.301 +2.247 +2.201 Ad0.05 +0.010 +0.010 +.02216 +0.022 +0.028 Ad' -0.009 -0.005 -0.004 -0.002 -0.001 0.05 (0. 05) -1.56 -1.60 -1.88 -3.16 -4.64 II' 05) +1.40 +0.80 +0.47 +0.29 +0.17 H'(0.05) d 01 +3.514 +3.363 +3.246 +3.158 +3.093 Ado001 +0.044 +0.044 +0.062 +0.079 +0.091 Ad' -0.004 +0.008 +0.017 +0.023 +0.021 0.01 H0 01) -6.40 -6.60 -10.56 -13.18 -16.66 (0.01)

98 TABLE 5 (Continued) Differ- fl ences & Pct. &6 8 12 24 ca Errors f2 600 d0.05 +2.435 +2.331 +2.239 +2.156 +2.082 Ad0.05 +0.012 +0.006 +0.003 +0.008 +0.015 Ad'. 0.000 -0.001 -0.001 -0.002 -0.006 0.05 I(0.05) -1.72 -0.99 -0.48 -1.44 -3.05 (0.05)'(0.05) +0.17 +0.16 +0.36 +1.22 +3.557 +3.307 +3.104 +2.938 +2.804 -.01 Ado 01 +0.02051 +0.016 +0.028 +0.046 Ad' 01 +0.021 +0.005 +0.005 +0.006 0.000 0.01 (0.01) -6.80 -2.86 -2.80 -6.14 -10.96 (0.01) -2.84 -0.72 -0.88 +1.32 (0.01) 750 d0.05 +2.440 +2.310 +2.193 +2.088 +1.993 Ad005 +0.004 +0.004 +0.001 +0.001 +0.001 Ad'o 0.000 0.000 0.000 0.000 +0.001 0.05 I(0.05) -1.16 -0.68 -0.2 - -0.12 -0.24 II' -0.24 (0.05) d0.01 +3.654 +3.328 +3.053 +2.822 +2.627 Ado.01 +0.020 +0.012 +0.004 +0.002 +0.002 Ad' +0.007 +0.002 0.000 0.000 +0.002 0.01 II(001) -3.80 -1.98 -0.84 -0.32 -0.20 (001) -1.33 -0.33 -0.20 (0.01)''

99 TABLE 5 (Continued) Differ- fi ences & Pct. 8 12 24 o Errors f = 8 15~ d +2.300 +2.292 +2.286 +2.281 %.05 Ad0 05 +0.005 +0.005 +0.004 +0.005 Ad'0 05 +0.001 +0.001 0.000 0.000 II(005) -0.75 -0.29 -0.56 -0.68 (0 05) -0.15 -0.05 d 01 +3.316 +3.307 +3.301 +3.295 do.01 Ad0 01 +0.009 +0.012 +0.010 +0.011 Ad' +0.002 +0.002 0.000 +0.001 0.01 H (0.01) -2.00 -2.04 -1.70 -1.74 (001) -0.44 -0.33 -0.16 300 d +2.294 +2.262 +2.236 +2.215 0.05 Ad 05 +0.006 +0.008 +0.011 +0.013 Ad' +0.001 -0.001 +0.001 -0.001 0.05 II(0.05) -1.05 -1.15 -1.94 -2.20 II' -0.16 -0.14 -0.19 -0.17 (0.05) do 0.1 +3.239 +3.192 +3.158 +3.132 Ad0.01 +0.022 +0.013 +0.070 +0.030 Ad' +0.009 -0.006 +0.003 -0.002 0.01 II( -ni-3.80 -2.26 -5.00 -5.30 (0.01)

100 TABLE 5 (Continued) Differ- fi ences & Pct. 8 12 24 oo Errors f2= 8 2 450 d +2.292 +2.229 +2.175 +2.128 d.05 d0.05 +0.002 +0.005 +0.009 +0.014 Ad' 05 -0.003 -0.003 0.000 +0.004 -0.68 -0.80 -1.60 -2.58 (0.05) II'(0.05) +0.45 +0.48 -0.74 do 0-1 +3.206 +3.083 +2.988 +2.916 Ad0 01 +0.009 +0.014 +0.024 +0.032 Ad', 01 -0.005 -0.004 +0.004 +0.010 (0.01) -2.40 -3.00 -4.00 -7.00 F(0.01) -1.33 -0.86 -0.67 -2.22 60 d.05 +2.294 +2.201 +2.118 +2.044 Ad0.05 +0.006 +0.002 +0.002 +0.005 ad.05 +0.001 -0.002 -0.002 0.000 I(0.o05) -1.05 -0.29 -0.30 -1.08 (0.05) H' (-.05) 0.18 +0.29 +0.30 d0.01 +3.239 +3.032 +2.862 +2.723 Ad001 +0.022 +0.005 +0.004 +0.011 0.01 Ad' +0.009 -0.003 -0.004 +0.001 OII A -3.80 -0.98 -0.98 -3.12 I(001) -1.55 +0.59 +0.98 -0.28

TABLE 5 (Continued) Differences & Pet, &Pct. 8 12 24 aw Errors, f2'8 75- d +2.300 +2.183 +2.077 +1.982 Ad0o05 +0.005 +0.001 +0.001 +0.001 Ad' +0.001 0.000 0.000 +0.001 0.05 II~0 05) -0,75 -0.30 -0.08 -0.12 II'05) -0.15 -0.12 (0.05) do,01 +3.316 +3.039 +2.805 +2.608 Ado.01 +0.009 +0.004 +0.001 +0.001 A d' +0.002 0.000 0.000 +0.001 0.01 II(001) -2.00 -0.7800 -0.24 -0.26 III.-0.044 -0.26 (0.01) Differences P & Pct, Error010 0.05 0.02 0.01 Errors fl =f = 7 1 2 15~ d +1.89902 +2.35807 +2.97119 +3.45397 Adp +0.00104 +0.00635 +0.01385 +0.01839 Adtp -0.00251 +0.00042 +0.00352 +0.00381 P

102 TABLE 5 (Continued) Differences _ & Pct. 0.10 0.05 0.02 0.01 Errors 1 2 150 II~p) -0.15 -0.98 -2.20 -2.57 II' +0.39 -0.06 -0.50 -0.54 (p) 300 dp +1.91113 +.35215 +2.92662 +3.36875 Ad +0.00058 +0.00916 +0.02235 +0.03158 p Ad'p -0.00443 +0.00088 +0.00816 +0.01160 II p) -0.090 -1,47 -3.52 -4.78 (,p) II'p) +0.69 -0.14 -1.30 -1.79 (p) 450 dp +1.91788 +2.35161 +2.90869 +3.33071 Adp +0.00031 +0.00642 +0.01628 +0.02340 Ad'p -0.00712 -0.00572 -0.00419 -0.00504 II~p) -0.049 -1.05 -2.69 -3.78 (p) II'H +1.12 +0.94 +0.70 +0.84 (p)

103 TABLE 6 COMPARISON OF THE APPROXIMATE CUMULATIVE PROBABILITIES WITH THE EXACT CUMULATIVE PROBABILITIES F(d) IN TERMS OF PERCENTAGE ERRORSa $22 d F(d) Pt. d F(d) Pct. Error Error fl = f2= 9 fl= f2 7 0.5 0.6290 +0.061 0.5 0.6265 +0.14 1 0.7438 +0.32 1 0.7392 +0.16 1.5 0.8349 +0.03 1.5 0.8290 +0.072 2 0.9002 -0.10 2 0.8939 +0.061 2.5 0.9415 +0.10 2.5 0.9364 +0.063 3 0.9687 -0.021 3 0.9637 -0.041 4 0.9914 -0.022 4 0.9884 -0.012 5 0.9977 -0.000010 5 0.9964 -0.011 apct. error = (approx. value - F(d))100/F(d).

104 TABLE 7 COMPARISON OF THE SIXTHI AND EIGHTH CUMULANTS OF hd AND STUDENT'S t WITH f DEGREES OF FREEDOMa 2 2 2 cl c2 C3 mu- (hd) (tlf) Diff. lant Case fl' f3 = 12 0.93302 0.06699 0.00000 c6 6.6655 6.0352 + 0.6303 YK8 182.8561 139.0572 +43.7989 0.75000 0.25000 0.00000 ~6 3.1997 2.5393 + 0.6604 le.8 66.3120 41.9825 +24.3295 0.5000 0.5000 0.00000 k: 1.7152 1.4701 + 0.2451 20.3670 16.2587 + 4.1083 0.81818 0.09091 0.09091 k 4.1375 3.2104 + 0.9271 \f<8 96.5436 48.7193 +47.8243 0.47369 0.47369 0.05263 K6 1.4212 1.1511 + 0.2701 tK8 18.4787 9.4831 + 8.9956 0.33333 0.33333 0.33333 k6 0.6939 0.5677 + 0.1262 K8 6.2116 3.1561 + 3.0555 aLet d = Clt1 +2t + c3t3, where 2 = 1, and when c3 = 0, d is the Behrens-Flsher variable, l2w(x) is the 2vth cumulant of a random variable x.

105 TABLE 7 (Continued) 2 2 2 Cuce c2 c3 mu- (hd) (t4f) Dff. lant Case fl = f2 = f3 - 24 0.93302 0.06699 0.00000 kz 0.6833 0.6348 + 0.0485 6 4.6019 3.7517 + 0.8502 0.75000 0.25000 0.00000 0.3463 0.2967 + 0.0496 1.7263 1.1635 + 0.5628 0.50000 0.50000 0.0000 ~6 0.1917 0.1816 + 0.0101 r-8.0.6458 0.5493 + 0.0965 0.81818 0.09091 0.09091 6 0.4413 0.3652 + 0.0761 K<8 2.4649 1.5704 + 0.8945 0.47369 0.47369 0.05263 ~6 0.1610 0.1455 + 0.0155 ~8 0.5114 0.3920 + 1.1194 0.33333 0.33333 0.33333 6 0.0815 0.0758 + 0.0067 0.1803 0.1458 + 0.0345

CHAPTER 9 APPLICATIONS OF ONE DILATED STUDENT'S t APPROXIMATION This chapter is for the person who wishes to compute Behrens-Fisher distributions in practical applications. 9.1. Use of Tabulated Percentage Points Percentage points have been tabulated in Fisher and Yates (1957, Tables VI, VI1, VI) for the following cases: (1) fl' f2 = 6, 8, 12, 24, cD; 8 = 00 (150) 900; p = 0.05, 0.01. (2) fl = G, f2 = 10, 12, 15, 20, 30, 60, o; 8 = 00 (10~) 90~; p = 0.1, 0.05, 0.02, 0.01, 0.005, 0.002; (3) fl = 1(2)7, f2 = f1(2)7; 8 = 0~ (150) 90~; p = 0.1, 0.05, 0.02, 0.01. Percentage points for any case falling in this range of fl, f2; p can be most conveniently computed from these tables by interpolation as indicated in Fisher and Yates (1957, page 3). 106

107 9.2. Computation of Densities, Cumulative Probabilities, and Percentage Points For fl, f2 > 6, the approximation by one dilated Student's t of Chapter 8 can be used in practical applications of Behrens-Fisher distributions. Two numbers f and h are computed by the formulas (8.1.3), (8.1.4). Densities, cumulative probabilities, and percentage points are calculated by considering hd to be distributed approximately according to Student's distribution with f degrees of freedom. Tables of Student's distribution are available in Smirnov (1961). Table 4 gives corrections Af to be added to f so that percentage points can be calculated with higher accuracy. Percentage points are calculated with f' = (f + Af) degrees of freedom instead of f. Case fl, f2 < 6 When (fl, I2) = (1, 1), (sine + cose) -d is a Cauchy variable, a Student's t with one degree of freedom for which densities, cumulative probabilities, and percentage points have been tabulated. The percentage points for odd values of fl, f2 = 1(2)5 are available in Fisher and Yates (1957, Table VI1).

108 From the closed forms given in Tables 8-10, and formulas (3.2.2), (3.2.3), the densities and cumulative probabilities for odd degrees of freedom are calculated, which Is the sum of three terms at most. The densities, cumulative probabilities, and percentage points for even degrees of freedom can be obtained from the corresponding values for odd degrees of freedom by harmonic interpolation in f and f2, which has been studied for densities in section (5.4). Thus, we have bridged the gap between (fl, f2) = (6, 6) and (1, 1). 9.3. Generalized Behrens-Fisher Distributions Generalized Behrens-Fisher distributions are important, and results of section (8.7) indicate that one dilated Student's t approximation might work and this has to be proved with further research by computing few nominally exact numerical values.

APPENDIX 1. Closed Forms of Cu,v(d; a4,) for u,v = 1(1)7 C v(d; a, 8) have been obtained by starting with and using the recurrence relations (3.1.4), (3.1.5), and (3.1.6). Since interchanging u,v amounts to interchanging a,:, there is no loss in supposing u > v. The general pattern of C is (3.2.1), C ua t form, vM Looking at this form, it is enough to give the coefficients Ljuv = Lj in order to know Cu,. For each (u,v), the coefficients Lj(j = v, v + 1,... u + v - 1) can be written as where 109

110 Q1 = 2' Q2= (a + B) Q3 = the remaining part of Lj. Thus, it suffices to know r,s and Q3, and these have been tabulated in Tables 8-14, in this manner: V Tables 8-12 give the exact expressions for v = 1(1)5, u = v(1)7, each table covering one value of v and u = v(1)7. Tables 13-14 cover cases (u,v) = (6,6); (7,6); (7,7). By looking at the expressions, it is seen that the behavior of s is regular. It starts with the value (u + v - 1) for LU+vl and decreases in steps of 2 until it is > 0, and is equal to zero when it is < 0; and s corresponding to Lj;u.v is 2(u + v - 1) - 2(u + v - I - 1j) 2j - (u + v - 1), whenever this is greater than zero, and is equal to zero whenever 2j - (u + v- 1) ~ O.

TABLE 8 CLOSED FORMS OF C FOR v = 1, u = 1(1)7a UlV (u,v) L r s Q (1,1) 1 0 0 Y1 (2,1) 1 0 0 bl3 2 0 2 a (3,1) 1 0 0 b23 2 2 1 3aB 3 0 3 a (4,1) 1 0 0 b33 2 3 0 ac1(4ac+51) 3 0 2 2a 4 0 4 a3 (5,1) 1 0 0 b40 2 6 0 5aj(5a+73) ac (d; a,3) = L LM + L M+l + lMu+V-l uv v v+l UvL 1 =QQ3; Q1 = Q2= (a+3)s, Q3 is the remaining part of L]. Yi =(ai+ji)' bi= (- 1/2)(-l)i =1 2.4.6... 21)

TABLE 8 (Continred) (u,v) L r s Q3 (5,1) 3 4 1 5a2B(2a+3B) 4 2 3 5a33i 5 0 5 a4 (6,1) 1 0 o b50 2 7 0 21aS(2a+3fi) 3 5 0 a2.B(15a2 +42 a3+289) 4 4 2 3a3 j(4a+7p) 5 1 4 3a4 6 0 6 a (7,1) 1 0 0 b63 2 9 0 21ap3(7a+113) 3 7 0 7a2/3(7a2+21a3+15i2) 4 6 1 7a3 i(5a2 +16aS+122) 5 3 3 7a 4f(a+20i) 6 2 5 7a5fB 7 0 7 a6

TABLE 9 CLOSED FORMS OF Cuyv FOR v = 2, u = 2(1)7 (u,v) L r s Q3 (2,2) 2 0 0 blY3 3 0 3 2aI3 (3,2) 2 0 0 b2p3 3 1 2 (a (a-2aj+3j3) 4 0 4 3a21 (4,2) 20 0 b30 3 2 1 5ap3 4 1 3 a2 (a-32aB+62) 5 0 5 4a"3, (5,2) 2 0 0 b4A3 3 5 0 5a3 (6a+7)8) 4 4 2 45af2 3 Ca v(d; a) LMV L Mv + + L M+l..u L u,v vQ, v+l2Im Lj = Q1Q2Q3; Q1 = 2-' = = (a+)s Q3 is the remaining part of i), =(1/2)()i = 1-3.5... (21-1) Lj. ~,, = (a:p), b. = (- 2.4.6...2

114 TABLE 9 (Continued) (u,v) L r s Q3 (5,2) 5 1 4 a3(a2- 4a3+1092) 6 0 6 5a (6,2) 2 0 0 b563 3 6 0 7ajp3(7a+9j) 4 5 1 21a 23 (3a+4P) 5 2 3 21a~3~3 6 1 5 a 4(a2 2_5a( +15f2) 7 0 7 6a35 (7,2) 2 0 0 befi3 3 8 0 21ap (8a+11p) 4 7 0 7a 23 (28a2+72 ap+4532) 5 2 2 7a j3 (2a+3j3) 6 3 4 70a4fi3 5 22 7 1 6 a (a -6a/+2132) 8 0 8 7a6

TABLE 10 CLOSED FORMS OF C FOR v = 3, u = 3(1)7a Uv (u,v) L r s Q3 (3,3) 3 0 0 b2Y5 2 2 4 2 3 3ap(3a2-4ap+302) 5 0 5 6a212 (4,3) 3 0 0 b3 5 4 3 2 33a (aA-2a 3+3a22 -4a 3+534) 2 2 2 5 0 4 3a xp(c -2 a.+2 ) 6 0 6 10Oa3 2 (5,3) 3 0 0 b4A5 4 6 1 105a35 2 4 3 22 3 4 5 3 3 3a (a4-3a 3+6a2 -10aj3+1534) 53 2 2 ~~~~~~6 2 5 5a3'(3a -8ai+10o2) 7 0 7 15a 4 2 uv v v+1 u+v-u a~u~v(d;a,13S)= LVMV+ +L+ iM +**Lu+v-Lui Lj = Q1Q2Q3; Q1 = 2-, Q2 (a+)S, Q is the remaining part of L;. Y = (ai +1), b. = (-/2)()i s 1... (2i-1) j 1 12 23.4.6.2i

116 TABLE 10 (Continued) (u,v) L r 8 Q3 3 (6,3) 3 0 0 bs5f5 4 7 0 21a3 5(8a+9B) 5 2 2 21a 25 6 3 4 3a 3(a4-4a3P+10a232 -20a3+35i4) 7 1 6 3a4 (3a2-10a+15j32) 8 0 8 21a 5 2 (7,3) 3 0 0 bp5 4 9 0 63a 5 (9a+113) 5 6 1 63a f5 (4a+583) 6 3 3 105a3p5 7 3 5 3a (a4 -5a3+15a 2-35ap33+70P14 8 2 7 21a 5 (a2 -4a1+7p2 6 286 9 0 9 28af3

117 TABLE 11 CLOSED FORMS OF C v FOR v = 4, u = 4(1)7a (u,v) L r s Q3 (4,4) 4 0 0 b3y7 5 1 3 ap((5a 4-8a3 +9a 2 2_8ac3 +51 4) 22 2 2 6 0 5 5a2 2(2a -3a3+22) 7 0 7 20a3A83 (5,4) 4 0 0 b0 A7 5 4 2 5a(a6-2a508+3a4j2-4a3 3+5a2j4 4- 6a5 +7 6) 6 3 4 5a 2(5a 4-122a3+18a2i2 -2Oax3 +1534) 7 0 6 5a3 2(3a2_-6a3+52) 8 0 8 35a 4p3 (6,4) 4 0 0 b5,37 5 5 1 63aft7 6 4 3 5a (a6-3a5 +6a4 42-10aP33+15a2,4-21a5 +28 6 7 2 5 3a3 j3(Sa4-16ca 3+300a 2 2-40car3+354 ) ua v v v+l 1 aCuv(d; cr,I)= LV +LV1 + -+L v M Lj = Q1Q2Q3; Q1 = 2- Q2 (c+3)S, Q3 is the remaining part of L. Yi (ai+/3i) bi (-1/2)()l = 135... (2i-1) ( +'3)=bi (14.6.2.4. 2i

TABLE 11 (Continued) (u,v) L r s Q3 (6,4) 8 1 7 21a 42 (2a 25ap+5132) 9 0 9 56a503 (7,4) 4 0 0 b0 67 5 7 0 21ap7(lOa+ll) 6 6 2 525a2 7 7 4 4 5a (a -4 +10a4+lOa -20a3 3+35a A2 -56a13 +84t3) 8 3 6 35a4B(a4 4a3 +9a 1-14a33+1414) 9 0 8 14a52 (2 a2_6a1S+79) 10 0 10 84a6f3

TABLE 12 CLOSED FORMS OF CUv FOR v = 5, u = 5(1)7a (u,) L r s Q3 (5,5) 5 0 0 b4y9 6 5 42 33 6 6 3 25a1i(7a6-12a 5,+15a -16a3/ 3 + 15 a2 4-12a35+7136) 584) 513) 3 3 2 2 8 2 7 35a383(5 -8a/3+50a) 9 0 9 70a44 (6,5) 5 0 0 b5 9 6 7 2 35a(a-82a73+3a6132- 4a1 5a3+54343-5 26 7 8 6a 35+7a 1 6_8ap7+918 7 5 4 5a 2P(21a -54a51+90a 32 -120a 13 + 135a24_ -126a15 +8413% 8 4 6 105a3A2(3a4 -8a313+12232 -12a 12a3+ 704) aC (d; a, ) = L + 1 +. Lu.+v- M L; = Q1i2Q3; Q1 =4 2 6'2 = (a+1)S, Q3 is the remaining part of Li Yi = (ai+3i), b. = (- )(_1)i = 1246.. (2i'~~ =' )=24.6... 2i'

120 TABLE 12 (Continued) (u,v) L r s Q3.3 43 2 2 (6,5) 9 0 8 35a 43(2a 2_4a3+332) 54 10 0 10 126a5 4 (7,5) 5 0 0 b 9 6 9 1 1155ap39 7 7 3 35a2 (a8-3a 7P+6a632 -10a5 15 44 21a3 5+28a 2p6-36a +7+4538) 8 6 5 35a 30(7a6_24a 5,+50a40287a3/ 53+ 2 4 5 6 105a 2 -112a +84 6) 9 2 7 35a 432 (3a4 -10a 3+18a 22_-21a3 + 143 4) 10 1 9 21a 53 (1oa 224ca+21 2) 11 0 11 210a6A4

TABLE 13 CLOSED FORMS OF C FOR v = 6, u = 6(1)7a U,V (u,v) L r s Q3 (6,6) 6 0 0 b0 51 62 5 44 35 7 6 3 21aj3(9a8-16aj7+21aj 62-24a 5j+25a 4 4-24am33 + 21a2i 6-16ai7+908) 8 5 5 21a2{2 (28a6-63a 5+90a2-_100a 3 3+90a245 6 60a53 +28p6) 9 1 7 21a3 3 (7a -16a33+20a 2 -16a 3 +7 4) 10 0 9 63a 44(3a-5a2 +30 ) 55 11 0 11 252a5j5 (7,6) 6 0 0 b0, 1 9 8f2 73 64 55 46 7 8 2 63 a(a10 -2a 93+3a 2-4a73+5a -6a +7a 483 87 +92 28_ 10 a9+ l 110) 8 7 4 21a2, (21a8 56a 7+98a S -140a 5 +175a4 - 196a3 35+196a2 6_ 168aj7 +105 8) ac v v+l u+v-1 uv(d; a,)= LVMV + LVM +...+LUV 1M L.= = 212; Q1 = (a+i)sy Q3 is the remaining part of Lj. y = (ai+i), bi = (-i(-1) = 246... (2i )

122 TABLE 13 (Continued) (u,v) L r s Q3 3 2 6 5 4 2 33 a24 (7,6) 9 2 6 7a38 (14a -42a 58+75a 4O-100Oa3B3+10 5a - 5 6 84ap +42j6) 10 2 8 63a4 3 (7a 420a30+30a282 -28a +144 ) 54 2 2 11 0 10 63a p4(5a -10aa+712) 12 0 12 462a6p5

123 TABLE 14 CLOSED FORM OF C FOR u 7 v = a u,~ (u,v) L r s Q3 (7,7) 7 0 0 b6Y13 8 9 3 147a3(lltal -20a 93+27a8 2 -32a 73 +35a6 4 - 36a 55 +35a 4 36_32a /37+27a -20a/ +11o ) 9 5 5 49a 22 (15a8 -36a7, +56a 632-70a 53 +75a 4,4 70a 3 5+56a 23- 36a/3 +15/38) 10 3 7 21a 333(42a 6-112a 5p+175 a42200a3/33+175 a - 112ap5 +42:6) 11 2 9 105a4b4(14a4-35a3/3+45a /32_35a/ +144 ) 12 1 11 231 a 5p5(7a2 12a/3+7/2) 13 0 13 924a 6/ ac (d; a,p) LMv + L Mv+l + + L Mu +vu,y v +' L = =Q1Q2Q; Q1 = 2 rI Q2 + (a+f)S, Q3 is the remaining part of L. Y = (a 1/2)(1 = 1.3.5.....(2i-1) 1t. +=j, i( -1i = 2.4.6... 2i

124 2. To Obtain a Generating Function when Either of the Degrees of Freedom Is Even Case f = 2 -, f2= 2q, 1 2 u =p, vq + 1/2. where C aC and Then, — o,6' ( o+ 1 If i/Z LX A

125 Since, for each p, q, and w, the integrand is at most xPyq, we ob1 tain, after interchanging summation and integral signs,(except for -), which can be transformed into an elementary integral, by suitable change -of variable. First express where )) - V

126 (2) - - Put ~f 06~~~~~~~~~~~~~. r 3.c

where Then, put It,' = - 1 (t\, I' )it This reduces I. to 0sC\-ft _____ ____'_ (3)_ X\ = (.) e uet From (1) and (2), we get (4) J-,XY ~~;~4

128 Substitute for Ij according to (3), and thus, -l1(x, y) is obtained. 41(X, y), even after simplification, was found very cumbersome and is, therefore, not reported here. Case 2 fl = 2p, f2 =2q 2J - |m -04 where ~_ /'tte3+Ce~oti', x'~'/~. CD 3c tS'L /.x // L ( X )/VL \ AL Y L xlX x, 1

129 --: -- " - - %4 <(K) = ( KKK) - K (L(I) ( Sx SK\J S'j K(I4, E (c) are the complete elliptic integrals of the first and second kind, and are tabulated. 3. Program in the "MAD" Language for the IBM 709 Computer To Calculate the Densities -(dlflf2; 0) for Odd Integral Values of f! and f2 In the program, fl, f2, and # are changed to N1, N2 and p.

130 R CONVOLUTION OF 2 INDEPENDENT T-DISTRIBUR TIONS FOR VENKUTAI PATIL, 1-62 R INITIALIZATION. START READ FORMAT DATA,NSTART,NSTOP,TSTEP, DSTART,DSTOP,DSTEP VECTOR VALUES DATA=$2I3,4F6.0*$ R MAKE GOMMA TABLE. SQRPI=1./SQRT.(3.14159265) GOMMA (NSTART)=SQRPI THROUGH LOOP1A, FOR N=3,2,N.G.N START LOOPlA GOMMA(N START) =GOMMA(NSTART) *(N- 1.0)/ 1. (N-2.0) THROUGH LOOP1, FOR N=NSTART+2,2,N.G.NSTOP LOOP1 GOMMA(N)=GOMMMA(N-2)* (N- 1.0)/(N-2.0) R MAKE SQUARE ROOT TABLE. THROUGH LOOP2, FOR F=NSTART,2.0,F.G.NSTOP LOOP2 SQR(F)=SQRT.(F) R MAKE K AND K1 TABLES. K(1)=-0.5 K1(1)=1.0 THROUGH LOOP3, FOR N=2,1,N.G.(NSTOP+1)/2 X=2.*N Y=X-3. K(N)=K(N- 1)*Y/X LOOP3 K1 (N)=K1 (N- 1)*Y/(X-2.) R MAKE SEC, SEC2, AND TAN TABLES. CONV=0.174532925E-1 P=O THROUGH LOOP4, FOR ARG=TSTEP,TSTEP, 1 ARG.G.89.9 P=P+1 THETA(P)=ARG X = ARG*CONV Y=COS.(X) SEC(P)=1./Y SEC2 (P)=1./ (Y*Y) LOOP4 TAN (P)=SIN.(X)/Y TSTOP=P+1 THETA(0)=0.0 THETA(TSTOP)=90.0 R CALCULATE RHO THROUGH ITER,FOR N2= NSTART,2,N2.G.NSTOP

Q=(N2+1)/2 THROUGH ITER,FOR N1=N2,2, N1.G. NSTOP P=(Nl+l)/2 S=SQR(N1)/SQR(N2) GO = GOMMA (N2)* GOMMA(N1)/SQR(N2) DN=1 THROUGH IT, FOR D=DSTART,DSTEP,D.G.DSTOP GD2=D*D/N2 GD1=D*D/Nl THROUGH ITT, FOR T=1,1,T.E.TSOP ALPHA=S*TAN(T) ALP H=2 *ALPHA ALPH1=ALPHA+1.0 ALPH2=ALPHA*ALPH1 PHI=i./(ALP H1.P.2+GD2*SEC2 (T)) C( 1,1)=ALPHI*PHI THROUGH ITA,FOR I=2,1,I.G.P X=O.0 THROUGH IT AA, FOR IA=, 1,IA.G.I-2 IT AA X=C( IA,1)*K(I-IA)+X IT A C( I,1)=PHI*(Kl(I)+ALPH2*C( I-1,1)-ALPH *X) THROUGH IT B, FOR J=2,1,J.G.Q X=0.0 THROUGH ITO,FOR JA=1,1,JA.G.J-2 X=C(1,JA)*K(J- JA)+X C(1,J)=PEI* (ALPHA*K1 (J)+ALPH1* C(1,J- 1) -ALPH*X) THROUGH IT B, FOR I=2,1,I.G.P. X=0.0 THROUGH IT B1, FOR IA=1,,IA.G.I-1 THROUGH ITBI, FOR JA=1,1,JA.G.J-1 IT B1 X=C( IA,JA)*K(I-IA)*K(J-JA)+X Y=0.0 THROUGH IT B2, FOR IA=1,,IA.G.I-2 IT B2 Y=C( IA,J)*K(I-IA)+Y Z=0.0 THROUGH IT B3, FOR JA=1,1,JA.G.J-2 IT B3 Z=C( I,JA)*K(J-JA)+Z IT B C( I,J)=(ALPH1*C( I,J-1)+ALPH2*C( I-1,J) 1 -ALPH*(X+Y+Z))*PHI ITT RHO(DN,T)=C( P,Q)*GO*SEC(T) R **************

132 RHO(DN,0)=(GOMMA(N2) *SQRPI/SQR(N2))/ (1.+GD2).P.Q RHO(DN,TSOP)= GOMMA(N1)*SQRPI/SQR(N1)/ (1.+GD1).P.P IT DN=DN+1 R R PRINT ALL PAGES FOR N1,N2 THROUGH ITER, FOR COL=0,6,COL.G.TSOP DN=1 WHENEVER COL+5.G.TSTOP L=TSTOP OTHERWISE L=COL+5 END OF CO NDITIONAL PRINT FORMAT HEAD,N1,N2,THETA(COL)... 1 THETA(L) THROUGH ITER, FOR D=DSTART,DSTEP, D.G.DSTOP PRINT FORMAT LINE, D, RHO(DN,COL)... RHO(DN,L) ITER DN=DN+1 TRANSFER TO START R DECLARATIONS VECTOR VALUES HEAD=$26HlCONVOLUTED T1 DISTRIBUTION..,S10,5HN1 = 1 I3,7H, N2 = I3/3H- D S50,5HTHETA/S17,F4.1,5(S13,! F4.1)///*$ VECTOR VALUES LINE=$1H F4.1,S1,6(1PE17.8)*$ DIMENSION K(50),K1 (50),GOMMA(100), SQR(100), 1 THETA(90),SEC(90),SEC2 (90),TAN(90) 2 C( 400,DMC),RHO(9000,DMR) VECTOR VALUES DMC=2,1,20 VECTOR VALUES DMR=2,2,91 EQUIVALENCE(GOMMA(2),SQR(1)) INTEGER NSTART,NSTOPN,P,Q,N1,N2,DN,T,I, 1 IA,J,JA,L,FIXD, 1 TSTOP,COL END OF PROGRAM

LIST OF REFERENCES Aiken, H. H., et al. (1952), Tables of the Error Function and of Its First Twenty Derivatives. Harvard University Press. Bartlett, M. S. (1936), "The information available in small samples," Proc. Camb. PhIL Soc., 32, 560-66. Behrens, W. U. (1929), "Ein Beltrag zur Fehlen-Berechnung bet wenigen Beobachtungen," Landw. Jb., 68, 807-37. Byrd, P. F., and Friedman, Me D. (1954), Handbook of Elliptic Integrals for Engineers and Physicists. Springer Verlag. Chapman, D. G. (1950), "Some two sample tests," Ann. math. Statist., 21, 469-87. Cramer, H. (1946), Mathematical Methods of Statistics. Princeton University Press. Fisher, R. A. (1925), "Expansion of Student's integral in powers of n-, Metron, 5, 109-20. Fisher, R. A. (1935), "The fiducial argument in statistical inference," Ann. Eugen., 6, 391-98. Fisher, R. A. (1941), "The asymptotic approach to Behrens integral with further tables for the test of significance," Ann. Eugen., 11, 141-72. Fisher, R. A., and Healy, M. J. R. (1956), "New tables of Behren's test of significance," J. R. Statist. Soc. B, 18, 212-16. Fisher, R. A., and Yates, F. (1957), Statistical Tables for Biological, Agricultural and Medical Research. London: Oliver and Boyd (fifth edition). 133

134 Greenwood, J. A., and Hartley, H. O. (1962), Guide to Tables in Mathematical Statistics. Princeton, New Jersey: Princeton University Press. Hildebrand, F. B. (1956), Introduction to Numerical Analysis. McGraw-Hill. James, G. S. (1959), "The Behrens-Fisher distribution and weighted means," J. R. Statist. Soc. B, 21, 73-90. Jeffreys, H. (1940), "Note on Behrens-Fisher formula," Ann. Eugen., 10, 48-51. Jeffreys, H. (1948), Theory of Probability. Oxford: University Press (2nd edition). Kendall, M. G. (1943), The Advanced Theory of Statistics, Vol. 1. Griffin and Co. Lindley, D. V. (1961), "The use of prior probability distributions in statistical inferences and decisions," Proc. Fourth Berkeley Symposium, 1, 453-68. Raiffa, H., and Schlaifer, R. (1961), Applied Statistical Decision Theory. Boston: Division of Research, Harvard Business School. Ruben, H. (1960), "'On the distribution of weighted difference of two independent Student variates," J. R. Statist. Soc. B, 22, 188-94. Satterthwaite, F. E. (1946), "An approximate distribution of estimates of variance components," Biom. Bull., 2, 110-14. Savage, L. J. (1954), The Foundations of Statistics. New York: Wiley. Savage, L. J., et al. (1962), The Foundations of Statistical Inference. New York: John Wiley. ScheffA, H. (1943), "On solutions of the Behrens-Fisher problem, based on the t-distribution," Ann. Math. Statist., 14, 35-44.

135 Scheffe, H. (1944), "A note on the Behrens-Fisher problem," Ann. Math. Statist., 15, 430-32. Smirnov, N. V. (1961), Tables for the Distribution and Density Functions of t-Distribution. New York, Oxford, London, Paris: Pergamon Press. Sukhatme, P. V. (1938), "On Fisher and Behrens' test of significcance for the difference in means of two normal samples," Sankhyl, 4, 39-48. Sukhatme, P. V., et al. (1951), "Revision of end figures of BehrensFisher criterion d," Ind. Soc. Agr. Statist. Jn., 3, 9. Wallace, D. Lk (1958), "Asymptotic approximations to distributions," Ann. Math. Statist., 29, 635-51. Welch, B. L. (1938), "The significance of the difference between two means when population variances are unequal," Biometrika, 29, 350-62. Welch, B. L. (1947), "The generalization of'Student's' problem when the population variances are unequal," Biometrika, 34, 28-35.

UNIVERSITY OF MICHIGAN 3 9015 03022 6719