NON-LINEARITIES IN GENETIC ADAPTIVE SEARCH by Daniel Raymond Frantz September 1972 Technical Report No. 138 with assistancp.from:,,.... National Science Foundation. Grant No." GJ,-29989X "'- /*,'. Washington,:'C.. D ~..C Department of Health, Education., and Wel'fare af<;J/ National Institutes.o'f' Heal~th' Grant No. GM-12236' Bethesda, Maryland

I. ~~~~~, ^^ 4 7K ^ ^ V~I~k

ABSTRACT NON-LINEARITIES IN GENETIC ADAPTIVE SEARCH by Daniel Raymond Frantz Chairman: John H. Holland Methods of adaptive search must contain, at least implicitly, the ability to detect and act upon non-linearities in their environments (i.e., in the functions to be optimized). If such knowledge can be made explicit, this information may be of value in constructing models of the environment and may lead to faster and more successful adaptation. One method of adaptive search, based on genetics, is called the Reproductive Plan, due to Holland. Requirements for its use are: 1) description of the environment must be in terms of a set of parameters; 2) the quantity to be optimized (the payoff) must be a function of these parameters; 3) there must be an initial population of parameter sets. The parameters are listed as a string of values, called a chromosome. The environments used in this research had twenty-five parameters, each of which could take two values. Dependent groups of up to nine parameters were used. In this variation of the Reproductive Plan new points to be tested are generated by a two step process: 1) Two parents are randomly selected from the current population with the selection biased according

to payoff. 2) The two parents are combined by operators to form an offspring, the new point; these operators are similar to the genetic versions of crossover, inversion, and mutation. The operation of the Reproductive Plan suggests two ways by which functional dependencies in the environment may be detected. Groups of parameters which interact non-linearly may have their frequencies of combination different from that predicted by the individual parameter frequencies. This is testable by a multi-dimensional chi-squared contingency table. Another characteristic suggests that position of parameters on the chromosome may be important in the rate of evolution or on the equilibrium point of the population payoff. If there is an adaptive advantage in position, a genetic operator such as inversion should be able,to produce chromosomes with better permutations of parameters. Computer experiments were performed to test these hypotheses. Multi-dimensional chi-squared contingency table tests showed that most groups of parameters were statistically dependent, a by-product of the Reproductive Plan, but non-linear payoff groups were definitely distinguishable from linear payoff groups. Analyses were then performed to detect the dependent groups when no prior information existed: contingency table values for all pairs of parameters were calculated. The pairs from non-linear groups showed higher association indices. Thus, dependent groups are detectable since all their pairs have much higher associations. Studies of the equilibrium level of the population payoff average did not show any consistent position effect, although there were some positive results. However, investigation of the rate of evolution showed that populations in which groups of dependent parameters are close

together significantly and consistently outperform populations in which the groups are spread apart on the chromosome. Populations with spread groups often did not achieve optimum or near-optimum points in the most complex environments. Experiments using the inversion operator were unable to demonstrate its ability to capitalize on the position effect. The advantage due to the position effect lasts for too short a time in the environments used. The role of inversion is open for further study since it may be shown effective in more complex environments with longer adaptation times.

ACKNOWLEDGEMENTS I would like to thank those who made this research not only possible but also worthwhile and enjoyable: foremost among all, my thesis chairman, John H. Holland, who introduced me to Reproductive Plans and kept me on the paths of righteousness while investigating them; my committee members Bernard P. Zeigler, Larry K. Flanigan, and Julian P. Adams, as well as W.J. Schull, a member during the early stages; the Logic of Computers Group (and its director Arthur W. Burks) which supported me financially, provided the computing equipment upon which all the experimentation took place, and whose memb'ers provided intellectual stimulation of all sorts; Monna Whipp and Karen Vora who helped type many of the tables; especially our secretary, Janet McDougall, who saw this document through several drafts with outstanding competence and cheerfulness; and finally my wife, Mary, who, often more than I, had faith that I would finish. This research was supported by the National Science Foundation, Grant No. GJ-29989X and the National Institutes of Health, Grant No. GM-12236. ii

TABLE OF CONTENTS Page CHAPTER ONE: A Different Use for Adaptive Search 1 Device Composition I Non-linearity (Dependence of Components) 2 Hierarchical Models 4 The Representation Problem 6 Summary 7 CHAPTER TWO: The Reproductive Plan 8 Definition of the Genetic Algorithm 8 Implications of Reproductive Plan for Dependencies 11 Frequency of Gene Combinations 12 Position Effect 15 CHAPTER THREE: The Experimental Basis 18 The Environment 18 Non-linearity 1S Discreteness 21 Functional Form. 23 The Payoff Function 26 Environments Used 26 The Adaptive Program 42 Genes, Alleles, and Chromosomes 42 Population Size 44 Mating and the Genetic Operators 45 Inversion 48 Crossover 51 Mutation 52 Migration 52 Selection 53 Monte Carlo Methods 54 Program Description 55 CHAPTER FOUR: Position Effects 59 Equilibrium Populations 60 Experimental Procedure 60 A Simple Adaptive System 62 The Problem of Homogeneity 65 Variance by Mutation 67 Variance by Decreased Selection 68 Variance by Migration 70 Best/Worst Tests 74 Summary 77 iii

TABLE OF CONTENTS (Cont'd) Evolving Populations 79 Best/Worst Evolution 80 Approach to the Optimum 98 Inversion Experiments 101 Probabilities of Permutations 103 Experimental Protocol 105 Equilibrium Test 113 Evolving Tests 115 Summary 119 CHAPTER FIVE: Frequency Effects 122 Existence of a Frequency Effect 122 Chi-Squared Analysis 122 Multiple-Gene Tests 124 Summary 143 Detecting Dependent Groups 145 Pair Analysis 146 Analysis at the End of Runs 146 Cumulative Analysis 148 CHAPTER SIX: Conclusions 163 APPENDIX: Ten Random Permutations of Twenty-five Genes 168 BIBLIOGRAPHY 169 iv

LIST OF TABLES Table Page 4.1: Experiment Al. Gene Positions and Results 63 4.2: Experiment C1. Gene Positions and Results 65 4.3: Experiments in Mutation Rates 68 4.4: Experiments in Reduced Selection 70 4.5: Experiments in Migration 72 4.6: Best/Worst Equilibrium Results 76 4.7: Summary of Best/Worst Evolution Experiments 97 4.8: Best Individuals vs. Population Payoff 100 4.9: Probability of Dispersion of Genes on a Chromosome 104 4.10: Inversion/Equilibrium Experiments 114 4.11: Inversion/Evolution Experiments (cont'd) 117 4.11: Inversion/Evolution Experiments 118 2 5.1(a): Multi-gene X2 Tests: Experimental Parameters 125 5.1(b): Multi-gene X2 Tests: Results of First Runs (cont'd) 126 5.1(b): Multi-gene X2 Tests: Results of First Runs 127 5.1(c): Multi-gene X Tests: 2Results of Second Runs 128 5.2: Selected Points of the X Distribution 129 v

LIST OF FIGURES Figure Page 2.1: Research Schema 17 3.1: Environment 1 27 3.2(a): Payoff Group A 28 3.2(b): Environment 2 28 3.3(a): Payoff Group B 29 3.3(b): Environment 3 29 3.4: Environment 4 30 3.5(a): Payoff Group C 31 3.5(b): Environment 5 31 3.6(a): Payoff Group D. 32 3.6(b): Environment 6 32 3.7(a): Payoff Group E 33 3.7(b): Payoff Group F 34 3.7(c): Environment 7 34 3.8: Environment 8 35 3.9(a): Payoff Group G (cont'd) 36 3.9(a): Payoff Group G 37 3.9(b): Environment 9 37 3.10(a): Payoff Group H (cont'd) 38 3.10(a): Payoff Group H (cont'd) 39 3.10(a): Payoff Group H (cont'd) 40 3.10(a): Payoff Group H 41 3.10(b): Environment 10 41 3.11: Generation of a New Population (cont'd) 57 3.11: Generation of a New Population 58 4.1: Output from Experiment Dl 81 4.2: Output from Experiment D3 (cont'd) 82 4.2: Output from Experiment D3 83 4.3: Output from Experiment D5 (cont'd) 84 4.3: Output from Experiment D5 (cont'd) 85 4.3: Output from Experiment D5 86 4.4: Output from Experiment D6 (cont'd) 87 4.4: Output from Experiment D6 (cont'd) 88 4.4: Output from Experiment D6 89 4.5: Output from Experiment D7 (cont'd) 90 4.5: Output from Experiment D7 (cont'd) 9i 4.5: Output from Experiment D7 92 4.6: Output from Experiment D8 (cont'd) 93 4.6: Output from Experiment D8 (cont'd) 94 4.6: Output from Experiment D8 95 4.7: Output from Experiment MH3/4 111 vi

LIST OF FIGURES (Cont'd) 5.1(a): Sampile X2 Multi-gelne Tcst-Experiment J1 130 5.1(b): Samplle X blMulti-gene Test-lxperiJment J4 131 5.2(a): Sample X Multi-gene Test-Expleriment L5 132 5.2(b): Sample X Multi-gene Test-Experiment L5 133 5.3: Pair Chi-squaredValues for Experiment III 150 5.4: Pair Chi-squared Values for Experiment 112 151 5.5: Pair Chi-squared Values for Experiment H6 152 5.6: Pair Chi-squared Values for Experiment H8 153 5.7(a): Pair Chi-squared Values for Experiment 13 154 5.7(b): Pair Chi-squared Values for Experiment 13 (8 Generations) 155 5.8: Pair Chi-squared Values for Experiment J5 156 5.9(a): Pair Chi-squared Values for Experiment K7 157 5.9(b): Pair Chi-squared Values for Experiment K8 158 5.10(a): Pair Chi-squared Values for Experiment L5 159 5.10(b): Pair Chi-squared Values for Experiment L6 160 5.11: Cumulative PCVs for ENV 7 161 5.12: Cumulative PCVs for ENV 8 162 vii

CHAPTER ONE A DIFFERENT USE FOR ADAPTIVE SEARCH Device Composition Many problems in adaptive search can be viewed as the creation of an optimal device from components, each of which have several possible instances. If either the number of components or the number of instances for each component is large, then the size of the search space is usually quite large, requiring the use of adaptive algorithms rather than enumeration to perform the job in a reasonable amount of time. Adaptive search is made feasible by the decomposition or coordinatization of the device: if the device were not made up of subunits but was instead a monolithic whole, no regularity in the makeup of the solution could be exploited and the only method of search would be to enumerate each of the possible devices. Examples of this kind of structure in an adaptive framework are numerous. In control theory the components are continuous control parameters (e.g., voltage levels, flow rates) and the optimum device is one which specifies values for the parameters yielding a minimum error in a selected subset of the dependent variables. In a pattern recognition device the components may be the weights attached to feature detectors, specifying the importance of that feature in picking out a particular pattern class. The optimum is that set of weights which discriminates perfectly. Similarly in Samuel's checker player (15), the components are the weights attached to parameters which measure some feature of the board. The optimum is that set of weights which predicts the value of a board in the same manner as a minimax strategy 1

2 for the game. Other examples can be found in genetics, economic planning, etc. The representation of the device in terms of the particular components chosen to describe it may be more or less adequate with respect to the overall problem of which adaptive search is only a small part. In economic theory, the choice of which activities to include in a mix may be a matter of opinion, but once those activities have been chosen, there is no doubt that the optimum obtained with respect to that base set is a solution to the problem posed. That is, the representation or coordination is adequate by design. The choice of activities is a value decision which cannot be decided or explored by adaptive search. On the other hand, if the set of detectors for the checker player is inadequate, the optimum with respect to those parameters will still not approach minimax. Thus, adequacy of the base set of components for the overall task (the so-called "representation problem") is in large part separable from the problem of adaptive search which merely optimizes with respect to a given set of components. Once a representation is determined the search space is well-defined. Different representations may lead to different "shaped" spaces for essentially the same problem, but any one representation yields a definite space, Non-linearity (Dependence of Components) Difficult optimization problems are those which are highly non-linear with respect to the parameters or components. In its most difficult sense, "non-linear" means that the optimal setting for a

3 particular parameter (i.e., that setting yielding a global maximum) depends on the setting of one or more other parameters. The resulting dependence makes it impossible to optimize the solution component by component. Combinations of parameter values are more important than individual values. When there are a finite number of parameter values there are many more combinations than there are individual values and this is where adaptive search enters. Somehow combinations must be tried, the results remembered, and new combinations tried on the basis of previous information. The opposite of non-linearity (or dependence) is linearity or independence. Two parameters are independent of each other (are relatively linear) if the effect.a value for one has on the optimum is not influenced by the value of the other. It is entirely possible for every component to depend on every other component. In this case, unless there is some regularity in the interaction, the search algorithm may have as hard a time as if it were dealing with a non-coordinatized system: it may break down and only be able to do an enumerative search. Most non-linear groups of components have regularities with respect to the component values; in functional optimization this is similar to stating that the function is at least piece-wise continuous. So even non-linear groups provide "handles" for an adaptive plan. If there are groups of parameters which are internally dependent but independent of other groups, the search algorithm has an easier time since it can optimize the groups independently (which at worst is an enumeration over much smaller spaces) and sum the results of the individual optima.

4 For any particular adaptive search algorithm to operate well it must take cognizance of the non-linearities of the system. A generally applicable algorithm must be able to handle non-linearities of any type or order. Thus any adaptive algorithm must be able to build up within itself, at least implicitly (as part of its state or data), information about the non-linearities of the search space. Even search algorithms which contain little explicit memory (such as some of the gradient methods which keep only one point in the space) can be considered to contain or generate this implicit information. No matter how it is achieved, selection of the next point to be tested must depend on knowledge of the space. Ultimately, the trajectory of points through the space may be used as clues to the structure. As the search progresses over'more samples from the space, more complete and certain knowledge must be gained. (Since plans operate with finite memory, it is reasonable that knowledge about bad sections of the space may be lost —only information affecting good points will be kept.) Despite having this information about the non-linearity of the space implicit within themselves, as normally formulated, optimization or search procedures have only one result: the optimum point (or at least a good point). It is the intention of this thesis to show how knowledge of dependencies resides in at least one class of search algorithms, the Holland Reproductive Plans (9,10,11). In addition, we intend to show that this implicit information can be made explicit in Hierarchical Models Concern over the form of a procedure's method of exploiting functional

5 dependency arises for two reasons. The first comes directly from the definition of a procedure and the nature of science. Each adaptive algorithm has a justification of its efficacy; part of this justification is an explanation of how the algorithm treats non-linearities. Proof of the value of the plan must include verification of such treatment (to eliminate chance or other factors). While previous researches into Holland Reproductive Plans have performed a verificiation of efficiency with respect to other plans, this paper is the first full-scale explicit investigations into the internal workings of the algorithm, an intrinsically interesting topic. The second, more significant, reason for investigation is the importance of learning something about the spaces one is searching. (The class of spaces we shall investigate is described in Chapter Three.) This is directly related to modelling theory with respect to the usefulness of a level of description of a system. In terms of both human preference and mathematical tractibility it is advantageous to explain complex systems in terms of a small number of factors. Interaction of components beyond a certain number or level of non-linearity becomes incomprehensible to the human mind; the mathematics of such systems is often incapable of closed solution. If at all possible, scientists will aggregate variables or describe the system at a level which can be handled easily. If necessary, several levels of abstraction or aggregation may be employed to achieve a description appropriate to a particular purpose. This process is called model building via hierarchy. Dependent groups of coordinates or variables are identified and then treated as a single, well-understood quantity on a higher level. In the ideal

6 case, aggregation may eventually lead to a linear system, which both humans and mathematics can treat most easily. In other cases, assuming the reduction preserves the structure of the system, a hierarchical model still has great intuitive explanatory powers. Thus, determining the dependencies among components in a particular environment is an aid to model building. Extracting information from an adaptive algorithm regarding non-linearities may be no more of an aid than merely pointing out that some components are dependent without giving specific information as to what the dependence is. However, if that knowledge reduces the number of parameters from a hundred to six, the reduction of the modelling problem is one of several orders of magnitude. The Representation Problem The modelling procedure is one aspect of the representation problem previously mentioned. The "problem" resides in the uncertainty of which information in the environment is relevant to the task at hand. Ordinarily in artificial intelligence work there is such an abundance of data available that some reduction (or selection) must be made (by a human intermediary) before the programmed learning can occur. Although we do not wish to stress this issue, being able to detect dependencies may eventually be of use in the solution of this problem. By feeding all of the potentially useful information into an adaptive program we may obtain information not only about good points in the search space but also about the dependencies. Most likely the oversupply of parameters, some of which may be irrelevant or redundant, will make for

7 slow adaptation —i.e., the representation is not the most useful. But by means of the dependency knowledge gained on this first attempt it may be possible to obtain a better representation, leading to faster adaptation and so forth. Thus, not only is the creation of hierarchical models useful for human understanding but, practically, it may also lead to the increased adaptive efficiency of the man-machine system. Summary When faced with a complex search space two kinds of information are important: the optimum achievable in the space and the "shape" or dependencies in the space. Knowledge of dependencies is important for the building of hierarchical models of the space. Rather than do a completely independent analysis of the space for the model building it would be advantageous to be able to use the adaptive search procedure to determine dependencies also. Since the adaptive program must build up within itself some knowledge of the space in order to find the optimum it is reasonable to try to make this knowledge explicit. Even if it cannot be made explicit in every case, it is valuable to verify that the adaptive search procedure contains the information in the manner expected so that it builds models implicitly.

CIHAPTER TWO TIE REPRODUCTIVE PLAN Reproductive Plans are a class of adaptive algorithms based on genetics. Holland has given theoretical justifications for the use of such plans in terms of their efficiency (11). Previous simulation work (Bagley, Cavicchio, Hollstien, Bosworth (1,3,122)) has shown the actual superiority of such algorithms over other search procedures. Since much has been written on the basis of Reproductive Plans (see the above references), we will include here only a summary of its salient points. Definition of the Genetic Algorithm Problem solving with the Reproductive Plans requires an environment (i.e., the problem area to be solved) with two properties: 1. There must be a device decomposition as discussed in Chapter One. The solution must be expressed in terms of a list of parameters with admissible substitutions for each parameter completely defining the search space. We shall thus refer to a device alternately as a string or an individual. (No order to the parameters is implied by use of the word "string"; some order must be chosen for convenience) 2. There must be a "payoff" (or goodness, fitness, utility) associated with each device described by the parameters, and this payoff is the quantity to be maximized. 8

9 Briefly, then, the main features of a reproductive plan are the following: 1. A "population" of devices forms the memory of the system at any point in time. 2. A new population is formed from the old population by the following two-step process: a) An intermediate population is formed by reproducing each member of the old population a number of times proportional to its payoff. For example, if two devices (strings of parameters) S1 and S2 have payoffs 3 and 7, respectively, then 3 copies of S1 and 7 copies of S2 will enter the intermediate population. b) The new population is then formed by randomly choosing members of the intermediate population and modifying them by "operators". These operators gre strictly string operators and are largely independent of the environment. The most important feature of a Reproductive Plan is 2a: emphasizing strings in proportion to their payoff. Thus, the proportion of better strings increases exponentially with respect to the average of the population. The choice of operators (2b) is critical. Previous work by Friedberg (8) and Fogel (6) failed precisely because they did not appreciate the importance of analyzing what the operators did, so that they destroyed the advantages of proportional reproduction. Choice of operators determines the exact form of the adaptive algorithm within the class of Reproductive Plans. Since many of the ideas embodied in the rest of this report are taken from genetics we will summarize the applicable genetic vocabulary

10 in terms of the ideas already presented: A gene is a functional unit —it corresponds to our notion of parameter. An aZlele is a particular instance of a gene: a value for the parameter. A chromosome is a string of genes (a one-dimensional list). A locus is a position on a chromosome (e.g., the first position from the left hand end). All the genetic operators are probabilistic in their effects; that is, there is a certain probability associated with their application in forming members of the new population. Although there are many string operators possible and although there are many operators recognized by biochemical geneticists, we shall investigate only the operators mutation, crossover, and inversion. Mutation changes an allele into another allele with some (small) probability. Its main function is to supply variability to the population by maintaining at least a small proportion of each allele. Crossover takes two chromosomes from the old population (parents) and interchanges parts of them, producing two new individuals. For example, for chromosomes of length seven, crossover might operate by exchanging exactly one portion in the following manner: AbcDEFg AbcDefg abCDefg abCDEFg It thus serves two functions. First, it preserves a large measure of association between the alleles in the parent chromosomes. In the example, the combinations in the first four and last three positions were preserved. If there were any interactions among these genes (i.e., any advantage in being together in those combinations), that

11 interaction or advantage is preserved. Since application of the operator is probabilistic it may cross over at other points also, allowing other interacting gene combinations to be preserved. Secondly, by interchanging portions of the chromosomes, crossover generates tests of entirely new combinations of genes. In fact, crossover is recognized by geneticists as the most important search factor in natural adaptation. The inversion operator randomly inverts a section of a chromosome, thereby changing the distance between genes. For example, AbcdeFG AedcbFG f t Since inversion does not change alleles but rather associations between genes it cannot directly affect the payoff of an individual as can crossover and mutation. The role of inversion is explored below under "Position Effect". The exact form of the Reproductive Plan used and variations on the operators are given in Chapter Three. Implications of Reproductive Plan for Dependencies The above brief analysis is a justification for the use of the crossover and inversion operators. As such it is a claim as to how the Reproductive Plan (with these operators) takes advantage of non-linearities. Let us investigate these two statements: 1) Combinations of genes which contribute to high payoff are increased in the population; 2) When inversion is used dependent genes tend to congregate together on the chromosome.

12 Frequency of Gene Combinations In the basic reproductive step (2a) increasing the frequency of an above average string in the population according to its relative payoff increases the frequency of all the substrings contained in that string. If combinations are not too greatly disturbed by the operators in changing from the intermediate population to a new population (2b), then the frequency of above average combinations tends to increase in the population. Holland suggests that the change is more rapid than might be predicted on an individual gene basis. Let us consider a set of dependent genes, S, and a set of instances, SO,S1,...,Sn, in which SO has the above average payoff. The combination SO is increased at the expense of the other combinations, but the others are still present. Since SO is increased, it may also be that the frequency of the alleles of the individual genes in the combination are also increased. We can then ask whether the combination increases faster than the individual alleles. Consider the following simple model: A chromosome is made up of two genes, a and b, each with alleles 0 and 1. Let there be N strings in the population divided evenly among the four combinations 00, 01, 10, and 11. Let the payoffs be according to the following table string payoff 00 1 01 1 10 1 11 1+S where S is the selection coefficient, a quantity greater than zero.

13 Let f = frequency of 1 alleles for gene a = a 2 1 fb = frequency of 1 alleles for gene b = f. = frequency of the combination ij (i = 0,1; j = 0,1). 1 1 4 Thus, in the initial population f00 = (l-fa)(l-fb), fll = fafb' etc.; that is, the individual gene frequencies perfectly predict the combinations. Now, carrying out only the reproductive step of the genetic algorithm we obtain the following population. type number 00 (Nf00) (1) = 00 ^00) ^4 01 (Nf0l (1) 4 10 (NfO) (1) 11 (Nfl1)(l+S) =N( With the total number of individuals in the intermediate population being N N' = T(4+S) Recalculating the frequencies of combination and of individual genes in the new population we obtain: f = f f = N/4 N/4 1 00 01 10 N' N/4) (4+S) 4+S f, = (N/4) (l+S) 1+S 11 (N/4)(4+S) 4+S a 10 11 4+S i 2+S f' - 4 b = 4+S

14 But now attempting to predict f' from f'f' we obtain 11 a b /fl.ft 2S 1 +S f afb 4+S+S+S 11 To'find the difference we calculate that f f'f' = > 0 11 a b ( 2 (4+S) i.e., f' > f'f' 11 a b which says that the frequency of combination cannot be accurately predicted by the individual gene frequencies: they underpredict. Other factors enter into this calculation in more general cases. The non-linearity we have given is not the only type possible; however, we are hampered by not having a well-developed theory of non-linearity for reference. In addition, different initial conditions and more genes leads us into a morass of analytic difficulties.'Population geneticists treat this problem under the heading of "linkage disequilibrium". Finally, operators (mutation and crossover) depress the effect and further complicate the analysis. In any case, the same simple calculation shows that a linear (additive) payoff leads to overprediction of the combination, i.e., fl < faf. 11 a b The statistical concept of contingency tables uses the chi-square criterion for determining dependencies of factors. As such it maybe useful for determining that a population has evolved under the influence of non-linearities. It is well known as a post hoc analytic aid in genetics; Crow and Kimura include the chi-square analysis as an appendix to their introductory textbook An Introduction to Population Genetics Theory ( 4). But even such basic knowledge has not been heretofore used for the purposes we intend.

15 The important point to note is that the reproductive plan inherently produces this effect as a means of dealing with the environment. Chapter Five contains descriptions of experiments designed to verify the effect and to make the knowledge contained in the population explicit to the experimenter. Position Effect Fisher (5) first argued that chromosomes on which dependent genes are close together have an adaptive advantage at equilibrium over chromosomes on which the genes are farther apart. Again under the banner of linkage disequilibrium, the population geneticists have proposed models to deal with this conjecture. It seems fairly well proved for most two-gene models, but there is room for'doubt in three (or more) gene models. Turner (16) provides a general discussion of diploid models of this sort. At any rate, most models discuss only the equilibrium conditions and not conditions of changing populations. For example, an equilibrium level might be low due to a particular distance, but that gene distance may have caused faster evolution. The intuitive explanation to support the hypothesis is that proximate genes are less likely to be separated by crossover. If the crossover separates genes which are independent, no harm results since the payoff associated with a gene is independent of the value of the other genes. On the other hand, if the genes involved are dependent on each other, a good combination broken up results in more of a loss. The important point to note is that position does not contribute directly to the payoff of a single chromosome —the payoff depends

16 strictly on the alleles present. However, position does enter into the ability of an individual to pass its good combinations to its descendants, and thus into the ability to produce good descendants. Chromosomes with good permutations of genes are more likely to have descendants in future generations than chromosomes with bad permutations. This constitutes an adaptive advantage. Chapter Four describes computer experiments performed to determine whether the position effect actually exists in some class of environments and whether it can be detected. Briefly, there are two ways of approaching this problem: with populations in equilibrium and with evolving populations. Populations in equilibrium (under the normal artificial conditions used) have the rather difficult property that they tend towards homogeneity (i.e., all members tend to be alike). This presents a problem (as described in Chapter Four) so that we introduce methods of increasing population variance at equilibrium to help bring out the position effect. Using evolving populations we study the rate at which populations evolve as a function of position. Figure 2.1 summarizes the types of experiments run to test the effects of dependencies in the environments and to try to discover these dependencies.

Detection of non-linearities Position effects (Ch. 4) Frequency Analysis (Ch. 5) Equilibrium Evolving MultiPopulatioPopulaopulations Dimensional Contingency Pair Tables Analysis Trminal Variance by Rates Permutations migration selection mutation Figure 2.1: Research Schema.

CHAPTER THREE THE EXPERIMENTAL BASIS The Environment The problem of deciding which environments to use in proving our point is both difficult and simple. We wish to test the reproductive plan on environments which have both linear and non-linear components, trying to separate the two effects. The difficulty is that "linear" is well-defined while "non-linear" is defined by the statement: "everything else". In all of computability theory there is little attempt to formalize or classify types of non-linearity. In functional optimization work, again, linearity is considered trivial and non-linearity is so complicated that only a few examples of non-linear environments are mentioned, mainly as benchmarks for the testing of new optimization techniques. These environments are either complicated analytic functions of a few arguments, or simple quadratic functions of many arguments. Bagley's (1) work on the "Meta-Environment" is one of the few developments directed to this area - and it was done for much the same reason we need it. Thus, the simple part of choosing environments to search is that we have almost a free hand to select the definition of non-linearity we like, a freedom we shall definitely exercise. Non-linearity Although it is certainly not our intention to create a theory of 18

19 non-linearity, a few prefatory remarks are in order. (Much of the following development is similar to Bagley.) A linear function of several arguments is one in which the value of the function is obtained by summing functions each of which is of, at most, one argument, for example: f(x,y,z) = g(x)+h(y)+e(z). Other ways of stating linearity are to say that the arguments are orthogonal or independent (relative to the function f). One interesting property of linear functions is that they are unimodal if the component functions are unimodal, that is, they have only one maximum, no false maxima. This suggests the following procedure for maximizing linear functions, as described in Bagley (p. 43): 1) Choose an argument and select arbitrary values for all the other arguments. 2) Vary the values of the chosen argument to obtain the relative maximum. 3) Hold the chosen argument at the value which achieved that maximum and choose another argument which has not been previously chosen. 4) Go to Step 2 and repeat the process until all arguments have been assigned values. The resulting argument values will then maximize the function. Even if component functions are multi-modal, this procedure will still find a (non-unique) optimum. Non-linear functions are all those functions which are not linear, i.e., not able to be expressed as sums of functions of only one argument. Such functions may be

20 unimodal, as are some linear functions; for example, f(x,y) = (x+y)2 0 < x < 10, 0 < y < 10, has its only maximum at (x = 10, y = 10). On the other hand, g(x,y) = (y) (- 0 s x S 10, 0 < y < 10, has maxima at (x = 10, y = 0) and (x = 0, y = 10). In summary, unimodal functions may be either linear or non-linear; but a multi-modal function must be non-linear or linear with multi-modal components. In this research arguments may take on only two values (see below). Consequently, when the total function is multi-modal it must then be non-linear. In the case of unimodal, non-linear functions, it is possible to use the same search algorithm as for linear functions. Unimodality is all that is required for its success. These kinds of functions are very similar to linear functions because of unimodality. Consider the two functions: f(x,y) = (x+y) x = 1,2 g(x,y) = x+y x = 1,8 Y = 1,2 y = 1,8 xy f(x,y) x y g(x,y) 1 1 4 1 1 2 1 2 9 1 8 9 2 1 9 8 1 9 2 2 16 8 8 16 Both functions have only one maximum, are very close in their range of values, and yet one is linear and the other non-linear. Although we are certainly not claiming that a linear function can be found to match any unimodal, non-linear function as closely as in this case, unimodal, non-linear functions may be difficult in general to separate in their effects from linear functions. For this reason, most of the non-linear environments we will study

21 will be multimodal; if we cannot detect unimodal non-linearity we will not consider it too great a loss since such environments do have simple optimization procedures (e.g,, gradient or the above). We consider it sufficient to specify environments of inte-^ merely as non-linear, multimodal for the purpose of this research. Discreteness The environments we shall use are discrete; that is, arguments to the function will have a finite number of substitution instances (alleles); in our case this finite number is two, and the values taken by the arguments may be 0 or 1. This is done strictly for convenience' sake in this research (which is designed to prove a point, not show all possible implementations of reproductive plans). Reporting gene frequencies is much easier with just two alleles (only one number specifies both frequencies), table look-up and compilation of tables is easier, and solution times are shorter than for cases in which more alleles are used. This is no restriction on the kinds of environments which can be used in real problems. Hollstien (12) implemented the concept of polygene to use binary alleles in the representation of many-valued arguments. Eight genes, each having two alleles, were used to specify a Real (as compared to Integer) argument in the range 0-100, with an accuracy of +0.5. These genes took part in the genetics of his system independently of each other, but at payoff calculation time they were interpreted jointly as a single number to plug into the function evaluation. The alternative to this coding is to use a single gene with 200 alleles,

22 all of which are adaptively significant. With this large number of alleles, adaptation is driven largely by mutation rather than by recombination, so that the advantages of search by recombination are lost. Since we are using discreet functions we must define what is meant by "local maximum" since we intend it slightly different than in the continuous case. If f is a function of n (discrete) arguments, then m = f(al,a2,...,an) is a local maximum if m > f(al,a,...,an) and m > f(al,a,...,an) and 2 n m > f(ala2,.,a) where al al,..., an # a, (a!-ai) sufficiently small. In coding theory this might be stated as: if m is calculated from f by the argument list A, then m is a local maximum if m > f(A') for every A' which is separated from A by a Hamming distance of one. It is a reasonable definition since it states that m is a local maximum if it is greater than every value that can be reached by making the smallest possible change in the argument list. A global maximum (or just plain maximum) is a local maximum which is at least as great as every other local maximum. Local maxima are also called peaks, and a false peak is a local maximum which is not a global maximum. Thus far we have specified that the functions we shall search will

23 have binary arguments. We shall study functions with up to 25 such arguments; therefore the size of the space we must search is about 33,000,000 points. While not overly large, the space is large enough to provide a-variety of possible functions and to show the power of the plan. Functional Form In specifying the payoff function, a table look-up procedure will be used. This has one very strong advantage: any function of the arguments may be specified without trying to find an expression for it. An analytic form is not particularly useful since all we are interested in is non-linearity. In addition, table look-up is fast. Obviously, we do not plan to use a thirty-three million place table. We define two types of functional forms. In the first, we will divide the arguments into four groups of six, six, six, and seven arguments apiece, providing table look-ups for each group; the result of the function will be the sum of the four subfunctions. That is, for the argument list al,...,a25. f(al'.,a25) = gl(al a2'a 3 a4,aa6) + g2 (a7 a8,a9,0'aal a12) + g3 (a13 a14, a ls a16, a17ala ) + g4 (a19a200a 21a 22Xa 235a 24'a25). Obviously, this restricts the non-linearity of the system to at most a function of seven arguments, a (sub-) space of 128 points. Below, we. describe functional forms of eight and nine arguments, spaces of 512 points. Once again, this should be sufficient for our needs; it seems to be a reasonable compromise between experimental and design contingencies.

24 Note that the table for a group need not be non-linear in its effects; may tabulate a linear function, a function which is non-linear in four genes and linear in two, etc. In addition, there is the element of linearity built into the final composition of f, the sum of four (possibly) non-linear functions. In the experimental work later on, we will wish to describe the space being searched. Since all environments will be sums of groups, we need only specify the groups. A group which is strictly linear in its effects is specified, for example, by 7 7 8 8 9 9 10 10 11 11 12 12 g2 L(v0,v;v0,v;v0v;v0'Vl;v0'V1;v0,V1 ) where v0 is the contribution due to a zero value for an argument and vI is that due to a one value. For example, g2:L(3,4;3,4;5,6;3,4;5,6;3,4) specifies that group two is linear and that zero-valued arguments for genes 7,8,10, and 12 contribute three and one-valued arguments contribute four; for genes 9 and 11 the values are 5 and 6. Group four, of course, requires seven argument values. Non-linear groups will be exactly specified by a table as in Figure 3.2; in shorthand we will specify gl:NL(A) where A is a table of argument combinations and associated function values. We will summarize the table for convenience by A: (comb:value,comb:value,...) where the "comb:value" pairs describe the argument combination which forms a local maximum for the function and the value of the function at that point. For example: B: (110110:24,001001:13) summarizes a table with two local maxima, one at 110110 with a function

25 value of 24 and the other at 001001 with a function value of 13. Since these two points are local maxima, there is a trough between them. The depth and slope of the trough is of lesser interest that the number of peaks, although some experimentation may be done on those variables also. Thus a complete specification of the environment may appear as ENV 3: gl:NL(A) g2:L(2,4;6,6;0,0;2,7;7,1;9,0) g3:L(0,1;1,0;1,0;0,0;l,0;0,1) g4:NL(B). Group one is specified by table A and group four by table B. Groups two and three are linear with the contribution due to argument values as given. Note that arguments 8,9 and 16 (the second and third arguments in group two and the fourth in group three) make no distinction between their possible values. In genetic terms, these arguments are selectively neutral; it makes no difference which value they take on. The second functional form for specifying payoff is similar to the first with the exception that there are only three additive groups, containing 8, 8, and 9 genes each. This form allows for more complex interactions between genes and makes it more unlikely that the optimum for a group will appear in an initial population by chance. We shall call these three groups g5, g6, and g7. Figure 3.9a is an example of this functional form.

26 The Payoff Function Given the specification of the environment, then, it is easy to state the goal of a reproductive plan facing it: members of the population will attempt to specify argument values corresponding to the peaks of the environment. The payoff will be the actual environmental function value resulting from the arguments contained in an individual chromosome. Environments Used Following is a list of the environments used in the rest of this research. As stated above, we are limited by not having a theory of non-linearity for reference so that we cannot claim universality for the environments chosen. Those used range from completely linear to one with a simple five-gene non-linear group to one with a very complex nine-gene group. The differences among the environments are discussed in the chapters on the experimental work.

27 ENV 1: gl:L(O0,1;0,1;0,1;0,1;0,;0,0) g2:L(O,O;O,O;O,;0O,O;O,O;O,O) g3:L(OO;O,O;O,O;O,O;O,O;,;O,O) gq:L(0,0;0,0;0,0;0,0;0,0;0,0;0,0) g4:L(O,O;O,O;O,O;O,O;O,O;O,O;O,O) Figure 5.1: Environment 1.

28 Payoff group A: 000000 3 010000 2 100000 2 110000 1 000001 3 010001 2 100001 2 110001 1 000010 2 010010 1 100010 1 110010 1 000011 2 010011 1 100011 1 110011 1 000100 2 0010100 1 100100 1 110100 000101 2 010101 1 100101 1 110101 1 000110 1 010110 1 100110 1 110110 2 000111 1 010111 100111 1 110111 2 001000 2 011000 1 101000 1 11100 1 001001 2 011001 1 101001 1 111001 1 001010 1 011010 1 101010 1 111010 2 001011 1 011011 1 101011 1 111011 2 00 1 1100 1 011 1 00 1 101100 111100 001101 1 011101 1 101101 1 111101 2 001110 1 011110 2 101110 2 111110 3 001111 1 011111 2 101111 2 111111 3 Summary: A:(00000:3, 11111:3) (Note that the sixth gene is neutral) Figure 3.2(a): Payoff Group A. ENV 2: gl:NL(A) g3:L(0,0;0,0;0,0;0,0;0,0;0,0) g4:L(0,0;0O,O;O,O;00,0;,00;0,0) Figure 3.2(b): Environment 2.

29 Payoff group B 000000 3 0100 5 100000 5 110000 7 000001 6 010001 3 100001 3 110001 5 000010 6 010010 3 100010 3 110010 5 000011 8 010011 6 100011 6 110011 3 000100 5 010100 7 100100 7 110100 9 000101 3 010101 5 100101 5 110101 7 000110 3 010110 100110 5 110110 7 000111 6 010111 3 100111 3 110111 5 001000 6 011000 3 101000 3 111000 5 001001 8 011001 6 101001 6 111001 3 001010 8 011010 6 101010 6 111010 3 00101110 011011 8 101011 8 111011 6 001100 3 C11100 5 101100 5 111100 7 001101 6 011101 3 101101 3 111101 5 001110 6 011110 3 101110 3 111110 5 001111 8 011111 6 101111 6 111111 3 Summary: B:(001011:10, 110100:9) Figure 3.3(a): Payoff Group B. ENV 3: gl:NL(B) g2:L(O,1;0,1;0,1;0,1;0,1;0,1) g3:L(O,1;0,1;0,1;0,1;0,1;0,1) g4:L(0,1;0,10,1;0,1;0,1;0,1;0,1) Figure 3.3(b): Environment 3.

30 ENV 4: gl:L(9,10;9,10;9,10;9,10;9,10;0,0) g2:L(O,O;O,O;O,O;O,O;O,O;O,O) g3:L(O,O;O,O;O,O;0,O;0,0;0,0) g4:L(0,O;0,0;0,0;0,0;0,0;0,0;0,0) Figure 3.4: Environment 4.

31 Payoff group C: 000000 000000 7 10000 000 7 110000 6 000001 8 010001 7 100001 7 110001 6 000010 7 010010 6 100010 6 110010 6 000011 7 010011 6 100011 6 110011 6 OuO100 7 010100 6 100100 6 110100 6 000101 7 010101 6 100101 6 110101 6 000110 6 010110 6 100110 6 110110 7 000111 6 010111 6 100111 6 110111 7 001000 7 011000 6 101000 6 111000 6 001001 7 011001 6 101001 6 111001 6 001010 6 011010 6 101010 6 111010 7 001011 6 011011 6 101011 6 111011 7 001100 6 011100 6 101100 6 111100 7 001101 6 011101 6 101101 6 111101 7 001110 6 011110 7 101110 7 111110 8 001111 6 011111 7 101111 7 111111 8 Summary: C:(00000:8, 11111:8) Figure 3.5(a): Payoff Group C. ENV 5: gl:NL(C) g2:L(0,0;0,0;0,0;,0;,0;0,0 ) g3:L(OO;OO;O,;0,0;0,0;0,0) g4:L(0,0;0,0;0,0;0,0;0,0;0,0;0,0) Figure 3.5(b): Environment 5.

32 Payoff group D: 000000 13 010000 15 100000 15 110000 17 000001 16 010001 13 100001 13 110001 15 000010 16 010010 13 100010 13 110010 15 000011 18 010011 16 100011 16 110011 13 000100 15 010100 17 100100 17 110100 19 000101 13 010101 15 100101 15 110101 17 OOC10 13 010110 15 100110 15 110110 17 000111 16 010111 13 100111 13 110111 15 001000 16 011000 13 101000 13 111000 15 001001 18 011001 16 101001 16 111001 13 001010 18 011010 16 101010 16 111010 13 001011 20 011011 18 101011 18 111011 16 001100 13 011100 15 101100 15 111100 17 001101 16 011101 13 101101 13 111101 15 001110 16 011110 13 101110 13 111110 15 001111 18 011111 16 101111 16 111111 13 Summary: D:(001011:20, 110100:19) Figure 3.6(a): Payoff Group D. ENV 6: glNL(D) g2:L(O,,1;1;0,1;0,1;0,1;0,1) g:L(0,1;0,1;0,1;0,1;0,1;0,1) g4:L(0,1,;0,1;;0,1;0,1;0,1;0,1) Figure 3.6(b): Environment 6.

33 Payoff group E: 000000 9 010000 10 100000 11 110000 10 000001 12 010001 7 100001 8 110001 7 000010 7 010010 8 100010 9 110010 8 000011 6 010011 7 100011 8 110011 7 000100 7 010100 8 100100 9 110100 8 000101 6 010101 7 100101 8 110101 7 000110 6 010110 7 100110 8 110110 7 000111 8 010111 10 100111 10 110111 9 001000 10 011000 11 101000 10 111000 9 001001 7 011001 8 101001 7 111001 6 001010 8 011010 9 101010 8 111010 7 001011 7 011011 8 101011 7 111011 13 001100 8 011100 9 101100 8 111100 7 001101 7 011101 8 101101 7 11110.1 6 001110 7 011110 13 101110 7 111110 6 001111 9 011111 10 101111 9 111111 8 Summary: E:(000001:12, 011110:13, 111011:13, 011000:11, 100000:11, 011000:11) Group E is made up of the two non-linear subgroups El: E2: 000 3 000 6 001 4 001 3 010 4 010 4 011 5 011 3 100 5 100 4 101 4 101 3 110 4 110 3 111 3 11 5 El: (011:5, 100:5) E2: (000:6, 111:5) which add linearly except for the points 000001 12 011110 13 111011 13 Figure 3.7(a): Payoff Group E.

34 Payoff Group F: Made up of the two non-linear subgroups F1 and F2, which add linearly. Fl: 000 6 F2: 0000 5 1000 4 001 5 0001 4 1001 5 010 5 0010 4 1010 5 011 4 0011 5 1011 6 100 5 0100 6 1100 5 101 4 0101 5 1101 4 110 4 0110 5 1110 4 111 5 0111 4 1111 5 F1: (000:6,111:5) F2: (0100:6,1011:6) Figure 3.7(b): Payoff Group F, ENV 7: g:NL(B) g2:NL(A) g3:NL(E) g4:NL(F) Figure 3.7(c): Environment 7.

35 ENV 8: gl:NL(B) g2:NL(B) g3:NL(B) g4:L(O,1;0,10,,1;0,1;0,1;0,1;0,1) Figure 3.8: Environment 8.

36 Payoff group G. 00000000 4 01000000 4 10000000 6 11000000 4 00000001 4 01000001 5 1000001 5 11000001 5 00000010 4 01000010 4 1000010 5 11000010 5 00000011 4 01000011 4 10000011 4 1100011 4 00000100 5 01000100 6 10000100 5 11000100 4 00000101 4 01000101 4 10000101 5 11000101 4 00000110 5 01000110 4 10000110 4 11000110 5 00000111 4 01000111 4 10000111 4 11000111 5 00001000 4 01001000 6 10001000 4 11001000 5 00001001 4 01001001 4 10001001 4 11001001 5 00001010 5 01001010 5 10001010 5 11001010 5 00001011 5 01001011 4 10001011 5 11001011 5 00001100 6 01001100 8 10001100 4 11001100 6 00001101 5 01001101 6 10001101 4 11001101 4 00001110 5 01001110 6 10001110 4 11001110 4 00001111 5 01001111 4 10001111 4 1100111 5 00010000 6 01010000 6 ICO10000 7 11010000 6 00010001 4 01010001 5 10010001 6 11010001 5 00010010 5 01010010 4 i0010010 6 11010010 5 00010011 5 01010011 4 10010011 4 11010011 4 00010100 5 01010100 4 10010100 6 11010100 5 010101 4 010101010 4 10010101 4 11 5 11010101 5 00010110 4 01010110 4 10010110 5 11010110 4 00010111 4' 01010111 5 10010111 4 11010111 5 00011000 4 01011000 4 10011000 6 11011000 5 00011001 4 01011001 4 10011001 5 11011001 5 00011010 4 01011010 4 10011010 5 11011010 5 00011011 4 01011011 5 10011011 5 11011011 5 00011100 4 01011100 6 10011100 4 11011100 4 00011101 5 01011101 5 10011101 5 11011101 5 00011110 4 01011110 4 10011110 4 11011110 4 00011111 4 01011111 5 10011111 5 11011111 8 00100000 6 01100000 6 10100000 7 11100000 6 00100001 5 01100001 5 10100001 6 11100001 5 00100010 4 01100010 5 10100010 6 11100010 4 00100011 5 01100011 4 10100011 5 11100011 4 00100100 5 01100100 5 10100100 6 11100100 4 00100101 4 01100101 4 10100101 5 11100101 5 00100110 5 011001510 4 10100110 5 11100110 5 00100111 5 01100111 5 10100111 4 11100111 4 00101000 5 01101000 4 10101000 6 11101000 4 00101001 5 01101001 5 10101001 5 11101001 5 00101010 4 01101010 5 10101010 4 1101010 4 00101011 4 01101011 5 10101011 5 1110101 5 00101100 4 01101100 6 10101100 4 11101100 4 00101101 5 01101101 5 10101101 5 11101101 4 Figure 3.9(a): Payoff Group G. (Cont' d)

37 00101110 5 01101110 5 10101110 4 11101110 4 00101111 4 01101111 4 10101111 5 11101111 8 00110000 7 01110000 6 10110000 8 11110000 7 00110001 6 01110001 6 10110001 7 11110001 6 00110010 6 01110010 6 10110010 7 11110010 6 00110011 5 01110011 4 10110011 6 11110011 4 00110100 6 01110100 6 10110100 7 11110100 6 00110101 4 01110101 4 10110101 6 11110101 4 00110110 4 01110110 5 10110110 6 11110110 4 00110111 4 01110111 5 10110111 5 11110111 8 00111000 6 01111000 6 10111000 7 11111000 6 00111001 5 01111001 5 10111001 6 11111001 4 00111010 5 01111010 5 10111010 6.11111010 5 00111011 5 01111011 4 10111011 5 11111011 8 00111100 5 01111100 5 10111100 6 11111100 4 00111101 5 01111101 5 10111101 4 11111101 8 00111110 5 01111110 4 10111110 5 11111110 8 00111111 4 01111111 8 10111111 8 11111111 9 Summary: G:(11111111:9, 10110000:8, 01001100:8) Neighbors of the first peak pay 8. Neighbors of the second peak pay 7 and their neighbors pay 6. Neighbors of the third peak pay 6. All other points pay 4 or 5 randomly. Figure 3.9(a): Payoff Group G. ENV 9: g5:NL(G) g6:L(0,l;;,;0,1;0,l;0,1;01,l;0,l;Ol) g7:L(0,1;0,1;0,1;0,1;01;0,1;0,1;0,1;0,1) Figure 3.9(b): Environment 9.

38 Payoff group H: 000000000 a 010000000 6 100000000 6 110000000 3 000000001 6 010000001 3 100000001 5 11000001 5 000000010 6 010000010 5 100000010 4 110000010 3 000000011 4 010000011 5 100000011 3 110000011 3 000000100 6 010000100 4 100000100 5 110000100 3 000000101 4 010000101 4 100000101 4 110000101 3 000000110 5 010000110 3 100000110 4 110000110 3 000000111 4 010000111 4 100000111 5 110000111 4 000001000 6 010001000 4 100001000 5 110001000 3 000001001 4 010001001 5 100001001 4 110001001 3 000001010 5 010001010 4 100001010 5 110001010 4 000001011 4 010001011 3 100001011 5 110001011 4 000001100 3 010001100 5 100001100 4 110001100 4 000001101 5 010001101 3 100001101 5 110001101 5 000001110 4010001110 5 100001110 5 110001110 4 000001111 5 010001111 4 100001111 3 110001111 4 000010000 6 010010000 3 100010000 4 110010000 4 000010001 3 010010001 5 100010001 5 110010001 3 000010010 5 010010010 3 100010010 5 110010010 5 000010011 4 010010011 5 100010011 3 110010011 5 000010100 4 010010100 5 100010100 5 110010100 4 000010101 3 010010101 4 100010101 5 110010101 5 000010110 4 010010110 5 100010110 3 110010110 4 000010111 3 010010111 3 100010111 3 110010111 5 000011000 4 010011000 4 100011000 5 110011000 5 000011001 3 010011001 5 100011001 5 110011001 5 000011010 4 010011010 3 100011010 4 110011010 5 000011011 5 010011011 5 100011011 5 110011011 4 000011100 5 010011100 3 100011100 4 110011100 4 000011101 3 010011101 5 100011101 4 110011101 5 000011110 5 010011110. 3 100011110 3 110011110 4 000011111 5 010011111 4 100011111 3 110011111 4 Figure 3.10(a): Payoff Group H. (Cont'd)

39 000100000 6 010100000 3 100100000 5 110100000 3 000100001 5 010100001 4 100100001 5 110100001 4 000100010 3 010100010 5 100100010 5 110100010 3 00000011 5 10100011 3 100100011 5 110100011 3 000100100 5 010.100100 3 100100100 5 110100100 3 000100101 5 010100101 3 100100101 5 110100101 5 000100110 4 010100110 5 100100110 5 110100110 4 000100111 5 010100111 4 100100111 4 110100111 5 000101000 3 010101000 5 100101000 3 110101000 5 000101001 5 010101001 3 100101001 5 110101001 4 000101010 5 1010101010 3 100101010 4 110101010 3 000101011 3 010101011 5 100101011 4 110101011 5 000101100 6 010101100 5 100101100 4 110101100 4 000101101 4 010101101 3 100101101 3 110101101 5 000101110 5 010101110 5 100101110 5 110101110 5 000101111 4 010101111 5 100101111 4 110101111 5 000110000 5 010110000 5 100110000 5 110110000 5 000110001 5 010110001 4 100110001 5 110110001 3 000110010 4 010110010 4 100110010 3 110110010 3 000110011 3 010110011 4 100110011 5 110110011 4 000110100 5 010110100 5 100110100 5 110110100 4 000110101 3 010110101 5 100110101 3 110110101 5 000110110 4 010110110 3 100110110 3 110110110 4 000110111 3 010110111 3 100110111 3 110110111 5 000111000 5 010111000 4 100111000 3 110111000 4 000111001 3 010111001 5 100111001 5 110111001 3 000111010 4 010111010 3 100111010 3 110111010 3 000111011 4 010111011 5 100111011 5 110111011 3 000111100 4 010111100 3 100111100 5 110111100 3 000111101 5 010111101 5 100111101 5 110111101 3 000111110 3 010111110 5 100111110 5 110111110 4 000111111 3 010111111 5 100111111 5 110111111 8 001000000 6 011000000 4 101000000 5 111000000 3 001000001 4 011000001 3 101000001 3 11100001 4 001000010 5 011000010 5 101000010 3 111000010 3 001000011 3 011000011 4 101000011 3 111000011 5 001000100 3 011000100 5 101000100 5 111000100 4 001000101 3 011000101 5 101000101 4 111000101 5 001000110 4 011000110 5 101000110 4 111000110 3 001000111 3 011000111 4 101000111 5 111000111 4 001001000 3 011001000 3 101001000 5 111001000 3 001001001 4 011001001 4 101001001 3 111001001 4 001001010 5 011000101 5 101001010 3 111001010 3 001001011 4 011001011 5 101001011 5 111001011 4 001001100 6 011001100 3 101001100 4 111001100 4 001001101 4 011001101 4 101001101 4 111001101 3 001001110 3' 011001110 3 101001110 3 111001110 4 001001111 5 011001111 4 101001111 5 111001111 4 Figure 3.10(a): Payoff Group H. (Cont'd)

40 001010000 5 011010000 4 101010000 3 111010000 5 001010001 3 011010001 4 101010001 5 111010001 4 001010010 5 011010010 4 101010010 3 111010010 4 001010011 4 011010011 3 101010011 5 111010011 5 001010100 4 011010100 4 101010100 3 111010100 3 001010101 4 011010101 3 101010101 4 111010101 4 001010110 3 011010110 3 101010110 4 111010110 5 001010111 3 011010111 3 1101010111 3 111010111 5 001011000 4 011011000 4 101011000 3 111011000 5 001011001 3 011011001 4 101011001 5 111011001 3 001011010 5 0110110 3 101011010 4 111011010 4 001011011 4 1011011011 4 11011011 5 111101 5 001011100 5 011011100 3 101011100 5 111011100 4 001011101 4 011011101 5 101011101 5 111011101 3 001011110 3 011011110 5 10101110 4 11101110 4 001011111 5 011011111 5 101011111 5 -11011111 8 001100000 5 011100000 3 101100000 4 111100000 3 001100001 4 011100001 4 101100001 5 111100001 4 001100010 4 011100010 4 101100010 5 111100010 4 001100011 3 011100011 5 101100011 4 111100011 4 001100100 6 011100100 4 101100100 4 111100100 3 001100101 5 011100101 3 101100101 3 111100101 5 001100110 5. 011100110 5 101100110 4 111100110 4 001100111 3 011100111 3 101100111 4 11 111 3 001101000 6 011101000 4 101101000 5 111101000 3 001101001 5 011101001 5 101101001 3 111101001 4 001101010 4 011101010 4 101101010 5' 111101010 5 001101011 5 011101011 4 101101011 6 111101011 3 001101100 8 011101100 6 101101100 3 111101100 5 001101101 6 Oll101101 5 101101101 3 111101101 5 001101110 6 011101110 3 101101110 3 111101110 4 001101111 3 011101111 4 101101111 3 111101111 5 001110000 3 01111000 5 101110000 5 111110000 3 001110001 3 011110001 5 101110001 5 111110001 3 001110010 3 011110010 4 101110010 5 111110010 4 001110011 3 011110011 3 101110011 3 111110011 3 001110100 a 011110100 3 101110100 5 111110100 4 001110101 5 011110101 4 101110101 5 111110101 5 001110110 4 011110110 3 101110110 4 111110110 4 001110111 5 011110111 4 101110111 5 111110111 8 001111000 5 011111000 4 101111000 3 111111000 5 001111001 4 011111001 4 101111001 3 111111001 5 001111010 3 011111010 4 101111010 5 111111010 5 001111011 3 011111011 3 101111011 5 111111011 8 001111100 6 011111100 3 101111100 3 111111100 4 001111101 3 011111101 5 101111101 3 111111101 5 001111110 5 011111110 4 101111110 5 111111110 5 001111111 3 011111111 4 101111111 4 111111111 9 Figure 3.10(a): Payoff Group H. (Cont' d)

41 Summary: H: (111111111:9, 000000000:8, 001101100:8) Neighbors of the second and third peaks pay 6. Only four neighbors of the first peak pay 8:110111111, 111011111, 111110111, and 111111011. All other points pay 3,4, or 5 randomly. Figure 3.10(a): Payoff Group H. ENV 10: g5:NL(G) g6:NL(G) g7:NL(H) Figure 3.10(b): Environment 10.

42 The Adaptive Program Every researcher who uses the reproductive algorithm has many choices to make, among them: 1) What are the "genes"? How many are there in a chromosome? How many alleles should there be? 2) What operators should be used? With what parameter settings? 3) What size should the population be? One choice he does not have is that of selection, at least if he is going to use Holland's (I1) theory in its exact form to attain the efficiency it guarantees: selection of. parents for the next generation must be according to payoff. Genes, Alleles, and Chromosomes A researcher's decision on genes and alleles depends on his knowledge of the environment and on the level at which he wishes to model it. In general this may be a very difficult question as discussed in Chapter Two, but for our purposes it is much easier. We are facing an artificial environment in which there may be as many as 25 parameters which affect the payoff; each parameter may take on one of two values. (We do not wish to minimize the difficulty involved in complex gene encodings such as used by others (3 and 12). It's just that once an encoding is determined it induces a function and it is the induced function that we wish to investigate.) Thus, each of 25 genes will correspond to one argument in the environment and will have two possible alleles, 0 and 1. Our goal is

43 then to match every gene to the argument value in the environment which maximizes the environmental function value. Because a gene may occupy any position on a chromosome, it is "tagged"; every position on the chromosome has two numbers associated: the number of the gene occupying that position and the allele of that gene represented. A chromosome might be represented as the following: (13,1) (2,1) (22,0),... At the leftmost of 25 positions is gene 13, represented by allele 1. At the second position is gene 2 with allele 1; the fact that gene 2 occupies position 2 is purely accidental. Every gene is represented exactly once in the chromosome. The individuals in our population are haploid chromosomes; that is, an individual is specified by a single string of 25 genes. (We will use the terms individual, member of the population, string, and chromosome interchangeably.) Facing other environments we might have chosen chromosomes to be diploid (composed of two strings of genes), as are the chromosomes of most of the "higher" animals and plants. Diploidy and the associated concept of dominance seem to be optimal for living organisms because of their requirements in adapting to very complex, changing environments. In artificial systems of the sort we are dealing with the environment is stationary so that the population need not change after it has reached the stable optimum. Hollstien (12) has achieved some success in diploidy and modifiable dominance facing a cyclically changing environment. (In populations made up of diploid individuals, a single string of genes is called a gamete. In our case, then, we have one gamete chromosomes.)

44 Population Size Cavicchio (3) and Hollstien (12) did not use the strict form of selection required by Holland's theory. Their small populations made it mandatory to use breeding plans to keep a high level of performance, once achieved, and to make special efforts to maintain the variability of the population. The disadvantages of their modifications are large: loss of the efficiency guaranteed by the theory and excessive programming of tricks and special cases with consequent loss of the easy attribution of credit for success. The advantages lie in the small number of payoff evaluations per generation, reduced computer running time, and less storage required to run their programs. We are not interested in minimizing payoff evaluations since we are taking as a premise that the Holland Reproductive Plan is efficient and we do not need to compare it to anything else to justify its use. Cavicchio and Hollstien have done that. Besides, it is not clear that small populations actually reduce the evaluations needed in very complex environments. Small populations are much more likely to be subject to genetic drift and get hung up on a false peak. While the reproductive plan's stochastic operators provide a method for getting off such peaks, it is likely to take quite a while to do so. The inherently larger variability of large populations make it less likely that this will happen. Finally, in this sort of artificial system, there is much force in the argument that it is not a good population that we want, but a single, good individual from the population; and the fastest way to search the space is to have a large number of individuals with a lot of mixing.

45 One of Cavicchio's biggest problems was maintaining population variance. Because of his small populations, good alleles were often lost before their effects could make themselves known. He ultimately tried three or four kinds of mutation operators and special selection techniques strictly to maintain variance. In population genetics this problem goes by the name of genetic drift: the variance in frequency of occurrence of alleles due to stochastic effects. If the mutation rate is small enough, an allele might be fixed randomly: In a simple one gene, two allele model of a haploid population in which the rates of mutation from one allele to the other are identical, and in which there is no advantage in either allele, "if the mutation rates are half the reciprocal of the population size the gene frequency is equally likely to have any value between 0 and 1" (P.A.P. Moran, p. 125). Thus, we can keep the mutation rate down to a reasonable size, say.005 (which means that mutation is doing its proper job of just providing variability - not searching the space) and still get by with a population of 100. Actually we do not require as much variability as Moran suggests we will get with this size population and mutation. All that is required is that some alleles of every sort will normally be available in the population, not that we will have a lot. However, since we will be performing selection and since our model includes more than one gene, we will use this approximation (100 members and mutation rate of.005) as a starting point in our investigation. Mating and the Genetic Operators A new population is generated from the old, one individual at a time, by means of "genetic" operators. First, two parents are selected

46 from the old population. (Selection is discussed in the next section.) The operators are then applied to these two chromosomes to produce a new individual. During the course of this research several versions of mating schemes and the operators are used; the description of an experiment will indicate which methods were used. Some of the methods were suggested by experience after experimentation but all are described here for completeness' sake. Many are artificial —not genetic. The exact order of operator application is dependent on the methods being used to handle the homology/inversion problem. The problem is that when two chromosomes are non-homologous (the order of genes on the chromosome is not identical), crossover between them may produce individuals with a surplus of some genes and a lack of others. Four mating methods may be used to avoid such aberrant offspring: strict-homology, viability, any-pattern, and best-pattern. The strict-homology mating rule requires the two parents to be strictly homologous for them even to be considered for crossover. If they are not, no offspring is produced and two more parents are chosen. The viability mating allows crossover to take place between any two parents, but only those offspring containing exactly one of each gene will be allowed to join the new population. To illustrate the difference between these two methods, consider the two eight-gene chromosomes C1 and C2, under consideration as possible parents: Position 1 2 3 4 5 6 7 8 C1 A B C D E F G H C2 A E D C B F G H where the capital letters A-H indicate the gene function, not alleles. These chromosomes are not homologous (the genes in positions 2-5 do not

47 match exactly) and so would be rejected as parents by the strict-homology mating rule. Under the viability rule, crossover would first take place and the resulting chromosome examined. If crossover takes place between positions 2-3, 3-4, or 4-5, the resultant chromosomes will be inviable; that is they will be deficient in one or more genes. For example, crossover between positions 2 and 3 results in the chromosomes A B D C B FGH AE C D E F G H the first lacking gene E and the second lacking gene B. If, however, crossover takes place between positions 6 and 7 (among other possibilities), the two resulting chromosomes are viable; that is, they each have one of the eight genes. (The term inviable is a reflection of the genetic situation in which a gene usually specifies enzymes needed to carry out a particular cellular function. If the enzymes are missing, the cell does not have the capability to maintain its own existence long enough to reproduce. In our case, a missing gene means that not all arguments of the environment are being specified, a situation in which the payoff function is not well defined.) The best-pattern mating rule (2) handles non-homologies by choosing the pattern of the resultant chromosome to be the pattern of that parent with the highest payoff. (If the payoffs are equal, one is chosen at random..) The rationale for this method is that if the pattern of genes on the chromosome contributes at all to the value of the chromosome, then on the average, better chromosomes are better because of the pattern, and that pattern should be rewarded. In the above example, if the particular alleles present in C2 yielded a higher payoff than the

48 alleles of C1, then the pattern of the result would be that of C2. (This point will be discussed further in Chapter Four.) This method does not correspond to anything in genetics but is an attempt to reward patterns directly. Every pair of parents can yield offspring under this method. As an example, suppose that we have the two chromosomes C3 (1,0)(3,0)(2,1) (4,0)(5,1 )(6,) C4 (1,1)(2,0)(6,0)(5,1)(4,1)(3,0), that C3 has a higher payoff than C4, and that crossover is to take place between positions 3 and 4. Then the pattern of genes to be used is the pattern ofC3 (i.e., 1,3,2,4,5,6). One possible crossover result would then take the alleles for genes 1,3, and 2 from C, and the alleles for 4,5, and 6 from C4, yielding: (1,0) (3,0) (2,1) (4,1) (5,1) (6,0) and the other possible result is (1,1) (3,0) (2,0) (4,0) (5,1) (6,1). Any-pattern mating is similar to best-pattern with the exception that the parent from which to take the pattern is chosen randomly. Inversion We shall use two ways of choosing inversion points: linear and linear+end. In the Zinear inversion method, two positions are chosen randomly (i.e., each position has an equally likely probability of being chosen) and all genes between and including these positions are inverted. For example, if the chromosome A BC D E F G H is to be inverted between positions 2 and 5, the resulting individual

49 is: A E D C B F G H. Since inversions are attempts to change the adjacencies of genes we would hope that an inversion rule would produce all adjacencies with equal probability to facilitate testing. One way to approach this goal (which is desirable, not necessary) is to ask that every position in the string be equally likely to be included in an inversion. However, using the linear algorithm described above, this is not the case. Positions close to the center of the string have a much greater chance to be moved by an inversion (i.e., be in an inverted segment) than positions near the end. The exact form is a quadratic in m, the position in the string. If N is the number of positions on the string, then the probability of any position being inverted is: (2/N) (m(N+l)-m -1), or for N = 25, (2/625) (26m-m2-1) In this case the central position is almost seven times more likely to be included in an inversion than either of the two ends, making it difficult for any gene starting out in the end position to be tested close to a gene on the other end. The linear+end inversion method was designed to alleviate this problem. In choosing positions, it does the same choice as the linear method three-quarters of the time; with probability one-eighth it does an inversion between position one and a position randomly chosen between two and thirteen; also with probability one-eighth it does an inversion between position twenty-five and a position randomly chosen between thirteen and twenty-four. The probabilities then for a position to be

50 included in an inverted segment are much more nearly equal, with all but positions 1,2,24, and 25 having nearly identical probabilities and those positions differing by a factor of less than one-half. Since our goal was only to attain reasonable mixing, no further refinement will be attempted. Inversions may be introduced into the population in the following two manners. In the continuous inversion method, as every individual is created it has a probability of undergoing inversion. Two positions are selected (as above) and the genes between the two points are inverted. Every chromosome undergoing inversion has a probability of having differen sections inverted. For example, if a population has only one pattern, and if six new individuals undergo inversion, it is highly likely that there are six new, different patterns in the new population. This certainly present a problem in homology and we would expect this method to work well only with the viability, any-pattern, or best-pattern mating rules. Combining it with the strict-homology rule yields the same kind of system as Bagley used and the same kinds of problems: any chromosome which has undergone inversion has a very small probability of finding another chromosome with the same pattern. The mass inversion method was designed to overcome the problems of the strict-homology mating rule, but may easily be used with either of the other three rules. No inversion is performed until an entire new population has been generated. At that time, with a given probability exactly half of the population undergoes exactly the same inversion. (The half chosen is the odd-numbered individuals. No bias is introduced because all individuals are generated independently.) For example, if an inversion is to be performed on an entirely homologous population of

51 100 individuals, then there would result two patterns in the population, each represented by 50 individuals. There is then a fairly high probability (1/2) that any two members chosen to be parents will be homologous. Crossover In the course of this research two different methods of selecting recombination points are used. The first, one-crossover, chooses exactly one point on the chromosome at which to perform crossover. This was used for most of the experiments except those in which we decided that slightly more mixing was desired. It operated as follows. One of the 24 points of connection was chosen randomly, each point being equiprobable. All the genes to the left of this point on one of the parent chromosomes became the left hand part of the new chromosome and all the genes to the right of this point on the other parent became the right hand side of the new chromosome. (It does not matter which of the parents contributes to the right or left since they are selected independently of each other.) Problems involving non-homology are handled by the mating rules. One alteration to this scheme is necessary under best-pattern mating, as described above. MuZtipZe-crossover is used where slightly more mixing of the two parents is desired. The leftmost gene of a new individual is supplied by one of the parents, say C1. There is then a probability, Pcross, that the second position on the new chromosome will be filled from the other parent, C2. Correspondingly, with probability 1-Poss the next gene will come from the second position of C1. Once a switch has been made to supplying genes from C2, there is then the probability PCross

52 that the next position will come from C1, and so on. This corresponds to a random walk down the two parents with probability Pross of shifting to the other parent. There is a finite probability, 24 (1-P ), that no crossover will take place as a result of this method. If this is the case, the individual is regenerated via the one-crossover method. Mutation Mutation is a simple operator. After a new individual is generated, every allele is changed (from 0 to l or from 1 to 0) with probability Pmut Ordinarily Pmut will be very small. Migration In genetics, migration is the actual physical movement of organisms from outside an otherwise closed population (a deme) into that deme. In the usual sense of the word it is applied to situations in which the deme under question is adapted to its environment and the migrating individuals, while members of the same species and containing many of the same genes, are adapted to another environment. The migrating individuals thus contain different alleles or, at least, their frequency of allelic occurrences are different. The effect of migration on a deme (if continuous) is to increase the variability of the population and to keep it from becoming too specialized. The use of migration is further explained and justified in Chapter Four, "Variance by Migration". The migration rate per generation is determined as a variable of the experiment: Nmig individuals enter the population as immigrants.

53 The genetic constitution of an immigrant must be similar to that of the deme to ensure that it is sufficiently adapted to contribute to the next generation. To this end, an immigrant will be created as follows: A chromosome is selected in the same manner as parents are selected (i.e., according to relative payoff). The same gene pattern is used for the immigrant but one-third of the alleles in the "parent" are mutated. If the inversion time is continuous, the new individual undergoes inversion with the given probability. The immigrant then enters the population differing from some member of the parent population by an average of eight alleles and possibly an inversion. Selection Selection of parents randomly with sampling probability in proportion to relative payoff is the method by which Holland's theory guarantees efficiency. This research will follow the theory exactly. Since the previous work in this area has nqt done so, it is worth while considering the Monte Carlo algorithm actually used. The members of the parent population are denoted by Ci, i = 1,...,100. PAYi is the payoff for individual i; all payoffs are integer (i.e., whole valued) numbers. We form the array CUM. (i = 1,...,100) as follows: CUM. = PAY.. 1 j=l I CUM is similar to the concept of cumulative probability distribution, except that it is not normalized to 1.0. To select a parent we first generate a uniformly distributed random integer in the range (O,CUM100); call it RN. The index of the parent

54 selected is then the least i such that CUM. > RN. 1 In ALGOL terms this might appear as: for i: 1 step 1 until 100 do if CUM[i] > RN then goto done; The probability that individual i is selected is then PAY 100 Z PAY. j=l J as required by the theory. Monte Carlo Methods The random numbers mentioned here are, of course, members of a pseudo-random number sequence, generated algorithmically. The particular generator used in this work is P = P(214-3)Mod 228 a multiplicative congruential method as described in CACM, January 1967, p. 40, Algorithm #294. The subroutine RAND! called with an integer parameter, N, uses this method to return a number in the range (0,N-l), such that each number in the range has probability 1/N of being selected. Without discussing the well-known problems introduced by subtle biases in such pseudo-random sequences, we shall accept this generator as sufficiently random for our purposes. There is a definite advantage in the technique of being able to start the random number sequence at exactly the same place at the beginning of an experiment for debugging purposes. Suitable randomization of initial conditions must be used to avoid biases.

55 Many of the experimental variables used are probabilities. These are usually expressed as positive integers in the program, the interpretation being that the fraction represented is 1/1000 that of the integer. Testing whether an event occurs (for example, mutation) is merely a matter of asking whether RAND!(1000) is less than Pm The reason for using integer representations for probabilities is that REAL (i.e., fractional) arithmetic on the computer employed in these experiments takes significantly longer than integer arithmetic. No loss in "randomness" is experienced; the only loss is in a restriction of resolving power to one part in a thousand. Each complete specification of the adaptive program parameters is dignified with a separate number (called the case number) for identification purposes. Each case runs for a given number of generations. To try to separate random fluctuations from the effects of the program parameters we can reinitialize the adaptive system with the same population but with the random number generator starting at a different point. Each start with the same population and parameters is called a run of the case. Typically we will use from 5 to 10 runs for every experiment and use some statistical measure based on these runs. A series of cases will sometimes be identified by a letter and number (as Al) for convenience. Program Description Now that all the pieces have been presented, we can give a general description of the whole process. The language used in the program is CESSL (7), a procedure oriented language similar to ALGOL and FORTRAN; it was developed locally for other purposes but lends itself admirably

56 to our needs. Since it is available only at this one research installation we shall not include any of the actual programs used, but only algorithms, given in flow charts or by a sort of ALGOL notation. At the start of an experiment the entire population receives the same pattern of genes on the chromosome; each allele is set randomly to 0 or 1. The payoff PAY. is then calculated for each member and the CUM array formed from PAY. Creation of a new population procedes as follows. Two parents are selected from the old population, as described above. A new individual is formed by crossover using one of the three mating methods. (If there is a homology or viability problem, two new parents are first selected.) If the inversion time is "continuous", the new individual undergoes inversion with a certain probability. Every gene then is mutated according to the given probability. The individual thus formed enters the new population, and the next individual is started. If there is migration, this continues until 100 less the number of migrating individuals are formed; the remaining slots in the population are then filled by migration If the inversion time is "mass", then, with a given probability, exactly half the population undergoes an inversion. The calculation for one generation is given in somewhat more detail in Figure 3.11.

57 select 2 parents mating rule strict viability anyhomology best- pattern pattern / \ / best- \ any- \ ^ /normal \ pattern \ pattern no // homo - o crossover crossover rossover \ T-^^ \._________ / \ _________ / \, / 3 yes yes z nomal ono store new [ Ci With probability =.ont inuous inv choos I' inversio~ inversion points Vnd invert Ci no Mutate every gene in Ci with probatt vmur i =i+l

58 no N. >0? > mig yes select one old member, store in C: Mutate every gene of C. with probability 1/3 \With probability ontinuou s_ Pinv, chos e ]nversiol) inversion points?^\ ^^ I[and invert Ci i = i+. i>100? yes | choose v inversion -- inversion / /-points _li invert generationo ye = i I Figure 3.11: Generation of a New Population.

CHAPTER FOUR POSITION EFFECTS This chapter discusses a series of experiments which investigate the effect of position on populations evolving under the reproductive plan. Recalling the argument in Chapter Two concerning the nearness hypothesis, we note that clumping does not contribute to the payoff of an individual chromosome —the payoff depends strictly on the alleles present. We expect to find, however, that clumping does contribute to the ability of a chromosome to pass good combinations of alleles to its descendants. There are two ways in which we (and the reproductive plan through use of the inversion operator) may reasonably gain information based on this expectation: 1) from populations which are already adapted to their environment (i.e., there is no significant evolution or improvement in performance going on); 2) from populations which are still evolving. Experiments based on these possibilities require quite different designs and yield quite a different class of results. Working with populations in equilibrium has the advantage that time or speed of evolution does not need to be considered; any effects are presumed to be a property of the equilibrium state no matter how rapidly or slowly it was reached. On the other hand, it may well be argued that the important of position is in aiding the speed of evolution, so that studies of evolving populations may be of more interest albeit possibly more difficult. The remainder of this chapter is divided into three parts, corresponding to these two sources of information and on experiments using this information. 59

60 Equilibrium Populations In this chapter we shall call a population in equilibrium (or in steady state) when it has reached a level of adaptation such that no significant further evolution (of the population as a whole) will take place. This does not necessarily describe a population in which all the genes are fixed; mutation or other mechanisms may contribute to variability. Equilibrium can only be defined in terms of averages —not only individuals but also populations will fluctuate about the equilibrium point due to stochastic effects. Individuals may be produced which are much better than the average but the population has pressures (mutation, etc.) which keep the average down. It is in this sense of equilibrium that we hope to detect differences in the "equilibrium population average" due to position effects. Specifically, we expect that populations in which dependent genes are clumped (are close together) will have higher steady state points (higher population averages) than those in which the genes are a long way apart. If this is indeed the case, then it is possible for the reproductive plan itself to detect these differences and capitalize on them, favoring small distances in the affected genes. But, first we must convince ourselves that there really is a position effect. Experimental Procedure In order to detect differences in the equilibrium point we will perform experiments along the following general lines. There is a set of genes which we wish to investigate with respect to position. We shall collect data from several cases (specifications of experimental

61 parameters) differing only in the positions of the distinguished genes on the chromosomes. For each case we run the experiment many times from an initial random population. A run of many generations without inversion allows any position effects to be expressed in the fitness of the descendants. These differences in the equilibrium point will ordinarily be small and the variance over runs large so that we will be forced to employ statistical techniques. We restate the hypothesis as follows: the average payoff achieved by a population depends on the distance between the distinguished genes. Distance is a controlled factor and payoff a random factor. We shall undertake two lines of analysis. In the first we will attempt to fit a straight line to the data produced (by linear regression) and expect to find a negative slope (smaller payoff for large distances). Using standard techniques, we can test for significance by an upper one-sided Student's t-test with n-2 degrees of freedom (where n is the number of points), under the null hypothesis that the slope is zero, hoping to reject the hypothesis in favor of a negative slope. The t-test requires normal distribution and equal variance over the range of the distance, although it is quite robust to departures from these requirements, especially when the number of points is fairly large. We will usually have on the order of 50-70 points (5-7 different distances with 10 samples at each distance) so that we feel confident of any results. Difficulties will be discussed in the appropriate experiments. Our normal acceptance level is 95%; we will usually state a P-value if much larger than 95%. The second analysis measures only the best and worst cases: when the genes in a dependent group are clumped and when they are completely

62 spread out. For this type of experiment we will obtain twenty different sample points for each of the two gene permutations. Each run starts with a different initial population. The statistical analysis is the Student's t-test for difference between means. We shall perform two-sided tests. Analysis for a negative slope is performed in the hope that distance effects are fairly smooth. The distance is well defined for environments which have only one set of non-linear genes; more than one set makes it difficult to determine what shall be called distance for this purpose. Best/worst testing is well defined in all environments (genes from a group are either adjacent or they are not). In addition, the increased number of sample points taken in this test sharpens the power of our statistical tools. (We might mention that these experiments take a considerable amount of computer time so that it is infeasible to obtain large numbers of sample points for all possible experimental parameters.) A Simple Adaptive System We start with an experiment involving a simple environment and a simple adaptive system. The environment is ENV 1 (Figure 3.1). The only functional genes in ENV 1 are 1-5, linearly combined such that zero alleles are worth nothing and one alleles are worth one. The remaining genes contribute nothing to payoff. If the concept of clumping is correct for dependent (non-linear) genes, among other independent (linear) genes, then we may find it to be true also of functional genes (linear or non-linear) among non-functional genes. In any case, this environment should serve as a control on later ones.

63 We try this kind of environment first because this reduces (to zero) the "noise" from the remaining 20 genes, an advantage in a first step. The adaptive system used has no inversion and no migration. Since there is no inversion and since the population starts out completely homologous (i.e., all individuals have the same gene pattern), no special mating rule is needed. Single crossover is used (i.e., exactly one crossover point generates a new individuals from two parents). The probability of mutation and the pattern of genes on the chromosomes are the only variables of interest. Experiment Al consists of five cases (1-5), in which the initial population and all parameters are identical (P =.005), the only difference being in the position of the five functioning genes on the chromosomes in the five cases. These positions are given in Table 4.1. The distance figure in the table is the distance from the leftmost distinguished gene to the rightmost. This is the controlled value in the linear regression. Position of gene: Case 1 2 3 4 5 Distance Ave pay for 10 runs 1 1 2 3 4 5 4 4.89 2 1 3 6 8 10 9 4.88 3 1 4 8 12 15 14 4.87 4 1 6 11 15 20 19 4.91 5 1 7 13 19 25 24 4.88 Table 4.1: Experiment Al. Gene Positions and Results. Each case had ten runs of eighty generations apiece. It took about sixty generations for the population to reach a steady state at about 4.89. The actual runs had a variation in average population payoff from 4.74 to 4.97; the numbers in Table 4.1 were the average

64 of the ten runs for each case —they are not used in the analysis and are included only to give an intuitive feel of the range to the reader. The raw data consisted of 50 points, 48 degrees of freedom, requiring a negative slope and t = 1.68 for 95% confidence. The slope obtained from the regression was +.0003 with a t-value of to =.30, not enabling us to reject the hypothesis that there is no dependence of payoff on distance. Experiment B1 used ENV 2 which also involved 20 non-functional and 5 functional genes, but the five functional genes were non-linear in their effects according to Figure 3.2(a). Only genes 1-5 out of group 1 were functional, the payoff being the highest (3) at all ones or all zero, dropping to 2 with one allele different (four l's and one 0, and four O's and one 1) and to 1 for combinations containing three of one allele and two of the other. This experiment was run under exactly the same set of conditions as experiment A1-5 cases, 80 generations, 10 runs; the patterns for each case were as given in Table 4.1. Any problems due to the strict linearity of ENV 1 should not affect this experiment. Again, no significant slope was found; the population average over all 50 runs was 2.93, the slope was +.0002, and t0 =.35. Experiment C1 used ENV 3, in which every gene is functional and group one is non-linear according to Figure 3.3. Non-linear group B (6 genes) has a peak of 10 at combination 001011 and 9 at the competing peak of 110100, with a trough extending to 3 between them. The maximal points over the whole environment are at 28 and 29. Table 4.2 contains a summary of the experimental parameters for the S cases, including the position of the distinguished genes in each

65 case. The mutation rate was again.005. Position of gene: Ave pay, 10 runs Case 1 2 3 4 5 6 Distance.at gen. 120 11 1 2 3 4 5 6 5 25.30 12 1 3 5 7 9 11 10 25.46 13 1 4 7 10 13 16 15 25.35 14 1 5 9 13 17 21 20 25.89 15 1 6 11 16 21 25 24 25.38 Table 4.2: Experiment Cl. Gene Positions and Results. The overall average was 25.47, the slope was +0.013, and t0 =.94. Again, no significance. The Problem of Homogeneity The results of these three experiments force us to conclude that there is no advantage in clumping, at least for these particular sets of experimental parameters and environments. A close look at the equilibrium populations in the three experiments indicates a probable reason for this result: most of the individuals were "perfect" with respect to the genes of interest. In experiments A and B about 90% of the individuals population had all the alleles right and in experiment C, anywhere from 50% to 90% of the population had all the correct alleles. Actual allelic frequencies were approximately 90-100%. Thus, the populations are very homogeneous, at least with respect to the genes of interest. When crossover occurs between two parents the distinguished genes may be split up, but because of this homogeneity it is unlikely that the resulting individual will have different alleles substituted for the good combinations in the parents. In restrospect this makes sense, for the position-induced behavior is predicted on the basis of keeping good combinations together so that

66 they will not be lost when split up. But if the good alleles are almost universal in the population it makes no difference whether the genes are split or not. The reproductive plan tends to lead the whole population to the optimal point once it has been reached and the only reason for the variance observed is mutation. If we expect to determine dependencies using position effects in stable populations, then, we must ensure that the steady state populations we produce are not too homogeneous. In population genetics theory there are several reasons offered for the large variety observed in natural populations: 1) Non-linear effects due to diploidy, e.g., overdominance and epistasis. Obviously, this does not help haploid populations; in addition, the extent of this effect is being questioned by population geneticists. 2) The population is not in equilibrium; i.e., some gene or group of genes is in the process of moving from one steady state point to another due to a change in the environment or the discovery of a new local maximum. This reason is contrary to our assumption of equilibrium. 3) Neutralism. This theory states that most of the observed differences are selectively neutral; the characteristics have nothing to do with fitness or payoff. Again, this reason does not help us for we wish to investigate the effects of selectively important genes. 4) "High" mutation rates. 5) Low selection with respect to mutation rates.

67 6) Migration from one deme (closed population) into another when the two are facing different environments. These last three reasons are the only possible sources of variation we can use to revise the equilibrium point of a population downward to a limited amount of homogeneity. They will be taken up one at a time. Note that we are not using these methods to aid the reproductive plan. On the contrary, we expect them to fight the natural tendency of the reproductive plan to find the optimum. Our interest in this research is not the optimum but rather dependencies in the environment in the vicinity of the optimum. For this we need some variety in the population when it is "in equilibrium". We might hope that the increased difficulty in natural or real environments would automatically lead to a higher degree of variance which we are providing artificially. We will first describe the experiments and results and then make our conclusions. Variance by Mutation We expect increasing the mutation rate to shift the equilibrium point of a population undergoing selection. Population geneticists have worked out formulas to express the relation for only very simple cases (e.g., one or two genes, two alleles, diploid chromosomes). Our environments are so much more complex that we can expect little from this previous work except direction: higher mutation, more variance. Our main problem may be that in order to get enough variance we may have to increase mutation so much that it destroys the work of selection.

68 In experiments Al, B1, and C1, the mutation levels were set at.005. Two higher levels were tried:.025 and.100, in experiments identified as A2, A3, B2, B3, C2, and C3. All that was observed was a lowering of the average population payoff according to Table 4.3. Population Range average Mutation in over Confidenc Experiment Environment rate payoff 50 runs Slope of t-test Al 1.005 0-5 4.89 +.0003.62 A2 1.025 0-5 4.45 +.0011.68 A3 1.100 0-5 3.54 +.0022.75 B1 2.005 1-3 2.93 +.0002.64 B2 2.025 1-3 2.65 +.0017.89 B3 2.100 1-3 1.94 +.0022.86 C1 3.005 3-29 25.47 +.0133.82 C2 3.025 3-29 20.04 -.0052.61 C3 3.100 3-29 16.14 -.0043.66 Table 4.3: Experiments in Mutation Rates. Since there is no indication that the addition in variance gives rise to the expected position effect, experiments along these lines were discontinued. The highest mutation level used (.1) caused a large degradation in performance: in experiment C3, the population average produced was not much greater than that expected by chance (16.14 vs. 14.72). These mutation levels are too high to permit effective adaptation. Intermediate levels also did not produce the desired effect. Variance by Decreased Selection Expressions obtained by theorists indicate that the equilibrium point for one gene, two allele systems depends not only on the mutation rates but also on the selection coefficient, that is, on the relative

69 number of offspring expected due to an allele; the lower the selection, the higher the variance expected. In our terms this corresponds to the ratio of the direct "payoffs" or function values of the chromosomes involved. For example, in ENV 3 the best chromosome had a value of 29 and the worst a value of 3. Thus, the best was approximately ten times as likely to serve as a parent as the worse. Again, generalizing from the single gene model to a multiple gene model leads to currently unsolvable mathematics so that direction is the only inference we can draw. Our method of testing this as a possible source of variance is merely to translate the payoffs of every chromosome upward by some fixed amount. The selection coefficient (i.e., the ratio of best to worst) is thus reduced, leading to a downward shift in the performance level of the population due to the increased ability of "non-optimal" genes and'combinations to enter into determination of the next generation. In terms of the specification of the environment we have been using, this is actually realized by the definition of a new environment which has the appropriate characteristics. ENV 4 is merely ENV 1 with each zero allele paying 9 and each one allele paying 10. Similarly, ENV 5 is ENV 2 with each functional group of alleles shifted by 5 (so that the payoffs are 8,7,6,7,8) and ENV 6 is ENV 3 with all of group 1 shifted by 10 (so that the payoffs for group 1 range from 13 to 20, and the payoffs for a single chromosome can range from 13 to 39). If this technique is useful it would be a simple matter to adjust the selection rate to a "standard" range as part of the adaptive system. For the present, we will adjust the payoffs manually by changing the environments.

70 Using the altered environments as defined above, experiments A4, B4, and C4 were performed, with a mutation rate of.005, yielding the results of Table 4.4. Population Range average in over Confide Experiment Environment payoff 50 runs Adjusted Slope of t-te A4 4 ("1"+45) 45-50 48.45 3.45 +.0044.64 B4 5 ("2"+5) 6-8 7.80 2.80 +.003.89 C4 6 ("3"+10) 13-39 34.14 24.14 +.022.83 Table 4.4: Experiments in Reduced Selection. The population averages are indeed depressed in comparison to the averages of experiments Al, B1, and Cl, which used the same mutation rates. The results seem to be comparable to or slightly worse than experiments A3, B3, and C3,which used a.1 mutation rate. But as before, the results are not significant, nor even indicative (the slope is in the wrong direction). Variance by Migration Migration is the last method by which we shall try to increase the variance of a population at its equilibrium point. Migration is the movement of an organism from outside a closed, interbreeding population (deme) into that deme. If two demes are stable and face different environments, their genotypic constitutions will be different, so that mixing them via migration will tend to force the population away from the optimum for either environment. In single gene analysis theoretical population geneticists treat migration exactly the same as mutation: introduction of alternate

71 alleles according to a particular probability distribution. But in the multiple gene case we expect differences in the effects for the following reason. Although immigrants must be of the same species (i.e., they are basically similar) they are different in many of their genes. Thus the immigrant may contain many non-adapted alleles in one chromosome. When it mates with an individual of the deme many non-adapted alleles are transferred at once. Members of the deme which do not breed with immigrants do not receive this high dose of new alleles and so their offspring can maintain their adaptation. This is different in effect from a comparable relatively high mutation rate where the adaptation of every individual in the deme is threatened by the mutation. As described in Chapter Three, migration is simulated by creating immigrants which differ from some individual already in the population by about a third of the genes in its chromosome. The actual proportion of changed genes could have been an experimental parameter if the experiments described below had been more positive in their results. Table 4.5 describes the experiments which were run with migration and the analyses performed. Experiment AS with ENV 1 showed mixed results. Analysis of the population average at generation 70 gave rise to our first significant regression line. However, analysis of the population average at generation 80 was not significant, definitely a disappointment. Looking a bit more closely at the population at generation 80 to perform a different analysis of the data, we counted the number of "perfect" chromosomes (chromosomes whose five functional genes contained the right alleles for maximum

72 Regression Exp ENV Mut Migr Variable** Mean Slope Confidence VEBR*** AS 1.005 25% PA70 3.81 -.006.99 11% PA80 3.87 +.00002 NP80 33.51 -.27.98 9% A6* 1.005 25% PA80 3.86 -.005.98 10% A7 1.005 30% PA80 3.66 -.0002 NP80 24.40 -.016 B5 2.005 25% PA80 2.24 -.004.99 11% NP80 46.12 -.30.98 9% C5 3.005 5% PA120 22.53 +.026 C6 3.002 10% PA120 20.76 -.005 TF120 26.26 -.011 C7 3.005 25% PA120 16.60 -.025.995 10% TF120 22.45 -.016 *Gene positions: four together, one spread apart. **PA = Population Average at generation... NP = Number Perfect chromosomes at generation... TF = Top Five average at generation... ***Percent Variance Explained by Regression which is 95% significant. Table 4.5: Experiments in Migration

73 payoff), and performed a regression on distance with respect to this random variable. About one-third of the populations were made up of perfect chromosomes. The regression showed significance, with the average number of perfect chromosomes for the least distance (4) being 36.22 and for the greatest distance (24) being 30.82. The reason that the population average did not reflect this difference is in the distribution of the non-perfect chromosomes. For some reason, populations with a lot of perfect individuals also had a lot of very bad individuals. Experiment A6 was run with a slightly different arrangement of genes on the chromosomes. Instead of spreading the genes evenly as for the other experiments up to this point (described in Table 4.1), four of the genes were clumped at one end of the string (positions 1,2,3 and 4) and the remaining gene was placed on the string at five different distances (at positions'5,10,15,20 and 25). This constitutes a different test of the position effect. The results were significant. Experiment A7 was run at a migration rate of 30% but showed no significance of the population average at generation 80. The number of perfect chromosomes likewise failed to achieve significance. Experiment B5 (with ENV 3) was run at a 25% migration level on environment 2 and produced significant position effects both in the population average and the number of perfect chromosomes. Experiments C5, C6, and C7 were attempts to define the effective range of the migration operator on environment 3. Rates of 5% and 10% were ineffective at producing the position effect, but a rate of 25% produced a significant regression line on the population average. An alternate analysis of experiments C6 and C7 was tried: the average of the top five individuals in the population versus distance. This is

74 a measure of how the best part of the population is doing and may be a more reasonable choice for analysis than the number of perfect chromosomes since perfect chromosomes are usually rare (or non-existent) in complex environments. (In addition knowledge of the optimum is often lacking.) In the case of ENV 3, a perfect chromosome pays 29 and the top five averages for the two experiments were 26.26 (C6) and 22.45 (C7), indicating that few really good chromosomes were produced. The results of this top five analysis were not significant, even in experiment C7 in which the whole population average showed the position effect. The interpretation is that while the increased migration rate creates enough variance in the whole population to'show the effect, it does not create much variance in the best part of the population (which by definition is that part closest to the optimum and so much more identical). Best/Worst Tests The results obtained thus far are not too encouraging. Only by artificial means are we able to show a position effect. It may be that seeking a linear relationship between distance and average payoff is expecting too much. The relationship may not be very smooth. Testing just the best and worst cases (i.e., genes adjacent and genes most spread) should overcome some of this difficulty (if indeed there is a distance effect). In addition, the experiments described in this section obtain more sample points for each permutation than in the previous experiments, allowing the statistical tests to be more certain. Finally, we will test a wider range of, and much more complexd environments via this method.

75 The adaptive system used involved no migration, no inversion, no special mating rule (since there was no inversion), single crossover, and a mutation rate of.005. The results are stated in Table 4.6. For each experiment two different permutations of genes were used: the best possible, where dependent genes are adjacent and the worst possible, where all genes from dependent groups were as far away from each other as possible. For each permutation twenty different initial populations were run to equilibrium at some number of generations. The population average at that point was used in a Student's t-test for difference of means with 38 degrees of freedom. The table indicates which permutation produced the best populations and whether the difference between the populations. was significant at the 95% or 99% level for a two-sided t-test. Environments 7,8,9,/ and 10 are newcomers. They are all somewhat more complex than the previous ones encountered. We can briefly summarize them as follows. ENV 7 contains five non-linear groups, the first of order 6, the second of order 5, the third of order 6, the fourth of order 3, and the fifth of order 4. ENV 8 contains three non-linear groups of order 6, each group similar to the single non-linear group of ENV 3; the remaining 7 genes are linear. ENV 9 contains a non-linear group of order 8, containing two false peaks; good points in the payoff function are much sparser than in the previous environments. The remaining genes pay linearly. ENV 10 repeats the 8-group of ENV 9 twice and also has a very sparse group of 9 genes with two false peaks in addition to the true peak. It is an important point that optimum combinations (within groups) appear infrequentl iin random initial populations of ENV 9 and ENV 10.

76 Best Worst (near) (far) Significance Exp ENV # Gens Mean Direction Mean (if any) D1 2 75 2.93 < 2.94 D2 5 150 7.77 = 7.77 D3 3 150 25.23 < 25.35 D4 6 250 34.15 < 34.52 D5 7 250 32.99 > 32.07 99% D6 8 250 31.91 < 32.36 D7 9 250 22.87 > 22.51 95% D8 10 250 22.50 > 21.46 99% Table 4.6: Best/Worst Equilibrium Results.

77 We note that the only significant results in Table 4.6 are in the direction we desired, i.e., the permutation in which genes were adjacent performed better than permutations in which they were far apart. Of the more complex environments, only ENV 8 failed to show this direction. We point out that experiments D2 and D4 are lowered-selection versions of Dl and D3, and are comparable to B4 and C4. None of these four showed significance or even any movement in the right direction. We observe that the means in experiments D5 and D8 differed by more than one, corresponding to a full allele's difference in the payoff or possibly to a false peaking paying one less than the true peak. Summary In general, the results above are mixed. In the earliest, linear regression experiments, of the three methods used to introduce population variance at the stable point only migration showed any effectiveness at unmasking the position effect with respect to the population average. Analyses of other population data, number of perfect chromosomes and top five average, were inconsistent. (These latter two analyses were also tried on some previous experiments but did not show significance.) Perhaps of most interest in the migration experiments which showed significant position effects are the steepness of the slope and the variance explained by the regression line (formally defined as the square of the coefficient of linear regression). The steepness of the slopes in the regressioon on population average were small, on the order of 4% of the average payoff from one end of the range to the other. The variance explained by the regression is only about 10% of the total

78 variance observed. While these differences are small, the nature of the reproductive plan is such that it could indeed take advantage of them. Population geneticists are very comfortable working with selection factors on the order of 1.04. Stochastic effects on small populations (genetic drift, etc.), however, may confound the ability of a single short run to show these effects. Any investigation in which genetic operators (e.g., inversion) are used to explain the effect will require a great many runs and a lot of statistical analysis to show any significance. Another problem lies in the level of the migration needed to observe the effect. (It may be that the reproductive plan could do the detection at lower levels, but this is not verified by the experiments above.) The environments used to this point have been fairly simple. Nonetheless, the best chromosomes observed in experiment C7 usually differed by 3 to 5 genes from the optimum. In a very complex environment with many false peaks, it may not be possible for even the best individual to reach the optimum point because of the migration pressure. This could destroy our confidence of the meaningfulness of the result. We want information about the environment near the optimum. However, we will perform some investigation on the effects of the inversion operator on populations using migration, both at the 25% level observed to have a distance effect and lower levels. Such populations maintain the variability needed over long spans of time for the inversion operators to have effect. In much more complex classes of environments than those in this research this desired variability automatically exists —it just plain takes longer to home in on the maximur

79 The best/worst analysis yielded a bit more information. It appears that the expected position effect shows up in more complex environments, although not perfectly, as witness the failure of experiment D6. Although these results are significant, there is the distinct possibility that they are due to effects earlier in evolution, rather than strict equilibrium considerations. This question is considered in the next section. Evolving Populations The usefulness of a "correct" permutation of genes in evolving populations is almost impossible to show analytically. Turner's paper on a three gene model is the most complex analysis available on linkage, and that involves an "equilibrium", diploid population. There are few analyses of populations on the move. The intuitive analysis of the position effect in evolving populations is much the same as for populations in equilibrium with the following difference: "good" combinations of genes are likely to be rare in an evolving population. If an instance of a good combination is encountered via recombination and then (subsequently) broken up by recombination it is very unlikely to be reconstructed soon. Thus, there is a sort of "discovery" problem. When a combination appears for the first time it is more likely to make its presence known if it is not broken up easily, i.e., if it is in a good permutation. Clearly this implies that good permutations lead to faster evolution than do poor permutations. This is exactly the basis on which we will examine the position hypothesis in this section.

80 By faster evolution we do not mean instantaneous rates of change with attendant difficulties of definition. We merely intend to measure the average population payoff at each generation and make comparisons on this basis. There is a distinct non-independence of sampling between generations: if population A is greater than population B at time t, then it is also likely to be greater than population B at time t+l. But if there are no selective differences between the populations, then, on the average, no population should predominate all the time, and more importantly, the difference should not be statistically significant. Best/Worst Evolution The experimental conditions for testing the faster evolution hypothesis are exactly the same as for the best/worst-equilibrium experiments. (As a matter of fact, the data for the best-worst experiments come from the terminal conditions of the computer runs used in this section.) Briefly, the experiments involved no migration, no inversion, no special mating rule (since there was no inversion), single crossover, and a mutation rate of.005. For each experiment two different permutations of genes were used: the best possible, where all genes from dependent groups were adjacent, and the worst possible, where all genes from dependent groups were as far away from each other as possible. For each permutation twenty different initial populations were run to equilibrium for some number of generations. We report on experiments D1-D8 as given in Table 4.6. The analysis performed was simple. At each generation the means of the twenty samples for the two permutations were calculated and a

81 NEAR FAR 44 2.93 2.93 > -GEN MEAN MEAN DIR SIG 45 2.93 2.94 < 0 1.45 1.44 > 46 2.93 2.93 < 1 1.67 1.57 > 47 2.92 2,93 < 2 1.87 1.66 > 48 2.93 2.93 < 3 2.03 1.77 > 49 2.92 2.93 < 4 2.21 1.85 > 50 2.91 2.92 < 5 2.32 1.95 > 51 2.90 2.93 < 6 2.40 2.10 > 52 2.91 2.93 < 7 2.48 2.21 > 53 2.93 2.93 < 8 2.56 2.34 > 54 2.93 2.93 < 9 2.63 2.44 > 55 2.92 2.93 < 10 2.64 2.57 > 56 2.92 2.92 > 11 2.67 2.65 > 57 2.92 2.92 < 12 2.69 2.72 < 58 2.93 2.93 < 13 2.71 2.79 < 59 2.92 2.93 < 14 2.75 2.82 < 60 2.93 2.93 < 15 2.76 2.86 < 61 2.94 2.95 < 16 2.77 2.87 < 62 2.93 2*94 < 17 2.78 2.90 < 63 2.93 2.94 < 18 2.78 2.91 < 64 2.92 2.94 < 19 2.80 2.92 < 65 2.92 2.93 < 20 2.82 2.91 < 66 2.92 2.91 > 21 2.82 2.92 < 67 2.92 2.91 > 22 2.83 2.92 < 68 2.91 2.91 > 23 2.84 2,92 < 69 2.92 2.92 > 24 2.85 2.94 < 70 2.92 2.90 > 25 2.84 2.94 < 71 2.91 2.91 > 26 2.85 2.94 < 72 2.92 2.92 > 27 2.85 2.93 < 73 2.93 2.92 > 28 2.86 2.93 < 74 2.92 2.92 > 29 2.85 2.92 < 75 2.93 2.94 < 30 2.87 2.93 < 31 2.89 2.92 < 32 2.91 2.93 < 33 2.91 2.93 < 34 2.90 2.92 < 35 2.90 2.92 < 36 2.90 2.93 < 37 2.91 2.93 < 38 2.90 2.92 < 39 2.91 2.93 < 40 2.91 2.92 < 41 2.91 2.92 < 42 2.91 2.92 < 43 2.92 2.92 < ENV 2 Figure 4.1: Output from Experiment Dl.

82 NEAR FAR 50 24.20 23.93 > GEN MEAN MEAN DIR SIG 51 24.17 24.05 > 0 14.71 14.68 > 52 24.25 24.09 > 1 15.18 15.06 > 53 24.25 24.15 > 2 15.52 15.55 < 54 24.32 24.20 > 3 15.97 15.82 > 55 24.40 24.21 > 4 16.38 16.05 > 56 24.50 24.29 > 5 16.74 16.43 > 57 24.48 24.37 > 6 17.01 16.71 > 58 24.52 24.41 > 7 17.35 16.89 > 59 24.45 24.47 < 8 17.74 17.16 > 60 24.51 24.46 9 18.05 17.48 > 61 24.59 24.58 > 10 18.46 17.58 > 62 24.62 24.56 > 11 18.66 17.81 > * 63 24.72 24.57 > 12 18.87 18.10 > ~ 64 24.77 24.65 > 13 19.23 18.36 > " 65 24.77 24.63 > 14 19.51 18.51 > " 66 24.76 24.05 > 15 19.78 18.86 > * 67 24.81 24.65 > 16 20.07 19.11 > 6 68 24.79 24.72 > 17 20.28 19.28 > * 69 24.78 24.78 < 18 20.51 19.49 > * 70 24.82 24.79 > 19 20.67 19.70 > ~~ 71 24.89 24.84 > 20 20.88 19.87 > oo 72 24.89 24.82 > 21 21.03 20.08 > * 73 24.84 24.84 > 22 21.10, 20.22 > ~ 74 24.86 24.83 > 23 21.25 20.42 > o 75 24.82 24.88 < 24 21.38 20.58 > * 76 24.82 24.95 < 25 21.68 20.84 > ~ 77 24.88 24.95 < 26 21.86 21.07 > o 78 24.92 25.01 < 27 21.98 21.29 > 79 24.85 25.03 < 28 22.10 21.41 > * 80 24.80 25.04 < 29 22.31 21.44 > * 81 24.86 25.03 < 30 22.40 21.53 > ** 82 24.88 24.99 < 31 22.63 21.62 > 4 83 24.91 25.00 < 32 22.75 21.74 > * 84 24.89 24.95 < 33 22.95 21.86 > *~ 85 24.83 24.91 < 34 23.08 21 99 > 8 86 24.87 24.93 < 35 23.15 22.11 > oo 87 24.95 24.94 > 36 23.31 22.13 > * 88 24.92 24.96 < 37 23.35 22.23 > o 89 24.94 24.91 > 38 23.45 22.40 > o 39 23.58 22.60 > o 40 23.57 22.68 > o 41 23.64 22.85 > o 42 23.68 22.99 >, 43 23.72 23.16 > o 44 23.80 23.27 > 45 23.81 23.40 > 46 23.89 23.52 > 47 23.97 23.66 > 48 24.09 23.76 > =o 49 24.20 23.85 > ENV 3 Figure 4.2: Output from-Experiment D3. (Cont' d)

83 90 24.97 24.98 < 117 25.27 25.45 < 91 24.98 24.93 > 118 25.21 25.49 < 92 25.06 24.92 > 119 25.23 25.44 < 93 25.11 24.99 > 120 25.24 25.47 < 94 25.17 25,10 > 121 25.28 25.51 < 95 25.16 25.18 < 122 25.29 25.53 < 96 25.13 25,23 < 123 25.26 25.49 < 97 25.15 25.25 < 124 25.26 25.47 < 98 25.16 25.24 < 125 25,25 5.44 < *o 99 25.11 25.25 < 126 25.23 2 - < 100 25.09 25.28 < 127 25.27 25.43 s 101 25.12 25.33 < * 128 25.26 25.49 < 102 25.15 25.37 < 129 25.25 25.51 < 103 25.15 25.31 < 130 25.27 25.53 < 104 25.13 25.24 < 131 25.27 25.58 < 105 25.16 25.23 < 132 25.26 25.54 < 106 25.15 25.23 < 133 25.21 25.52 < 107 25.24 25.29 < 134 25.18 25.47 < 108 25.23 25.30 < 135 25.19 25.42 < 109 25.28 25.31 < 136 25.21 25.43 < 110 25.31 25.32 < 137 25.21 25.41 < 111 25.26 25.33 < oo 138 25.21 25.45 < 112 25.23 25.39 < 139 25.19 25.41. < 113 25.25 25.42 < 140 25.16 25.42 < 114 25.23 25.38 < 141 25.19 25.44 < 115 25.22 25.38 < 142 25.20 25*43 < 116 25.23 25.40 < 143 257 25.42 < 144 25.24 25.41 < 145 25.26 25.39 < 146 25.30 25.36 < 147 25.29 25.30 < 148 25.27 25.32 < 149 25.24 25.34 < 150 25.23 25.35 < Figure 4.2: Output from Experiment D3.

84 NEAR FAR 50 30.40 29.78 > GEN MEAN MEAN DIR SIG 51 30.52 30.11 > 0 24.34 24.43 < 52 30.48 30.13 > 1 24.57 24.59 < 53 30.53 30.28 > 2 24.68 24.68 < 54 30.58 30.42 > 3 24.93 24.76 > 55 30.69 30.55 > 4 25.16 24.88 > 56 30.74 30.47 > 5 25.43 24.84 > o 57 30.81 30.59 > 6 25.61 24.88 > o 58 30.77 30.62 > 7 25.85 25.01 > o 59 30.81 30.67 > 8 25.95 25.22 > ~ 60 30.90 30.76 > 9 26.16 25.22 > * 61 30.93 30.78 > 10 26.36 25.25 > e 62 31.00 30.76 > 11 26.47 25.31 > ~~ 63 31.06 30.77 > 12 26.59 25.42 > ~ 64 31.17 30.82 > 13 26.62 25.51 > * 65 31.14 30.86 > * 14 26.67 25.65 > o~ 66 31.18 30.97 > 15 26.78 25.84 > 67 31.17 31.08 > 16 26.95 25.86 > o 68 31.25 31.12 > 17 27.19 26.04 > o~ 69 31.29 31.19 > 18 27.19 26.16 > ~ 70 31.27 31.30 < 19 27.40 26.23 > o 71 31.30 31.31 < 20 27.59 26.38 > o 72 31.35 31.37 < 21 27.70 26.61 > 73 31.37 31.42 < 22 27.84 26.74 > o 74 31.42 31.43 < 23 27.82 26.87 > o 75 31.53 31.38 > 24 27.88 26.94 > o 76 31.44 31.43 > 25 27.99 27.09 > 77 31.48 31.54 < 26 28.09 27.23 > ~ 78 31.57 31.45 > 27 28.22 27.40 > o 79 31.61 31.47 > 28 28.28 27.62 > e 80 31.62 31.39 > 29 28.40 27.82 > o 81 31.69 31.45 > 30 28.52 27.86 > o 82 31.82 31.53 > 31 28.58 27.89 > o 83 31.79 31.58 > o 32 28.64 27.95 > * 84 31.69 31.46 > 33 28.77 28.14 > 85 31.79 31.52 > 34 28.81 28.11 > * 86 31.79 31 64 > 35 28.82 28.19 > 87 31.82 31.b2 > 36 28.89 28.38 > o 88 31.78 31.67 > 37 28.90 28.35 > 89 31.81 31.75 > 38 29.06 28.46 > o 90 31.76 31.66 > 39 29.21 28.67 > 9 91 31.77 31.54 > 40 29.34 28.83 > o 92 31.76 31.56 > 41 29.45 28.95 > 8 93 31.87 31.64 > 42 29.54 29.06 > # 94 31.85 31.62 > 43 29.68 29.13 > * 95 31.96 31.66 > 44 29.87 29.21 > o 96 31.87 31.71 > 45 29.91 29.27 > 97 31.87 31.73 > 46 30.02 29.47 > * 98 31.86 31.74 > 47 30.11 29.63 > 99 31.76 31.75 > 48 30.23 29.73 > ~ 49 30.31 29.81 > ENV 7 Figure 4.3: Output from Experiment D5. (Cont'd)

85 100 31.85 31.82 > 150 32.35 31.86 > O 101 31.84 31.80 > 151 32.38 31.78 > 102 31.72 31.80 < 152 32.40 31.74 > * 103 31.71 31.75 < 153 32.42 31.77 > 104 31.75 31.77 < 154 32.40 31.67 > 105 31.82 31.67 > 155 32.41 31.74 > 106 31.90 31.81 > 156 32.36 31.73 > 107 31.91 31.85 > 157 32.43 31.68 > 108 31.92 31.83 > 158 32.38 31.82 > 109 32.02 31.90 > 159 32.40 31.71 > 110 32.09 31.88 > 160 32.34 31.69 > 111 32.15 31.92 > 161 32.38 31.81 > 112 32.17 32.09 > 162 32.38 31.83 > 113 32.31 32.12 > 163 32.40 31.76 > t 114 32.28 32.14 > 164 32.38 31.80 > 115 32.23 32.05 > 165 32.31 31.83 > * 116 32.21 31.99 > 166 32.34 31.93 > 117 32.16 32.00 > 167 32.38 32.05 > 118 32.27 31.97 > 168 32.36 32.18 > 119 32.37 31.90 > $ 169 32.33 32.08 > 120 32.32 32.00 > 170 32.33 32.02 > 121 32.33 32.02 > 171 32.36 32.01 > 122 32.31 31.94 > 172 32.30 31.96 > 123 32.20 31.93 > 173 32.26 31,97 > 124 32.13 31.99 -> 174 32.33 31.89 > 125 32.13 32.01 175 32.29 31.82 > 126 32.25 32.08 > 176 32.32 31.83 127 32.20 32.10 > 177 32.33 31.83 > 128 32.07 32.07 < 178 32.30 31.78 > 129 32.13 32.09 > 179 32.36 31.77 > 130 32.10 32.00 > 180 32.34 31.80 > 131 32.15 32.02 > 181 32.32 31.89 > 132 32.14 31.95 > 182 32.36 31.84 > 133 32.12 31.98 > 183 32.36 31.90 > ) 134 32.11 32.00 > 184 3242 31.86 > 135 32.16 32.10 > 185 32.36 31.86 > 136 32.17 32.09 > 186 3231 31.83 > 137 32.22 32.02 > 187 32.31 31.81 > 138 32.20 31.99 > 188 32.39 31.91 > 139 32.22 31.99 ) 189 32.42 32.00 > 140 32.22 31.96 > 190 32.39 31.98 > 141 32.22 32.05 > 191 32.34 31.99 > 142 32,18 32.06 > 192 32.40 32.01 > 143 32.27 31.96 > 193 32.47 31.93 > 144 32.26 31.91 > 194 32.51 3194 > 145 32.32 31.81 > ** 195 32.59 32.01 > 146 32.34 31.85 > o 196 32.60 32.05 > 147 32.33 31.81 > 197 32.57 32.02 > 148 32.38 31.71 > * 198 32.52 32.04 > 149 32.39 31.77 > o 199 32.4? 32.02 > Figure 4.3: Output from Experiment D5. (Cont'd)

86 200 32.52 32.05 > 237 32.98 32.05 > A 201 32.55 32.05 > X 238 33.06 31.94 > 202 32.54 32.15 > 239 33.03 31.87 > 203 32.56 32,15 > 240 33.06 31.89 > ) 204 32.66 32.16 > 241 33.01 31.97 > ^ 205 32.68 32.17 > o 242 33.06 32.03 > 206 32.62 32.21 > 243 33.C0 32.01 > 207 32.68 32.13 > o 244 33.07 32.01 > 208 32.76 32.09 > o 245 33.07 31.94 > 209 32.70 32.05 >, 246 32.99 31.95 > 210 32.71 32.04 >, 247 32.94 32.04 > ~ 211 32.66 32.03 > * 248 32.96 32.07 > t 212 32.65 31.98 > 249 32.97 32.10 > ^ 213 32.64 32.00 > * 250 32.99 32.07 > 214 32.61 32.10 > 215 32.70 32.10 > 216 32.80 32.09 > ~ 217 32.83 32.11 > oo 218 32.80 32.23 > Ad 219 32.74 32.16 > o 220 32.77 32.19 > 221 32.87 32.16 > t 222 32.94 32.23 > ) 223 32.88 32.13 > * 224 32.88 32.10 > o 225 32.93 32.06 > o 226 32.99 32.00 > *t 227 33.11 32.00 > o 228 33.05 31.99 > t 229 33.08 31.91 > t 230 33.07 31.96 >, 231 33.01 31.86 > * 232 32.99 31.97 > o 233 33.04 31.91 > o 234 33.05 31.83 > ~~ 235 33.06 31.91 >, 236 33.02 31.98 > Ad Figure 4.3: Output from Experiment D5.

87 NEAR FAR 50 30.06 29.36 > GEN MEAN MEAN DIR SIG 51 30.10 29.47 > 0 19.25 19.33 < 52 30.05 29.52 > o 1 19.77 19.50 > 53 30.15 29.55 > 2 20.10 19.85 > 54 30.28 29.66 > 3 20.52 20.01 >: 55 30.31 29.75 > 4 20.89 20.22 >, 56 30.33 29.84 > 5 21.36 20.51 >: 57 30.41 30.02 > 6 21.58 20.69 > 58 30.48 29.98 > 7 21.98 20.89 > 59 30.66 29.97 > 8 22.21 21.21 > 60 30.58 29.99 > 9 22.41 21.45 > * 61 30.65 30.09 > 10 22.73 21.75 > * 62 30.63 30.15 > 11 23.01 22.02 > ** 63 30.77 30.23 > 12 23.26 22.27 > ** 64 30.82 30.25 > 13 23.57 22.51 > 65 30.87 30,34 > 14 23.88 22.80 > * 66 30.84 30.35 > 15 24.11 23.02 > * 67 30.92 30.49 > 16 24.31 23.14 > * 68 30.97 30.55 > 17 24.53 23.31 > ** 69 31.10 30.66 > 18 24.78 23.84 > 70 31.19 30.67 > 19 24.99 24.10 > ** 71 31.15 30.77 > 20 25.40 24.38 > * 72 31.15 30.76 > 21 25.67 24.59 > * 73 31.22 30.76 > 22 25.89 24.74 > ** 74 31.39 30.88 > ^ 23 26.06 24.87 > * 75 31.35 30.85 > 24 26.40 25.12 > 76 31.35 30.85 > 25 26.61 25.31 > ** 77 31.38 30.99 > 26 26.81 25.50 > * 78 31.44 31.05 > 27 26.94 25.68 > * 79 31.49 30.96 > 28 27.15 26.02 > ** 80 31.49 30.97 > 29 27.28 26.19 > 81 31.53 31.01 > 30 27.47 26.29 > * 82 31.60 31.00 > 31 27.78 26.66 > * 83 31.70 30.99 > 32 27.95 26.95 > * 84 31.63 31.01 > 33 28.10 27.12 > * 85 31.62 31.15 > 34 28.19 27.32 > o 86 31.62 31.26 > 35 28.32 27.33 > o 87 31.55 31.31 > 36 28.50 27.54 > * 88 31.56 31.39 > 37 28.68 27.68 > o 89 31.59 31.38 > 38 28.76 27.77 > 90 31.58 31.50 > 39 28.97 27.99 > * 91 31.53 31.46 > 40 29.11 28.13 > * 92 31.57 31.46 > 41 29.10 28.37 > * 93 31.54 31.48 > 42 29.13 28.39 >, 94 31.56 31.50 > 43 29.21 28.51 > 95 31.55 31.49 > 44 29.37 28.66 > * 96 31.65 31.54 > 45 29.51 28.80 > * 97 31.65 31.65 < 46 29.58 28.84 > ** 98 31.63 31.70 < 47 29.63 28.99 > 99 31.68 31.66 > 48 29.73 28.97 > * 49 29.95 29.09 > *c ENV 8 Figure 4.4: Output from Experiment D6. (Cont' d)

88 100 31.76 31.65 > 150 31.92 32.00 < 101 31.84 31.73 > 151 31.86 32.05 < 102 31.80 31.76 > 152 31.87 31.96 < 103 31.83 31.80 > 153 31.94 31.94 < 104 31.65 31.84 < 154 31.89 31.97 < 105 31.64 31.85 < oo 155 31.78 32.00 < 106 31.69 31.88 < 156 31.68 31.98 < 107 31.75 31.92 < 157 31.60 31.93 < 108 31.82 31.79 > 158 31.52 31.87 < 109 31.77 31.83 < 159 31.56 31.90 < 110 31.82 31.82 > 160 31.51 31.87 < 111 31.72 31.82 < 161 31.51 31.98 < 112 31.88 31.88 < 162 31.49 32.05 < 113 31.89 31.89 > 163 31.60 32.03 < 114 31.92 31.84 > 164 31.65 32.07 < 115 31.90 31.73 > 165 31.68 32.08 < 116 31.88 31.68 > 166 31.64 32.09 < 117 31.90 31.81 > 167 31.60 32.15 < o 118 31.86 31.88 < 168 31.60 32.18 < 119 31.81 31.90 < 169 31.61 32.13 < o 120 31.69 31.91 < 170 31.54 32.14 < 121 31.64 31.95 < 171 31.63 32.06 < o 122 31.68 32.00 < 172 31.55 31.90 < ~ 123 31.55 31.96 < 173 31.58 32.00 < o 124 31.55. 31.94 < * 174 31.60 32.01 < 125 31.52 31.85 < 175 31.68 32.09 < 126 31.53 31.87 < 176 31.70 32.14 < 127 31.54 31.86 < 177 31.68 32.18 < 128 31.55 31.89 < 178 31.68 32.21 < 129 31.54 31.91 < 179 31.64 32.12 < 130 31.60 31.92 < 180 31.57 32.14 < 131 31.56 31,87 < 181 31.57 32.13 < 132 31.66 31.85 < 182 31.64 32.13 < 133 31.62 31.85 < 183 31.60 32.14 < 134 31.55 31.82 < 184 31.55 32.17 < 135 31.54 31.81 < 185 31.67 32.20 < 136 31.56 31.89 < 186 31.67 32.13 < 137 31.55 31.92 < 187 31.68 32.04 < 138 31.50 31.90 < 188 31.81 32.05 < 139 31.60 31.92 < * 189 31.83 32.06 < 140 31.64 31.98 < 190 31.86 32.13 < 141 31.67 31.96 < 191 31.89 32.12 < 142 31.70 31.90 < 192 31.92 32.14 < 143 31.70 31.96 < 193 31.92 32.12 < 144 31.72 31.95 < 194 31.91 32.09 < 145 31.75 32.00 < 195 31.95 32.07 < 146 31.80 32.01 < 196 31.83 32.11 < 147 31.85 32.01 < 197 31.80 32.06 < 148 31.87 31.98 < 198 31.81 32.12 < 149 31.86 32.07 < 199 31.82 32.08 < Figure 4.4: Output from Experiment D6. (Cont' d)

89 200 31.74 32.23 < 201 31.79 32.24 < ot 202 31.84 32.27 < 203 31.82 32.26 < 204 31.84 32.23 < 205 31.84 32.19 < 206 31.78 32.13 < 207 31.76 32.18 < 208 31.79 32.08 < *v 209 31.76 31.95 < 210 31.76 31.96 < 211 31.73 31.96 < 212 31.72 32.01 < 213 31.68 32.04 < 214 31.65 32.02 < ot 215 31.67 32.07 < ot 216 31.73 32.08 < 217 31.63 32.02 < 218 31.64 32.10 < 219 31.59 32.03 < 220 31.65 32.07 < 221 31.72 32.01 < 222 31.68 32.01 < 223 31.77 32.01 < 224 31.73 32.10 < o 225 31.69 32.08 < t 226 31.73 32.10 < 227 31.72 32.10 < t 228 31.71 32.09 < t 229 31.77 32.12 < 230 31.79 32.17 < oo 231 31.72 32.12 < 232 31.67 32.12 < 233 31.61 32.10 < 234 31.67 32.12 < 235 31.70 32.10 < 0 236 31.65 32.17 < 237 31.74 32.15 < v 238 31.79 32.14 < 239 31.81 32.19 < 240 31.74 32.21 < 241 31.74 32.20 < 242 31.79 32.23 < 243 31.84 32.30 < 244 31.84 32.23 < 245 31.78 32.27 < 246 31.74 32.31 < 247 31.74 32.26 < 248 31.77 32.26 < 249 31.82 32.27 < 250 31.91 32.36 < Figure 4.4: Output from Experiment D6.

90 NEAR FAR 50 20.85 20.73 > GEN MEAN MEAN DIR SIG 51 20.94 20.79 > 0 13.44 13.46 < 52 21.04 20.82 > 1 13.77 13.73 > 53 21.02 20.85 > 2 14.06 14.06 < 54 21.07 20.97 > 3 14.37 14.37 < 55 21.15 21.03 > 4 14.69 14.65 > 56 21.19 21.11 > 5 14.92 14.95 < 57 21.17 21.15 > 6 15.24 15.28 < 58 21.24 21.16 > 7 15.48 15.47 > 59 21.34 21.26 > 8 15.75 15.72 >60 21.38 2134 > 9 15.93 15.86 > 61 2138 21.40 10 16.25 16.10 > ~~ 62 2145 21.44 > 11 16.45 16.34 > 63 21.43 21.50 < 12 16.64 16.45 > 64 21.50 21.52 < 13 16.84 16.63 > 65 21.51 21.59 < 14 17.09 16.84 > 66 21.59 21.62 < 15 17.39 16.99 > ~ 67 21.55 21.63 < 16 17.60 17.24 >, 68 21.62 21.69 < 17 17.76 17.47 > 69 21.66 21.67 < 18 17.78 17.63 > 70 21.67 21.67 < 19 17.84 17.80 > 71 21.66 21.68 < 20 18.01 17.90 > 72 21.66 21.67 < 21 18.15 18.05 > 73 21.71 21.72 < 22 18.31 18.17 > A 74 21.77 21.77 > 23 18.40 18.32 > 75 21.77 21-.78 < 24 18.54 18.49 > 76 21.79 21.79 > 25 18.59 18.64 < 77 21.81 21.77 > 26 18.62 18.69 < 78 21.85 21.84 > 27 18.69 18.86 < 79 21.80 21.94 < 28 18.75 18.97 < = 80 21.78 21.90 < 29 18.86 19.08 <. 81 21.83 21.93 < 30 18.98 19.17 < 82 21.85 21.89 < 31 19.06 19.25 < 83 21.91 21.90 > 32 19.14 19.39 < 84 21.82 21.85 < 33 19.16 19.42 < 85 21.88 21.92 < 34 19.30 19.53 < 86 21.89 21.93 < 35 19.34 19.65 < 87 21.95 21.94 > 36 19.43 19.79 < * 88 21.92 21.99 < 37 19.55 19.84 < 89 21.96 22.02 < 38 19.64 19.84 < 90 21.91 22.00 < 39 19.79 19.90 < 91 21.95 22.03 < 40 19.88 20.10 <, 92 22.01 22.06 < 41 19.91 20.12 < 9, 93 21.94 22.10 < 42 20.06 20.23 < v 94 21.97 22.06 < 43 20.15 20*35 < 9, 95 22.07 22.03 > 44 20.27 20.43 <, 96 22.04 22.12 < 45 20.33 20.48 < 97 21.99 22.15 < 46 20.45 20.46 < 98 22.02 22.16 < 47 20.56 20.52 > 99 22.01 22.20 < 48 20.66 20.64 > 49 20.79 20 68 > ENV 9 Figure 4.5: Output from Experiment D7. (Cont' d)

91 100 22.05 22.17 < 150 22.64 22.45 > 101 22.10 22.17 < 151 22.64 22.50 > 102 22.10 22.17 < 152 22.68 22*54 > 103 22.14 22.21 < 153 22.77 22.60 > 104 22.14 22-15 154 22.78 22.58 > 105 22.15 22.10 > 155 22,81 22.51 > 106 22.14 22.12 > 156 22.80 22.55 > 107 22*13 22.19 < 157 22.76 22,56 > 108 22.10 22.26 < 158 22.68 22.54 > 109 22.16 22,29 < 159 22.62 22.50 > 110 22,15 22,31 < 160 22.56 22.44 > 111 22.14 22.40 < 161 22.55 22,43 >, 112 22.18 22.47 <' 162 22.54 22,44 > 113 22.21 22.55 < * 163 22.52 22.45 > 114 22,33 22,49 < *, 164 22,55 22,42 > 115 22.35 22.47 < 165 22.56 22.47 > 116 22.36 22.43 < 166 22,59 22.42 > 117 22.38 2248 < 167 2257 22.48 > 118 22.41 22.45 < 168 22.56 22.48 > 119 22.41 22,45 < 169 22.51 22.44 > 119 22*41 22*45 < 169 22-51 22144 > 120 22.42 22.45 < 170 22.51 22.49 > 121 22.48 22.50 < 171 22.52 22 52 > 122 22.48 22,52 < 172 22.48 22.53 < 123 22.47 22.40 > 173 22.52 22.59 < 124 22,46 22.43 > 174 22.53 22.66 < 125 22.43 22,46 < 175 22.57 22.69 < 126 22.50 22.41 > 176 22.57 22.68 < 127 22.51 22.41 > 177 22.61 22.68 < 128 22.46 22.42 > 178 22.64 22.60 > 129 22,42 22.46 < 179 22.57 22.58 < 130 22.43 22.41 > 180 22.53 22.55 < 131 22.45 22.43 > 181 22.50 22,56 < 132 22.42 22.40 > 182 22.45 22.49 < 133 22.39 22.37 > 183 22.45 22.47 < 134 22.46 22.37 > 184 22.54 22.51 > 135 22.42 22.43 < 185 22.54 22.48 > 136 22.46 22.43 > 186 22.48 22.37 > 137 22.43 22.44 < 187 22.57 22.35 > 138 22.51 22.45 > 188 22.62 22.41 > 139 22.53 22.55 < 189 22.61 2238 > 140 22.51 22.54 < 190 22.60 22.37 > 141 22.45 22.54 < 191 22.60 22.35 > 142 22.46 22.54 < 192 22.55 22.43 > 143 22.48 22.53 < 193 22.57 22.41 > 144 22.48 22.47 > 194 22.57 22,40 > 145 22.52 22.45 > 195 22.59 22.47 > 146 22.53 22,46 > 196 22,61 22.46 > 147 22.55 22;44 > 197 22.57 22.48 > 148 22.58 22.46 > 198 22.57 22.42 > 149 22.57 22.45 > 199 22.56 22.42 > Figure 4.5: Output from Experiment D7. (Cont'd)

92 200 22.59 22.38 > 201 22.61 22.38 > 202 22.55 22.40 > 203 22.57 22.40 > 204 22.66 22.41 > 205 22.67 22.45 > 206 22.71 22.44 > 207 22.71 22.44 > 208 22.72 22.39 > 209 22.73 22.40 > 210 22.69 22.42 > ~ 211 22.65 22.47 > ~ 212 22.68 22.42 > 213 22.75 22.38 > ~ 214 22.78 22.44 > 215 22.81 22.46 > 216 22.80 22.48 > 217 22.80 22.40 >, 218 22.84 22.39 > * 219 22.83 22.46 > 220 22.89 22.48 > 221 22.91 22.48 > 222 22.91 22.44 > > 223 22.88 22.45 > v 224 22.86 22.48 > ~ 225 22.78 22.50 > 226 22.85 22.48 > 227 22.90 22.55 > 228 22.92 22.58 > 229 22.95 22.58 > t 230 22.97 22.55 > 231 22.98 22.57 > v 232 22.95 22.55 > > 233 22.91 22.55 > 234 22.93 22.54 > 235 22.88 22.52 > * 236 22.92 22.56 > 237 22.95 22.58 > 238 22.99 22.58 > 239 23.00 22.61 > 240 22.96 22.63 > 241 22.92 22.64 > 242 22.92 22.60 > 243 22.95 22.59 > 244 22.93 22.57 > 245 22.94 22.56 > 246 22.87 22.54 > 247 22.87 22.49 > 4 248 22.90 22.54 > 249 22.89 22.54 > 4 250 22.87 22.51 > 4 Figure 4.5: Output from Experiment D7.

93 NEAR FAR 50 19.05 18.01 >, GEN MEAN MEAN DIR SIG 51 19.11 18.09 > ** 0 14.10 14.15 < 52 19.22 18.18 > * 1 14.28 14.27 > 53 19.39 18.26 >, o 2 14.43 14.33 > 54 19.50 18.28 > >' 3 14.49 14.37 > 55 19.59 18.28 > o 4 14.58 14.48 > 56 19.67 18.48 >, 5 14.71 14.53 > o 57 19.77 18.44 > * 6 14.83 14.51 >, 58 19.88 18.53 > o 7 15.02 14.66 > 59 19.99 18.59 > * 8 15.10 14.70 > 60 20.01 18.73 > 0 9 15.17 14.79 > 61 20.03 18.82 >, o 10 15.21 14.89 > 62 20.14 18.86 > * 11 15.32 14.99 > oo 63 20.14 18.91 > ~, 12 15.36 15.11 > 64 20.23 18.98 > * 13 15.52 15.17 > 65 20.22 19.08 > 14 15.66 15.16 > * 66 20.22 19.02 >, 15 15.79 15.25 > * 67 20.26 19.11 >, 16 15.88 15.37 > * 68 20.30 19.10 > * 17 15.98 15.51 > * 69 20.36 19.13 > o 18 16,04 15.50 >' 70 20.44 19.08- > 0 19 16.05 15.64 > 7 71 20.48 19.13 > o 20 16.21 15.66 > * 72 20.58 19.19 > O 21 16.35 15.71 > * 73 20.65 19.17 >, 22 16'.46 15.77 > * 74 20.71 19.23 > 23 16.56 15.84 > o 75 20.74 19.27 > t 24 16.64 15.93 > o 76 20.76 19.27 >, 25 16.69 16.06 >, 77 20.90 19.39 > * 26 16.69 16.08 >'0 78 20.95 19.49 > o 27 16.82 16.16 >, 79 21.06 19.43 > " 28 16.93 16.19 > 1 80 21.08 19.53 > 0 29 17.09 16.24 > * 81 21.08 19.53 > * 30 17.14 16.27 > ** 82 21.05 19.58 >, 31 17.22 16.38 > * 83 21.07 19.63 > 0 32 17.35 16.51 > 0 84 21.19 19.64 >' 33 17.42 16.62 > 0 85 21.16 19.61 > o 34 17.56 16.74 > *~ 86 21.16 19.66 >, 35 17.68 16.72 >, 87 21.28 19.72 >, 36 17.72 16.90 > ** 88 21.28 19.71 > * 37 17.87 17.04 > o 89 21.36 19.76 >, 38 17.92 17.09 >, 90 21.39 19.78 > * 39 17.99 17.18 > * 91 21.44 19.78 >' 40 18.14 17.28 > * 92 21.46 19.76 >, 41 18.29 17.31 > t 93 21.48 19.79 > * 42 18.45 17.50 > 94 21.50 19.81 > 43 18.52 17.52 > * 95 21.49 19.88 > 44 18.55 17.56 > 0 96 21.51 19.79 > 45 18.64 17.63 > * 97 21.54 19.73 > o 46 18.73 17.73 > 2 98 21.53 19.74 > * 47 18.77 17.82 >' 99 21.60 19.82 > o 48 18.81 17.84 > *o 49 18.91 17.97 >'0 ENV 10 Figure 4.6: Output from Experiment D8. (Cont, d)

94 100 21.56 19.96 >,m 150 22.21 21.01 >,o 101 21.56 20.05 > ~ 151 22.23 21.04 > *~ 102 21.65 20.08 > o 152 22.21 21.11 > o 103 21.68 20.15 > ~ 153 22.33 21.17 > ~ 104 21.66 20.15 > o 154 22.36 21.19 > oo 105 21.66 20.22 > * 155 22.27 21.17 > o 106 21.69 20.24 > o 156 22.28 21.18 > ) 107 21.74 20.31 > ~ 157 22.25 21.13 > ~ 108 21.71 20.32 >,o 158 22.19 21.11 > 0~ 109 21.77 20.33 > o 159 22.29 21.09 > < 110 21.76 20.42 > *~ 160 22.24 21.02 > ^ 111 21.87 20.42 > ~ 161 22.27 21.03 > oo 112 21.91 20.43 > o 162 22.22 21.03 > ~ 113 21.91 20.56 > o 163 22.23 20.93 > o, 114 21.84 20.52 > ~~ 164 22.19 20.95 > o 115 21.91 20.53 > om 165 22.27 20.96 >,o 116 21.85 20.50 > ~ 166 22.39 20.96 > t 117 21.92 20.57 > n 167 22.40 21.00 > 118 22.02 20.61 > 2238 21.01 > 119 22.05 20.65 > 22.3 2 > 169 22.39 20.97 > 120 22.09 20.65 > 121 22.14 20.65 > 17 22.37 20.96 > ^ 121 22.14 20.69 > tt 171 22.30 21.07 > o 122 22.10 20.67 > > 172 22.27 20.98 > * 123 22.07 20.69 > * 173 22.38 21.03 > O 124 22.12 20.64 > * 17 22.36 2113 > 174 22.36 21.11 > 125 22.15 20.58 > 22.24 21.12 > 175 22.24 21.12 > oo 126 22.07 20.66 > 176 22.25 21.06 > ~ 127 22.08 20.67 > ~ 177 22.27 21.02 > o 128 22.08 20.68 > 178 22.19 21.00 > 129 22.05 20.71 > 179 22.21 20.96 > 130 21.99 20.77 > * 180 22.14 20.96 > * 131 21.90 20.85 > 181 22.14 20.95 > 132 21.89 20.80 > ** 182 22.11 20.97 > * 133 21.91 20.80 > * 183 22.13 20.99 > * 134 21.87 20.81 > * 184 22.21 20.98 > 135 21.92 20.78 > * 185 22.21 20.98 > * 136 21.94 20.81 > * 186 22.20 20.96 > * 137 21.92 20.85 > 187 22.14 21.02 > 138 22.01 20.90 > 188 22.11 21.08 > * 139 22.03 20.89 > 189 2210 2112 > 189 22~10 21. 12 > 140 22.08 20.88 > 22.13 21.14 > 190 22.13 21.14 > 141 22.05 20.92 > 191 2205 2112 > 191 22.05 21.12 > 142 22.10 20.88 > 192 22.11 21.20 > 143 22.01 20.96 >3 2216 2126 > 193 22.16 21.26 > 144 22.00 20.90 > 22.23 21.17 > 194 22.23 21.17 > 145 22.05 20.92 > 195 22.25 21.19 > * 146 22.17 20.87 > * 196 22.28 21.25 > 147 22.12 20.88 > 197 22.31 21.28 > 148 22.18 20.88 > * 198 22.31 21.28 > ~ 149 22.21 20.96 > 199 22.30 21.31 > l Figure 4.6: Output from Experiment D8. (Cont' d)

95 200 22.36 21.22 > t 237i 22.57 21.36 > 201 22.40 21.22 > o 238 22.52 21.40 > 202 22.43 21.18 > 239 22.54 21.33 > ^ 203 22.43 21.15 > 240 22.52 21.28 > 204 22.37 21.21 > o 241 22.43 21.27 > 205 22.31 21.23 > ~ 242 22.41 21.29 > ~ 206 22.44 21.17 > ~ 243 22.41 21.28 > 207 22.55 21.21 > ~ 244 22.52 21.31 > 208 22.55 21.23 > o 245 22.44 21.34 > 209 22.46 21.35 > ~~ 246 22.44 21.38 > 210 22.33 21.30 > 2m 247 22.51'21.47 >. 211 22.31 21.27 > o 248 22.49 21.40 > ^ 212 22.31 21.24 > o 249 22.53 21.45 > ^ 213 22.34 21.21 > o 250 22.50 21.46 > ~ 214 22.38 21.26 > oo 215 22.29 21.26 > o 216 22.37 21.17 > ~ 217 22.41 21.15 > t 218 22.40 21.08 >' 219 22.41 21.08 > ~ 220 22.42 21.11 >' 221 22.42 21.15 > t 222 22.44 21.20 > o 223 22.40 21.23 > ~ 224 22.35 21.23 > ~ 225 22.34 21.32 > ~ 226 22.38 21.30 > ~ 227 22.48 21.34 > ~ 228 22.48 21.36 > * 229 22.44 21.32 > < 230 22.40 21.36 > < 231 22.30 21.29 > * 232 22.31 21.29 > t 233 22.28 21.35 > ~~ 234 22.38 21.36 > o 235 22.42 21.40 > o 236 22.38 21.40 > ** Figure 4.6: Output from Experiment D8.

96 t-test performed to detect a significant difference. The results of these experiments are best shown by the computer output. In these figures (4.1-4.6) the column labeled "NEAR MEAN" gives the average population payoff for the permutation with all dependent genes adjacent and the column labeled "FAR MEAN" gives the average for the worst case payoff. The column labeled "DIR" shows the direction of the difference (< means NEAR<FAR, and > means NEAR>FAR). The column labeled "SIG" shows whether the difference between the two values is significant. One asterisk (*) means the difference is significant at the 95% level and two asterisks (**) indicates significance at the 99% level. Figure 4.1 shows the results for experiment D1, ENV 2. At no time is the difference between the two populations significant and the direction of the difference is not consistently in either direction. With some confidence we can say there is no position effect in ENV 2, our simplest non-linear environment. The lowered selection experiment, D2, showed significance in the right direction at some points. Figure 4.2 shows the results for experiment D3, ENV 3. It is of considerable interest that the direction of difference between means is in the desired direction quite consistently through 74 generations, and that the difference is significant at the 99% level for generations 8 through 42. After generation 74 the sense of the difference wanders but settles down after generation 95 to "<" up to generation 150, with the difference only occasionally significant. There is a definite difference in the rate of evolution early in time with the adjacent combinations performing best. Afterwards the separated permutations catch up and exceed the adjacent ones, although not usually significantly. Experiment D4, the lowered selection version of D3, showed somewhat

97 Exp ENV Early Equilibrium D1 2 > Not Sig < Not Sig D2 5 > Sometimes Sig = D3 3 > Sig <Not Sig D4 6 > Sometimes Sig < Not Sig D5 7 > Sig > Sig D6 8 > Sig < Sometimes Sig D7 9 > Sometimes Sig > Sig D8 10 >Sig > Sig Table 4.7: Summary of Best/Worst Evolution Experiments.

98 different behavior in that while the difference was ">" much of the time in the early stages, it was only sometimes significant. Experiments D5 (ENV 7), D6 (ENV 8), and D8 (ENV 10) all showed early significant differences favoring the adjacent permutations. In D5 and D8 there was also a significant difference at the equilibrium state. In D7 the earliest trend (while not always significant) again favored the adjacent permutations. Table 4.7 summarizes the differences in means for experiments D1-D8 in their early and late stages of evolution. Out of the eight different environments four show significant early differences favoring the adjacent permutations, three show consistent, sometimes significant differences in the right direction, and only one never shows significance, although its earliest direction is still correct. It is important to note that of the three environments containing more than one dependent group (7,8, and 10), all three showed significant early differences. This strongly implies that the more complex the environment, the more important is the permutation. Approach to the Optimum We shall undertake another analysis to show the importance of position in early evolution. We noted that experiments D5, D7, and D8 all had significantly greater population averages at the end of twenty runs for the best permutation. However, population average doesn't tell us everything there is to know. We can, in addition, look at the maximum payoff attained in each run. Not all populations attained the maximum possible payoff value; many got hung up on false peaks. It may be that permutations have something to do with this. Although

99 we do not have enough data to make authoritative conclusions in this regard, the analysis below is strongly supportive of the belief that good permutations in early evolution are important.. In each of the three above-mentioned experiments we count the number of populations (out of the 20 obtained) in which a particular peak was attained at the end of the run. For example, in ENVi 7, the best individuals in the populations ended up on peaks paying 34, 35, 36, and 37, as well as on the true maximum of 38. We then take the average of all the populations which have achieved a particular peak. This is done for populations using both the best and the worst permutations. The results are tabulated in Table 4.8 for the three experiments. It is clear that the population average associated with a particular peak reflects the magnitude of the peak. (But it may be that at the particular point in time that data was taken, a peak had just appeared in or disappeared from the population, so that the population average does not reflect the peak shown. As a matter of fact, we often note in the source data that there are only one or two adherents to a peak. However, these are standard problems in sampling and we must take the data as is. Taking more sample points is the only way of solving this problem.) A two-sided t-test for difference between means was performed for those peaks which showed up in two or more samples in both the best and worst permutation runs. The direction of the difference most often favored the adjacent permutations, but never significantly (i.e., at the 95% level or more). This leads us to believe that once a population has homed in on a peak the permutation makes no difference, as many of our previous experiments indicate. In simple environments

100 Best Permutation Worst Permutation Best Number Ave. Number Ave. Exp indiv. populations Pay populations Pay DIR SiG D5 34 1 31.63 2 30.67 35 3 31.48 7 31.02 > 36 4 32.58 2 31.66 > 37 7 33.11 3 32.14 > 38 5 34.33 6 33.85 > D7 24 0 3 21.47 25 9 22.45 11 22.34 > 26 11 23.22 6 23.35 < D8 23 1 20.21 2 20.42 24 1 20.10 8 21.38 25 6 22.07 8 21.60 > 26 8 22.83 2 22.92 27 4 23.68 0 Table 4.8: Best Individuals vs. Population Payoff.

101 nearly all populations find the true optimum, even given the worst permutation of genes on their chromosomes. More complex environments, however, are more likely to trap a population on a false peak so that sampling the equilibrium point obtains more false peaks, possibly yielding a significant difference in sampled overall means, as in this case. We note especially that experiment D8, which used our most complex and sparsest payoff function, ENV 10, failed to attain the optimum point at all in twenty runs with the worst gene permutation, while the optimum was attained four times with the most favorable permutation. Other comparisons are similar. The important lesson to be gained from all this is that sparse payoff functions require close permutations if a good point is to be maintained once it is found; otherwise, random effects are enough to overcome a true peak's advantage in selection and drive the population to any of the numerous false peaks in an environment. Inversion Experiments To determine whether inversion has any effect as a genetic operator we can do two things: look at the population average when inversion is and is not in use; and investigate the actual gene permutations in the populations. The first method was that chosen by Cavicchio (3) and Bosworth (2). Its greatest failing is that it does not completely verify the hypothesis; that is, it may be that the inversion operator effects the payoff average by means other than bringing together interacting genes. In fact, Cavicchio's inversion experiments are subject to an entirely different analysis. Briefly, Cavicchio's chromosomes consisted

102 of many copies of the same gene —a gene which had a very large number of alleles. If there were two copies of the same allele on one chromosome (an a priori unlikely event) the effect of that allele would be increased —unlike our system in which only one allele is allowed to appear for a gene. His system did not need to check for homology since all genes were the same. Any two individuals could always mate. The alternate analysis of the effectiveness of inversion then states the following. For exactly the same reason that we need mating rules when inversion is used in our system (i.e., to avoid missing or multiple copies of a gene), it is likely that an inversion followed by a mating with the parent or a sibling will produce individuals with multiple copies of some alleles. Multiple copies of a good allele would then increase payoff'. Since Cavicchio did not study the micro-structure of his populations we do not have any evidence on which to accept one explanation over the other. If we were to make Cavicchio's claim (i.e., comparison of population averages) on our kinds of experiments, the claim would be easier to accept since our protocol does not allow for this alternate hypothesis. It does, however, allow for the possibility that dependent genes work best farther apart and that inversion helps attain a large distance. Further verification would be needed. Bosworth makes this sort of claim (that inversion is effective in bringing epistatic genes together) but presents no proof in (2). However, in discussions with him, it appears that he has unanalyzed data that might prove the point for a two gene case. The experiments in the first two parts of this chapter are first attempts to show that inversion can work as advertised: indeed, under

103 certain circumstances the position of genes on a chromosome is important. Formally, it is easy to show that an operator such as inversion can produce any possible permutation of genes. Thus, as in a classic crime case, we can show motive (adaptive advantage), means (the ability to produce any permutation), and opportunity (inversion is a part of our adaptive system). However, in this case we need an eyewitness: hence the experiments. Probabilities of Permutations In order to state that genes are actually being pushed closer on a chromosome, we need some idea of how distant a set of genes is likely to be under random assignment. Let the length of the chromosome be m and let the number of genes in the set, N, be n(2 < n 5 m). Assume that every permutation of m genes (of which there are m!) is equally likely. For any permutation define L to be the position of the leftmost gene from the set N and R to be the position of the rightmost gene from the set N. For convenience we define the distance, D, to be R-L+1. Thus the maximum distance is m and the minimum distance for a set of two genes is two (when they are adjacent). Now, the total number of different permutations of n genes on a chromosome of length m is (m). Thus, for any distance, we have P(D=d) = (Number permutations with D=d)/(m). The numerator of this expression is just (m-dl)(I:l)

104 d-2 P(D=d) (m-d+l)\n-2 2 < n < d m m (n) where m = length of chromosome n = size of group d = distance Numbers below are for m = 25. 2 5 6 8 d P(D=d) P(D<d) P(D=d) P(D<d) P(D=d) P(D<d) P(D=d) P(D;d) 2.080.080 3.077.157 4.073.230 5.070.300.000.000 6.067 1.367.002.002.000.000 7.063.430.004.005.001.001 8.060.490.007.012.002.002.000.000 9.057.547.011.023.003.006 -1.000.000 10.053.600.017.040.006.012.001.001 11.050.650.024.064.011.023.001.002 12.047.697.032.096.017.039.003.004 13.043.740.040.136.024.063.006.010 14.040.780.050.186.034.097.010.020 15.037.817.059.245.044.141.017.038 16.033.850.069.313.057.198.028.065 17.030.880.077.391.069.267.042.107 18.027.907.084.475.082.349.059.166! 19.023.930.090.564.094.443.080.246 20.020.950.092.657.104.547.103.349 21.017.967.091.748.109.657.125.475 22.013.980.086.834.109.766.143.618 23.010.990.075.909.101.867.151.769 24.007.997.058.967.083.950.138.907 25.003 1.000.033 1.000.050 1.000.093 1.000 Mean = 9.67 Mean = 18.34 Mean = 19.56 Mean = 21.22 Table 4.9: Probability of Dispersion of Genes on a Chromosome.

105 That is, there are (m-d+l) ways of positioning the outermost two genes at distance d, leaving the interior (d-2) positions to be filled by the (n-2) remaining genes. The cumulative distribution is P(D) = (dd nm-nd+d P(DS) = )( I - _= )/(n) which is obtained after a page of not very interesting manipulation. Table 4.9 contains the probabilities for distances on a chromosome of length 25 of groups of size 2,5,6, and 8. We note particularly that for 2 genes the median and mean distances are between 8 and 9; for 5 genes they are between 18 and 19; for 6 genes they are between 19 and 20, and for 8 genes they are between 21 and 22. Any reference to "'good" permutations will have to be measured against these figures. Experimental Protocol All the experiments in the rest of this chapter have the following protocol. Each experiment consists of 5 or 10 runs from an initial, random population. Each population begins with all its individuals having the same gene permutation, but that permutation is different for each run. The ten random permutations used are listed in the Appendix. (These random permutations were obtained by generating 25 random numbers and sorting them. The i'th random number then ends up in position j, defining a random permutation of the numbers 1-25.) Each experiment involves a particular combination of an inversion operator and a mating scheme, At the end of each run three sets of data are collected. The first concerns the distance D, as defined above, for a given set of genes: the distance between the left- and right-most positions of genes from

106 that set. D is calculated for every chromosome in the population, 100 samples in all. At the end of 5 or 10 runs there are 500 or 1000 samples of D, falling into at most 25 different cells. The data collected is then the number of chromosomes which have distance D(2 < D < 25). The other two sets of data concern the distance between all possible 25 pairs of genes. There are (22) = 300 such pairs. For each of the 300 pairs two items are collected over the 500 or 1000 samples: the average distance, and the number of chromosomes in which the distance is 9 or less. The analysis to be performed is a comparison of these values against the values predicted by the probability distribution calculated above to determine whether the inversion/mating combination has produced permutations which are different from chance, specifically, whether particular sets of genes have moved closer together on the chromosomes. (Details are given below.) Immediately we are faced with an ancient problem in statistics. Any statistical test requires that all sample points be obtained independently. But in each of the 10 runs all 100 chromosomes start out with the same permutation, not 100 different random permutations. Presumably, the random inversion will tend to alter the initial permutations, but the danger exists that the chromosomes will still look like the initial permutation after some number of generations. In addition, even if the initial permutation has been changed by inversion so that it bears no more than a chance resemblance to some terminal chromosome, it is quite likely that all the chromosomes in a terminal population resemble each other. These problems are of great importance and would cause serious difficulties if we wished to perform strict statistical analyses. It is our hope, however, that

107 the results will be sufficiently outstanding that such a procedure is not required. We will take the data as it is to see if statistical analysis is warranted and then worry about whether it is feasible. There are many possible combinations of the genetic operators and mating rules concerned with inversion described in Chapter Three. We list them below for convenience. Inversion type Inversion time Mating rule Linear Continuous Strict-homology Linear+end Mass Viability Any-pattern Best-pattern We do not report in detail any of the experiments attempted using the linear inversion type (i.e., just picking two inversion points at random). As mentioned in Chapter Three, this method of selecting pivot points yields a small probability for the end points to be moved in an inverted segment. Early experiments indicated that there was a very strong tendency for the end genes to stay in place over many generations. We stopped using "linear" and all the remaining experiments were run using the linear+end operator. Of the eight possible inversion time/mating rule combinations which might be tested for usefulness we immediately knock out continuous/strict-homology. Bagley (1) reported that this combination caused great difficulty because the individual inversions were almost all different so that few were able to discover homologous mates. As a result the system was slowed down by the necessity of making several selections of parents for one member of the new population; and, we

108 presume, most of the offspring thus tended to be of the initial permutation, since there were more of these in the parent generation. We are thus left with seven possibilities, all of which we will try. For each of these combinations, the level of the inversion probability is another variable. We will try these combinations at two or three levels only. For convenience we will identify experiments by two letters specifyin the inversion type and mating scheme, followed by a number specifying the inversion operator probability level, followed by a number specifying the experiment of this type. For example, MH3/2 means the second experiment using the combination Mass/Homology/0.3. The experiment designations are as follows. Inversion t Mating scheme Operator level. Designation Mass Viability 0.2 NV2 Mass Viability 0.3 MV3 Mass Homology 0.2 MH2 Mass Homology 0.3 MH3 Mass Any 0.2 NtA2 Mass Any 0.3 MA3 Mass Best 0.2 MB2 Mass Best 0.3 MB3 Continuous Viability 0.1 CV1 Continuous Biability 0.2 CV2 Continuous Any 0.05 CA05 Continuous Any 0.1 CAl Continuous Any 0.2 CA2 Continuous Best 0.1 CB1 Continuous Best 0.2 CB2 Not all combinations will be tried with all environments or at different generation times.

109 Experimental results are given in tables such as Table 4.10. While most of the table heading are obvious, the last three require explanation, being the basis for judging the results of the experiment. Each of the environments listed (See Figures 3.2-3.10) has its first group of genes non-linear. For example, in environments 3,7, and 8, genes 1-6 are non-linear, and in environment 2, genes 1-5 are dependent. For each experiment facing these environments, the D-value for these groups of genes is calculated, yielding the number of chromosomes observed with the various D-values. The column labeled "Prop<"' is the proportion of chromosomes observed whose D value was M or less, where M is 19 for environments 2-8 and 21 for environments 9 and 10. Consulting Table 4.9 we note that for 5 genes, the proportion of chromosomes expected randomly to assume D-values of 19 or less is.564; for 6 genes the proportion is.443; for 8 genes the proportion of D-values of 21 or less is.475. If there is any movement of genes together we expect these proportions to be exceeded in the data collected. As an example, see Figure 4.7(b) which shows the results of experiment MH3/4: the number of chromosomes taking on various D-values. If we were so inclined we ~could do a chi-squared goodness-of-fit test to see if the distribution of D-values observed were close to that predicted. Failing that, we might perform a binomial probability test to see whether the proportion observed at 19 or less was significantly more than the proportions (i.e., binomial probabilities) mentioned above. We did not perform either of these tests. We merely point out in the table (by an asterisk next to the value) those experiments in which the proportion observed was greater than the proportion expected. We follow this course rather than that of the more sophisticated statistics

110 for two reasons: non-independence of sampling makes the statistics suspect, and, as we note below, the results do not seem to be very. positive, obviating the need for subtle analysis. The column labeled "#>.547" refers to the pair-data collected. By Table 4.9 we observe that the probability that a pair is randomly at a distance of 9 or less is 0.547. For convenience, we shall call a distance of 9 or less "close". One of the pieces of data collected for each pair of genes is the number of chromosomes in which the D-value for the pair is observed to be close. On the average then, we expect 54.7% of the pair D-values observed to be close. Further, there is a 0.5 probability that we will observe a pair whose close count is greater than 54.7%. Now in a dependent group of size n there are (n) pairs of genes from that group; e.g., for groups of size 6 there are 15 pairs and for groups of size 5 there are 10 pairs. If genes are positioned randomly we expect half the pairs to have large counts. If there is any tendency for genes to move together, there will be a tendency for there to be more chromosomes with D<9 and a greater than.5 chance for a pair to have a close count larger than 54.7%, so that we would expect that more than 50% of the pairs from a congregating group to have large close counts. The column #<.547" refers to the number of pairs which have large (i.e., a proportion larger than.547) close counts. It sounds a bit complicated, but an example will probably (P=.8) help. Figure 4.7(a) shows a portion of the printout from the computer program which yields this data for experiment MH3/4. The upper portion of the matrix contains the close counts (i.e., the number of chromosomes observed with D-values less than or equal to 9). This experiment had 10 runs, obtaining 1000 sample

I11 1 2 3 4 5 6 7 8 9 10 11 12 13 1 I - 770 368 365 637 596 627 712 691 368 632 359 639 2 65 - 473 544 738 375 642 731 211 132 463 346 385 3 100 97 - 485 524 816 561 800 668 653 396 303 498 4 ill 92 91 - 748 753 592 405 626 493 332 586 656 5 83 56 84 66 - 673 616 561 214 449 518 502 560 6 72 95 66 74 71 - 506 612 768 611 510 514 977 7 69 67 76 78 66 84 - 677 529 415 577 443 605 8 82 71 69 98 86 57 74 - 501 306 468 198 609 9 86 125 100 78 105 53 88 79 - 811 515 485 759 10 106 143 88 85 107 73 86 99 55 - 520 654 815 11 76 97 103 110 90 74 79 97 80 96 - 657 705 12 106 125 115 81 90 84 98 132 99 73 66 - 680 13 79 106 77 75 75 35 85 82 57 59 69 77 - 14 92 74 63 74 57 69 71 56 98 108 81 92 84 15 69 110 102 93 92 58 99 101 52 72 64 94 44 16 75 83 95 81 78 94 55 69 77 84 87 98 100 17 83 93 82 114 100 86 57 74 88 81 73 102 83 18 92 74 90 100 88 106 75 90 121 112 93 74 97 19 104 121 81 _82 95 64 87 82 79 75_108 89 74 20 83 127 104 87 96 76 91 113 74 66 120 81 82 21 85 62 114 73 50 95 76 101 123 119 86 83 98 22 68 93 93 106 87 84 66 91 82 69 64 91 79 23 100 101 77 87 80 58 75 73 61 71 78 86 70 24 93 92 54 108 102 92 59 60 107 90 100 108 96 25 116 97 72 83 102 86 88 81 101 106 112 98 89 Closeness count (upper triangular matrix) Average distance times 10 (lower triangular matrix) (a) Distribution of group distance D count D count <10 0 18 60 11 142 19 169 12 0 20 252 13 0 21 74 14 0 22 0 15 49 23 31 16 33 24 6 17 126 25 58 (b) Figure 4.7: Output from Experiment MH3/4.

112 chromosomes. Of the 10 pairs from the set (1,2,3,4,5), four are greater than 547: (1,2) is 770, (1,5) is 637, (2,5) is 738, and (4,5) is 748; the other six pairs have lower than average values. The entry in the "#>.547" column is thus 4/10. A sophisticated analysis of this information would use the binomial distribution to test the significance of the close count's excess over.547 and the excess of pairs (over 50%) having large close counts. (We realize that the pairs are not independent samples, but this approach might still have value.) However, such manipulation would only be useful for our purposes if a clearcut trend were to be observed. Instead, we only indicate when a value is in the right direction (greater than a half) by putting an asterisk next to it in the table. The column "#<8.67" again refers to pair data, this time to the average distance between pairs over all samples in the experiment. In this case, we are using D' as the distance where D'=D-1=L-R, for convenience. The lower half of the matrix in Figure 4.7(a) gives this average distance. Again, we count the number of pairs (from a group) whose average distance is less than the expected distance of 8.67. The average distance between pairs in a group should go down if the group is congregating, so that the number of pairs with distance less than 8.67 should be large if inversion is working. As postulated in the previous column, we indicate that a value is in the right direction by an asterisk. Clearly, these last two columns are related: the more chromosomes that have a D-value less than the average, the lower the average distance. We include both measures to see if one is any better predictor.

113 (As the reader will note, not all experiments collected pair data. Those experiments were run at a time before this data collection was programmed. In general, the results do not seem interesting enough to do a rerun.) The reason we bother collecting data on pairs is the following. If we determine that genes really do move closer together, we will still be faced with finding a method of determining which genes constitute a group when we do not have any a priori knowledge of the environment. If, at the same time, all pairs of genes in a congregating group show distances much smaller than predicted by chance, it may be possible to construct a group merely by examining the pair matrix to find pairs with abnormal values. Equilibrium Tests We recall that the first part of this chapter showed migration to be an effective means of prolonging variance in the population such that small differences in distance became a significant determinant in average population payoff. Table 4.10 summarizes the experiments performed to test the effect of the inversion/mating/operator-level combinations for populations in equilibrium under migration pressure. Several environments and run lengths were tried, along with two different migration levels. Since most of the runs were 500 generations or more we anticipate that any long term effect should have time to make itself felt. Recalling that asterisks next to a value indicate that the value is in the direction of showing clumping, we observe that the asterisks

114 INV/ MATE/ LEVEL ID MIGR % ENV # RUNS # GENS Prop<M #>.547 #<8.6 MV2 1 25 3 10 500.516* 2 25 3 10 2000.412 3 25 2 10 250.628* 5/10 3/10 4 25 2 10 500.552 5/10 4/10 MV3 1 15 8 5 500.562* 9/15* 8/15 2 15 7 5 500.408 6/15 7/15 MH2 1 25 3 10 500.298 MH3 1 25 3 10 500 771* 2 25 3 10 2000.325 3 25 2 10 250.460 2/10 3/10 4 25 2 10 500.579* 4/10 5/10 15 8 5 500.242 4/15 5/15 6 15 7 5 500.402 9/15* 9/15 MA2 1 25 3 10 500.522* 2 25 3 10 2000.436 3 25 2 10 2000.552 5/10 6/1C 4 25 2 10 2000.591* 5/10 5/IC MA3 1 25 3 10 500.354 MB2 1 25 3 5 500.322 6/15 6/15 2 25 2 5 500.638* 7/10* 7/1C MB3 1 25 3 5 500.722* 12/15* 11/15 2 25 2 5 500.730* 5/10 5/1C 3 15 7 5 500.578* 9/15* 9/15 4 15 8 5 500.404 7/15 6/15 CV1 1 25 3 10 500.260 2 15 7 5 500.410 9/15* 7/15 3 15 8 5 500.388 8/15* 8/15 CA05 1 25 3 10 500.332 CAl 1 25 3 10 500.391 CA2 1 25 3 10 500.463* CB1 1 25 3 5 500.508* 9/15* 7/1. 2 25 2 5 500.664* 10/10* 9/1( 3 15 7 5 500.280 6/15 8/15 4 15 8 5 500.456* 6/15 8/1. See text for explanation of headings Table 4.10: Inversion/Equilibrium Experiments.

115 appear next to 34 out of 78 entries in the table. (We might expect slightly less than 39 asterisks randomly.) This is the bad news which makes it unnecessary for us to perform any more sophisticated analysis. We conclude that over all the possible combinations of inversion tried, there is no distinct movement of genes together on the chromosome. The combinations mass/best/0.3 and continuous/best/O.l are much more positive in their effects than others, but they are not consistent. If we were to explore further it would certainly be with these parameters. However, the pair data for those combinations is weak, destroying whatever hope we might cherish of being able to detect movement when we did not know what we were looking for. In particular, although the counts and average distances exceed the thresholds set for them, they do not exceed them by much, certainly not by as much as many other samples do. Since the table speaks for itself, and since the results are not too positive, we will mercifully cut short any further analysis and end this section, saving discussion for the end of the chapter. Evolving Tests In the section of this chapter entitled "Evolving Populations" we showed that although differences in payoff due to gene distances did not always show up in the equilibrium state, they did show up consistently, very often significantly, in the early stages of evolution. In addition, the goodness of a permutation had an important effect on how close the population could get to the true maximum.

116 With this in mind we report here on a set of experiments which test combinations of the mating/inversion operators to see if any of them have effect on permutations early in evolution. Picking a time at which to measure distances is a difficult task. Clearly, if there is a distance effect early, but not later, running too long will destroy any adaptation on permutations due to the early effect. The values chosen (50,150, and 250 generations for the various environments) seem to be reasonable. If the effect is there, but only to be observed at particularly special instants of time, it does not do us much good, since it is unlikely that we would discover this time accidently. We do not wish to do an exhaustive search: we need a strong effect. Table 4.11 reports on these efforts. Most of the trials were allocated to the combinations involving the "best-pattern" mating rule since these had shown the most promise in the previous trials. Again, mass/best/0.3 showed well; but continuous/best/0.1 did not fare well and continuous/best/0.2 did only average (4 of 8 trials showed movement in the right direction). For MB3, 7 of 8 trials showed movement of the total group D-value in the right direction but experiments 14,15, and 16 exceeded the expected proportion of.433 by.008,.013, and.011, hardly significant differences. In addition the result for ENV 2 must be viewed with caution since our previous experiments show little reason to expect success with this, the simplest of all our non-linear environments. Even counting ENV 2, the four positive experiments (MB3/10,11,12, and 17) may well be significant, but they represent success on only four of the six environments. A final source of discouragement is that the pair data do not reveal any reason to believe that detection of a group would be easy under these circumstances.

117 INV/ MATE/ LEVEL ID ENV # GENS Prop<M #>.517 #<8.67 MV2 10 9 150.642* 11/28 13/28 11 10 250.429 14/28 12/28 MV3 10 9 150.439 12/.28 10/28 11 10 250 411 15/28* 13/28 MH2 10 9 150.600* 11/28 13/28 11 10 250.457 8/28 8/28 M113 10 9 150.450 12/28 7/28 11 10 250.748* 19/28* 19/28* MA2 10 9 150.455 11/28 13/28 11 10 250.482* 11/28 10/28 MA3 10 9 150.539* 19/28* 18/28* 11 10 250.454 15/28* 11/28 MB2 10 9 150.240 12/28 10/28 11 10 250.418 13/28 13/28 MB3 10 9 150.535* 10'28 1i 1/28 I 11 10 250.644* 18/28* 16/28* 12 2 50.692* 7/10* 7/10* 13 2 150.488 5/10 5/10 14 3 50.451* 7/15 7/15 15 3 150.456* 9/15* 10/15* 16 7 150.454* 9/15* 7/15 17 8 150.561* 10/15* 9/15* Table 4.11: Inversion/Evolution Experiments. (Cont'd)

118 INV/ MATE/ LEVEL ID ENV # GENS Prop<M #>.547 #<8.67 CV1 10 9 150.322 12/28 16/28* 11 10 250.387 11/28 14/28 CV2 10 9 150.297 13/28 14/28 11 10 250.328 11/28 12/28 CA1 10 9 150.554* 10/28 8/28 11 10 250.454 19/28* 17/28* CA2 10 9 150.459 14/28 11/28 11 10 250.549* 16/28 18/28* CB1 10 9 150.531* 13/28 15/28* 11 10 250.349 13/28 10/28! 12 2 50.561 5/10 5/10 13 2 150.472 3/10 3/10 14 3 50.441 8/15* 9/15* j 15 3 150.418 8/15* 8/15* 16 7 150.392 6715 7/15 17 8 150.505* 7/15 6/15 CB2 10 9 150.446 8/28 7/28 11 10 250.423 14/28 11/28 12 2 50.602* 6/10* 6/10* 13 2 150.625* 8/10* 8/10* 14 3 50.451* 11/15* 9/15* 15 3 150.467* 6/15 8/15* 16 7 150.358 7/15 6/15 17 8 150.400 6/15 5/15 Table 4.11: Inversion/Evolution Experiments.

119 Summary Much as we would like to claim confirmation of our hypothesis that dependent genes move together under the influence of inversion, we do not feel confident in doing so on the basis of the evidence garnered in this chapter. Of all the experiments performed involving combinations of the inversion operators and the mating rules, it appears that the combination of mass/best/0.3 is by far the best in approaching the beahvior we sought. It produced possibly significant movement in three out of the four environments used in the migration equilibrium experiments and four out of six environments used in the early evolution experiments. But the failure of inversion in general to be stronger where it did move correctly, its failure to be universally successful, and the outstanding failure of the pair data to be useful, all contributed enough negative output to discourage us from trying to overcome the difficulties in producing meaningful statistics from these highly non-independent samples. The MB3 data may show a trend or even be significant, but if so the significance is of low order and the results are not likely to be of great interest. Our efforts are best devoted elsewhere. However, the results of the early evolution experiments must still be accounted successful. We can state with some confidence that position is an important factor in the rate at which populations advance from starting conditions far from the optimum (which is certainly likely to be the type of starting condition encountered in artificial work). More importantly, position can affect the system's likelihood of achieving the maximum value in a space.

120 That we cannot demonstrate the ability of inversion ot take advantage of these conditions is regrettable. Two possible explanations for this failure come to mind, two explanations which suggest conditions under which inversion may work better. The first is that we are working with chromosomes which are too short. Consider a non-linear group of order six. Given a single crossover operator and a random positioning of genes on a chromosomes of length twenty-five, the average distance between the left- and right-most genes of the group is about nineteen, so that crossover will occur within the group with probability.8 or so. If inversion manages to reduce the spread by a third to a distance of 12 or less (which has an a priori probability of.04), the probability of crossover splitting the group is still one-half. It may be that our experiments did, not work because this reduction does not provide sufficient selective advantage to overcome the stochastic effects of the adaptive system. If so, problems with much longer chromosomes might find inversion more useful. This suggests an experiment in which many superfluous genes are added to the chromosome to provide enough length so that groups have a very wide range of probabilities of being split in different permutations. The second possible reason for inversion's non-effectiveness is the shortness of the length of time in which inversion has to act while position is important in early evolution. In our experiments this was anywhere from 25 to at most 150 generations. Considering the mass inversion operator at a level of 0.3, only 45 different inversions occur, on the average, in 150 generations. If even a third are advantageous and if three-quarters of those are able to overcome the stochastic effects of the system, there are only 11 advantageous permu

121 tations in that time. From our experience this is not enough to do the job. Inversion may be effective only in systems in which significant adaptation continues for a very long time. Such systems may be very much more complicated in terms of number of genes and alleles, they may be diploid with dominance adding to the adaptive problem, or the environment may be non-stationary. All these possibilities suggest a whole new series of experiments. At any rate, the subject is not closed; the early evolution experiments are too positive for that.

CHAPTER FIVE FREQUENCY EFFECTS Existence of a Frequency Effect The complete analysis of frequency effects for a simple two gene model (given in Chapter Two) is difficult to extend to a complex environment and a complex adaptive system involving stochastic genetic operators. Most work to this point merely assumes that effects demonstrated for two genes do indeed generalize to the more complicated case. ("It is intuitively obvious that...") However, if we hope to detect evidence of non-linearities using this idea, it would be well to establish the fact —which we shall call the IFC (Increased Frequency of Combination) effect. Chi-Squared Analysis The chi-squared goodness-of-fit statistic is a simple test to determine how well a set of observations of a random variable agrees with a probability distribution. The distribution defines classes (or ranges) of the random variable called cells and the observations are assigned to those cells. When the classification is the cross product of two or more independent criteria, the cells are arranged in a table and the test is then called the chi-squared contingency test. The degrees of freedom in any chi-squared test is equal to the number of cells, minus the number of parameters estimated from the data, minus one. Thus, in a 2 x 2 table in which the probability of occurrence of each class is estimated from the data there are 122

123 2 x 2-2-1 = 1 degrees of freedom. Similarly in a 2 x 2 x 2 x 2 x 2 table (which we will call a five-way table) there are 26 degrees of freedom. The nature of the contingency table test is to detect any departure from linearity —the hypothesis is that the frequency of the cross-product classification can be predicted by the estimated frequency of the individual classes. The statistic X~ ~ (Expected.-Observed.)2 over all Expected. cells i Exp d1 is then distributed according to the chi-squared distribution with the appropriate degrees of freedom. If the statistic exceeds the given confidence level we may reject the hypotheses of linearity: there is some interaction between the classes. In addition to being a statistical test the observed chi-squared value may be used as a measure of association among the variables. That is, the magnitude of the statistic calculated above may serve as a ranking on the degree of association. A key point in the chi-squared test, as in most statistical work, is that the observations are required to be independent samples of the random variable. We will immediately run into difficulty on this account. Another important point which will be a source of difficulty is that, for the test to have validity, authorities recommend that the expected number of items er cell be at least five for 80% of the cells and never less than one (14).

124 Multiple-Gene Tests In applying contingency table analysis to our situation we find that a gene is a component class and that alleles are the possible values for the class. In determining whether there is interaction among n genes, we use an n-way contingency table, i.e., a table with 2 cells and 2 -n-1 degrees of freedom. We shall call the resulting statistic a Group Chi-squared Value (GCV) or an association index. Now, if each of the alleles were to have a frequency of 0.5, in order to satisfy the requirement that each cell has an expected value of 5, we would need (on the average) 5 x 2n samples. In the case of n = 5 this is 160. If some of the allelic frequencies are lower we need a correspondingly higher number of samples. In any event, in order to calculate the chi-squared statistic for more than 3 or 4 genes at a time we require more samples than are available from our basic 100-individual population. To handle this requirement and to satisfy the requirements of randomness as much as possible we will adopt the following experimental protocol: 1. Under a given set of parameters, 10 populations will be run to a terminal state, producing 1000 samples. 2. To minimize the effects of the starting condition a new population (i.e., a new set of alleles for each individual) will be generated at the beginning of each run. 3. In some experiments, to minimize the effects of the original gene permutation, a new initial permutation of genes will be generated at the beginning of each run. Table 5.1(a) displays the parameters used in the H,I,J,K, and L

125 Gene Permutations EXP ENV # Generations Crossover* across runs** H1 2 8 One All Same H2 2 8 One Random H3 2 8.100 Random H4 2 8.250 Random HS 2 8.500 Random H6 5 8.100 Random H7 5 25.100 Random H8 5 50.100 Random II 1 4 One All Same 12 1 4 One Random 13 1 4.100 Random 14 1 4.250 Random IS 1 4.500 Random J1 3 16 One All Same J2 3 16 One Random J3 3 16.100 Random J4 3 16.250 Random JS 3 16.500 Random J6 6 16.100 Random J7 6 25.100 Random K1 7 25 One Random K2 7 25.100 Random K3 7 25.250 Random K4 7 25.500 Random KS 7 50.167 Random K6 7 16.167 Random K7 7 100 One Random K8 7 150 One Random L1 8 8.167 Random L2 8 16.167 Random L3 8 25.167 Random L4 8 16 One All Same L5 8 50 One Random L6 8 100 One Random Mutation =.005 Inversion = None Migration = None *One => One-crossover. A number => probabilistic crossover with given probability. **All same => Permutation (1,2,...,24,25) used in all 10 runs. Random => A different random permutation used in each of 10 runs (See Appendix B). Table 5.1(a): Multi-gene X TestS: Experimental Parameters.

Degrees of Genes Tested Freedom Experimental X Observed** Hi H2 H3 H4 HS H6 H7 H8 *1,2,3,4,5 26 7225 5899 3336 3681 2209 86 2760 15786 21,22,23,24,25 26 91 84 49 32 32 62 73 86 10,11,12,13,14 26 104 158 62 36 32 27 68 151 6,10,15,20,25 26 41 93 44 62 55 42 51 61 1,7,13,19,25 26 86 44 122 86 32 36 81 94 I 1 12 I3 14 IS 1,2,3,4,5 26 98 55 23 39 21 21,22,23,24,25 26 60 21 34 42 20 10,11,12,13,14 26 105 56 39 26 33 6,10,15,20,25 26 48 53 53 38 42 1,7,13,19,25 26 33 51 60 30 28 J1 J2 J3 J4 J5 J6 J7 *1,2,3,4,5,6 57 1026 833 324 395 1013 210 1338 20,21,22,23,24,25 57 190+ 117 65+ 77 82 108 61 10,11,12,13,14,15 57 214 136 97 87 62 94 97 1,6,10,15,20,25 57 116 179 LUO 129 125 107 228 Table 5.1(h): Multi-gene 2 Tests: Results of First Runs. (Cont'd)

Degrees of Genes Tested Freedom Experimental x Observed** K1 K2 K3 K4 KS K6 K7.K8 *1,2,3,4,5,6 57 1313 380 320 130 7581 79 212759 63103 *7,8,9,10,11 26 69 70 112 177 58 51 1212 1032 *13,14,15,16,17,18 57 384 104 160 82 504 89 2455 3166 *19,20,21 4 1 62 29 2 90 12 78 17 *22,23,24,25 11 65 44 112 83 62 63 628 677 1,7,13,19,25 26 84 79 ~ 86 70 231 58 283 404 2,8,14,20,24 26 107 50 136 45 128+ 62 157 433 1,2,7,8,13,14 57 296 100 252 140 354 135 1867 2196 L1 L2 L3 L4 L5 L6 *1,2,3,4,5,6 57 158 176 621 775 2426 23036 + + *7,8,9,10,11,12 57 98 89 491 811 4808+ 15894k *13,14,15,16,17,18 57 220 379+ 616+ 1137 21396+ 16401 19,20,21,22,23,24 57 57 120 160+ 214 86: 174+ 1,7,13,19,22,25 57 86 75 141 152 369+ 231+ 2,9,18,20,21,23 57 81 125 188 110 210 480+ 3,12,15,19,21,24 57 95 101 154 114 188 151 *Dependent group of genes **1000 Sample points (10 runs) each experiment 2 +Failed X2 cell size requirements Tale 5.b): Multi-gene X Tests: Results of First Runs Table 5.1(b): Multi-gene x Tests: Results of First Runs.

128 Degrees of 2 Genes Tested Freedom Experimental X Observed** H1' 1-2' H3' H4' H15' H6' *1,2,3,4,5 26 5804 2908 3120 3268 4771 87 21,22,23,24,25 26 80 68 53 27 41 61 10,11,12,13,14 26 84 62 58 41 27 27 6,10,15,20,25 26 41 69 68 53 43 39 1,7,13,19,25 26 62 47 94 64 45 50 I' 112' 13' 14' 15' 1,2,3,4,5, 26 91+ 47+ 22+ 50+ 37+ 21,23,23,24,25 26 78 28 35 43 42 10,11,12,13,14 26 59 34 27 21 28 6,10,15,20,25 26 22 45 53 37 24 1,7,13,19,25 26 44 59 72 26 30 J1' J2' J3' J4' JS' J6' *1,2,3,4,5,6 57 1708 158+ 328 384 140 263 20,21,22,23,24,25 57 176+ 167+ 98+ 74 75+ 13S 10,11,12,13,14,15 57 234+ 101+ 98+ 107+ 60+ 99 1,6,10,15,20,25 57 216 140+ 106 147 67 99 K1' K2' K3' K4' KS' K6' *1,2,3,4,5,6 57 336 486 240 385 8214 106 *7,8,9,10,11 26 112 84 114 48 114 53 *13,14,15,16,17,18 57 253 120 214 164 424+ 91 *19,20,21 4 11 56 54 26 92 16 *22,23,24,25 11 33 78 101 13 84 41 1,7,13,19,25 26 94 83 130 74 188 55 2,8,14,20,24 26 52 58 128 44 122 55 1,2,7,8,13,14 57 261 108 214 146 300+ 133 Li' 2 L3' L5 4' *1,2,3,4,5,6 57 161 214 750+ 949 *7,8,9,10,11,12 57 112 92+ 463 678 *13,14,15,16,17,18 57 184 266 420 1414 19,20,21,22,23,24 57 69 100+ 153+ 267+ 1,7,13,19,22,2 57 90 84 159+ 190 2,9,18,20,21,23 57 85 115 239+ 234 3,12,15,19,21,24 57 96 121+ 130+ 147 Table 5.1(c): Multi-gene X2 Tests: Results of Second Runs.

129 2 2 2 2 Degrees of Freedom x X X X ___,X0. 95 0.99099.5 x.50 xo9 xo.9995 1 0.46 3.84 6.64 12.12 4 3.36 9.49 13.28 20.00 11 10.34 19.68 24.73 33.14 26 25.34 38.89 45.64 56.41 57 56.34 75.61 84.71 98.75 Table 5.2 Selected Points of the 2 Distribution. Table 5.2: Selected Points of the x Distribution.

130 CA L- 393 01:17, 06/23//2 ANALY5IS AT FND OF lo GENERATIONS SAME PFRMUTATION E/CH RIJN CUPIES= 10 GE NES: 1 2 3 45 FRFQ:.42.44.5i.42.56.48 COMB EXP 0%iS CUJM EXP DoS 000000 19.20 5 100900 13.96 15 000001 1.15 5 100001 1j.20 3 000010 27.06 33 100010 19.58 2 000011 25.59 43 100011 16.61 3 0001 00 14.' 2 21 100i00 10.49 9 000101 13.64 7 10 0101 9.92 i2 000110 20.33 9 lOO10 14.73 9 000111 19.23 6 100111 13. 8 i0 U01000 20.22 9 10)100 14.70 17 O 1 1 19.12 17 101Ol 1 3.0 i 1 001010 28.51 68. 101010 20.73 17 001011 26.05 9i 1l0111 19.60 40 00llO0 0 15.19 16 101100 11.35 5 001101 14.37 b 101101 10U.45 2 001110 21.2 6 11 110 i 15.57 2 001111 20.25 31 1011i11 14.73 7 ul0000 15.15 11 110000 1i.01 15 010001 14.32 i 110001 10.4 1 8 010010 21.35 6 iiOUi 1 5. 2 010011 20.19 15 1 1 1 1 4.68 13 010100 1.38 53 110101 8.27 59 010101 10.76 12 li0101 7.62 17 010110 16.04 3 I11O11 11.66 il 010111 15.17 11 110111 11. J 3 18 011000 5.95 1 1 1 11000 1.bC i12 011001 15.09 5 1 i11 1 1..97 3 011010 22.49.13 111UiO 16.55 4 011011 21.26 36 111011 15.46 20 011100 11.99 18 11100 0.72 26 011101 i1.33 2 111101 o.24 3 011110 16.90 0 111110 12.29 7 011111 15.98 17 111111 11.62 2 CHISQUARE = 1026.23 DEGREES OF FREEDOM: 57 The underlined points are the peaks of the environment. Figure 5.1(a): Sample x2 Multi-gene Test-Experiment J1.

131 CASE 404 01:43, 06/23/72 ANALYSIS AT END OF 16 GENERATIONS NEW PLPMUTATIUN FACH RUN COP I ES= 10 GENES: 1 2 3 4 > FRE Q:.48.50.57.39.55.55 COMB EXP J3S C0M EXP OoS 000000 12.27 5 100000 11.79 8 000001 15.55 11 100001 14.94 19 000010 15.62 10 100010 15.30 1i 000011 19.79 25 100011 19.32 21 000100 8.15 9 100100 7.83 12 003101 10.32 4 100101 9.92 5 00110 10.37 5 1COll0 9.96 10 000111 13.14 13 100111 1.563 6 0 1 000 0 16.87 11 101000 1.21 14 0001001 1.39 21 101001 20.55 i4 001010 21.48 16 11010 20.53 23 00101 17.22 76 101011 26.15,0 001100 1.20 12 Ii10 100 0.76 10 o1 101 14.20 - 7 101i01 13.54 ii 00111 0 14.26 101110 13.70 10 001111 18.07 28 101111 17.36 17 010000 12.77 9 110000 12.27 27 010001 16.19 8 110001 15.55 9 01001O 16.25 14 110010 15.b2 i5 0 00011 20.60 13 110011 19.79 i0 013100 8.48 26 110100 6.15 41 010101 10.75 9 110101 10.32 19 010110 10.79 9 110110 10.37 20 010111 13.68 I1 110111 13.14 8 011000 17.56 13 111000 1'.87 21 011001 22.26 22 111001 21.39 16 011010 22.35 19 11101 2i.-8 17 011011 28.33 43 111011 27.22 21 011100 11.66 12 111i00 11.20 11 011101 14.78 16 111101 14.20 S 01l 110 14.84 6 111110 14.26 9 011111 18.81 20 111111 18.J7 8 CHISQUARE = 394.87 DEGREES OF FREEDOM = 57 The underlined points are the peaks of the environment. Figure 5.1(b): Sample x2 Multi-gene Test-Experiment J4.

132 CASE 418, 06:16, 07/10/72 ANALYSIS AT END OF 50 GENERATIONS, NEW PERMUTATION EACH RUN, 10 Runs GENES: 7 8 9 10 11 12 FRECS:.14.13.80.18.70.76 COMB EXP OBS COMB EXP OBS 000000 7.97 5 100000 1.36 1 000001 26.40 13 100001 4.51 3 000010 19.43 11 100010 3.32 1 000011 64.31 49 100011 10.99 4 OC0100 1.85 4 100100.32 37 000101 6.11 6 100101 1.04 8 000110 4.50 4 100110.77 12 OCOll0 14.89 5 100111 2.55 3 001000 32.71 19 101000 5.59 10 101001 108.28 105 101001 18.51 10 OC1010 79.69 71 101010 13.62 6 OCIOll 263.81 377 101011 45.10 15 001100 7.57 5 101100 1.29 4 001101 25.07 13 101101 4.29 3 001110 18.45 11 101110 3.15 8 001111 61.08 45 101111 10.44 1 010000 1.20 3 110000.21 2 010001 3.98 9 110001.68 1 010010 2.93 3 110010.50 0 010011 9.70 5 110011 1.66 1 010100.28 0 110100.05 2 010101.92 0 110101.16 1 010110.68 0 110110.12 1 010111 2,24 2 110111.38 0 011000 4.93 2 111000.84 0 011001 16.32 19 111001 2.79 2 011010 12.01 3 111010 2.05 1 011011 39.77 58 111011 6.80 3 011100 1.14 0 111100.20 1 011101 3.78 3 111101.65 0 011110 2.78 1 111110.48 4 011111 9.21 3 111111 1.57 1 CHISCUARE ~ 4807.74 DEGREES CF FREEDOM - 57 The underlined points are the peaks of the environment. This set 2 of data does not satisfy the X2 test requirement. Figure 5.2(a): Sample X2 Multi-gene Test-Experiment L5.

133 CASE 418, 06:19, 07/10/72 ANALYSIS AT END OF 50 GENERATIONS, NEW PERMUTATION EACH RUN, 10 Runs GENES: 3 12 15 19 21 24 FRECS:.73.76.67.81.81.70 COMB EXP OBS COMB EXP OBS 000000.20 0 100000.56 1 000001.48 0 100001 1,35 2 OCOO10.89 2 100010 2.48 7 000011 2.13 1 100011 5.94 17 000100.89 0 100100 2.48 8 000101 2.13 1 100101 5.94 13 000110 3.92 1 100110 10.94 10 000111 9.38 5 100111 26.15 38 OC1000.42 0 101000 1.16 1 O1001.99 0 101001 2.77 0 001010 1.83 0 101010 5.11 3 001011 4.38 0 101011 12.22 14 001100 1.83 1 101100 5.11 9 001101 4.38 - 3 101101 12.22 5 001110 8.08 7 101110 22.52 18 001111 19.30 16 101111 53.81 49 010000.67 3 110000 1.87 0 010001 1.60 6 110001 4.46 1 010010 2.95 1 110010 8.22 9 010011 7.05 10 110011 19.65 9 010100 2.95 4 110100 8.22 8 010101 7.05 14 110101 19.65 8 010110 12.99 16 110110 36.22 35 010111 31.05 48 110111 86.55 49 ~011000 1.38 1 111000 3.84 5 011001 3.29 1 111001 9.18 14 011010 6.07 1 111010 16.92 16 011011 14.50 13 111011 40.44 47 011100 6.07 5 111100 16.92 9 011101 14.50 11 111101 40.44 51 011110 26.74 18 111110 74.54 96 011111 63.90 75 111111 178.14 184 CHISCUARE = 187.94 DEGREES OF FREEDOM = 57 This group of genes is linear with respect to payoff. The data 2 do not satisfy the X test requirement. 2 Figure 5.2(b): Sample x Multi-gene Test-Experiment L5.

134 series of experiments analyzed below. In all experiments the mutation rate was.005, there was no migration, and no mating rule was needed since there was no inversion. Single crossover is indicated by "One" and probabilistic crossover is indicated by the probability used. An initial gene permutation of (1,2,3,...,24,25) for each run is indicated by "All same" and "Random" indicates that a different, random initial permutation was used for each run. The 10 random permutations used may be found in the Appendix. In each series there were several groups of genes tested for their association indices after a given number of generations. (Figures 5.1 and 5.2 are sample outputs from the program which calculates the chi-squared value.) Most experiments were run twice (with different random number starters) —the GCV results are given in Tables 5.1(b) and 5.1(c). The analysis below refers specifically to the first run (5.1(b)) but the second run results are qualitatively the same. Table 5.2 contains selected values of the chi-squared distribution for reference. The first set of experiments involved ENV 2 (Figure 3.2). Hi used one-crossover and the initial permutation of genes in numeric order (1,2,...,24,25). Analyses were made of five combinations involving five genes each —the one combination of dependent genes and four control combinations of independent genes. The degrees of freedom for a five 2 way table are 26 and X 99(26) = 45.64. Clearly the observed chi-squared value of 7225 for the combination (1,2,3,4,5) is significant, as expected. However, the gene combinations (21,22,23,24,25), (10,11,12,13,14), (6,10,15,20,25), and (1,7,13,19,25), in spite of having much lower chi-squared values, are also significant at the 99% level although the second last was not significant at the 99.95% level.

135 The high degree of significance even among non-interacting genes points out our violation of the requirement of independence of sampling. Each member of generation n is formed by a (limited) mixing of the genes from two parents of generation n-1. Since selection of parents is biased (by the reproductive scheme) towards those individuals with high payoff, many individuals in the next generation automatically bear a resemblance to the parents with highest payoff. This effect is desired and expected as the means of exploiting non-linearities. What is surprising is its magnitude. Genes other than 1,2,3,4, and 5 are non-adaptive (neutral) and may be expected to assort randomly. However, other effects of the experimental parameters (one crossover and every individual having the same gene permutation) are such that large portions of the chromosomes vary together. We note that genes 10-14 and 21-25, which are adjacent on the chromosome (and thus are split infrequently by the one-crossover operator), have a larger chi-squared value (association index) than do the other two control combinations, parts of which are separated practically every crossover. Two steps were taken to reduce the association of non-interacting genes while at the same time (hopefully) maintaining that of the truly interacting genes. In experiment H2, each of the ten runs from the initial random population was made with a different permutation of genes. (Ten random permutations were used. See Appendix.) The intention is to reduce the adjacency effect. The result was to lower somewhat the association of genes 1-5; but the association of the other gene combinations was not changed greatly: three of the four remained significant at the 99.95% level. The second step, in experiments H3, H4, and HS was to use the

136 probabilistic crossover operator at levels 0.100, 0.250 and 0.500, again to reduce the adjacency effect by reducing the length of the portions of the chromosome which were taken from one parent at any time. As can be seen in Table 5.1(b), the effect of the probabilistic crossover can be seen in most of the groups. The interacting group's chi-squared value is lowered to about 30% of its previous (H2) value at P = 0.5 in experiment H5. Crossover values of 0.1 and 0.25 cross seemed to have some effect on the non-interacting combinations but a value of 0.5 finally reduced the GCV for three of them to below the 95% point and the fourth was below 99.95%. As a further experiment in this series, H6 used ENV 5, which is the same as ENV 2, but with reduced selection (the payoff range is 6-8 instead of 1-3). H6 also used probabilistic crossover, Pross 0.1. The reduction in the GCV for the interacting genes is dramatic. The value is still well above the values for non-interacting genes, but the association is still a factor of 25-90 below previous values, which may be cause for difficulty in more complex environments. Inspecting the data from H6 more carefully, we observe that because of the decreased selection there has not been much evolution: the average tendency toward fixation (difference from 50%) is 3% for the 5 dependent genes. Experiment H3, which had comparable parameters but which used the higher selection rate, averaged over 5% fixation tendency. Comparing the runs directly (adjusting for the different selection factors), populations in exepriment H6 had an average payoff of 1.62 at the end of 8 generations while those in experiment H3 averaged 2.25. H6 did not approach H3's payoff average until it was allowed to continue to about 30 generations. Accordingly, experiments H7 and H8 were run at lengths of 25 and 50

137 generations, otherwise keeping the same parameters as H6. Table 5.1(b) again shows positive results (increased GCVs for the dependent group). Because selection alters the rate of evolution, we must take care in choosing the times at which to make our analysis. Gene frequencies may guide us in this regard (tendency toward fixation may mean that significant evolution has occurred), although population averages may be better. In summary, the H series shows that chi-squared values are always higher for the non-linear group than for others, and usually much higher. Before going on to more complex environments, we shall go to one simpler yet, ENV 1, which has only 5 linearly acting genes (1-5). We use this as a basis of comparison to see if the observed chi-squared values for the H-series were merely the result of selection on the only adaptive genes, or whether the non-linear interaction is what caused the effect. Experiments I1-I5 follow exactly the same course as H1-H5 and the same gene combinations were analyzed. (Note that these experiments continued for only four generations rather than the eight generations of the H-series. The linearity of ENV 1 allowed such rapid adaptation that gene frequencies approached 1.0 much more quickly, making the chi-squared analysis less accurate due to the reduced size of the expected values in each cell. Doing the analysis earlier catches the frequencies before they get too low. Although we have already noted that some associations do not become apparent until a longer period of evolution has taken place, we can feel reasonably confident that we are not losing information in this case. When the gene frequencies tend to fixation so fast, combinations

138 involving the other alleles will not have much effect, even later. We automatically suspect linearity when we observe this behavior.) The results are positive: aZZ the associations observed are in the same order of magnitude and the values for the adaptive genes (1-5) are often lower than the values for the non-adaptive combinations. In addition, the lowering of the association index across the experiments designed to do just that seems to follow the same course as in the H-series. In general the associations are below the values in the H-series even though the selection is greater, perhaps indicating that some of the non-adaptive combinations in H were affected by something we might call a "tag-along" effect: selection of better chromosomes emphasizes the non-adaptive gene combinations on the chromosome as well as the adaptive gene combinations. While much of this should be smoothed by random effects over time it is probably present at every instant. Another possibility is that the smaller number of generations did not allow as much build-up of associations. Most of the association values ended up less than statistically significant. In any case, a linear payoff does not present the same behavior as a non-linear payoff. The J-series of experiments used ENV 3 (ENV 6 in the reduced selectio: experiment J6), which consists of 6 dependent genes and 19 genes producing a linear payoff. As such it differs from ENV 2 by having a larger dependent group and by having the remaining genes all adaptive; it also has reduced selection on the dependent group because of the larger number of genes contributing to payoff. J1-J7 followed exactly the same course as H1-H7, producing somewhat similar results. J1 produced a huge association index for the dependent genes which were adjacent on the chromosome, and much smaller associations for adjacent, independent

139 genes. Use of random initial permutations (J2) and lower selection (J6) reduced the associations, as did use of probabilistic crossover; longer runs increased the association (J7). The J-series did not have the same nice progression as H3-H5 in the reduction of the association with larger recombination values. This deserves some comment. We recall that the intention in using probabilistic crossover was to decrease the direct association between adjacent genes by splitting them up more often than in the one-crossover case. However, it still remains the case that when a split (crossover) occurs, the next gene is taken from the other parent. When this happens twice, the source for genes for the new chromosome is again the new parent. In the extreme case of crossover probability 1.0, every other gene is taken from the same parent. Given an initial gene permutation of (1,2,...,25), the result is the same as if the permutation were (1,3,5,7,...,23,25,2,4,...,22,24), and there were again exactly one crossover, but always between positions 13 and 14 (genes 25 and 2). Thus, higher values for the probability of crossover actually reduce the disassociation desired. In addition, to allow the inversion operator to produce permutations which are advantageous, crossover must be fairly low. No matter what the means of mixing genes, there will always be a residue of association caused by using only two parents as sources. Since we are not interested in finding optimum setting for parameters we will not attempt to determine which value of crossover might be "best" for this particular purpose; use of probabilistic rather than one-crossover might well be indicated for longer chromosomes. At any rate, the optimum setting might depend on the number of genes in a

140 dependent group, on the selection coefficients, or any number of other factors. By inspection of the data for the H,J, and K experiments, it appears that levels around 0.1 seem to be effective, and are probably low enough to allow inversion to work, so that we would recommend levels on this order in future work. In the next set of experiments, K1-K4 follow the pattern of H2-H5 and J2-J5, producing somewhat similar results again, using ENV 7, our most complex environment. The five dependent groups mostly show large association indices, but the two groups of five mutually independent genes, (1,7,19,25) and (2,8,14,20,24) are not consistently larger or smaller than the comparable dependent group, (7,8,9,10,11). These groups contain one gene from each of the dependent groups. While we expect the groups to vary independently since payoff is additive among them, we should not be too surprised if one group dominates the others (in terms of selective pressure), leading to a strong tag-along effect at specific times of evolution. The group (1,2,7,8,13,14) contains 2 genes each from 3 dependent groups. It shows higher indices than the independent groups and often exceeds the dependent groups in strength of association. These somewhat clouded results give some idea of the difficulty involved in very complex environments. Actually, even after 25 generations, the degree of improvement from the initial random population was not great. (We do not reproduce the data here.) Because of the large amount of interaction, the populatior average did not advance very rapidly, and the best strings produced were much below the optimum. As a measure of the amount of adaptation, the tendency toward fixation of all the individual gene frequencies (i.e., difference from 50%) at the end of 25 generations, averaged

141 across the 10 runs for experiment K2, was only about 8%. To get some feel for the possible effects of greater adaptation, K5 continued for 50 generations. A higher degree of adaptation was seen (gene frequency fixation averaged 11%) and the associations were much larger (compared to K2 or K3 which had similar crossover probabilities). One factor making the associations larger was the additional non-independence of sampling provided by the longer evolution times, as evidenced by the increase even among the independent groups. K6 measured the associations at only 16 generations where gene frequency fixation averaged 6%. As expected the associations were mostly lower, with the two completely independent groups falling in the same range as the comparable dependent group. K7 and K8 continued for 100 and 150 generations. Here the dependent group clearly dominated the two control groups (in K7, 1212 versus 283 and 157). The groups of six dependent genes also dominated the control group of six and the group of four dependent genes dominated even the control groups of five which could be expected to be higher. It seems that increasing the lengths of runs (at least in this range) definitely alters the relative association indices. Short runs may be better for detecting differences in smaller groups, although the evidence is inconclusive, observing the failure in the small group (19,20,21). In longer runs the small groups may be almost completely adapted and thus contribute less to the payoff. They thus have higher gene frequencies and fewer members searching false peaks so that the chi-squared calculation appears to yield smaller association indices. The K-series used an environment in which every gene but one was included in a dependent group, so that it was difficult to get an idea of the relative magnitude of the associations in dependent and independent

142 groups. In the last series of experiments on this topic, L1, L2, and L3, used ENV 8 which consists of three groups of 6 dependent genes, each group identical to the dependent group of the J-series. The remaining 7 genes are linear, allowing plenty of independent gene groupings. Random permutations and probabilistic crossover were used. The three experiments show the results of the analysis at the end of 8,16, and 25 generations. By the end of 8 generations there has not been enough time for all associations to build up, but by generation 16 they are more plainly there, and at the end of 25 generations they are unmistakable. The five control groups are well below the dependent groups, and associations in these five groups seem to increase with a larger number of generations, as indicated by the K-series. L4 again returned to one-crossover and numeric order for the genes: all dependent groups had their genes adjacent for the 10 runs. The ratio of association between the dependent and independent groups was very large, as expected. Experiments L5 and L6, with single crossover and random permutations, continued for 50 and 100 generations, producing even larger ratios than L4. As noted in Table 5.1(b), every GCV calculated violated the strict rules for the statistical test. A large degree of adaptation took place even by generation 50, producing a large tendency to gene fixation, resulting in expected gene frequencies of combination below 1.0 for the cells in the contingency table. Such cells contribute inordinately to the GCV, ruling out its use as a pure statistic. However, we note that the GCV ratio is still large in exactly the manner we had anticipated, indicating that use of the GCV is still valuable in this situation. Examination of the particulars of such a table is

143 often valuable for determining the importance of such deviance from strict form. See Figures 5.2(a) and 5.2(b) for two such tables, one for a dependent group and one for an independent group. Summary The results of the experiments described in this section are extremely satisfying. First, they verify the generalization of the simple two gene model prediction of frequencies to a complex multi-gene model with genetic operators. Secondly, the K and L experiments indicate that when there is more than one group of dependent genes, the groups tend to vary independently, emphasizing their own best combinations simultaneously. This bodes well for the ability of this analysis to handle even more complex, real-world problems. Another pleasant result is that the Increased Frequency of Combination (IFC) effect manifests itself to a significant degree very rapidly. Twenty-five generations were enough to show it even in some of the complex cases tested. This means analysis need not take an unreasonable length of time —certainly no more than adaptation itself takes. One somewhat disappointing result is that we are not able to use the chi-squared contingency test as a pure statistical method of detecting non-linearities. The experimental data obtained violate the requirements of independence in spite of all our efforts in manipulating parameters and initial conditions. This is a direct consequence (and intended operation of) the Reproductive Plan. Independent groups will always show strong associations with respect to statistical significance. On the other hand, using the chi-squared statistic as an index of

144 association appears to be a completely adequate discriminator. One extremely important result is that the association ratios observed between dependent and independent groups were highest when the genes in the groups were closest together. Figures 5.1(a), 5.1(b), and 5.2(a) show that the reason for a high association index is that there are many individuals on or near both of the peaks of the environment. When a (linear) prediction of combinations from individual gene frequencies yields a high value for one combination, it must necessarily give a lower value for other combinations. When there are two peaks and individuals cluster on both, the chi-squared value grows large from the contributions of at least one of the peaks. In the cases shown here (which are completely typical), the observed count on both peaks considerably outdistanced the predicted count, and the disparity was larger for the case where the genes were adjacent. The only possible conclusion is that very good combinations are separated and destroyed much less often when the genes are closest. As a result the search concentrates around local maxima in the environment. This is important supporting evidence of the potential value of inversion in promoting favorable permutations. Since efforts to create conditions satisfying the chi-squared independence requirement were a failure, and since such efforts reduced the association ratios of dependent to independent groups, it appears that the best discrimination (and probably best adaptation) can be made using the single crossover operator (or at least low probabilities of the probabilistic crossover operator for larger chromosomes) and permutations in which suspected genes are adjacent. Given a particular hypothesis about which genes are dependent we can then suggest the following experimental procedure. Form several

145 control groups with the same number of genes as in the suspected group; choose these groups from known (or thought to be) independent genes. Use the reproductive plan with low mutation, single crossover, and multiple runs each with the same permutation of genes such that all genes from a single group are adjacent on the chromosome. Perform runs of various lengths. Comparing the chi-squared values from the "independent" groups with the value for the suspected groups, we can accept the hypothesis of dependence if there is a large (say, order of magnitude) difference. Detecting Dependent Groups From the results of the last section we are now certain that the association indices of dependent groups of genes are much larger than indices for comparable, independent groups of genes. However we are still faced with a large problem: how to form hypotheses about which groups of genes to test in environments about which we know little. Even in the relatively simple (25 gene) environments used in this 25 research, if we are looking for groups of size 6, there are (6) = 177,100 possible groups. Obviously some screening method must be used. One such method might be the experimenter's knowledge or hunches about the environment. Other screening methods are discussed below. But first a note of warning. The process of hypothesis formation (induction) on scanty data is an arcane art at best. Even 500 generations of 100 individuals sample at most 0.15% of the total payoff points 25 in our environments of size 2 = 33,554,432. It is unlikely we shall be able to discover every interaction. But then, any information we can discover is more than is otherwise available.

146 Pair Analysis Although testing every combination of 5,6, or 7 genes requires an unreasonable amount of analysis, calculating contingency tables for every pair does not; there are only (2) = 300 pairs of genes on a 25-gene chromosome. This immediately suggests an interesting possibility Consider a group of 5 interacting genes. There are (2) = 10 pairs of genes from this group. If the group has a relatively high chi-squared value, it might be that every one of the pairs does also. If indeed these pairs have unusually high chi-squared values when compared to control pairs, we have a valuable screening tool. The intended usage is as follows: from the 300 pairs of values it may be possible to select a group of n genes, all of whose (2) Pair Chi-squared Values (PCVs) are comparatively large. Then, using the techniques suggested in the previous section of this chapter, we may test the hypotheses that these genes do interact as a group by means of an n-dimensional contingency table, obtaining the GCV, and making the final decision based on this value. Analysis at the End of Runs The most obvious method for detecting interacting pairs follows the experimental procedure of the last section exactly. Ten runs of each experiment were performed, under the experimental parameters of Table 5.1(a). A chi-squared contingency table analysis was performed on each of the 300 pairs of genes over the 1000 sample points. The result of the analysis for experiment H1 is shown in Figure 5.3. Genes 1-5 are known to be interacting and the average of the 10 PCVs

147 in this group (in the triangle) is 481 while the average for the remaining 290 pairs is 7. Clearly this is positive confirmation of our hopes. There is no doubt about which genes should be tested for 5-way interaction. Such a test is probably not even required. Of course, HI was run with genes 1-5 adjacent, a piece of prior knowledge known to increase the IFC effect. But analyses of experiments H2-HS yielded very similar results; although the numbers were slightly smaller, there was still no difficulty in determining the correct set to try (Figure 5.4 gives the results of experiment H2.) These results match well with the GCVs obtained in Tables 5.1(b) and 5.1(c). Figure 5.5, stating the results of experiment H6, has a different tale to tell. The average chi-squared value for the 10 pairs known to be interacting is 6.9 and the average value for all other pairs is 2.1 so that again our hypothesis is confirmed: high GCVs are reflected in high PCVs. But alas, mere inspection of the data does not yield the best information on which genes to try. It would be easy to choose a set of five genes whose pair associations averaged higher than 6.9; for example the group (8,10,14,17,23) has an average PCV of 7.6. Thus, the decreased GCV ratio shown in Table 5.16 definitely reflects itself in the PCVs. Again, more evolution is called for. Figure 5.6 shows the pair analysis at generation 50. The results, under these conditions, are just as positive as for the previous experiments. Figure 5.7(a) contains the pair analysis for experiment 13, ENV 1, a typical representative of the completely linear environment. As expected, no set of genes stands out. Figure 5.7(b) contains the analysis for 8 generations with similar results. We include Figure 5.8

148 analyzing J5, ENV 3 as a typical output of the six gene dependent group series. The effect is clear, and longer runs make it stand out even more. Figures 5.9(a), 5.9(b), 5.10(a), 5.10(b) contain the PCV analysis for experiments K7,K8,L5,and L6. Pair analysis of ENV 7 (K-series) on shorter runs was not very successful: only the group (1,2,3,4,5,6) stood out much —probably because it had the widest selection range. Genes (7,8,9,10,11) showed up slightly by generation 50, but as can be seen in Figure 5.9, they are not completely consistent even at generations 100 and 150. Similarly, the three other groups were standouts at some instants but not at others. ENV 8 (L-series) showed a like inclination to emphasize one or another group at runs of various lengths. Cumulative Analysis We are thus faced with the problem of different groups having more or less influence at particular instants of time and thus being more or less detectable in a pair analysis matrix. To overcome this effect we must perform pair analyses at different instants in time and try to pick groups from each analysis for testing via GCV. An alternate method which proves to be quite good is based on the observation that when PCVs stand out, they are quite a bit larger than the linear, essentially random PCVs. Thus, if a PCV is abnormally high at some instant and only in the random range at another, the average of the two will still be quite high and may be large enough to stand out and deserve our attention. We establish the following experimental protocol: perform PCV analyses at a number of different generations and sum the

149 values obtained at each instant. Figure 5.11 shows the results for ENV 7 (K-series) and Figure 5.12 shows results for ENV 8 (L-series). The experimental parameters used were the same as for experiments K8 and L6, except that the analyses were done at the instants of time stated in the figures. As expected, the results are unmistakable in both instances, with the exception of the small group (19,20,21) in ENV 7. Small dependent groups may be difficult to pick out especially if their selection factors are low as in the current case.

CASE 391, 21:27, 07/10/72 ANALYSIS AT END OF 8 GENERATIONS, SAME PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1* ALLELE S 45 45 44 46 45 43 51 44 54 41 53 63 52 45 52 45 47 48 49 45 51 44 52 48 49 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13'14 15 16 17 18 19 20 21 22 23 24 25 1 2 59 3 58 62 4 52 50 60\ 5 44 43 1 5 6 1 2 2 2 17 5 6 5 4 7 1- 8 6,5 5 6 5 0 9- 9 0 0 0 1 1 3 0 1 10 C 0 0 0 0 0 1 0 111 0 0 0 0 0 0 0 1 2 0 12 C C 0 0 0 2 2 0 0 0 0 13 0 0 0 1 0 0 0 0 0 0 0 114 0 C 0 0 0 1 0 0 0 0 0 0 1 15 0 0 0 0 0 0 0 0 0 0 0 0 0 016 0 0 0 0 0 0 0 0 0 1 0 0 0 017 0 0 0 0 0 1 0 0 0 0 0 11 0 1 0:18 0 0 0 0 00 0 0 0 0 0 4 0 0 0 0 2 19 O 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 020 1 1 1 0 1 0 0 0 0 0 0 0 O 0 0 1 0 0 021 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 222 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 023 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 124 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 - 25 C 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0AVE.= 22.88 Figure 5.3: Pair Chi-squared Values for Experiment III.

CASE 391, 21:35, 07/10/72 ANALYSIS AT END OF 8 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF I' ALLELES 36 37 37 35 36 53 48 48 49 52 46 49 50 47 57 56 43 46 49 45 52 49 50 47 50 CHI'-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 1 2 22 3 24 50 4 26 50 55> 5 19 39 44 3 6 1 0 0 0 07 0 0 0 1 0 08 3 2 2 1 1 0 - 0 9 4 4 4 6 4 0 1 0 10 0 0 1 1 1 0 0 0 1 - 11 1 0 1 1 1 0 0 0 1 612 1 0 0 0 0 0 0 1 0 2 413 0 0 0 0 0 0 0 0 0 0 0 014 1 11 0 1 1 0 0 0 0 0 0 0 - 15 1 1 I 1 2 0 0 0 0 1 1 0 0 116 2 3 3 3 3 2 0 1 0 3 2 1 1 0 117 0 0 0 0 0 0 0 0 0 0 0 0 0 00 118 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 219 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 - 20 1 1 0 0 0 2 0 0 00 00 0 0 0 00 0 0 - 21 0 0 0 0 1 0 0 0 2 0 2 5 0 2 0 3 0 0 0 022 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 2 1 0 0 223 1 0 0 1 1 0 1 0 0 1 0 0 4 0 0 3 0 0 1 0 1 024 0 0 0 0 0 0 0 2 0 0 0 1 0 0 0 1 0 0 0 2 0 2 025 C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - AVE= 18.20 Figure 5.4: Pair Chi-squared Values for Experiment 112.

CASE 398, 22:08, 07/10/72 ANALYS15 AT END OF 8 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1' ALLELES 49 50 58 52 53 47 50 50 47 49 55 50 55 52 46 53 56 43 44 53 56 37 49 5051 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 1 -r 3 C 4 1 1 1 5 6 0 0 0 0 0 7 0 1 0 00 0- 80100000- a 0 1 0 0 0 0 09 0 0 0 1 0 0 010 C 0 1 0 0 0 0 0 011 0 0 0 0 0 0 0 0 0 012 0 0 0 0 0 0 0 0 0 013 C 0 0 0 0 0 0 0 0 0 0 014 0 0 0 0 1 0 0 0 1 0 0 0 115 0 0 0 0 0 0 0 0 0 1 1 0 0 016 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 17 c 0000o0o0o1O 0000001 ooI018 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 019 1 1 1 0 0 0 1 1 1 0 1 0 0 0 0 0 020 C 0 0 0 0 0 0 0 0 1 0 0'1 0 0 0 0 0 021 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 022 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 023 0 1 0 0 0 0 0 1 0 1 0 1 0 2 0 0 1 1 0 0 0 024 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 025 c 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00AVEE 2.27 Figure 5.5: Pair Chi-squared Values for Experiment 116.

CASE 398, 22:50, 07/10/72 ANALYSIS AT END OF 50 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1' ALLELES 50 5.2 52 52 52 38 48 55 56 44 21 59 44 56 44 43 56 59 41 47 47 61 52 47 33 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1718 19 20 21 22 23 24 25 1 2 5 3 60 6 4 57 65 6 5 51 61 58 5 6 0 0 1 0 07 0 0 0 0 0 2 8 0 0 0 0 0 0 09 1 0 0 0 1 4 1 0 — 10 2 3 3 4 2 0 1 0 111 0 0 0 0 0 0 1 0 0 2 12 0 0 1 0 0 1 1 10 0 013 0 0 0 0 0 4 0 0 1 0 0 014 2 2 1 2 2 10 0 1 0 0 2 1 15 0 0 0 0 0 0 1 4 0 2 0 6 0 3 16 0 1 0 0 1 0 1 1 2 0 0 1 0 3 717 2 3 2 3 2 0 0 1 0 3 0 1 4 0 4 018 0 0 0 0 0 1 1 1 0 5 2 0 2 1 3 1 019 O0 0 O 0 1 0 2 0 1 1 1 0 0 1 020 0 1 1 1 2 3 0 0 3 0 0 3 0 3 3 0 0 0 21 1 2 2 2 4 0 0 3 0 0 0 2 0 2 8 2 0 1 0 1 22 0 C 0 0 0 1 3 0 0 2 0 1 0 0 0 0 0 0 1 0 223 1 1 1 1 1 0 0 0 2 1 1 0 0 0 0 0 0 1 0 0 1 1 - 24 0 0 0 0 0 1 0 0 1 1 0 1 0 0 0 2 0 1 0 2 0 0 025 1 2 2 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0AVE.= 28.13 Figure 5.6: Pair Chi-squared Valucs for Experiment H8.

CASE 395, 23:10, 07/10/72 ANALYSIS AT END OF 4 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1' ALLELES 73 77 75 77 78 56 49 52 54 49 48 47 39 534349 47 57 49 43 54 52 51 52 50 CHI-SQUARED VALUES (TIMES 0.1) FOR PAI:RS 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 20'21 22 23 24 25 1 *2 0 3 C C 4 0 0 0 5 __ 0 0 00 6 0 1 0 0 17 0 0 0 000 0 8 0 0 0 0 0 0 09 C 0 0 1 0 0 1 010 C 0 1 0 0 1 0 0 011 C 0 0 0 1 0 1 0 0 012 0 0 0 1 0 0 0 0 0 013 1 1 0 0 0 0 0 0 0 0 1 0 - 14 1 0 0 0 0 0 0 0 0 0 0 0 015 1 0 0 0 0 0 0 0 0 0 0 0 016 0 0 0 0 0 0 1 0 0 0 0 0 0 0 017 0 0 0 0 1 0 0 0 0 0 1 0 000 118 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0' - 19 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 020 c 0 0 0 0 0 0 0 1 1 0 0 *0 10 0 0 0 1 21 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 022 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 023 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 024 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 025 C C 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0AVE. 2.03 Figure 5.7(a): Pair Chi-squared Values for Experiment 13.

CASE 395, 23:18, 07/10/72 ANALYSIS AT END OF 8 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF Il' ALLELES 88 84 87 89 89 60 49 53 51 48 45 56 42 54 42 5353 48 42 51 47 45 53 47 43 CHI:-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 0 3 0 0 4 0 0 0 5 so 0 60.00001 7 0 0 0 000- 1 8 C 2 0 0 0 0 0 9 0 0 0 C 0 0 0 010 0 C 0 0 0 0 0 0 011 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 1 0 0 0 1 0 013 0 0 0 0 0 0 0 0 0 0 0 0 14 1 0 0 0 0 1 0 0 0 0 1 0 1 15 0 0 0 0 1 0 0 0 1 1 0 0 0 016 1 0 0 0 0 0 0 0 0 0 0 0 0 0 017 C 1 0 0 0 0 0 2 0 0 0 0 0 0 1 0 18 0 0 1 0 1 0 0 0 0 4 1 0 0 0 0 0 019 1 10 1 0 0 0 0 0 0 0 0'0 0 1 1 0 0 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 121 0 0 0 0 0 1 0 0 0 0 0 0.1 1 0 0 0 0 0 022 0 0 0 0 0 0 0 2 1 0 0 0 1 1 0 0 1 0 0 0 023 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 24 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 25 C 0 0 0 0 1 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0AVE= 2.52 Figure 5.7(b): Pair Chi-squared Values for experiment T3 (8 Generations).

CASE 405, 00:21, 07/10/72 ANALYSIS AT END OF 16 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1' ALLELES 34 29 56 44 62 61 76 66 69 64 65 72 59 57 74 66 71 66 71 62 66 64 60 74 66 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13.14 15 16 17 18 19 20 21 22 23 24 25 1 - 2 3 4 4 3 3 6 5 3 7 6 6 14 2 2 0 7 0 1 1 2 1 08 0 0 0 0 1 1 0- 9 0 0 0 0 0 0 0 210 0 0 2 3 1 0 0 011 0 0 0 0 0 0 0 0 0 0 12.0 2 0 0 0 0 0 0 0 0 013 0 0 0 0 0 0 0 0 0 0 0 014 0 0 0 0 0 0 1 0 0 0 0 0 015 0 0 0 0 0 0 0 0 0 0 0 0 016 0 3 0 0 1 0 0 0 0 0 0 1 1 117 1 1 2 0 0 0 0 0 0 0 1 1 0 0 018 0 0 1 1 1 0 2 0 0 0 0 0 0 0 0 019 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 020 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 021 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 022 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 023 0 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 2 0 0 0 0 0 24 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 25 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 AVE= 4.55 Figure 5.8: Pair Chi-squared Values for Experiment J5.

CASE 413, 01:10, 07/10/72 ANALYSIS AT END OF 100 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF #1' ALLELES 16 14 79 16 85 84 66 62 64 49 75 59 54 59 71 24 38 23 22 23 24 62 4149 65 CHI-SQUARED VALUES (TIMES O.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 23 3 16 2 4 19 21 1 5$23 29 17 1 6 24 29 20 18 2 7674723 80101025^^ 8 0 1 0 1 0 2( 9 11 13 8 9 9 9 10 10 6 7 6 5 3 6 9 7 12 11 20 28 11 10 15 16 11 1 13 12 0 0 0 2 0 0 0 1 0 0 0 13 1 1 0 0 0 1 0 0 1 0 5 0 14 5 6 6 4 7 3 0 0 6 5 5 0. 15 6 10 11 7 13 14 0 0 6 2 S 1 0 1 16 3 2 1 1 1 1 6 1 5 4 6 0 15 9 17 2 3 2 3 3 3 0 1 4 8 2 1 2 20 1216 18 0 0 1 2 1 1 1 1 0 1 1 3 9 4 2 3 19 0 0 1 0 1 1 0 0 1 1 2 0 4 6 1 1 1 1 20 0 0 0 0 0 1 0 1 1 1 0 0 3 1 2 0 3 4 21 0 0 0 1 0 0 0 1 0 0 1 1 7 0 0 2 1 6 2 22 0 0 2 2 0 2 1 4 1 15 0 1 1 1 3 2 8 3 0 3 1 23 2 4 3 2 2 2 4 0 0 0 2 0 1 2 0 2 3 0 1 1 0 24 1 2 1 1 0 2 1 0 0 2 0 0 0 0 2 2 1 1 4 1 29 25 2 2 1 1 4 0 1 1 0 6 3 0 0 0 0 0 0 0 20 19 AVE.~ 40.66 Figure 5.9(a): Pair Chi-squared Values for Experiment K7.

CASE 413, 01:48, 07/10/72 ANALYSIS AT END OF 150 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT CF'1' ALLELES 26 21 76 30 78 73 48 50 52 57 59 53 57 62 56 30 43 26 38 21 16 44 58 37 35 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 1 * 2 4 3 36 4 4 30 39 2 - 5 40 48 43 34 6 32 39 30 28 37 0 0 2 0 2 0v 8 0 0 1 1 0 01 9 1 1 0 3 1 115 1 10 0.1 1 1 0 0 212 11 2 1 3 0 1 7 13 8 12 0 0 1 0 0 1 2 1 2 1 013 1 0 2 1 0 0 12 29 22 15 9 1 15 1 0 3 1 0 0 3 1 5 1 9 2 16 5 4 12 0 2 1 5 15 5 8 8 117 9 1 17 0 0 3 2 0 0 0 2 0 6 1 I1 2 2137 15 18 3 2 3 3 3 3 7 6 8 2 8 12 10 3 7 1 1"19 0 0 0 0 0 0 1 0 1 1 4 0 0 1 1 0 0 20 1 1 1 2 1 0 1 0 2 1 1 4 3 0 0 1 1 3 21 0 0 1 0 0 0 1 1 0 1 1 0 1 0 1 0 3 0 22 1 0 0 2 1 0 0 1 0 1 1 0 0 0 I 4 5 0 4 0 0 23 1 1 2 0 0 0 0 5 0 7 0 0 7 4 7 11 11 0 2 0 0 24 4 3 3 1 1 0 0 6 0 5 3 5 3 2 4 13 9 1 0 5 0 3 1 25 4 5 6 1 2 2 1 7 0 9 0 1 5 6 13 12 15 2 0 0 5 21 1 AVE.= 49.85 Figure 5.9(b): Pair Chi-squared Values for Experiment K8.

CASE 418, 12:40, 07/11/72 ANALYSIS AT END OF 50 GENERA-TIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF 1' ALLELES 25 33 73 26 81 60 14 13 80 18 70 76 26 23 67 31 71 66 81 76 81 76 89 70 83 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 1 32 3 13 bo 4 9 1 9Sc - 516 514 f-" 6125 14 20 15 13 7 4 2 3 3 2 4 8 1 000 0 9 C 0 0 0 0 012 10 3 2 2 1 1 7 18 11 5 0 3 3 2 2 7 0 4 12 2 5 2 4 1 5 14 0 69 13 2 0 0 1 1 0 11 1 3 11 1 6 14 0 0 3 4 0 0 6 0 2 7 0 24 15 0 0 2 6 0 1 5 1 4 5 0 2 30 24 16 C 1 1 1 0 0 8 1 1 7 0 2 20 20 14 17 C 0 2 7 0 0 8 1 3 9 2 34 2631 1 18 1 1 1 1 1 1 5 0 2 11 0 i 16 20 14 11 18 19 1 0 0 0 0 1 0 00 3 0 0 1 1 01 1 220 0 1 0 1 0 0 1 0 0 1 0 1 2 1 4 1 0 1 0 21 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 023 C 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 024 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 3 0 - 25 0 C 1 0 0 0 1 0 C 1 0 0 1 0 0 2 1 0 0 0 0 0 0 0 AVE= 29.12 Figure 5.10(a): Pair Chi-squared Values for Experiment L5.

CASE 418, 13:07, 07/11/72 ANALYSIS AT END OF 100 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS PERCENT OF'1' ALLELES 19 35 78 25 75 76 28 18 73 26 74 70 16 15 87 13 78 87 89 83 62 76 70 73 79 CHI-SQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 1 3 24 3 4 17 19 3 5 23 26 41 3 6 IC 22 29 18 27 1 6 2 0 0 1 8 1 1 2 0 1 0 0 1 9 1 4 2 1 3 1 24 10 211 4 1 1 4 21 10 1 11 3 11 4 3 6 1 19 15 21 14 12 3 7 4 2 2 2'30 21 22 21 2 13 0 1 1 0 2 1 1 1 0 3 1 3 - 14 1 2 1 0 2 1 0 0 0 1 0 0 15 2 0 2 1 2 0 0 0 0 0 0 0 0 16 1 3 1 1 1 1 0 0 0 0 0 0 11 11 17 1 3 1 0 2 2 2 0 0 0 2 10 17 2 1 - 18& 1 1 1 I 1 0 4 3 1 4 1 5 113 19 2 15 1 19 0 1 1 1 0 1 2 1 3 0 1 1 1 1 0 1 O020 0 1 1 1 1 0 5 0 2 2 1 1 O 1 0 1 1 1 021 0 0 0 0 1 0 0 0 1 2 1 0 4 2 1 3 4 5 0 1 22 C 3 2 1 2 2 4 2 2 7 3 5 2 1 1 0 1 3 1 0 023 C 9 2 1 0 2 0 0 0 2 2 1 0 0 0 0 0 0 0 1 3 024 C 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 1 0 0 2 025 1 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 1 0AVE= 37.92 Figure 5.10(b): Pair Chi-squared Values for Experiment L6.

CASE 413, 02:26, 07/11/72 ANALYSIS AT END OF 150 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS CUMULATIVE CHISQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 14 12912 413712510 - 513813612611 612411410011211 7 10 12 11 11 10 7 8 11 6 8 5 7 8 1F 9 14 17 10 14 11 16 48 4r - 10 25 18 25 22 21 17 35 51 38 11 45 44 30 27 42 33 33 34 4 12 3 1 3 6 2 2 3 9 4 5 7 13 8 4 5 6 7 4183028 2515 5 14 7 12 8 7 8 4 9 10 16 13 10 3 41 15 12 15 16 10 17 17 13 6 10 12 13 4 19 79 16 15 9 16 4 10 6 14 18 10 14 17 3 6534 26 17 4 15 7 8 7 10 6 7 10 17 12 4 7 7984 5 18 20 9 16 17 15 9 12 12 14 9 16 19 37 15 18 11 4 - 19 12 13 12 9 7 8 3 2 7 7 5 10 8 10 7 9 6 7 20 8 5 9 10 5 6 3 9 9 5 6 6 7 5 6 2 7 13 7 21 1 2 2 2 3 5 4 4 3 4 8 5 30 1 6 12 6 25 6J 22 2 2 8 8 5 8 1 9 5 21 5 1 12' 4 7 10 14 11 6 9 3 23 10 7 8 4 8 6 12 15 8 13 7 6 27 9 9 23 16 10 10 5 9 2 24 12 6 6 4 2 6 2 14 13 14 21 11 13 11 10 21 17 9 7 9 72 312 25 9 13 10 6 10 12 5 11 13 20 8 7 19 9 15 15 25 18 8 9 184 4 AVE ~ 190.32 Values calculated at generations 25,50,75,100,125,150. Figure 5.11: Cumulative PCVs for ENV7.

CASE 418, 22:48, 07/11/72 ANALYSIS AT END OF 100 GENERATIONS, NEW PERMUTATION EACH RUN, 10 RUNS CUMULATIVE CHISQUARED VALUES (TIMES 0.1) FOR PAIRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 4 - 3 69 78 4 52 36 9 5 59 4184 5 6 57 64 80 58 5 7 8 8 11 7 20 9 8 8 4 13'4 23 6 9 9 8 14 9 24 8 72 61 10 14 16 20 10 28 19 90 64 7o II 17 15 20 14 29 14 66 50 58 7 12 13 16 16 12 26 16 94 73 79102 8 - 13 3 4 2 3 3 7 13 2 3 15 2 9 14 1 5 6 5 2 6 12 3 6 14 4 12 38 15 4 2 9 11 7 2 6 2 5 6 1 4 32 2 - 16 2 10 9.3 3 2 10 4 4 9 3 6 3444 14 17 2 4 4 8 4 3 13 3 4 12 2 7 44 48 33 38 18 3 4 3 3 3 2 10 4 6 16 3 7 29 46 17 32 38 - 19 2 3 5 2 1 3 4 3 3 4 3 2 2 4 2 7 1 520 1 4 1 2 3 1 9 4 6 6 2 6 2 3 9 2 3 6 2 21 2 3 1 3 3 2 0 1 1 2 2 1 4 3 2 3 4 6 1 1 22 4 4 2 3 3 3 5 3 3 7 3 5 4 1 3 3 1 3 1 3 0 23 13 22 16 12 13 14 3 5 9 10 11 7 2 3 3 4 2 2 1 3 3 1 - 24 2 1 2 2 1 2 1 2 3 1 1 2 1 1 1 2 0 2 2 1 1 5 125 4 2 1 0 2 3 3 3 1 3 2 1 1 1 1 3 2 1 1 1 1 0 2 0 AVE 130.02 Values calculated at generations 25,50,75,100. Figure 5.12: Cumulative PCVs for ENV8.

CHAPTER SIX CONCLUSIONS Methods of adaptive search must contain, at least implicitly, the ability to detect and act upon non-linearities in their environments (i.e., in the functions to be optimized). If such knowledge can be made explicit, this information may be of value in constructing models of the environment and may lead to faster and more successful adaptation. We define a function to be linear (or independent) with respect to its arguments (or parameters) if it may be expressed as a sum of sub-functions of at most one argument. If the function cannot be expressed as such a sum, it is called non-linear. Those parameters which appear as the only arguments for sub-functions are called independent parameters. A set of parameters which appear as arguments in a non-linear sub-function are called dependent or non-linear parameters. For example, consider the total function f: f(x,y,z) = g(x)+h(y,z) where h is non-linear (cannot be expressed as a sum of functions containing only y and z). The set {y,z} is a dependent group of parameters and x is an independent parameter. Our goal is to detect dependent groups of parameters. The environments studied had twenty-five arguments, each of which could take on two values. For the case of only two parameter values non-linearity is insured if the payoff for a group of parameters is multimodal (has more than one maximum). These environments contained from one to five dependent groups of parameters, where the groups ranged in size from three to nine. 163

164 The adaptive plan used was a Holland Reproductive Plan (10,11), a method inspired by genetics. Briefly, the state of the adaptive system resides in a "population" of many "individuals" or "chromosomes", each of which consists of a specification of a point in the function space, i.e., each chromosome contains a parameter value for each of the twenty-five arguments. The next state of the system is obtained by a two step process: 1) two individuals are randomly selected from the current population with the selection biased according to the payoffs (function value) of the old population; 2) these two "parents" are combined by operators to form an offspring, a new point in the function space. The operators used are similar to (but not necessarily identical to) the genetic versions of crossover, inversion, and mutation. Crossover selects portions' of the chromosome from different parents, mutation randomly changes the parameter values, and inversion changes the order in which parameters are listed in the string defining an individual. The experimental work set out to prove two things: that the reproductive plan adapts to non-linear sets of environmental parameters in different ways than it adapts to linear sets, and that the difference allows the detection of such non-linear effects without prior knowledge, thus supplying additional information to the experimenter. Two differences tested in the behavior of non-linear sets concerned the effects of distance along a string and the frequency-of-combination of parameter values. In both cases we have shown that the differences do exist and are important. But only in the frequency-of-combination effect were we able to use this information in detecting non-linear sets of parameters.

165 Of course, all these conclusions are limited to the specific environments tested. However, these environments are definitely non-linear and, for lack of any theory of non-linearity, we must assume that the results will generalize to other environments with similar properties: a small number of parameter values and multiple peaks in the environment. There are many problems in artificial intelligence with these characteristics so that we can foresee some usefulness for this method. More specifically we have shown: Position Effect: 1) The position effect (i.e., distance between genes in a dependent group) is very important during the period of time a population is undergoing the most rapid evolution. Smaller distances between genes leads to a faster convergence in payoff average. This seems to be directly related to the ability of the population to keep good combinations of alleles together once they have been found. Bad permutations of genes may make it unlikely that a population will ever find the true optimum. 2) The position effect has not been demonstrated at equilibrium under normal operation of the reproductive plan. The only instances in which a population with a good permutation performed better were those in which the early evolution effect enabled populations to find better peaks before reproduction fixed the gene frequencies. This seems to be constant over a wide range of experimental parameters. 3) Introducing migration as an artificial means of increasing variance

166 into the population leads to a fairly smooth position effect, albeit at a population performance level much lower than is acceptable. 4) Experiments using inversion failed to take advantage of the position effect with any consistency, and certainly not in a manner which we might be expected to use in detecting dependent groups of genes. 5) Although inversion failed in our experiments, we were able to propose conditions under which it might perform better. Frequency Effect: 6) Non-linear groups of genes have different behavior than linear groups in the combinations in which they are observed. A linear prediction of gene combinations from individual gen'e frequencies fails in most cases (in the sense of a chi-squared contingency table test for independence), but the association index calculated for non-linear groups is often an order of magnitude higher, a consistently observable difference. 7) It is possible to pick out dependent groups of genes via use of chi-squared tests even when there is no a priori knowledge of the environment. 8) Investigation of several multi-dimensional contingency tables indicate that the reason for the greatly increased association indices of non-linear groups lies in the large excess (over predictions) of sample points which inhabit peaks of the environment. This excess increases as the genes in the group are brought closer together on the chromosomes, indicating that close permutations more effectively sample the payoff function near the peaks.

167 This is further confirmation of the hypothesized position effect. Although we have not been able to demonstrate that an inversion operator works to produce beneficial permutations, we have shown that position can be extremely important in searching the space. One experiment using a bad permutation, for example, failed to achieve the true optimum of the space even after twenty runs. We expect similar happenings with yet more complex environments. Ideally we would prefer some reordering operator, such as inversion, automatically to rearrange the genes to minimize this difficulty. In some cases this might occur but even with our current environments —in which inversion failed to operate as desired —there is an alternative. Using the results from a chi-squared pair analysis of the same twenty runs, we, as experimenters, can hypothesize (with some degree of confidence) which genes are dependent and thus which permutations are best. We can then intervene in the initial conditions of the experiment to arrange such a permutation. The result is an increased probability of finding the true optimum of the space. Thus not only can we gather additional information about the form of the environment, but by so doing we can increase the probability that the adaptive system will achieve the true mean —an unanticipated bonus of the analysis suggested in this research.

Appendix Ten Random Permutations of Twenty-five Genes 10, 8, 4, 9, 7, 3,15, 1,14,18, 2, 6,25,12,16, 5,23,11,22,21,13,20,17,19,24 7,15,21,13,22, 6,1,23t 5,14,12,20,24, 9,10,18,19,25, 1, 8, 2, 4, 3,16,17 20, 9,10,13, 1,18,21,25,23,19,15,16, 7, 6,22, 8, 4, 3,24,11,12, 2,14,17, 5 0^ ~ 21, 9, 1,24, 8,19t 6,20,25,18,13,10,16,14,23,15, 7, 5, 3,11,12, 4, 2,17,22 9,22, 1,24,11,20,23,13,25,12, 3, 5,10,16, 4, 7, 2,15,19,21, 8,14,17,18, 6 25, 8,15,10, 3,11, 4,14, 2, 9,21,17,16, 6,23, 5,18,19,13,20,22,12, 7, 1,24 20,10,15, 5, 3, 9,18, 2,13,17,16,22, 8,21,12,14,23, 4, 1,25,24, 6,11,19, 7 3, 4,12,20,22, 5, 2,25, 8,16,23,21,14,18,11, 6,15, 9,10, 1,17, 7,19,13,24 1,17,12,24,18, 7, 9, 8, 4,22,13, 5,20,14,25,19, 2, 3,16,23,11,21, 6,10,15 23, 7, 1,11,25,15,12t 4, 8, 9,16,14,22, 2,13,21,20, 6,24, 5,10,18, 3,19,17

BIBLIOGRAPHY 1. Bagley, John D., The Behavior of Adaptive Systems Which Employ Genetic and Correlation Algorithms, Doctoral Thesis, Department of Computer and Communication Sciences, The University of Michigan, 1967 2. Bosworth, Jack; Norman Foo, and Bernard P. Zeigler, Comparison of Genetic Algorithms with Conjugate Gradient Methods, Technical Report 003120-1-T, Department of Computer and Communication Sciences, The University of Michigan, 1972 3. Cavicchio, Daniel J. Jr., Adaptive Search Using Simulated Evolution, Doctoral Thesis, Department of Computer and Communication Sciences, The University of Michigan, 1970 4. Crow, James F; and Motoo Kimura, An Introduction to Population Genetics Theory, Harper and Row, New York, 1970 5. Fisher, Ronald A., The Genetical Theory of Natural Selection, Dover, New York, 1958 6. Fogel, L.J.; A.J. Owen; and M.F. Walsh, Artificial Intelligence Through Simulated Evolution, Wiley, New York, 1966 7. Frantz, Daniel R., and Ronald F. Brender, The CESSL Programming Language, Technical Report 012520-5-T, Department of Computer and Communication Sciences, The University of Michigan, 1971 8. Friedberg, R.M., "A Learning Machine, Part I", IBM Journal, January 1968 9. Holland, John H., "Nonlinear Environments Permitting Efficient Adaptation", in Computer and Information Sciences-II, pp. 147-164, Academic Press, New York, 1967 10. __ "Processing and Processors for Schemata", in Associative Information Techniques (E.L. Jacks, ed.), Elsevier, New York, 1971, pp. 127-146 11. _, "Genetic Algorithms and the Optimal Allocation of Trials", (to be published) 12. Hollstien, Roy B., Artificial Genetic Adaptation in Computer Control Systems, Doctoral Thesis, Department of Computer Information and Control Engineering, The University of Michigan, 1971 13. Moran, P.A.P., The Statistical Processes of Evolutionary Theory, Clarendon Press, Oxford, 1962 169

170 14. Remington, Richard D., and M.A. Schork, Statistics with Applications to the Biological and Health Sciences, Prentice Hall, Englewood Cliffs, New Jersey, 1970 15. Samuel, A.L., "Some Studies in Machine Learning Using the Game of Checkers", IBM Journal of Research and Development, 3, No. 3, 1959, pp. 210-229. 16. Turner, John R.G., "Why Does the Genotype not Congeal?", Evolution 21, No. 4, pp. 645-656, December, 1967 UNIVERSIT3 OF MICHIGAN 3 9015 02826 3377