THE UNIVERSITY OF MICHIGAN INDUSTRY PROGRAM OF TEE COLLEGE OF ENGINEERING A STUDY OF AUTOMATIC SYSTEM SIMULATION PROGRAMMING AND TEE ANALYSIS OF THE BEHAVIOR OF PHYSICAL SYSTEMS USING AN INERNALLY STORED PROGRAM COMPUTER Franklin Ho Westervelt A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The University of Michigan 1960 October, 1960 IP-470

PPEFACE In the relatively short time during which the high speed digital and analogule computers havre been in, use,, many remarkable appli cations have been made, Particularly in the area of simulated behavior of physical systems the interest has been high and the utility great While a great many systems have been programmed and investigated, the general application of the digital computer to system simulation has not yet been as widespread as the applications of analog computers,, This has been true for many reasons despite the inherently greater flexibility and more positive error control offered by the digital machineo Perhaps the chief reason for this lies in the more difficult encoding of the analysis. The need to reduce each analysis to machine code has already resulted in several levels of machine languages that are designed to assist the user in bringing his problem to the machine, Since many engineering problems., of which the system simulation is a good example., require extensive analysis prior to the time at which advanced languages can be of assistance, it may be expected that the need to study a variety of simulations and large systems would result in the development of methods to assist the analysis as well, as the later computation of results,, The approaches to the assistance in analysis have been variedo The range of methods extends from thel production of a generalized system program from which a specific system of the same type may be approximated by an interpretive selection to more truly analytical programs capable of producing other programs to simulate specific systems of rather specific types,. ii

iii This paper treats the development of two techniques to handle a very generalized system simulation. The first technique; referred to here as The Simulator Program; is a procedure, referred to as an algorithm, for prod.ucin.g programs automatically on the digital computer that are simulation programs for very general systems. The second technique, referred to here as The Stepwise Regression Program with Simple Learning, is an algorithm for producing analytical expressions and subroutines for use in representing the performa.ce of the components of systemso The subroutines may be used by the programs produced by the simulator program or by other programs as desired, Together these techniques provide a heretofore unavailable method for undertaking the study of large systemso The cooperation and direction given me by the members of the doctoral committee; Associate Dean G. V. Edmonson as chairman, Professors Bo A. Galler, Jo J. Martin, No R. Scott' C. A, Siebert. Go J. Van Wylen and Mro Ro D. Allen of Consumers Power Company in Jackson; Michigan., has been greatly appreciatedo Special thanks are extended to both Consumers Power Company and Commonwealth Associates, Incorporated for the enabling grant that made the work possibleo Thanks are also extended to the many people at the Computing Center and the College of Engineering whose assistance was most valuableo In particular, thanks are extended to Mru Jo SO Squire and Mr. Mo P. Anderson for their valuable assistanceo

BACKGROT),O OF THE S''TUDY' In order to establish an orl.enta.tion of the developments presented in this paper, it may prove!desi.rabe to review briefly the developments of earlier workers in pertinent, areaso The simulation problem has attracted the attention of many workerso The conventional approach taken by most earlier efforts was the construc-otion of a special purpose program designed to encompass as general. a description of the system to be simulated as might be feasibleo The soluti.on. of the simulation of a specific physical system then depenied. upon, that system being representable by some subsection of the more gene.ral program., The specific system was usually selected by means of con-trol para.m.eters from the more general. program. Experience with such techniques madie obvious the inherent difficulty of representing accurately the many variations of the systems that may be suggested for study, This deficiency led to the consii.eration of programs () that produced other programs which ir turn accd mp;OplIshed'. the desired, simulationo In this way, the computer began to ie i.tilzel as an analytical aid and was enabled to generate programs fro:rr. a, descxription of the system and a specified set of physical laws and relatio. nships wit which the system may be described. DYANA, for examlr.e.,..,s able to generate programs for any system representable by a gere:a netwo:rk and' by the principles contained in Kirchoff s laws and/ocr D Aler.mbert.;i s p pericple o The task of generaliz-n.g h. e siu,,lat. on problem remained and the Simulator Program presented i.n this paper is a w.ojrkable solution of that problem. Specifically, the genera.l'.liatiol.n allowed. by this technique iv

extends the use of the computer for the analysis of any system describable as a network and whose component parts may be characterized by the application of relationships involving the parameters of the system as determined at the node, or interconnection, points of the networko The nature of the physical laws and relationships required1 for the analysis does not influence the logical structure of the Simulator, Thu.s, the programs generated by DYANA are, in effect, members of the set of programs that may be generated by the Simulator, The relationships supplied by the user to characterize the components of his system may then be time dependent or time independent as desired. The second development presented in this paper concerns the production of expressions for the prediction of the behavior of physical phenomena. The methods of "least squares" and multiple regression have received much attention. (9,10,11 12) The stepwise regression technique of Efro\ymson as extended by Dallemand. offers a powerful technique to assist the Simple Learning developed in this paper to extend the treatment of data representation problems to include all. orders of interaction between multiple functions of several independent variables. The stepwise regression analysis provides the independent evaluation required by the heuristic selection mechanism employed by the Simple Learning program to generate the terms to be used in the prediction equatJion for the datao In this extension of the techniques of "artificial intelligence" many of the objections to the earlier methods of regression analysis have been answered. Previous methods were often forced. to make rather drastic simplifications of the statistical models in order to allow the problems to be solved in a reasonable time, even on the largest and. fastest computers, v

The incorporation of Simple Learning has allowed the regression analysis to gain access to every possible term within the scope of the functions allowed by the user for all orders of interaction in multiple independent variable problems and still obtain the desired equations in practically feasible time. Since the equations generated are often required by the components of the systems treated by the Simulator, extensions of earlier methods were made to cause the production of the resulting equations in subroutine form ready to be usedo It is important to understand that the stepwise regression portion of the analysis may be replaced by other techniques as improved methods are developed and still retain the benefits offered by the Simple Learning methods presented here. At present, however, the stepwise regression analysis as extended here is regarded as ae very suitable technique- The extensions include a treatment of the truncation and roundoff errors generated during the analysis and an improved treatment of the constant term when the analysis is conducted with respect to the normal coordinate axes. The Simple Learning techniques presented are also extended from earlier effortso (2,)4,5) The basic principle may be regarded as analogous to learning through reinforcement. By arranging to increment the probability of an action after encountering success and decrementing the probability after failure, earlier workers had indicated the potential ability of a mechanism to simulate learning. Some interesting observations of random mechanisms had also pointed up possible advantages of an initially random mechanism that would gradually become more nearly stepwise in its vi

action as the reinforcement process took place. Earlier workers also pointed out some of the pitfalls awaiting learning mechanisms0 By incorporating the experiences of earliers workers and introducing a "halflife" concept of reinforcement, the mechanism presented in this paper displays rather promising properties while retaining simplicity. The learning mechanism employed by the program is termed "simple" because the modification of each portion of the selecting mechanism is controlled by a single parameter and each modification occurs individually when the success or failure of each portion of the mechanism is determined. The more complex problem of the interrelation of success and failure patterns is not treated by the present mechanism. Thus it may be observed that the contributions of many workers in many rather diverse areas have served as the foundation upon which the present work was built. In turn, the development and applications of the methods and concepts presented here will serve to further extend this area in the future. vii

TABLE OF CONTENTS Page PREFACE................................................ ii BACKGROUND OF THE STUDY...................................... iv LIST OF FIGURES........................................ ix INTRODUCTION................................................ 1 SUMMARY OF RESULTS.........................6............... 6 I. THE SIMULATOR PROGRAM................................... 8 THE.TRUCTURE OF A PROCEDURE TO GENERATE ALGORITHMS TO SIMULATE PHYSICAL SYSTEMS............................ 8 IMPLEMENTING THE SIMULATOR............................... 19 Communication of the System Information to the Program......................................... 19 THE STRUCTURE OF THE SIMULATOR TRANSLATOR................ 51 II. STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING........ 66 IMPLEMENTING THE STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING................................... 98 Communication of the Problem to the Program........... 98 The Structure of the Program.......................... 118 CONCLUSIONS........................................... 124 SYSTEM SIMULATION FLOW DIAGRAMS AND CORE LAYOUTS............ 126 STEPWISE REGRESSION FLOW DIAGRAM AND CORE LAYOUTS............ 195 COMMENTS ON THE SYSTEM SIMULATOR FLOW DIAGRAMS.............. 227 COMMENTS ON THE STEPWISE REGRESSION FLOW DIAGRAMS............ 233 ILLUSTRATIVE EXAMPLE......................................... 239 BIBLIOGRAPHY.................................... 254 viii

LIST OF FIGURES Figure Page 1 Generator Electrical Losses as a Function of Load, Hydrogen Pressure, and Power Factor......... 72 2 The Gaussian Distribution........................... 77 3 Surface Showing Values of F which. May be Exceeded. by Chance with Stated Probability............... 82 ix

INTRODUCTION The simulation of the behavior of physical systems is one of the most economically promising uses of the digital computer because the simulation process affords the user an opportunity to examine system performance without requiring a capital investment in the actual hardware of the system, The simulation of a system also is attractive theoretically because of the opportunity extended to conduct controlled investigations and to avoid the "noise" problems associated with actual systems, The simulation process is.also attractive in those cases in which some degree of hazard is present since an "explosion" occurs only on papero In spite of the obvious advantages of simulation on the digital computer only a relatively small number of physical systems have thus far been studied in this wayo The reason for this state of affairs hinges primarily on the difficulty of communicating general system problems to the computer in the form of a procedure capable of producing the desired results o The work described in this paper attacks this problem by producing two procedures of immediate use in helping to generate system simulation programs for digital computers. The procedures have been coded and verified for the IBM 704 computer but the techniqures are more generally applicableo The two procedures are the following: 1) The Simulator Program. This procedure produces simulation programs as algorithms in compiler language ready for translation and execution by the machine.

2) The Stepwise Regression Program with Simple Learning. This procedure produces predicting equations in subroutine form for the description of the behavior of the components of the system being simulated. Both procedures produce machine translatable programs automatically ready for immediate processing by the machineo Taken together the procedures offere for the first time a method of direct communication for general system problems to the machine for analysis and production of algorithms for the simulation of the system. The use of the machine for analysis as opposed to calculation is not yet widespread. The implementation of the techniques discussed in this paper may be of much more general interest in this areao Specifically, the machine must be presented with methods of proceeding with a problem when the best available information is only capable of indicating the relative possibility of success for the alternative pathso Methods of probabilistic choice and mechanisms for altering the likelihood of choice from "experience" also implemented in the development of the procedures. The programs discussed contain workable implementations of these methodso These methods are a first step toward more sophisticated "artificial intelligence" techniques. In order to understand the problem encountered in the generation of algorithms by a machine, consider the simulation process itselfo If a system is to be simulated, the behavior of each component of the system must be related to the other components in such a way that the physical laws and relationships pertinent to the problem are preserved. In the past, the preservation of this consistency was the responsibility of the human programmer. The result of his effort of analysis was a procedure or algorithm

-3 by which he, or the machine, could proceed step by step from the data supplied to the results desired. In general, the analysis of the system to produce this algorithm depends on three kinds of information: 1) The System Definition, 2) The Given or Known Data, 3) The Desired or Unknown Results. The algorithm may be specialized to the defined system and designed to accept the given data and from this information produce the desired resultso The algorithm must be so constructed that the solution can proceed from step to step toward the result. The Simulator Program is designed to produce these algorithms. The same three basic classes of information are supplied to the Simulator. The Simulator Program then carries out an analysis of this information to produce an algorithm capable of the required performance (or, if it should prove to be incapable of producing the algorithm due to insufficient or otherwise inadequate information, error diagnostics are produced). Because of the possibility that several methods may exist that will yield the desired results, some interesting heuristic methods must be employed by the Simulator. When the program for the particular system has been produced by the Simulator, it is both printed and punched on cards ready for compilation and execution. The produced program will, in general, require many special characteristics of the components of the system to be available to the program. These characteristics are usually functions of one or more parameters of the system. The Stepwise Regression Program with Simple Learning is an extension of earlier stepwise regression techniques to allow consideration of all orders of interaction in multiple independent variable problems.

-4 The classical approach to such a problem may be shown to be quite unmanageable on present (and even projected) computers without drastic simplification. The Simple learning mechanism employed in this program avoids such a pitfall and allows an accumulation of "experience" to be directed toward the acceleration of the generation of the predicting equation for the desired characteristic. The technique has been employed in many varied problems and has been carefully verified in many known caseso The procedure produces the predicting equation for a given problem together with a complete statistical analysis and a punched card subroutine ready to be used with the simulation program (or by any other.program, as desired). The Simulator Program together with the Stepwise Regression Program with Simple Learning allow the direct simulation of the performance of any system that may be characterized by a general networko The use of these programs to produce specialized programs of each system avoids the loss of accuracy of representation sometimes suffered in attempting to use a general representation in an interpretive modeo Since the information presented to both programs is designed to be fairly easily obtained, the man-hour cost of programming system simulation problems can be greatly reduced. At the same time, revisions of simulation programs to keep pace with design and operating changes in the actual system can be made economically feasible0 Finally. the user of these programs during the design phases can allow study of a greater number of possible configurations than was previously practical.

-5The feature of, perhaps, greatest importance is the opportunity extended to the user to obtain increasingly accurate representations of the actual behavior of a system while studying a simulation of the system. The most recent information available on any part of the system may be incorporated immediately in the simulation by the techniques described here. Thus the problem of revising and correcting a former simulation program becomes quite secondary.

SUMMARY OF RESULTS Two programs have been produced for assisting the representation of the performance of physical systems on the digital computer. The Simulator Program is designed to produce the analysis of physical systems in the form of an algorithm in machine translatable language. The user of the Simulator Program must supply: 1) The System Definition, 2) The Given or Known Parameters, 3) The Deisred or Unknown Parameterso The Simulator Program then attempts to construct the required algorithm for the problem. In doing so, use is made of a Library of methods pertinent to the system. The library is accessible to the user., so that new A.ethods may be easily inserted. The methods are grouped under the heading of Element Descriptions. That is, each possible element that may occur in the system is described in the Library. New elements may be easil added an:i older descriptions may be revised within the structure of ihe Simulator Program. The second program, The Stepwise Regression Program with Simple Learning allows the generation of predicting equations for the behavior of the components of the system in a form required by the simulation program produced by the Simulator. This program allows the consideration of multiple independent variable proclems with interactions of all orders allowed. Since the classical approach to this problem is of such magnitude that present and projected computers cannot adequately cope with the solution in many cases, the Simple Learning technique was developed to allow a solution to be made.

-7The two programs thus allow the simulation of very general physical systems from a rather basic set of information available on the systems. The resulting reduction of man-hours of programming should allow the extension of systems simulation techniques in many areas not previously practical. Furthermdre, the availability of these techniques should allow the study, and thus lead to understanding, of more complicated systems and components than those previously treated.

Is THE SIMULATOR PROGRAM THE STRUCTURE OF A PROCEDURE TO GENERATE ALGORITHMS TO SIMULATE PHYSICAL SYSTEMS' In order to understand a procedure that can generate algorithms to simulate physical systems, first consider the nature of the problem. In general, a physical system consists of a collection of components or elements that are inter-connected to each other in various ways. For example, a typical vibrating mechanical system consistes of masses, springs, dampers, levers and so on. These components or elements of the system are inter-connected to each other to form the desired system. One such interconnection might be the attachment between a mass element and a spring element aS an illustration. The behavior of each component is determined by various physical laws and relationships that are determined by the nature of the component. In particular the behavior may be expressed in terms of the values of parameters at the points of inter-connection of the component to the Other members of the system. For the general system thlere may be a very large number of different components and associated with each component a large number of methods and procedures that allow the performance to be calculated. In order to select those procedures needed to produce a simulation of the. system additional constraints must be imposed. These constraints consist of those parameters for which values will be supplied as initial and/or boundary conditions.

-9 For purposes of discussion, suppose that the simulation of a system is regarded as a method or procedure that will allow the calculation of the values of the various parameters belonging to the system. The task of the simulator program is directed at the problem of determining the method of calculation just mentioned. The actual calculations of the values of the parameters will be produced by the program method produced by the simulator program. The Simulator program enters the problem as an analytical rather than as a calculational method. It is this use of the computer on the level of producing programs that in turn are used to produce calculations that allows the most powerful applications of the technique. In so doing, the program has assumed a very large burden in the solution of system simulation problems. This in turn will allow the user to study more complicated and more accurately represented systems, Specifically, the task of the simulator is not to be construed as that of determining all possible values of all possible parameters but rather that of determining the values of specific parameters subject to specific initial and/or boundary conditions. If the procedure for defining the algorithm can be made quite general then the parameters selected for display and the conditions imposed can be made quite general. The problem is always that of determiningwhen a sufficient set of information has been supplied and of producing the procedure when a sufficient set of information is present,

-10 The information requirements are easily set down. The determination of the sufficiency is most difficulto The requirements are: 1) The Definition of the System. 2) The Definition of the Components of the Systemo 3) The Specification of the Constraints to be imposed on the System. The Definition of the System consists of the specification of the elements or components of the system and the way in which these components are inter-connected. For the purpose of this discussion let the Definition of the System be complete when all of the components of the system are defined and there are no possible inter-connection points of any component that are not connected to some other point. To secure the completeness it may be necessary to define some components to act as sources, sinks or boundaries. The Definition of the Components of the System consists of the specification of the methods or procedures by which values of parameters at the various inter-connection points of the components may be found in terms of the values of parameters at the same and other inter-connection points of the same component. Strictly speaking, completeness of the Definition of the Components requires an exhaustive collection of all possible methods of parameter calculation. In other words, every feasible method or technique that can be applied to a given component must be made part of the collection. Otherwise, it is always conceivable that a program will not be generated by the simulator because a technique was omitted from the collection. This will seldom, if ever, be achieved in practice. Usually the most the Definition of Components can be expected to do is to embrace the most generally productive methods.

-11 Since the decision of what constitutes general productivity is at best highly subjective, a procedure charged with constructing a calculational procedure from these methods must not be involved directly with this decision or else its utility is almost certain to be limited by the decision. If possible, the simulator procedure should allow easy extension and/or modification of the Component Definitions independ1ty-y-f ithe simulator.prceedure itself. That is, the method of analysis and use of Component Definitions should be independent of the constents of the Component Definitions. The Specification of Constraints imposed on the System consists of the values of thieparameters to be construed as initial and/or boundary (operating) conditions for the system, Completeness of these Specifications is generally dependent upon the completeness of the Component Definitions. If very complete Component Definitions are available for the system then it is possible that several different sets of values of the parameters can be made to produce a given value of another parameter. For example, in the superheat region for steam the enthalpy of the steam may be found if the values of any two independent properties, such as pressure, temperature, entropy, specific volume and so on, are known. If many methods are available in the componet definition for the determination of enthalpy then almost any combination of two parameters will allow the calculation. If only a few methods have be&n included in the component definition obviously the parameters given as constraints must be so chosen that these methods apply. In other words, if the Component DefLnitions are very nearly complete, then calculation procedures can be found for almost any set of constraints that may be given. If this is not the case, then the set of constaints must be large enough to include those that are needed for the calculation using the

available methods. In almost every case however there will be some minimum set of constraints required for any given desired parameter. The interaction between the Constraints and Component Definitions for a given system is extremely complex. The determination of a minimum set of constraints for a given set of component definitions and a given system definition may be of interest in some cases but the general problem cannot usually recognize whether a failure to yield an algorithm to produce a particular result is due to lack of constraints or deficiency in component definitions. Furthermore, when several alternative calculational methods exist at one or more points in a system it is possible that a valid procedure can be found before all possible procedures have been examined. In addition to the three information requirements previously mentioned, the generation of a specific program must be viewed as a selection of a reduced collection of methods and their arrangement to yield specific values for certain parameters. Otherwise the simulation of a system would be required to be exhaustive and again this is, in general, not possible. To be generally useful a procedure for generating a simulation program should allow for unrestricted specification of desired information and be charged with the task of establishing a method of calculating this information within the framework of the previous three information requirements whenever possible. The development of an algorithm to accomplish this generation must be concerned with the recognition of conditions in which it can be established that a method df calculation cannot be foundo This may seem strange Unhtil

-13 it is considered that when a method cannot be found and the condition recognized then it becomes possible to either terminate the attempt or restart the attempt from another direction. These things could not be done otherwiseo If the situation can be recognized when further progress cannot be made in generating a program it is then possible to formulate a simple procedure for generating a program. A program may be said to be "non-extendable" when there exists at least one required parameter that satisfies the following conditions: 1) The parameter is not given as a constraint and 2) there is no method in either component definition of the two components directly involved with the parameter that will yield the value of the parametero For example, suppose that the current flow is needed through a thermistor and that no method is available that can produce the current flow value in this caseo Then, if further direct progress is to be made the value must either be given or an appropriate method must be added to the Component Definition collection. If neither of these things can be done then the progress depends upon finding an alternative chain of methods that avoid the need for the current flow through the thermistor, If there have been no points earlier in the work at which alternative paths could have been chosen then there is no possibility of the extending the method beyond this point. The program is said to be "non-extendable", If a program is non-extendable and each previously established required parameter could be found in no more than one way, then the problem may be said to be not well posedo A problem is not well posed, in general, when there does not exist any collection of methods to yield all

-14 of the requested information subject to the imposed constraints and the definitions. It should be noted that a problem may be not well posed even when several methods exist at previous stages. The determination of the well posed condition when this occurs may require an exhaustive investigation of all possibilities. An algorithm that uses the definition of a non- extendable program to generate a simulation program for a system is the following: 1) Check to be certain that the System Definition is complete. That is, determine if there is a Component Definition available for every element of the System Definition and if every attachment point is connected to some other point. (A Complete System Definition may not be correct but it is capable of analysis.) 2) "heck each attachment point in an ordered search to locate any point at which there is requested information. A "request" for a parameter may have occurred in one of two ways, a) the value of the parameter may have been desired by the user of the program, b) the value of the parameter may have been required as an input for a method selected previously. A "request" cannot occur in any other way, 2A) If no such point can be found in the entire system an algorithm for the simulation of the program has been found. (If no such point were found in the first search, the problem is trivial but the preceding statement is still valid.) 2B) When such a point is found, remove the request for information by one of the following methods: 2B1) Matching the request with a specified constraint. That is, if a request result has been given as a

-15 constraint value then the value of the requested result is known without any further calculation. 2B2) Finding a calculational procedure that applies at this point that will yield the requested result. If more than one method applies, pick one and indicate that a choice has been made. 2C) Whenever the request is removed by matching with a constraint no new requests are created. Whenever the request is removed by finding a method, the method may introduce new requests for information. When this is true, cause the entire step 2 to be repeated again after completing the current search. 2D) Whenever a request cannot be removed, the program may be nonextendable. Test the choice indicator and a) if no previous choices have been made, the program is not well posed, b) if previous choices have been made, cause the program generation to return to an earlier state, make another choice and try again. 3) Repeat the search indicated by step 2 until 2A is satisfied or until an upper limit of numbers of trials have been exceeded or until the problem is shown to be not well posed. 4) Whenever step 2A is satisfied, the required algorithm consists of the methods found by step 2B2 executed in the reverse order of their determination. That is, the first method determined yields the last request required of the program. The second yields the next to the last and so on. If any method introduces new requests, these results must be found before

-16 the method can be used. This is precisely the situation that will be obtained since the methods to determine these results will be found later, the execution of the methods in reverse order will produce the results before they are needed by the method requesting them. An essential part of the previous procedure lies in the technique of picking a method whenever more than one method is available. Obviously, if an inflexible selection is made, that is, always choosing the "best" method (no matter how "best" may be defined), repeating the generation when a program has been found to be non-extendable would always lead back to the same point. Therefore, the selection should be made flexible and, in particular, allow equal chance of selection for equally promising methods and occasionally the selection should allow choice of methods not locally "best," Thus the technique must set some scale by which the characteristic "equally promising" and, in fact, the degree of "promise" can be measured. Many such scales could be specified. One scale that is easily determined and contains some measure of the "promise" characteristic is the ratio of the number of useful results produced by a method to the number of new information requests the method will makeo Since the objective is to eliminate the requests by finding methods that require only constraint information, such a scale would place greater weight on methods that make the smallest number of new requests, but would also consider the number of useful results yielded, For example, a method that requires one new result while yielding one requested result is equally promising when compared to a method requiring four new results to yield four requested resultso However, a method

-17 yielding four requested results and requiring only two new results would be scaled twice as promising as either previous method. On this scale, a method that produces any results without requiring any new results would automatically be selected. Also a method that produces no results is automatically rejected. The important point to be understood is that the selection cannot be fixed so that the method of greatest finite weight is always selected since it is possible that one of the parameters that this method would require may not be capable of calculation due to incompleteness of Component Definitions or Constraint Specifications while a method of less weight may avoid this difficulty. However, the methods of greatest weight should tend to be selected if the program is ever to be finished. The result of these considerations is to produce a selection method that operates probabilistically in the choice. The implementation of this procedure in the form of a program for the digital computer requires two tasks to be performed: 1) The creation of an artificial language to allow communication of the information concerning the system between the human user and the machine program, 2) the preparation of the foregoing simulator procedure as a program capable of accomplishing the translation of the artifical language into a simulation program. Since the first of these tasks is strongly associated with the human and his simulation problem, while the second task may, for the casual user, be regarded as a problem removed from his immediate consideration, the discussion of the implementation of the simulator is divided into two parts along this division. Of course, the second part is vital to the use of the first but its operation is of concern to a relatively small number of people

-18by comparison, It must be understood, however, that the generation of the simulation programs is accomplished by the translator, The translator is thus the procedure of greatest importance in the solution of the simulation problem on the machine,

IMPLEMENTING THE SIMULATOR Communication of the System Information to the Program Communication of the information concerning the system to be simulated to the simulator program is the first, and for the simulator user, the most important, step in the generation of a program to simulate the system. This transfer of information is accomplished through the medium of an artificial language that is designed to be reasonably like the user's own and, at the same time, contain a structure that is recognizable by the program so that the information content may be extracted. Thus the user may expect to use familiar alphanumeric characters and standard punctuation symbols in all but a few cases. Since the user is often not acquainted with the detailed operation of computing machines, some effort has been made to remove restrictions in the formats for source program preparation, (The source program is the collection of punched cards containing the user's system infornation. The simulator program produces an object program from the source program. The object program is a machine translatable program from which the machine can produce the desired simulation when supplied with data, ) In general, the source program may be punched anywhere in columns 1 through 72 on IBM cards. Statements may run over from card to card without requiring special continuation symbolso While most users will tend to place a single statement per card for convenience in checking and correcting source programs, more than one statement may be placed on a card, if the user so desires. Statements are terminated with a period (decimal point symbol) as in conventional -19

-20 writing. In a few cases, notably in Element Descriptions, where the user wishes to convey a specific object program language to the simulator, format restrictions will be imposed and emphasized at that place. Structure of the Simulator Source Language Information must be transmitted to the simulator from three basis areas and, if desired, further implemented by a fourth "utility" area. The basic areas are 1) the System Definition, 2) the Input-Output Requirements, and 3) the Characterization of Component Performance. In every problem, the user will be involved with the first two of these areas directly. If the problem requires modification or extension of the libraries on Component Performance, the user will also be involved in the third area. The language requirements for each area are inter-related so that the user may carry over most of the structure from one area to the next. I. The System Definition In order to simulate a system, the specific system to be considered must first be separated from the set of all possible systems allowed by the simulator. This requires the communication of 1) the names of the actual components that are found in the system, 2) the way in which these components are attached (connected) to each other. This information is transmitted to the simulator program by statements occurring within the range of a CONNECTIONS declaration. A Declaration does not perform any calculation but instead prepares the program to receive the information that follows. The range of any declaration begins immediately after the declaration and continues until terminated

-21 by any other declaration or by the end of input data cardss, The form of the system definition declaration is: CONNECTIONS. Connection Statements Connection Statements are chains of symbolic names transmitting the precise components and attachments and their interconnection to the simulator, The symbolic names are of four types: 1, Element Name Any six or fewer alphanumeric characters may be an Element Name. The Element Name used in a Connection Statement must either agree exactly with the corresponding Element Description Name for that component or be made to agree by use of a synonym, There are no restrictions as to the order of appearance of alphabetic or numeric characters, Examples of Acceptable Element Names PIMP; TRBIN1 6L6; 12AT7 MASS; SPRING; DAMPER 2, Element Identification Since more than one element of a given kind may be found within a system means must be provided to identify each different element, Since the desired effect is to convey uniqueness of the system, the user may use any six or fewer alphanumeric symbols to identify the elements of the system. If an element is identified, all occurrences of the same element must exactly agree in identification, If an element occurs only once in a system, the element identifier may be omitted, EXamples of Acceptable Identifiers 1; 2; 3 A; B; C Al; 2B; 5C6 MAIN; SCNDRY

-22 35 Attachment Name Every element enters the system definition by the way in which it is connected to the rest of the system. That is, an element name cannot, except for unary and binary elements, occur without an associated attachment name. The attachment name consists of six or less alphanumeric characters and must agree with the attachment names given in the Element Description for the element associated with the attachment name. If the user desires the agreement can be obtained through the use of synonymso Examples of Acceptable Attachment Names INLET; EXIT 1; 2 ENTRY3; OUTLET GRID; PLATE; CTHODE; SCREEN 4. Attachment Identification If more than one occurrence of an attachment name with its identified element exists in a set of CONNECTIONS statements, an ambiguous situation arises. The attachment, in effect, has been made to several different places with but one physical contact point. The user has the option of defining an element with branching or junction properties to resolve this problem or, if it is more convenient or desirable, the option of writing the element description of the component to allow for attachment identification. The attachment identifier is any alphanumeric name of 6 or less characters. A unique identified attachment of an identified element may occur only once in each program. The acceptable forms are like those of element identifierso As with element identifiers, the user is free to create whatever symbolic attachment identifiers that may be needed.

The Connective, TO, and Connections Punctuation The foregoing forms of symbolic names are sufficient to define a unique connection point in a system. Let the following generalized symbolic names be defined: Let EL1 be any allowable element name. E11. be any allowable element identifier. AT1 be any allowable attachment name, AID1 be any allowable attachment identifiers. The following forms of connection points are then allowed: ELl(ATl) if there is only one element EL1 and only one AT1 on EL1 in the system. EL1, EIDl(ATl) if there is more than one EL1 but only one AT1 on EL1, EID1. ELl(AT1, AID1) if there is only one EL1 but more than one AT1 on EL1. EL1, EIDl(ATl, AID1) if there is more than one EL1 and more than one AT1 on EL1, EID1. Let CONN be any of the connection point forms above. EL1 (AT1) EL1, EID1 (AT1) CONN ^EL1, (AT1, AID1) ELU, EID1 (AT1, AIDl) Then a CONNECTIONS Statement is of the form: CONN, TO, CONN. The connective, TO, establishes the joining together of the connections. The punctuation should appear as written in the definition. Examples of Acceptable CONNECTIONS Statements PUMP1, A23 (OUTLET, B), TO, HEATER, 5(INLET1). 6L6, I(PLATE), TO, XFRM, OUTPUT (TAP1, 3). SPRING, 15B (END1), TO, LEVER, 6C, (AT.3).

-24The Special Cases of Unary and Binary Elements The unary elements (one attachment) and binary elements (two attachments) allow special treatment in writing connections statements. These elements may be written without specifing attachment names since the attachment is immediately obtained form the context, If the full connection statement notation is used, no error will result but some saving in programming will be losto The simulator program will assign the attachment names 1 and 2 to binary elements and the attachment name 1 to unary elements, The user must take care to use these names if reference is made to these elements using the complete notation. Examples of Connection Statements with Unary and Binary Elements PUMP1, 16(OUTLET), TO, PIPE, 23, TO, HOTWEL. 2N133, l(CLLCTR, 1), TO, CAP, 3, TO, 2N132, L(BASE, 1). II. The Input-Output Requirements In addition to the System Definition, the user must st&te the information requirement to be imposed on the system. That is, the parameters for which values will be supplied as data for initial and/or boundary condiptions and the parameters for which the user expects the program to produce values must be stated to the simulator, These parameters are listed under one or the other of the input-output declarations: 1) INPUT PARAMETERS. 2) DESIRED RESULTSO The user must give the source program the appropriate declaration followed by a list of the pertinent parameters, The list of parameters consists of 1) the name of the parameter, 2) the symbolic location of the parameter in the system.

-25 Parameter Names Parameter names are 6 or fewer alphanumeric characters of which the first character must be alphabetic. These names must agree exactly with the parameter names used by the element descriptions. Synonym modification is not allowed. Examples of Acceptable Parameters (with Associated Connections) PRESS (SUPHTR, l(EXIT)). VOLTS (6SN7, 3(GRID))* Specification of More Than One Parameter at a Point More than one parameter may be specified at a point by giving a list separated by commas and followed by the point designation. For example: FLOW, PRESS, TEMP (TwkIN 1, l(INLET)). FORCE, VELCTY, ACCIRN (LINK, 3(END1)), Parameter Range The range of the input-output requirements Of a parameter is established by the point designation, If the point designation is EL1, EID1 (AT1, AID1) the parameter requirement will have the range of exactly one point. If the designation is EL1, EID1 (AT1) and there is more than one AT1 Qnr'. EL1, EID1 in the system definition, the range will be for all AT1 on EL1, EIDJT If the designation $s EL1 (AT1), the range is for all AT1 on all ELL. In general the range is defined to cover every connection bearing the.deignation. This concept is very useful for writing parameter input-output requirements but may trap the unwary user. (If for example, the user gave the designation EL1, the effect is to place the parameter requirement at every attachment on every occurrence of EL1, )

III. Characterization of Components The simulator must have precise information concerning the way in which the physical laws or relationships are to be treated for each component in the system. Fundamentally, the problem is one of allowing the program analytical access to a large collection of possible methods. The methods are catalogued as to the input information required and the output results produced. The simulator program then searches the catalogue for the most appropriate methods to use in the generated program. If the library of component descriptions is complete for a given system, the user will obtain a program to simulate the performance of the given system after supplying only the System Definition and the InputOutput requirements. If the component library is inadequate for any reason the user must then supply additional information concerning the component. This is done by using the declaration'ELEMENT DESCRIPTION." followed by a collection of assertions and statements conveying the information. Assertions Within Element Descriptions An Assertion, like a declaration, conveys special information to the simulator program but unlike the declaration does not terminate the scope of the declaration, The "ELEMENT DESCRIPTION." language has four assertions and two forms of statements. Element Name Assertion The element name assertion defines the six or less alphanumeric symbolic name by which the description will be recognized. This is the true

-27 name of the element description. Let ELNAME stand for any symbolic name, then the element name assertion is: NAME OF ELEMENT EITAME. Examples of Allowable Element Name Assertions.NAME OF ELEMENT PUMPI. NAME OF ELEMENT 2N133. NAME OF ELEMENT SPRING. Parameter Scope Assertion So that an attachment may be identified, the concept of parameter scope must be implemented, The parameter scope concept classifies parameters into Broad Scope parameters and Narrow Scope parameters by applying the following rules: A parameter is a Broad Scope parameter if when considering this parameter at an identified attachment it is true that when the value of the parameter has been established at any one of the identified attachment points it has automatically been established for every other identified point of that attachment A parameter is a Narrow Scope parameter otherwise, In particular, a parameter is a Narrow Scope parameter when a requirement for a value of this parameter at an attachment point automatically requires individual determinations of the value of the parameter at each identified point of the attachment The user may declare a parameter to be of Broad Scope with the assertion: BROAD SCOPE PARAM1, PARAM2, PARAM3. where PARAM1, PARAM2, PARAM3 stand for any symbolic parameter names* The simulator program will assume a parameter to be of Narrow Scope unless otherwise specified in every component description using the

-28 parameter contained in either the Permanent or the Temporary Library. A parameter not used by an Element Description is thus excluded from this assumption. Failure to properly assert the scope of parameters may result in failure to generate programs that should lie within the scope of the simulator, but programs generated will yield correct results even though redundant calculations were programmed. Examples of Parameter Scope Assertions BROAD SCOPE PRESS, TEMP. BROAD SCOPE VOLTS. Typical broad scope parameters are pressure, temperature, voltage, since if these parameters are established at any identification of any attachment all other identifications of the same attachment must have the same value for the parameter. Typical narrow scope parameters are flow, current and similar parameters since the value for the attachment requires the values at every identification of the attachment, Library Status Assertion The elevation of an Element Description to Permanent Library Status should be made only after the Element Description is throughly checked and very generally useful. When the user feels that these requirements are met the status may be made permanent by giving the assertion: PERMANENT. After an ELEMENT DESCRIPTION is entered in the Permanent Library it may be removed only by rewriting the entire Permanent Library. Once entered in the Permanent Library the Element Description does not have to appear with the source program deck to generate programs using the element,

-29 The preceeding three assertions may be made in any order but if given, must follow the Element Description declaration, and preceed the first Statement Collection assertion for a given element description. Statement Collection Assertion The assertion Statement Collection prepares the simulator so that the Statements following the assertion will be processed to form the element description capability, The statement collection assertion has the form: STAEMENT COLLECTION. Collection Capability Statement Immediately following the statement collection assertion, the collection capability is stated. The capability language conveys the input parameter requirements and the result capability of the collection, The parameters are given in exactly the parameter language form of the InputOutput requirements, The Collection Capability requirements add three special words to the simulator language; 1) WITHOUT, 2) THEN, 3) ESTIMATE. The Connection Implication Then The word then set off by commas separates and identifies for the simulator the output results of a statement collection, This may be best illustrated by example. Suppose the user wishes to convey to the simulator the capability of an element description that would allow the determination of enthalpy at a point if pressure and entropy were known. This capability might be expressed: PRESS, ENTRPY(OUTLET), THEN, ENHLPY(OUTLET).

-30 A capability statement may involve any number of input parameters and any number of output results but only one capability statement may be given by the user for each statement collection. The Restrictive Without To avoid the problems associated with having many similar element descriptions for components that are basically alike but have different attachments the user is permitted to restrict a Statement Collection to apply only to those elements of the type that are without certain attachment pointso For example, this allows one element description to be written for a turbine stage and treat both stages with an extraction point and stages without an extraction point. The collection handling.a turbine stage without an extraction might use the capability statement: FLOW (INLET), WITHOUT (EXPT), THEN, FLOW (OUTLET). Clearly the word without conveys the information required to prevent the use of the statement collection that would follow the capability statement in the case of the turbine stage with an extraction point. ESTIMATE, The Iterative Solution Collection Indicator Often physical systems are simulated most conveniently through iterative algorithms. That is, the program is so structured that an initial estimate is improved by repeated calculation, If the user wishes to present a collection of MAD statements to the simulator library to allow the use of such a technique, the form is such that usefull results are apparently produced without requiring any input information, Such a collection should be used only if the parameter produced by the collection has been calculated

-31 independently by some other methodo In order to inform the simulator that a statement collection is iterative in form the word ESTIMATE is given followed by the list of parameters to be estimated (and later calculated). When this is done, the simulator will allow the use of the iterative collection only when the parameter has been calculated by some other means later in the program. An acceptable capability statement for the ESTIMATE control word iso ESTIMATE, FLOW (EXTRCT)o Restricted Collections Certain types of statement collections that are extremely useful in writing element descriptions are such that if they occur in a program they may not be used more than once at any given attachment points As an example of such a collection consider the continuity equation for mass flow applied at a branching junction. Suppose, for illustration, that there is a single inlet stream designated at the attachment INLET but five outlet streams at the attachments EXIT, 1; EXIT, 2; EXITY 3; EXIT, 4 and EXIT, 5. If the flow were found at the inlet and at exits 1, 2, 4, 5 then continuity allows the flow at 3 to be found by the relation: FLOW3 = FLOWINLET - FLOWEXIT i i=l i/3 Clearly this is a most useful collection form but its use must be restricted to one occurrence at any given attachment. Of course, the collection may be used once and only once at any pertinent attachment point in the system and, therefore, the collection may appear several times in a program, each time at a different attachment.

-32 The collection capability statement for this type of collection uses an identified attachment name for at least one of the attachment names that apply. The capability statement for the preceeding illustration is: FLOW (INLET), FLOW (EXIT, i), THEN, FLOW (EXIT) or the equally correct forms FLOW (INLET), FLOW (EXIT), THEN, FLOW (EXIT.,j) or FLOW (INLET), FLOW (EXIT, i), THEN, FLOW (EXITj) Note that the symbols i and j are completely arbitrary and therefore open to the user's choice. Only one form is not recognized as meaningful by the simulator. That form omits both occurrences of the identified attachment on the opposite sides of the implication TEEN. The previous illustration written incorrectly is: FLOW(INLET), FLOW(EXIT), THEN, FLOW(EXIT) This form effectively says that the exit flow is known if the exit flow is known. This is not meaningful to the simulator because of the apparent redundancy. The Collection of Statements Immediately following the capability statement the user must supply the program statements that will produce the results claimed. This simulator program adopts the Michigan Algorithm Decoder language as the medium for the program statements, The user must write the Statement Collection conforming exactly to the format and coding restrictions and conventions of that language, Complete details and. manuals for

-33 the M^AD language are available fronm the Computing Center of The University of Michigan, It is presumed here that the user is familiar with the MkA,,D language. The simulator program allows the complete structure of the M.AoDo language to be employed in implementing statement collections. Two special symbols are the multiple punch symbols plus zero (~) and minus zero (B)a The plus zero is punched by depressing and holding the multiple punch key -on the IBM 026 keypunch and then striking the plus sign and the numeric zero, The resulting combination of holes in the IBM card is 12-0* (The equivalent 12-2-8 combination will not be read properly by the peripheral 714 Card Reader ) The minus zero is produced by depressing and holding the multiple punch key and then striking the minus (not the dash) and the zero, The resulting combination of holes is ll-0) (The equivalent.112-8 will not be read properly by the 714 Card Reader)o These symbols function as special brackets or parenthesises for the simulator, Since the user should have no need for'these symbols in his MA.D-. statements their use is specifically restricted to conrey information from the MAkD4 statements to the simulator. Meaning and Use of the Symbol 0 The symbol 5 is used to delimit two types of statement segments; 1) function substitutions, 2) floating statement labelso The function substitution use allows modifications to be made in the actual program generated by simulator at the time of the execution of the generation~ For example, suppose that the element description for the element TRBINEI has need for the stage efficiency of the turbine stage and

-34 the different turbine stages require different functions to describe the efficiency, The statement collection might be written: + + EFF=OETAO, (0 FLOW(INLET)0) and at execution of the program, if this collection were required a check would be made to determine whether a substitution of names was desired for ETA. Thus the program might be made to produce ETA1 for ETA when stage one used, ETA2 for ETA when stage two is used, and so onO If no substitution is given at execution time, the program will use ETA. The 0 symbols are deleted from the object program. The floating statement label allows the use of statement labels in statement collections. Since an element may appear any number of times in a system and the same statement collection might therefore occur any number of times in a program simulating the system, the user must not use fixed statement labels within a statement collection. If this were done the program generated might be ambiguous. Floating statement labels allow the simulator to generate unique statement labels and thus eliminate any ambiguity in labels, The user wishing to use a floating statement label writes: O**XXXXX5 Where the X represent any six or less alphanumeric character statement label the user may care to use in his statement collection. The floating statement label is initiated by the O followed by two asterisks (*) and is closed by the 6Q The simulator will generate a unique statement label, replace the entire floating statement label by the unique label and use

-35 this unique label whenever the same floating statement label is found in this local statement collection. Should the collection appear again in the object program another unique statement label will be generated. Example of Floating Statement Label THROUGH 5**A5, FOR I = 1, 1, I.G.10 WHENEVER T.G. TMAX, TRANSFER TO O**AO O**AO CONTINUE + The Meaning and Use of the Symbol 0 + The symbol 0 is used to delimit the parameter and attachment names for which the simulator must generate unique variable names of 6 or less alphanumeric characters, If it is possible to extract the first three characters of the parameter name and produce a unique symbol, this will be don&-o In this way the mnemonic significance will be preserved so far as possible. If conflicts occur, the simulator will generate a unique symbol for the parameter and use this symbol throughout the object program. In generating the unique symbol the procedure is to first build up a three non-blank alphabetic character symbol and if conflict still exists modify this symbol with probability 0.10 of changing the first character to a random alphabetic, probability 0.30 of changing the second character -to a random alphanumeric, and probability 0,60 of changing the third character to a random alphanumeric. The process continues until a unique symbol is generated. A maximum of 70 different parameters may occur at each attachment in any system to be simulated.

-36 Let PARAM represent any true parameter name; ATTCH represent any true attachment name; AID represent any attachment identifier, then the following forms are allowed within the scope of a pair of 0 symbols: + + O PARAM (ATTCH, AID) 0 PARAM (ATTCH) 5 t (ATTCH, AID) t 0 ATTCH, AID 0 o (ATTCH) O o ATTCH + No other forms are permitted within the scope of Oo The attachment within the scope of the 0 will cause a unique three digit attachfent point number to be determined from the System Definition. The unique parameter name code and three digit attachment code are combined to form a six character variable name for use by the M.A.D. translator. In + case no parameter occurs in the 0 scope, as in the latter four allowable forms, the parameter code is generated as three blank columns. The user may use this feature in whatever way may be logically useful in his statement collection. Because of possible ambiguity, the user may not use more than one identified attachment with a given statement. Any number of occurrences of the same identified attachment may appear with the same or different parameter names within the same statement. When an identified attachment is encountered, the simulator will produce a copy of the statement for every identified attachment occurring in the System Definition for the identified element concerned with this statement collection,

-37 The only exception to this rule occurs when a statement collection involves an identified attachment name agreeing exactly with the point of attachment in the system that caused the collection to be chosen. For example, suppose that the point OUT is identified on some element and that a parameter, say PRESSR, occurring at OUT has caused a collection to be selected in which the flow at OUT is treated for identified attachments named OUTo In this case, a copy of the statement will be produced for every identified attachment OUT except the current one for which the parameter PRESSR was requiredo The "all except the current point" rule thus becomes "all points" if the current point is not of the same + name, Furthermore, if the attachment name within the 0 scope is not identified the rule is to use the current point designation if the name agrees with the attachment name, otherwise use the first occurrence of the attachment name on the current element encountered in the System Definition, These rules allow the user of the Element Description Library to write simple but very powerful statements involving identified attachments, Summarizing, the user may: lo Write a collection referring to all identified attachments of an element0 2o Write a collection referring to all identified attachments except the current one, if the current point name agrees with the identified attachment. 35 Write a collection referring only to the first occurrence of an attachment name or to the current point if it has the same name, with priority awarded to the current point8

-38 + Examples of the Use of 0. EXECUTE TR BINl (OPRESS(INLET)O,OENHLPY( INET) 1, OPRESS(OUTLET)O, OENHLPY(OUTLET )O;OETAO 2 FLOW(OUTLET)0 The numeric 1 and 2 in column 11 are continuation marks exactly as in the usual M.A.D. language. The following group of statements might be used to sum the current flow at an attachment that may be identified. CURRNT =0. CURRNT =OCRRNT(BASE, 1)&CURN T If K other components were attached to BASE by using attachment identification in the System Definition the result of the two previous ELEMENT DESCRIPTION statements will be K+1 object program statements that will produce the total current flow at the BASE attachment. All other characters occurring outside the scope of 0 and/or + 0 symbols, including blanks, are automatically passed directly to the object program. Correct M.ADo card formats are automatically produced, as are continuation cards if needed, Remark cards are automatically produced, as in M.A.D. by placing a character R in Column 11o End of Element Description Declaration As many statement collections can be given as may be required to completely state the performance characteristics of the component. When the description has been completed the process is terminated by giving the declaration: DESCRIPTION FINISHED.

-39 This declaration must be given, Failure to do so will cause a simulator error that will finally throw the job off the computer,. Example of an Element Description The following element description is intended to illustrate the Element Description Language and would undoubtedly require more capability to be of general use. The increased capability may be obtained by adding as many more statement collections as needed. ELEMENT DESCRIPTION. NAME OF ELEMENT TRBIN. BROAD SCOPE PRESS, TEMP. PERMANENT. STATEMENT COLLECTION. FLOW (INLET), FLOW(EXPT), THEN, FLOW (EXIT). FLOWl=0 FLOW1=FLOWl+~ FLOW (INLET, 1) FLOWl=FLOWL- FLOW (EXPT,l)t 6FLOW(EXI =FLOW1 STATEMENT COLLECTION. PRESS(INLET) FLOW(EXIT), THEN, PRESa(EXIT),PRESS(EXPT)o 5PRESS(EXIT )tPRESS( INLET)Y 5PRATIOO. (SLOWlIEXIT)$) WHENEVER FIRST, READ FORMAT DATA(l), PDPOEXPTO PPRESS(EXPT )=( 1.-PDPtEXPTt).*PRESS (EXIT ) STATEMENT COLLECTION. PRESS, ENHTPY, FLOW (INLET), PRESS(EXIT), THEN, ENHLPY (EXIT), KWH(SHAFT). FLOW=O. FLOW=FLOW+$FLOW( INLET, 1)$ EXECUTE TR BIN1 (&PRESS(INLET)~, EENHLPY(INLET) 1 O,OPRESS(EXIT)O, OETAO. (FLOWl), FLOW OENHLPY 2 (EXIT),,6KWH(SHAFT)) ) EQUIVALENCE (tPRESS(INLET 6 6PRESS( (INNLET 1)S) EQUIVALENCE (6PRESS (EXIT 6), SPRESS (EXIT, 1 ) STATEMENT COLLECTION o FLOW(INLET), WITHOUT(EXPT), THEN, FLOW(EXIT)o OFLOW(EXIT )=O. OFLOW(EXIT )=OFLOW(EXIT )&+FLOW(INLET, 1) DESCRIPTION FINISHED

-40 IVo Utility Declaration The foregoing three areas of information constitute a necessary and sufficient amount of information for the simulator program to accomplish the generation of programso There are some additional features of the simulator language which are not strictly essential but which allow the user to produce better programs in some cases and to produce programs more easily in others. Still other declarations are for use in "housekeeping", that is, for making the job of Library Maintenance easier. These declarations and their associated statements are called the "utility" declarations. The declarations are~ 1) FUNCTION SUBSTITUTIONS. 2) SYNONYMS. 3) NEW ELEMENT TAPEo 4) NEXT SET OF DATAO FUNCTION SUBSTITUTIONS Declaration When the author of the library Element Descriptions so desires the descriptions may be written so that minor changes can be made at the time of the program execution. The proper method of writing this capability into Element Descriptions was treated in that section. The remaining task is that of allowing the user to exploit this capability in his programs. This is done by giving a FUNCTION SUBSTITUTIONS. declaration followed by function substitution statements0 The declaration form is: FUNCTION SUBSTITUTIONS o Function Substitution Statements The function substitution statements convey the substitution to be made and the scope of the substitution to the simulator0 Any six or

less alphanumeric character word may be substituted for any other word enclosed in 0 in a statement collection, The form of the statement is defined as follows: Let NUWORD be any six or less character new word. OLDWRD be any six or less character old word. Let EL be any symbolic element name. EID be any symbolic element identification, Then the allowable function substitution forms are: NUWORD, OLDWRD. NUWOID, OLDWRD (EL). NTJWORD, OLDWRD(EL,EID)o The effect of this statement is to replace OLDWRD when it occurs within 0, and at the element specified, by NUWORDT The scope of the subs stitution is controlled as follows: 1) For the statement form, NUWORD, OLDWRD(EL,EID ) The substitution will take place only for statements generated for EL, EID. 2) For the statement form, NUWORD, OLDWRD(EL) The substitution will take place for statements generated for any occurrence of ELo 3) For the statement form, NUWORD, OLDWRDo The substitution will take place for any object program statements generated. If function substitution declaration is not used, the original contents of the 0 in the statement collection will be used. If the user misspells the OLDWRD, the substitution will not occuro If the user misspells the NUWORD, the object program will contain the error. Examples of Function Substitutions Suppose that the element description TRBIN contains the state ment: DELTAH=DELTAH*OETAO. ('FLOW( INET )O) and several components TRBIN occur in the system0 The characteristics of

each turbine may be different and the user may obtain different functions to represent these characteristics by using the following statements: FUNCTION SUBSTITUTIONSo ETA1, ETA TRBIN 1 ) ETA2, ETA(TRBIN1,2)o ETA3, ETA(TRBIN, 3 ) When this is done the object program statements will use the functions ETAl, ETA2, and ETA3 in place of ETAo SYNONYM. Declaration The synonyms declaration allows the use of different symbolic names in writing connection points, This declaration also may be used to condense the connection point symbol strings to a single word or a few words. Both of these uses are introduced, for the convenience of the usero It is not necessary to use synonyms to write programs, but the use of synonyms may be of considerable assistance. The form of the declaration is: SYNONYMS The synonyms declaration is followed by any number of synonym statements. Synonym Statements Synonym statements convey to the simulator the true symbolic name or string and the symbols that are synomynous with'the true name or s n e only stringction is that the true name or string must be first part of the statement. Equal sign symbols are used to separate the true part of the statement and any synonymous parts. If any portion of a connection symbol string is wtten as a single zero (0) that portion will be left untouched, Let the following symbols be defined:

-43 Let ELT be any true element name; EIDT be any true element identification; ATT be any true attachment name; AIDT be any true attachment identification; ELS be any element name synonym; EIDS be any element identification synonym; ATS be any attachment name synonym; AIDS be any attachment identification synonym. Then the allowable forms of synonyms are: ELT=ELS1=ELS2= - -=ELSno OE IDT=OEIDS= OEIDS2=- =0 EIDSn. Type I (ATT)=(ATS)l=(ATS)2= -=(ATS)n. (o,AIDT)=(o,AIDS)1=(0,AIDS )2=. - =(O,AIDS)n' The single zero must be punched as indicated. The substitution of the true name will occur unconditionally for any synonym on the right. ELT, EIDT=ELS1= o-=ELSn. ELT, EIDT(ATT)=ELSl= =ELSno ELT, EIDT(ATT,AIDT-=ELSl=- =ELSn. Type ELT(ATT)=ELS= —o=ELSn. ELT(O,AIDT)=ELS1= - -=ELSne (ATT,AIDT)=(ATS )=- o =(ATS)n In the preceeding group, the occurrence of the single synonym symbol will cause the use of the entire true symbol string. ELT(AT4=(ATS)1=~ =(ATS )n Type III ELT(O,AIDT)=(OAIDS)1.o =(O,AIDS)no ELT(ATT,AIDT)=(ATS)1=I - 0=(ATS)ne In the preceding group, the occurrence of the synonyms are restricted to apply only to the true attachment names and/or attachment identification associated with the true element name independent of element identification. O,EIDT(ATT)=(ATS) -o =(ATS) Type IV 0,EIDT(ATTAIDT)=(ATS)f=o=(ATTS)no 0,E IDT(0,AIDT)=(O,AIDS)i=o-(O0,AIDS)n

In the preceding group, the synonym stustitution occurs for all elements identified EIDT regardless of the element. Type V ELT, EIDT(ATT,AIDT)=ELS,ELDS(ATS,AIDS )1=~ =EL, Type v EDS(ATS,AIDS) no In this type of synonym, the synonymous groups are replaced by the'true names in a one to one substitution0 No other synonym forms are allowed. The utility of these groups is best illustrated by exampleo 1) Suppose the synonym statement is given: PI4MPl=PMPliPMP-=P, Whenever the user writes PMP1, FMP, or P as an element name in a connections or input s t -s melt the name PTMP1l will be used as the true name, 2) Suppose the synonym. statement is given: PUMP1, MAIN=PMPI6 Whenever PMP1 is used as an element name in a connections or input-oitput statement, the element name PUMP1 and the element identification MAIN will be used as the true names, 3) Suppose the synonym statement is given, PMP1 (OUTLET,PRIMRY )=Xl)o Whenever (X1) is used as an attachment with PMP1l, regardless of element identification, the symbols OUTLET and PRIMRY will be used \ true names As the user gains familiarity with the synonym capability, occasionally large reductions in the amount of punching required for connections and input-output requirements may be obtained. But it must be emphasized that this is completely a matter of convenience and does not increase the capability of the simulator.

-45 NEW ELEMENT TAPE Declaration The declaration NEW ELEMENT TAPE, if given, must precede the entire collection of Element Descriptions that are to be used in the program, and in particular precedes the group of statements known as the Prologue and Epilogue. In this way, new collections of elements can be made and new prologue and epilogue statements can be produced. The form of the declaration is: NEW ELEMENT TAPE. Immediately following this declaration the user must give a set of prologue and epilogue statements, This collection of statements will be common to every program generated using this element tape, The prologue is charged with bringing in the input parameters, making certain initializations, testing for the completion of the calculation and printing the desired results, The epilogue is charged with transferring the program back to the prologue for testings The epilogue section is entered automatically at the end of the statements generated to simulate the system. The prologue automatically precedes the statements generated to simulate the system. Rules for Writing Prologue and Epilogue Collections The rules for writing the Prologue and Epilogue collections are the same as for Element Descriptions with three exceptions: 1. The symbol 0 when enconnectered for the first and second time triggers the repetative generation of statements containing the input paramn eters. Multiple copies of the statement will be generated, the copies will differ only in the parameters and the parameters will be grouped by attachment point. After the completion of this task, a complete dictionary of the input parameters will be produced on Remark Cards,

+ The symbol 0 when encountered for the third time triggers the repetative generation of statements containing the desired results parameters. Multiple copies of the statment are again generated, one statement for each desired attachment point as before. On the completion of this generation, a second dictionary of Remark Cards is produced for the desired results, This is the only permitted use of the symbol 0 and must occur in the Prologue. The symbol 6 is not permitted in the Epilogue. The use of + I three 0 symbols thus allows 1) reading in the input parameters, 2) printing of the input parameters for verification, 3) printing the desired result parameters as solutionso + The symbol 0 as used in the Element Description is not affected in any way by the Prologue and Epilogue rules. 2. The minus sign occurring in Column 1 must appear twice in the Prologueo The first occurrence marks the point after which the program has completed testing and is ready to produce the desired results printing. The second occurrence marks the end of the Prologue. Both minus signs in column 1 must appear in the Prologue. The minus signs in column 1 must not appear on Remark Cardso 35 The plus sign, occurring in column 1, must appear once at the end of the Epilogue, The occurrence terminates the processing of the Prologue and Epilogue0 The plus sign is located on the last actual executable statement card, and must not appear on a Remark Card.

-47 Typical Prologue and Epilogue Collection The following prologue and epilogue collection is offered as an example of a generally useful collection. Many of the basic features of this collection would be common to any collection of prologue and epilogue statements. The statements themselves must conform to M.A.D. formats and restrictions. NEW ELEMENT TAPE. Coliunm 1 11 R SYSTEM SIMULATION PROGRAM R PROLOGUE BEGINS START READ FORMAT ICARD, NOTRYS VECTOR VALUES ICARD=$7I10$ TRYCNT=1 FIRST=lB REPEAT=OB INTEGER NOTRYS, TRYCNT BOOLEAN FIRST, REPEAT READ FORMAT WORDS, DATA(l)...DATA(12) VECTOR VALUES WORDS=$12C6*$ READ FORMAT DATA(l),0 PRINT FORMAT DATA, 0 DIMENSION DATA(12) VECTOR VALUES DATA=$1HO$ TRANSFER TO BEGIN BACK WHENEVER TRYCNT,,LNOTRYSo AND. REPEAT REPEAT=OB FIRST=OB TRYCNT=TRYCNT+l TRANSFER TO BEGIN END OF CONDITIONAL WHENEVER REPEAT, PRINT FORMAT REMARK, 1 NOTRYS VECTOR VALUES REMARK=$18HONO CONVERGENCE IN 15, 1 8H TRIALS.*$ READ FORMAT WORDS, DATA(l)..DATA(l2) PRINT FORMAT DATA (l), TRANSFER TO START R END OF PROLOGUE -BEGIN CONTINUE R END OF GENERATED SIMUIATION PROGRAM R EPILOGUE BEGINS TRANSFER TO BACK R END OF EPILOGUE + END OF PROGRAM

-48 Program Continuation Declaration If there is more than one system to be simulated in one approach to the computer, a statement is needed to signal the end of one problem and set signals to return for further problems upon completion of the current one. The declaration accomplishing this signalling is: NEXT SET OF DATA, This statement must be contained on the card preceding the first card of the next problem. Otherwise, the first card of the next problem will be skipped in processing. If there is no next problem, there is no NEXT SET OF DATA, declaration and return is made to terminate the simulator program. General Simulator Problem Considerations The statements, assertions and declarations of the Simulator Language may usually be presented without particular concern for ordering and grouping of the statements,, That is, CONNECTIONS declarations and statements may be intermixed with INPUT, PARAMETERS, DESIRED RESULTS, FUNCTION SUBSTITUTIONS and SYNONYMS. Only those statements pertaining to the Libraries are somewhat restricted in order NEW ELEMENT TAPE must precede the Prologue and Epilogue and all ELEMENT DESCRIPTION declarations, assertions and statements. The statements in ELEMENT DESCRIPTION are restricted somewhat. Reference should be made to that section of this paper for the exact restrictions, Beyond this restriction it should be noted that while no error will result to prevent execution of the simulator considerable savings in time of processing can be made by placing all Element Descriptions containing the assertion PERMANENT before those without this

assertion. If this is not done, the Temporary Library must be saved, the new Permanent Library entry made ad.the Temporary Library restored after each Permanent entry. This is not a fast procedure at best but will be done if required for processing. The current simulator is restricted in the size of system that can be simulated. The version for 704 Electronic Data Processing Machines with 8192 word core storage, 8192 word drum storage and 6 tapes will accomodate 200 connection statements. Simple revisions for 32768 word core storage machines would accomodate 1000 connection statements, Each connective TO produces one connection statement. Provisions are made for 125 Element Descriptions. Each element is allowed a maximum of 20 different kinds of attachments, If attachment identifiers are used, no limit is placed on the number of identifiers for each attachment. Each attachment and the system may treat up to a total of 70 different parameter types. That is, each attachment is involved with the same set of parameter types and the total number of different types in the system may not exceed 70. The number of INPUT PARAMETERS may not exceed 200. The same limit applies to DESIRED RESULTS. The number of SYNONYMS may not exceed 400, The number of FUNCTION SUBSTITUTIONS may not exceed 400. The number of card images allowed in one statement generated for the object program is limitedby M*A,oD:to the first card. and up-to nine continuation cards, The number of statements in a statement collection as well as the number of statement collections in an element description is limited only by the length of a magnetic tape reel. No practical limitation is expected in this area for some time.

-50 There is no limit on the number of times an element may appear in a system. The only restriction is that each unique element attachment may be used only oncee This is a restriction to prevent ambiguity in the system definition and not a size limitation. The scope of a parameter is assumed to be narrow unless the parameter is specifically defined to be a broad scope in every element description in both libraries that refers to the parameter. With the exception of the declaration NEXT SET OF DATA and the statements for Element Descriptions the simulator statements may be prepared anywhere within columns 1 through 72 of IBM cards and may run from card to card or contain more than one statement per card. No continuation marks are used but every simulator statement (but not the MoAoD. statements in element descriptions), assertion and declaration terminates with a period (decimal point, punched 12-3-8). Remark Cards for MADo that will be produced in every object program carry an R in column 11l Remark Cards for the simulator current job carry a division slash / in column lo Any card with / in column 1 is ignored, but printed, by the simulator,

THE STRUCTURE OF THE SIMULATOR TRANSLATOR After a language for communication of information concerning a system to be simulated is established, the job of the simulator program renains. That job is the translation of the various statements allowed by the language into an algorithm or solution procedure for the system simulation requested. This is accomplished by several sections of program. The sections and their function are: 1) Preprocessing 2) Desired Result Reduction 3) Program Generation The preprocessing phase consists of decomposing, analyzing and regenerating the information from the source program statements in a form more easily handled by the machineo Input Parameters and Desired Results are saved in a very condensed form, Since each attachment point may have up to 70 parameters and these may fall into two groups (input parameters and desired results), each point must retain information on 140 items. Thus 200 attachments require 28000 items to be stored. These items are, fortunately, Boolean constants. In particular, the Boolean constant for an Input Parameter is l(True) if the parameter has been stated to be giveno The constant is O(False) otherwise. For desired results, the constant is 1 if the parameter is required as an output and 0 otherwise, Since the 704 computer is a binary machine it is possible to identify each of the 36 binary digits in

-52 the 704 word with a specific parameter and thus save the status of 36 parameters in a single storage location. The entire parameter status is compressed into four words for each attachment by the Preprocessing section. A second task of the Preprocessing section is to generate an image of the system to be simulated within the machine. This is done by forming a connection matrixo This array retains the nature of each attachment pointo Each attachment is entered as the joint between two elements. Four items are required to specify uniquely an attachment on an element and thus eight locations specify a connection. The result is an8 xn matrix, where n is the number of connections in the system. The matrix entries are the true names of the elements, attachments and identifiers either as supplied directly by the connection statements or as replaced by synonymso The ccnnectior matrix thus generated may be quite disordered so far as efficient processing is concerned. After the matrix is completely entered in the machine a sorting is done using an indirect list address array to arrange the matrix in the order of occturence of the Element Descriptions on magnetic tapo The indirect address lists allow the matrix to remain stationary in memory while the effective order is completely changed. Since the finding of information on magnetic tape is the most time consuming of all the operations every effort is used to save tape movemento The ordering also provides for an effective method for locating all occurenoes of an element in the connecting array without searching the entire array. This is done as follows,

Let the connection array be denoted: rALLE3tLATj LAID LEI-jRED ARAIDR EL2LEID2LAT2LAID2LE2RE ID2RAT2RAID2R ELnLEIDnLATnLAIDnLELnLEIDnLATnLAIDnL Where EL is any true element name EID is any true element identifier AT is any true element attachment name AID is any true attachment identification and the subscripts iL and iR denote the attachment point occurring to the "left" and to the "right" and the i-th such attachment point. The ordering procedure is then: 1) Order the matrix by the ELiL (and group each EL and EID) according to the arrangement of descriptions on the magnetic tape. Call this ordering vector "L"o 2) Order the matrix by the ELiR in the same way and call this ordering vector "R". Let i be incremented from 1 through the number of connections, say n. Let Li R) be the value df the l-th location in the L>. vector. Then the i-th member of the ELL (t ) is found in ELLi {Ei). This type of list addressing is known as "indirect" addressingo 3) Construct a vector for the vector L whose values are the locations of the first occurrence of each element addressed through L in the vector R. If no such occurrenceexists in R, then insert the negative of the first occurrence of the element in L itself. Call this vector "L TO R". 4) Construct a similar vector for the vector R relating the occurrences in R to L. Call this vector "R TO L".

-. WLth these vedtots the task is simplified for finding the entire set of occurrences of any identified element, The location of all occurrences ia accomplished in the following ways (The method is given for i but is equl3y validS with appropriate changes, for R) 1) If L TO R at a point is ngative, the elemnt does not occur in R6 The first ocurrence in L is found by taking the absolute value of L TO Ro.2) Each entry in L agrees with the first as lopg as the corresponding L TO R corresponds to the first L TO RB ) If L. TO R is positives the first L actarrence is found by going first to the first occurrence of the element in R and then bact to L by using the associated R TO L value, The same test as in 2 applies to equivalent identified elements. 4) If L TO R is positive, the value gives the first R occurrence. The succeeding values in R are for the same identified element so long as R TO L for each succeeding value agrees with the first R TO L value,, The remaining task of Preprocessing is to satre all input-output re* quirements and function substitutions for the Program Generation, In addition, should the library complement be inaomplete, the Preprocessing must construct the library entries. Each Element Description is saved as two files on the tape known as ELTAPEo The first saves the contents of each collection capability in 80 word blocks~ This allows two words for input parameters and two words for desired results at each of the 20 allowable attachments The words are in the Boolean form previously described. Since two words allow for 72 parameters and only 70 parameters are allowedyw the remaining bits are available for special use, In particular, the last bit in the desired result word is used to signal that this capability must apply to elements without this attachment0

-55 The second file contains the Mo Ao Do statements to generate the capability contained in the first fileo The first capability group in the first file corresponds to the first collection of Mo Ao Do statements in the second file and so ono Upon completion of the Preprocessing, the status of the storage is as follows: 1) The connection matrix is in and contains only true nameso The matrix is ordered and the input-output requirements have been packed in Boolean parameter,words 2) The Element Descriptions are processed and saved in groups of two files per description on tape ELTAPE, with all permanent descriptions firsto 3) The function substitutions and input-output parameters in complete notation are saved for the program generation phase on an erasable tapeo At this point, control is passed to the Desired Result Reduction section. This section is charged with the actual generation of the algorithm for simulating the system. The procedure for accomplishing this task is almost the reverse of the usual procedure used by humans in attempting the same tasko The human, approach, largely because of the extremely large storage capacity of the human brain, is a search that proceeds from the known parameters and is directed toward the desired resultso This approach could be implemented in the machine but because of storage limitations may become quite unworkable. The difficulty is that the machine program cannot reject a method until it can be shown to be unnecessary in the program to obtain the desired resultso Thus the program would be forced to enumerate all the possible methods available from the inprt parameters plus the results

-56 of the first set and so ono The number of methods available grows rapidly and if the problem is well-posed the desired results will eventually be encompassed. However, this constitutes an exhaustive search with only a small fraction of the methods actually of useo Therefore a different approach is used, Essentially, the algorithm is produced in reverse by working from the desired results toward the input parameters. In this way every step generated is necessarily of use in the program. The program is, of course, backward, in that the first statement collection specified is the last one needed and so on but this is easily taken care of by the Program Generation setiono The method of production of the algorithm is the following: 1) Inspect the "Desired Result" Boolean words for each connection point in the matrix. Whenever no Desired Result bits can be found in the entire matrix the algorithm is completed, 2) Whenever a connection is found for which results are desired, steps must be taken to satisfy the request for resultso 2A) The requested results may be input parameterso If this is so, remove the corresponding desired result bitso 2B) The requested results may occur at any identified attache ment and be of broad-scopeo If this is so and the result (as an input parameter,) can be found at any of the identified attachments, remove the corresponding desired result bilo. 2C) If requested results still remain after steps "2' and"2B", then some additional program must be added to obtain the results0

-57 2C1) Find all of the statement collections for both elements that occur at this attachment point that are useful in obtaining the requested results. That is, ignore any collections that are "without' attachments specified for this element or collections that do not happen to produce any of the desired results. 2C2) Check each useful collection to determine its effectiveness. The effectiveness is the ratio of the number of requested results the collection produces to the number of new requests for results the collection will produce. A new request for results will occur if any of the parameters required by the collection is not an input parameter or already requested by previous statements. If the number of new requests for results is zero, the collection is always inserted in the algorithm. This collection produces results without requiring any new information. (The only exception to this rule occurs with the iterative ESTIMATE collection. In this case, no new information request is apparent, however, the ESTIMATE collection is restrained from inclusion in the program until the parameter in question is found by at least one independent calculation method.) Otherwise, retain the effectiveness ratio as the weight of the collection.

-58 2C3o When all of the collections have been examined for effectiveness and if desired results still remain, select a set of the collections that will produce the requested results At this point, the methods in which a parameter could be found using a method only once at a given attachment point are checked and discarded if already usedo Otherwise, these methods are simply placed in competition with any other techniques available 2C3A) First check to be certain that every requested result can be found in at least one wayo If any result cannot be so found, the problem may not be well posed, The problem is not well posed if no "branches" have occurred previously in the generationo A "branch" occurs when a choice is made between more than one method of determining a requested resulto 2C3B) Whenever there is exactly one method for producing a result, this method must be included'at this point in the algorithmo The method is inserted, the results produced by the method have the corresponding bits removed and any new requests occurring anywhere in the matrix have the corresponding bits inserted. 2C3C) After all single method results are taken care of there remain only results for which there are several methods of calculation. Since only one method will be

used for each result the selection will constitute a "branch" in the algorithm generation, if the method selected is not always forced to be the same one. If one considers the available methods, each with its associated weight, the simulator should tend to choose the method of greatest weighto However, the simulator should be allowed to select the method on the basis of the probability of selection being proportional to the weighto If this is not done, one may anticipate that in some case the method of greatest weight may contain a parameter that is incapable of calculation (considering the input parameter) and therefore the program could not be generated. If, however, the simulator makes the selection probabilistically, the method of greatest weight is most likely to be selected but other methods may be selected in its placeo The probabilistic selection is automatically made and the "branch" Boolean constant is set to oneo In this way, if later there should arise a case in which no method is available the simulator may make another trial and possibly work out a satisfactory algorithm by having the chance to choose another method at this pointo This is a situation in which the locally "best" method is not always the globally "best" but tends to be so.

-60 3) The steps 1 and 2 are repeated over and over. Each time the requested results are satisfied a new set of requests are generated except when the request matches an input parameter. If the problem is well posed then a sequence of methods may be found such that all desired results are satisfied, through the sequence, by input parameters. When this has been done, an algorithm for the simulation of the system has been produced. The algorithm produced tends to be optimal since at each state the method of greatest weight was most likely to be employed but the simulator cannot, with limited storage, view the generation of the algorithm beyond a single step. Thus occasionally the simulator may produce several steps that might be condensed if more information were available. In particular, it may happen that identical sets of statements may be produced in the algorithm at different stages of the generation. This redundancy is easily detected and the final algorithm will contain only the first occurrence of the set. The method of probabilistic selection is also used to discard the least likely method should there be found too many methods to apply at a given point. The method of probabilistic selection for picking an item from a group of n weighted items is the following: Let Wi > 0 be the weight of the i-th item from a goup of n total items. n W = E Wi be the total weight of the group i=l

N be a random number selected from a uniformly distributed set of 0 random numbers on the interval 09 Wo Then the K-th member of the group of n items will be selected for the smallest K such that K zWi= NW i=l N0 is most likely to fall in the subinterval such that Wi is maximum but may fall anywhere in the intervalo An modification of this method to pick the least likely item (for discards etco) consists of defining a new set of weights p = 1/Wi and make the selection using p in place of Wo In particulard it should be noted that equally probable alternatives receive equal. chances and every alternatives no matter how small its weight may be, receives some consideration and may be chosen at any timeo This method should find many applications in future programso Since there is no way to predict either the number of parameters that may be needed at a point or the number of methods available for any parameter it is necessary to allow an extremely flexible storage assignment so that the storage may be completely usedo This is done by means of an "associative memory" list for the storage regiono This list functions as follows 1) Associated with each parameter at the attachment point is a storage location whose value is~ 1A) Zero if the parameter is not requiredo 2B) Minus one if the parameter is not required and no method has yet been found to yield the parametero

-62 1C) Otherwise, the value is a positive integer giving the location of the beginning of the list of methods for this parameter in the "associative memory"o 2) Each entry in the associative memoiy list gives the location of the next (associated) entry. The final entry is denoted by a minus sign. 3) New entries are made by consulting the associative memory list beginning at the zeroth location, The value of this location is the next available location in the memory. An addition to the end of any list is made in the available location, the value that was in this location is stored in the zeroth location and the former list end is changed to refer to the new list endo 4) Whenever a result collection is selected9 the storage space is reassigned as available. storage by placing the starting location for the list in the zeroth location, and the value formerly in the zeroth location at the end of the list being removed. In this way the entire list is made available with only two storage reassignmentso 5) If the capacity of the storage is exceeded before all the parameters have been treated space can be created by selecting the parameter with the greatest number of methods and picking the method least likely using the probabilistic section technique. The location thus chosen is made available by giving its address to the zeroth location and reassigning the preceeding list location to skip this location and refer to the next item in the list.

-63 Upon completion of the removal of all the desired results and those created during the removal of others, the algorithm is completed ard Written in reverse order on magnetic tapeo This type of storage is used because it is not possible to predict the storage required for a program in advanceo This is somewhat unfortunate since the program is generated in the form of a "push down" listo A push down list is a list such that each entry occurs at the beginning (rather than the end) of the list and thus moves the former first item to second place, the former second to third and so ono Thus the items are "pushed down" on the listo Removal of items from the list occurs from the beginning of the list with the last item enteredo Thus the effective order of the list is reversedo This is precisely the action that must occur in the program generated since the last statement collection found must be the first used in the program and the first collection found is the last one used in the program. The difficulty is resolved by moving the tape backward two records and forward one working from the last record written toward the firsto As this process is begun after the completion of the algorithm the simulator is at this point generating the simulation program using the Program Generatoro The first output of the Program Generator is the Prologue and its associated input-output statementso The Prologue is followed by the program. The algorithm isa, stored in a short code giving the connection matrix row number and index in the L vector so that the unique connection could be located by the program generatorY the element description name given by position in the element name vector and tagged with a plush+) sign if the element involved was the left most (and a minus(-) sign if the

-64 right most) and finally the statement collection number0 The program generator section moves to the second file in the desired element description and next to the appropriate statement collection. Finally the statement collection is processed and produced both on cards and in print to form the desired simulation program. The rules by which the processing takes place have been stated in the section describing Element Descriptionso Briefly, the M. A. D. statements are written using floating statement labels, and special codes for function substitutions and for parameter and attachment codes. The Program Generator assigns unique fixed statement labels for the floating statement labelso Any function substitution is checked for possible modification~ If a substitution has been requested, the substitution is made, otherwise the original text is retainedo The parameter and attachment codes are reduced to a six character variable name code for each parameter-attachment combination occurringo Some of the six characters may be blanks. Non-identified attachment parameters are immediately coded and inserted in the output statemento Identified attachments cause multiple copies of the statement to be generated. One copy is made for each different identifiero To avoid possible embiguity, only one attachment may be identified in each statemento However, this attachment may occur any number of times with any number of parameters within one statement. In this way the effect of a special junction element is produced without specifically requiring such an element. As noted in the Collection Capability section, there is one exception to this ruleo Namely, when an attachment is identified and the current point in the connection matrix agrees exactly with this attachment name, a copy of

-65 the statement is produced for all occurrences except the current one. Finally, if an attachment is not identified the current attachment point will be selected if the name agrees with the name occurring in the collection statement. Otherwise, the first occurrence of the name is used in the codeo Upon completion of the program generation the control is returned to the Preprocessing Section to process any other system simulation problems that may be waiting. Since the output of this program is a program in M. A. D. code and on punched cards the simulation program may be used as it is produced or modified easily before using it to simulate the system. To use the program as it is generated, the user need only supply the data and special subroutines needed and the special cards needed by the executive system for the data processing system. The program will be translated into machine code and executed using the data supplied.

IIo STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING The representation of the characteristic performance of the various components of a system, is vital to the simulation problem~ It is not sufficient to obtain a relation which merely fits the available data, if the relation is to be used for predictive purposes, because such a relation may bear only superficial resemblance to the actual. performance at other points. A much more desirable relation would consist of terms suggested by the nature of the physical laws governing the component performance but using only those terms which may be shown to be substantiated by the available datao The Stepwise Regression program. was written to establish this relation and produce, in addition to the analysis of the data as just described, the actual MoAoDo statements needed for the predicting equation sub rout ine The use of simple learning by the program allows the program to deal with a much more general solution of the predicting equation problem than, h.s been previously possible.: The usual engineering problem consists of many independent variables (pressure, temperature, load, etc.) which affect the behavior of the dependent variable (eog,, efficiency, loss, etco) that is to be predictedo In addition, these variables are usually found to enter in nonlinear manners, (eogo, raised to powers or roots or even more complicated forms). Also, in the usual problem the dependent variable performance is often. affected by interaction between the independent variables and function- of the independent variableso The simplest sort of examp]le of such an. interaction is the Perfect Gas relationPV - MRT -66

-67 In this case, the variables P and V interact so that T may not be determined by a relation consisting of terms using P alone plus terms using V alone but may be found by using a term involving the interaction between P and V, Thus the size of an engineering problem of several independent variables, each of which may be represented by several functions, grows very rapidly when all possible interactions are allowed. An illustration will indicate the magnitude of the problem. A common selection of twenty functions for a single independent-. variable problem, that is, a problem which may be expressed: 20 Y = Z biFi(X) i =l will require about 30 seconds to solve on the IBM 704 using conventional stepwise regression techniques. If an apparently only slightly more complicated problem involving three independent variables, each of which has twenty functions, were to be attempted, considering all interactions, the number of terms to be considered increases from 20 to 9260, the size of the matrix involved grows from 21 by 21 to 9261 by 9261, and the IBM 704 time becomes approximately 2500 machine hours. It is clear that, without a technique capable of reducing this problem by several orders of magnitude, the general problems encountered in engineering will have to be treated in strongly simplified terms, Indeed, this has been precisely the motivation for earlier linearized system modelso

The simple learning mechani.sm. developed for use with the Stepwise Regression Programr has been used,sucessful.y to produce predictki g relations in much less time than required for more conventional methodso The following discussion will describe, first, the Stepwise Regression Program and, second, the Simple Learning mechanismr employed by the programo This program represents one of the first applicati ons of "ar-ificial intelligence" in. an area of immediate practical interest. Discussion of Stepwise Regression The following describes a, computer program useful in determining the relationships existing among a grovup of up to 60 variables or functions of variables at each program pass, Tak'ing one of the variables tto be a dependent variable, the program. resu..s in a linear predicting equation using the current set of pred. ictot vardables or terms and select;ing from this set a "minimal" seto The program allows simple learning to occur concerning the most satisfactor;y o terms, theey xtending t.he usefulness in determining equations tha take account of possible variable interactions of all orders. The programr further allows the generateion of equations using either stepwise buildup or stepwise purification at the discretion of the user, This discussion concerns some extensions carried out by the author of the work originated by Mr Mo Ao Efromnson of Esso Standard Research and Engineering Co,, and carried forward by M,'o EJ Eo Dallemand of General. Motors Research Staff, Th.ae problem considered is that of determining a predicting equation rrom a colilection of datao, The method of analysis deals with the situation which arises when data have been

-69 collected on many variables, of which one is regarded as a dependent or response variable and the remainder of the set is regarded as a set of independent or predictor variableso It may be anticipated that the method will be useful in experimental situations involving unknown complicated interactions between many variables and complicated relationships (functions) of the variableso In particular, when the data are already available, or where it is difficult to control variables systematically, or where the conduct, of a systematic experiment would disrupt the normal operation of a system too severely, this method will be useful. Specifically, this method is useful in obtaining answers to questions like the following: (1) What linear combination of the independent variables, or functions of the independent variables, or interactions (cross products) of independent variables and functions of independent variables best explains the data on the dependent variable? (2) How good is this relationship (obtained in (1))? (3) What is the linear relationship between the "best" single independent variable (function of an independent variable, or interaction) and the dependent variable? Also, what is the relation for the "best" two, three, or other subset of the possible predictor terms? (4) For each subset df (3), how good is the relationship? (5) What is the smallest set of predictor terms that will make statistically significant contributions toward explaining the statistical variation in the dependent variable? (The user may set the level of significance )

-70 (6) How good is the relationship in (5) and how good is the prediction? (7) How much of the behavior of the dependent variable is still unexplained by the equation? (This is the Standard Error of Estimate ) (8) If there is theoretical justification for suggesting certain terms to explain the behavior of the dependent variable, what is the "best" relationship for this set? (9) How good is this relationship (8) and what can be done to explain the behavior not explained by the present theory? Io The Stepwise Regression Method The Stepwise Regression analysis deals with a set of p independent variables denoted X1, X2, oo, Xp and a single dependent variable Y, Let N be the number of observations made on each of these p+l variables yielding N*(p+l) data in allo The objective of the analysis is to generate a relation of the form Y = bo + blX + b2X2 +.* + hpXp o (1) The bi, i = 0, 1, 2, oo~, p are the coefficients or multipliers of the various -X1o It should be clear that one could not distinguish between the previous case of p independent variables and the case of p linearly independent functions of a single independent variable or any other combination of numbers of independent variables and function choices for these variables totalling p terms in allo Therefore, the discussion here treats the problem as if there were p independent variables without loss of generality,

-71-r 1 l. To focus these statements on a physical problem, consider the following: Suppose that measurements have been made of the electrical losses of a hydrogen cooled generator. Figure 1 shows the general behavior of the variables and indicates that at least three factors must be considered, It is assumed that measurements or observations are available of (1) gross electrical load on the generator, (2) hydrogen pressure, and (3) power factor as well as the corresponding electrical losso The formal relation (1) might be interpreted as the linear relation: GENLOS = bo + b * GKW + b2 * HPRESS + b3 * PFCTOR where GENLOS = generator electrical loss GKW = gross generator load HPRESS = hydrogen pressure PFCTOR = power factor However, Figure 1 indicates that such a linear relation may not represent the actual behavior. More complicated analytical models may be suggested to the Stepwise Regression Program by making appropriate definitions of some pseudo-variables. Suppose that the pseudo-variables Xi are defined: X1 = GKW X2= GKW2 X3 GKW3 X4 = HPRESS X5 = HPRESS2 6 = HPRESS3 X7 = PFCTOR X8 = PFCTOR2 X9 = PFCTOR3 and, of course, the list may be longer and as complicated as needed to describe the physical problem0 The "standard" types of terms automatically

-72 // / CE) / S 1800 / / S / - o J' i / lad / /' r L/ 600 -I 1 8 1400 S 1200 - Variable Name Symbolic Name Generator Electrical Loss GENLOS 1000 h Generator Load' GKW Hydrogen Pressure HPRESS Power Factor PFCTOR 800 50 70 90 110 130 150 GENERATOR LOAD (GKW) Figure 1. Generator electrical losses as a function of load, hydrogen pressure, and power factor.

-73 available to every problem include integer powers, integer roots, and the reciprocals of these termso Provision is made to insert any other special terms desired as well (such as logarithms, exponentials, etCo)o Then a relation of the form (1) is: LOSS = bo + bl * X1 + b2 * X2 + 0 o + b9 X9 or its equivalent LOSS = bo + bi * GKW + b2 * GKW2 + o b. + b9 * PFCTOR3 Again, it often happens that interaction may occur between the variables and the functions of variableso Once again a relation of form (1) may result by defining: Z1 X1 = GKW Z2 = X2 = GKW2 z9 ZlO Zll X9 = xl = X1 = PFCTOR3 * X4 = GKW * HPRESS X5 = GKW * HPRESS2 Z36 = X6 * = HPRESS3 * PFCTOR3 Z37;X1 * X4 * X7 = GKW * HPRESS * PFCTOR = 3 X GKW PRESS PFCTOR3 Z63 = X3 * X6 * Xg = GKW3 * HPRESS3 * PFCTOR3 The formal relation (1) is now LOSS = bo + bl * Zl + b2 * Z2 + o00 + b63 * Z63 or its equivalent LOSS = bo + b1 * X1 + b2 * X2 + ~ + b63 * (GKW3 * HPRESS3 * PFCTOR3)o The problem consists of finding those Z's which contribute to the explanation of the dependent variable (LOSS) with sufficient importance, as indicated by the measured data, to allow their retention in a predicting

r74 equation,, And. having found the set of Z's meeting the importance criterion) the problem continues to the determination of the best possible estimates for the bCso In this way, a mi.rimal, relation is generated which may be used to predict LOSS for given values of GKW, HPRESS9 PFCTORo This relation is automatically generated by the Stepwise Regression Programo In addition, the Stepwise Regressior Program produces on punched cards the MoAoD, function. corresponding to the gererated relation an.d having any arbitrary function name desiredo In this case9 suppose that the desired furL.ction name is GENLOS. The program would produce the MoAoDo External Fur.ct ion GENTLOSo (X1i X29 X3 where X1. X29 and X3 are now symbolic names for the arguments GKW9 HPRESS, PFCTOR, in machine translatable form ready for inclusion as part of a simulation program (or any other application)o Thus or.e may later write the relation. NETKW = GKW - GENLOS. (GKW HPRESS PFCTOR)-MECLOS as a MoAoDo statement to be used. in a simulation program. and the result will be the net power generated (NETKW)o It is clear that no loss of generality has resulted by considering the formal relation (1)0 It should also be clear that the X terms in (1):may represent either the actual measuremernts of the independent variables or that. they may represent functions of these measurements without requiring any change in technique, In the remainder of this discussion9 the symbol X will be used and the meaning may be understood in its most general senseo The bi in (1) are determined in such a way that9 if one forms the sum of the squares of the differences between the observed values of Y and the predicted values of Y arising from, the use of (l), then that sum will.

-75 be minimum. Notice that the process of squaring the differences insure that all errors, positive and negative, contribute toward increasing the sum. This is commonly referred to as the method of "least squares." The importance of the Stepwise Regression method lies in the process of "building" the expression (1) a term at a time, always insisting that the terms be inserted in order of their relative importance to the explanation of the behavior of Yo Furthermore, checks are made continually regarding the continued importance of terms in the equation and only those terms will be inserted into (or removed from) the equation (1) that meet certain significance tests which can be controlled by the usero Thus the final equation will comprise a "minimal" set of terms, Since terms may be removed from the equation as well as inserted into the equation, the method of Stepwise Regression also allows the generation of a relationship by "purifying" an initially large set of terms with very little added burden to the user. Experience indicates that the purification process occasionally produces valuable additional information in certain problems. A number of statistics are computed before the task of building the predicting equation beginso These statistics may be printed out to help give further insight into inter-relationships in the data and are used by the program for executing the task. Included among these statistics are the mean (average value) for each variable, the standard deviation (a measure of variability) for each variable and the correlation coefficient for each pair of variables. The correlation coefficient measures the linear relationship existing between the pair of variables, and ranges from +1l00 (perfect direct relationship) to 0o0 (no relationship) to -lOO (perfect inverse relationship)

Figure 2 shows the interpretation of the standard deviation and the mean. If the scatter of data is due to random uncontrollable error, then the Gaussian distribution will model the variability withl respect to the predicting equation. Taking the mean or average value to be that indicated most likely by the data, the width of plus and minus one standard deviation will embrace an interval about the mean within which the expectation of the true value is 68%o As indicated by the figure, if the interval is doubled, the expectation grows to more than 95%, and if the interval. triples, the expectation is 99.8%o In other words, based-on the data measured on the physical component in question, one may expect to encounter a true value of the dependent variabl.e lying more than three standard deviations away from the predicting relation value with a long term frequency of 1 in 500O Figure 3 illustrates this discussion with respect to a predicting equation. If the predicting equation producces the estimate of the true value of the predicted variable indicated by the central heavier curve, then the bands to either side mnay be understood to indicate the range within which the true value may be expected to lie with the stated frequencieso Thus a predicting equation with very small standard error of estimate will more accurately represent the trule behavior of the variable than will a predicting equation with large standard error of estimateo TI Generation of a Predicting Equation Consider a simple example; suppose that an. experiment has been made consisting of a set of observations of six variables, Regarding one of the six as a dependent variable and the remaining five as predictor

-77 WITHIN THESE LIMITS OCCUR DEVIATIONS IN STANDARD DEVIATION UNITS Figure 2. The Gaussian distribution. / Y Sigma = F(X) Sigmao Y-Sigmo =F(X)-Si Y A I2 r F(X)2 I -2c 22c-3a 3aT CROSS-SECTION A-A Probability that true value of Y lies ----— within interval x Fig. 5. Predicting equation Y = F(X) as an approximation to true values of Y.

-78 or independent variables, the analysis determines the "minimal" set of variables which may be used in a relation of form (1), where, in this case, p = 5. The first step is to find that variable Xi which best predicts Y. This is done by correlating each of the Xi to Y and the selecting that Xi which has the greatest "correlation coefficient"* in absolute value. If more than one Xi shares the largest value, take the Xi with the lowest subscript i.' that is, take the first such Xi encountered. Suppose that in this instance that best i is 4. The first predicting equation is then Y = bo + b4X4 (2) Thed bo and b4 satisfy the least-squares criterion. Succeeding steps are of slightly different form. First, the Xi are sorted into two subsets Xil and Xi,2. The set Xi,l consists of all those variables that are in the predicting equation at the time of sorting. The set Xi,2 consists of all those variables that are not yet In the predicting equation. * The correlation coefficient is defined as the product-moment coefficient of correlation: Let ( A WtXit)( E WtXjt) XiXj = Z WtXitXjt -., t twt where t = number of observations n = number of independent variables j = i, i+l, i+2,, n+l i = 1, 2, *.., n+l Then let Ci =X'iXi i = 1, 2, *', n+l and the correlation coefficient r is then (xixj) wij (thi)(ej) with the properties.rji = rij i j, 2, A n2L rii = 1.000 -1 ri S- 1..-l1r Il

-79For each of the members of Xi 1 the analysis computes an "importance factor"*** which is a measure of the relative contribution of the variable to the predicted equation. The smallest of these importance factors is isolated. If the variable associated with this factor is less important than the user requires for the variable to be retained in the equation, then that variable is removed from the equation before continuing. The scale used to determine whether a variable meets the "importance" criterion is simply the probability or chance that the user is willing to take that a variable may be left in the predicting equation that should have been removed. Figure 4 illustrates the nature Of this "importance" scale. The F-test measures the extent to which a variable will contribute toward explaining the dependent variable behavior, and teststhis contribution against a purely chance correlation by comparing the variance with and without the term. The hypothesis tested is that the variance is equal in both cases and that any difference is due only to chance, Thus, the term will be..used only when the difference in variance cannot be explained by chance alone. Thus, in Figure 4, if one selects a probability of committing an insertion error (that is, inserting a term into the predicting equation that really does not belong in the equation) and finds the number of degrees of freedom (roughly the number of weighted observations), the value of F indicated by the surface is such that if the value of F displayed by the "best" term exceeds the value on the surface, then the risk is less than the probability chosen. As might be expected, the value of F on the "threshold" surface goes to zero as the probability goes to 1 (certainty of committing an error). In that case, any nonnegative value of F equals or exceeds the "threshold" and the result would be to insert every term whether correlated

-80 or not, Conversely, if one goes toward zero probability (certainty of not committing an error) the threshold value grows, approaching infinity in the case of zero probability. Thlus no value of F can exceed this threshold and so no terms carn be inserted, For any reasonable probabi.lity, the effect of the number of data can be assessedo As the number of data grows large, the "threshold" value approaches a constant dependent only on the probability,, As the number of data approach zero, the ri.sk of error is.held constant, by requiring larger and larger F values with infinity as the value corresponding to "no datao" With such a test, the Stepwise Regression Program. can control the generation of a predicting equation so that; each term possesses a maximum, risk of appearing incorrectlyo Of course, many, perhaps most, term.s actu'ally appearing in. the final equation exceed the threshold by substantial amounts and th.us represent greatly reduced riskso The test insures that every term is at least as good as the risk specified. NOTE- If V' > 0, then X-i is not' yet; in. the rererssion, equation and the Vi > 0 may be regarded as the relative contribution by the respective Xi. in explaining the as yet unexplained variance in the dependent variabl.e'Y, If Vi i < 0 then Xi is currently in the regression equation. The |Vi for all Vi < 0 may be regarded as the relative contribution by the respective Xi to the regression prediction of'Y As each term is added or deleted from the regression equation, the regression matrix aij is modified to contai.n the corresponding effect, ** The "importance factor" is found by using the variance conftribution for each variable o Initially, the correlation matrix ri. defined earlier is equal to the regression matrix ai. aij = rij; 1, j.: 1. 2, ooo n+l o Then the variance contribution for the i-th variable is: Vi Y= ai i.i, 2, o, n aj i and where the subscript y is runderstood to be the dependent variasble subscript (n+l)o

If the user takes a probability of error for removing variables of 005 then the odds are 1 in 20 that a term may be left in the predicting equation incorrectly. Obviously, if the user wants to make this error very rarely, he may set the probability of that error very low, say 0.01 or 0.001. This situation requires one additional remarko When one asks that the chance of committing an error be made small, the chance of committing the converse error must become large. In this case, one increases the risk of removing variables that really belong in the equation by decreasing the risk of leaving variables that do not belong in the equation. If the chance of leaving a variable incorrectly were set by the user at 1 in 10000, it is also possible that insufficient data may have been accumulated to allow the retention of any variables in the equation, and the analysis can do no more than predict the average value of the dependent variable Y'by the appropriate bpo The remedy is clear' if one wishes to set high standards, the price is additional experimentation to produce additional evidence to support the case. If and when all the importance factors exceed the minimum value required for retention of the set Xi 1, the analysis then examines the set Xi 2o For each element of this set a "potential importance factor" (as defined earlier), is determined and the largest of these isolated. These factors measure the relative contribution which each variable not presently in the predicting equation might make to the equation if it were put ino The largest of these is associated with the "best" variable at this stage. Once again a comparison is made to insure that the risk of inserting a variable incriorectly is in agreement with the significance of the "best" variable before the insertion is alloaed to occur. Again, the user specifies the risk he is willing to take of a variable being incorrectly inserted into the

-82 L/ ^^^; Figure 4. Surface showing values of F which may be exceeded by chance with stated probability.

-83 predicting equation, understanding ~05 to mean 1 chance in 20 of the error occurring and recognizing that reducing the chances of incorrectly inserting variables increases the chances of omitting correct variables from lack of evidence Suppose that, in the example considered, X4 has been retained, and that of X1, X2, X3, X5, the variable Xi best explains the behavior of Y not explained by X4. If there is sufficient evidence to support the insertion of X1, then a new predicting equation is formed by least squares: (1) (1) Y = bo + blX1 + b4X4 (3) The superscripts on bo and b4 indicate that these coefficients have undergone one modification in the process and are new valueso At this point the variables are again sorted and checked for importance and the procedure repeated. The analysis ceases when either all the X variables have been inserted into the predicting equation or none of the X variables that remain as possible candidates for the equation is sufficiently important to allow insertion. Continuing the example, suppose that on the third step X5 is introduced, yielding: (2) (1) (2) Y = bo + blX1. + b4X4 + b5X5. (4) Further suppose that X1 and X5 behave together in such a way that the results is like having X4 in the equation twipe. In such a case, the importance of X4 might be considerably reduced. Suppose that this is the case and that the importance of X4 falls below the limit set by the user, Then X4 is removed and the equation becomes: (3) (2) (1) Y = bo + blXi+bX5 (5)

In step (5) suppose t;hat X2 is added giving, (4) (3) (2) Y = bo + blXi + b2X2 + b5X5 (6) Now suppose that neither X3 nor X4 are sufficiently significant to allow thelir insertion. The final prediction equation produced is (6) The analysis makes availabl.e a n.m.ber of statistics at each step which may be interpreted as a measure of goodness of fit or predict;ion as wel.l as the b values and the importance level for the term considered at that stepo 2o The Statis tical. Model. Suppose that the physical system giving rise to the preceding example was such that it c.ould be hypothesized that the system could be characterized or described by the mathemahatical model: Y B+ BXo + BX + B2X2 + B3X3 + B4X4 + B5sX + E, (7) where the Bifi.-, 1, 2, o, 5) are ui.nknown. and possibly some of them may be zero, E is a randoTml erro, variable ternm. which acouants for the inabi.lty to obtain strictly reprodu.cible dat a when obFseY7ving the physical systemo Setti.ng aside the consideration of E for the momentr, the probl.em is that of obtaining the best estimates of the Bio It may be observed immediately that this is the probl.em just, considered, res. lting ir. Equation (6), and that the Bi are estimated by bi, respectivel.yo The best estimates of B and B are zeroo Tur.ning att.ention once againr to E i.n (7), it:s l.iar that Equation (6) is not qu.it e n-om.pleteo The rar.d.omness of E makes the prediction of E impossible. What is possible,. the detex.mi..ation of the likelihood of E being inside a range of valueso In other words, because of E the measurements obtained are not exactly repeatable even if all. the X's could be set

-85 at exactly their former valueso Therefore, the estimates are possibly, but not necessarily, in error due to the influence of Eo A more nearly complete treatment of (7) would (1) estimate the Bi as before; (2) estimate the possible errors in the Bi; and (3) estimate the variability of Eo The Stepwise Regression Program automatically estimates each of the three items desired. The estimate of the Bi has already been discussed. The possible errors.-in the Bi are indicated by quantities SBi called the "Standard Error of the Coefficient" for each io These values are such that if one forms the interval Bi - SBi B = Bi + SBi (8) then the "true" value of B may be expected to be included by this interval in about 68% of all cases (see Figure 2) If one extends the interval to form Bi -2S B Bi+ 2SBi ( (9) then this interval should include the true value in about 95% of all caseso The variability of E is measured by a statistic called "TPe Standard Error of Estimateo" This is roughly the standard deviation of the Eo Adding additional terms to the predicting equation usually results in reducing the standard error of estimateo The amount of reduction is a measure of the contribution made by that variable toward the explanation of Yo When the analysis is completed, this statistic measures the behavior of Y not explained by the predicting equation and reflects the remaining observational errors and, of course, possible errorsin the hypothetical model0 The precision of the predicting equation is reflected by the

-86 magnitude of the Standard Error of Estimate (Sy) such that if one uses the predicting equation (6) to estimate Y and then forms a band about the curve predicted by (6) of plus and minus Sy (i.eo, the band is 2Sy in width and centered on the curve from (6)), then the "true" value of Y may be expected to be included by this band in about 68% of all cases (see Figure 3). Again, doubling the band width to + 2Sy raises the expectation to about 95% of all caseso In other words, when enough experimental observations of a physical system are made accurately on good instruments so as to minimize observational errors, and when the hypothetical model correctly describes the physical system, then Sy will be small and the predicting equation may be used to estimate Y with a measure of the precision of this estimate interpreted as indicated,* The analysis produces two other valuable statistics at each step of the estimation processo The "Coefficient of Determinationr'** is interpreted as the proportion of the total variatfion in Y that is explained by the predicting equation. The possible values lie in the range from +i.00 (perfect prediction) to 0o0 (no prediction). Statisticains familiar with the "Multiple Correlation Coefficient," which is the positive square root of the Coefficient of Determination, will find it displayed also, IIo Artificial Intelligence Applied to the Stepwise Regression Method Section I of this discussion treated the use of the Stepwise Regression Method as it applied to those cases in which the entire set of It should be mentioned in passing that the interpretation of Sy and SB should be as stated here and that it is not correct to say that about 68% of all observed values will lie within the intervals indicated for + S and so ono ** The Coefficient of Determination (R2) is found by subtracting the regression matrix element ay (which measures the dependent variable variance) from unityo That is, R2 lo ary o

-87 variables and functions of variables can be represented by a single collection of small, enough size to allow complete retention within the memory of the machineo In the case of the IBM 704 with 8192 word core storage, the size of the problem is limited to 60 variables, which require, in addition to several linear arrays of 60 elements, a matrix 61 by 61 or 3721 locations. While some expansion might be realized by adroit programming, a little study of the nature of the problem indicates that an expansion in capacity of several orders of magnitude together with a new concept of programming will be required to handle problems of the types commonly encountered in research. To understand the nature of the problem encountered, consider the following example. Suppose that, as in the example of Section I, an experiment has been made consisting of a set of observations of six variableso Once again we regard one of the variables as a dependent variable and the remainder as predictor or independent variableso Assuming for the moment that only linear behavior is to be expected from any variable (a drastic simplification), it is apparent that the formal relation (1) Y = bo + blXl + o.. + bpXp (1) is not completely descriptive of even this simplified caseo This is because of the possible existence of interactions between variableso Extending the example proposed to include interaction requires the inclusion of sets of terms of the forms 1) Xi, 2) XiXj, 3) XiXjXk, 4) XiXjXkXi and in this case the single term X1 X2 X3 X4 X5 as possible candidates for the predicting equation. The number of such terms is found in the following wayo Let there be K groups of ni distinct objects'(i =1, 2, oo., k) and let there be selected j objects K j = ni i =1

-88 at a time to form combib.nai., orns, The number of such combinations is readily obtained for the case Cij = 1 for a.l i (the present example case)o The number is (for n.i i) Nj -K K/(K j)I j - and the case of K = 5 produces the tab le TABLE OF Ni FOB ri - 1 AND K - 5 j T^, N 3 -,. r 5 2 10 15 3 i0 25 4 30 5 1 31 Th. table shows:.au even t,; simp.l example chosen has expanded the required. storage c-apacity from a 6 + I square matrix of 49 locations to a 32 + i. sqeuare matrix of 1089 locationso- Furthermore, the usual problem does no', peermit the assummption of n —. lo'The more general case may be de,ermined if (1) ni is corntarnt for all. i (2) selection occurs always between groups and not within groupso Then N. [KJ/(K -j) j nJ (13) Condition (1) is not urnreasonable and condition (2) simply requires that nr be large enough to incu.r,-de whatever terms might be desired generated within the smaller group. That; is, if one considers X2 and and wishes also to consider X5 = X2'* X3, then condition (2) requires that X5 be made a member of the ni (and not generated from X2 and X3)o

Suppose that in the example 10 function choices are suggested for each of the five variables (the use of 20 or more is not uncommon in problems concerning a single independent variable). Neglecting interactions the problem requires 52 * 52 locationso Considering interactions and using (13), one obtains the table: TABLE OF Nj FOR ni = 10 AND K = 5 j N5 Nj 1 50... 50 2 1000 1050 3 10000 11050 4 50000 61050 5 100000 161050 Obviously this is outside the range of even projected computers since the matrix now requires (161052)2 locations. The cost of solution by conventional methods is also prohibitive since the solution of a 3-variable problem with 20 function choices per variable (which requires (9260)2 locations) has been estimated to require 2500 hours on the 704~ Conventionally, work has progressed in this field by the expedient of setting the coefficients of all but a very few of these terms identically equal to zero. The formal relation (1) is such a reductiono This method, while enabling some attack to be made on otherwise nearly hopeless problems, suffers greatly for several reasonso First of all, the:.,choice..of'omitted terms is a process of discarding thousands of terms to retain one. Secondly, the usual practice of relying on apparent fit to select terms before the

-90 regression process begins may result i.n the omission of exactly the terms needed d. A procedure is needed to conduct a search through thousands of possible terms engaging only a few dozen at; a t;ime to produce the predicting equationo To be as effectivre as possible, it would be very desirable to use each experience with. the problem, whether successful or not, to learn more about the nature of. th+,e terms t hat are generally useful and thereby accelerate the search. Such tec,-hni.ques as "l.earning" and the "acquiring of experience" are generally associated witth nonmiechanistic organisms. Since it is proposed that these techniques be simulated by- the 704 computer, this is the application of artificial.IntelliSgence to the problemo The program has been wri,tten so that the machine is not: presented with the condensed s-Jbset, as usulu.ally happens, but instead is given access to all possibl.e t;erms and interactions wit'-hijn the bounds of the number of variables considered and the numb'er of function choices per variable allowedo As usual in eproblems of this type, no stcraight-corwacrd procedure can be given t1o proceed to the soluti' on that, does not, a.Lso appea:r economically prohibitive. It.s not a matter of:instr ucting the machine how to solve +the problem, but ins.tead of i.ns..tr.t;i:ng the machine how to "lear.n" to solve the problemr Specifically, the machine must "learn," how to select terms so * Experience with the regression program on si.ngle irndependent variable problems indicates that the terms added suceessively to the predicting eqluation bear little relat-ion after the first step to their partia'l correlation. coefficients with respect to the dependent variable This is because the added terms are always charged with explaining the as yet unexplained variation in the dependent variableo Consequently, if the first term entered explairns the dependent variable behavior quite well, the next term may be of quite different, chara,.-cter in order to explain what is left by the first termo0

-91 that the set of terms chosen contain those terms needed to produce predicting equations of high precisiono Much remains to be done in this new and vital areao The present effort contains only the most rudimentary learning but is written in such a way that more sophisticated learning models can be inserted fairly easily. Experience with the simple learning mechanism has been extremely encouragingo Turning to the example of 5 variables and 10 functions per variable, the following discussion describes the nature of the learning scheme used by the program. Suppose that no knowledge of the nature of the more likely terms nor of the relative importance of the various term classifiers are known a priori. A "term classifier" is one of the set of (1) interaction order identifier, (2) variable identifier, or (3) function identifier, and is used to classify terms as to the degree of interaction, variables involved, and funct.ions of variables i.nvolved i.n the term. If such knowledge is presumed known before commencing the solution, means are provided to suggest either the initial set of terms to try or the initial distribution of weight among the term classifiers or both or neithero Ixn the present case, neither are assumed to be supplied so the discussion may be understood for any other case where more information is given initially. Since term classifiers are not assumed to be supplied, the program assumes no previous experience with the problem and accordingly sets the relative likelihood of all terms equalo This is accomplished by considering each of the classifiers as an array the elements of which are the lengths of the components of a vectoro Each component is initially set to a unit lengtho

-92 Next the initial set of terms must be generated by the program0 Each of the 161.,0o0 possible terms in. this example are equally likely at this stageo The program uses a pseudo-random number generator to select (I) an interaction classifier, (2) variables to satisfy the interaction selected, and (3) a function for each variable chosen for the interactiono As each term is selected, a check is made to be sure that it is not a duplicate of an earlier term chosen for the current passo When the number of terms (less than 60) requested by the user for each pass have been chosen and entered in a term matrix, the program calls upon an editor program to examine the data and the term matrix and thereby generate the set of edited data required by the regression analysis programo The editing process consists of operating on the raw data by referring to the term matrix for the definitions of the terms and to subroutines to carry out the generation of the termso Each raw observation is converted into the edited data and a magnetic tape recording of the result Js madeo When all the data have been edited, the program turns to the Stepwise Regression Program to carry out the analysis exactly as before with respect to the set of terms chosen by the program. Upon completion of the Regression Program for this selection of terms, a check of the generated predicting equation is made to see if: (1) the Coefficient of Determination is as large as the user specified, (2) the Standard Error of Estimate is as small as the user specified, (3) the number of passes executed have not exceeded the limit by the usero

-95 If further work is allowed as the result of these checks, the program proceeds to examine the results of the pass just completed and in. so doing acquires "experience" conerning the types of terms most suitable for future useo This "experience" is acquired by the student program as followso Each term is checked against the list of terms included in the predicting equation. If a term has been successfully used in the relation, the student (1) retains the term to be used again, and (2) increases the probability of trying similar terms by incrementing the lengths of the vector components of the classifier arrays that chose the termo If the term was not successful, the student decrements the lengths of the vector components that selected the term. By modifying the vectors by amount proportional to their current size, no term will ever be reduced to zero probability but may have its probability made arbitrarily small. but positive. In this way the arbitrary setting of huge blocks of coefficients to zero is avoided and any term may at any time be used successfully and thereby become a member of the predicting equation un.til supplanted by a still better term, After the student program completes the study of the previous run, the "experience" gained is utilized to select a new set of trial terms for the next pass, That is to say, the previously successful terms are retained from the former pass and the term matrix is filled out with terms chosen. by using the modified classifier arrays and the random selection process. Since the classifier arrays have been modified, the selection of new terms no longer occurs with equal probability for all interactions, variables, and functionso Thus the search is less random and becomes more

-94-4 nearly stepwise as success and failure direct the modification of the classifier arrayso o long as it is possib.le to retain terms used successfully on the previous pass and still select some additional term or terms, the program retains the previously successful terms0 In this way, the new pass will always be at least as "successful" as the last passo If, however, a new pass is called for and there is no room, for additional terms, the program has encountered a "traffic jam" since a new pass would not be requested if the old selection had been good enougho In this situation, a fresh start is needed but old "experience" may still provide valuable assistance in the selection of terms. The student program discards the old selection of terms (printout of the discarded set is automatic so that human study can be made of it) and selects a complete new set while retaining the "experience" imbedded in the classifier arrayso In this case the machine is completely able to handle the "traffic jams" without outside helpo Another pitfall -,which might be encountered by the program concerns the case in which the solution has progressed to a. locally maximally successful predicting equatlion In this case, any change appears to make the predicting equation less useful and yet the present predictting equation is not good enougho An interesting property of the Stepwise Regression method for choosing the most desirable term.s results in the ability of the program to work itself out of such a situationo In fact, several instances have been observed in which the program accepted somewhat poorer overall fits for one or two trials in order to retain particularly good terms and on a succeeding trial found the fitted predicting equation to be several times better than the best previous equationo

-95 In any case the process repeats itself, studying, grading, selecting, editing, fitting, until the conditions on the goodness of fit are met or until the desired number of passes have been used whichever comes first, While one cannot be certain that the very best predicting equation possible has been found after any predetermined number of passes (a characteristic of iterative processes generally), the procedure insures that the best solution to date is preserved and that all trials contribute to the improvement of the selection process. The learning scheme employed by the student program embodies many of the principles discussed by Friedburg, Dunham, and North in their articles on "Learning Machines" in the IoBoMo Journal. The student program extends these ideas and incorporates the advantages of both random search and stepwise search, Initial passes search rather randomly looking for promising leads, As evidence accumulates, the mode of search becomes increasingly stepwise as the number of "gqod" terms retained growo Thus the search narrows itself into promising areas and progress is made toward solution until either a solution is found or the allowable number of passes is exceeded or until either a solution is found or the allowable number of passes is exceeded Or until a "traffic jam" forces the random search to begin again. Random searching of the early stages is most promising since a podr start does not inhibit progress. Later stages have experienced some success and therefore the modifications are less drastic to allow the previous leads to be followed as far as they may prove to be profitable. During the solution of any particular problem, it may happen that, when the data are operated upon by the editor program to produce the edited data, the size of the numbers generated may overflow or underflow the size of the IBM 704 word. In floating point arithmetic this may

96 occur whenever the editor produces a non-zero number with absolute value outside the range,10 8 to 10+18 because of a later production of the sums of the squares and cross products of the terms by the Stepwise Regression Program. In these circumstances, the student program cannot experience "learning " for those terms of correct size since they have not yet been tried for the actual curve fitting, but the student program must "learn" about the selection of terms acceptable to the 704. Occasionally, it has been observed that the terms suggested by the curve fitting process and the terms acceptable in size to the 704 may not agreeo The present learning mechanism is capable of correcting itself in this case without requiring human intervention. Some final remarks may be of assistance in understanding the analysis. First of all, a given set of data may result in more than one predicting equation of a specified goodness of fito This corresponds to the existence of several mathematical models of the system. Classically, this situation leads to the development of experiments capable of distinguishing between the models and the retention of those models which best describe-ete greatest variety of consistent circumstances accurately, By randomly restarting the problem this possibility may be investigated, If the program produces different predicting equations upon random restarting, more evidence is neededo Failure to produce different equations, however, does not guarantee freedom from such difficulty but decreases the probability of this difficulty,; Secondly, if previous experience with a problem is available, prudence usually dictates that the initial pass make full use of ito The program provides ready means for saving previous results and for

-97 restarting with any or all of the previous classifier arrays and term selections intact. This same philosophy may be extended to initial runs in which the user s training and experience or previous encounters with similar problems may serve to generate an initial selection and/or weighting. The penalty for a poor guess is an increased number of passes, but a good guess results in considerable savingo The Stepwise Regression Program with Simple Learning has been used successfully on many test problems and actual physical component modeling problems~ In addition, interest in this technique has been generated in many diverse areas of the physical and social sciences. The ability to know precisely the worth of each and every term in a predicting equation, as well as the worth of the equation as a whole, as it is supported by actual evidence, should enable extensions of knowledge in many fieldso

IMP13EMENTING THE STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING Communication of the Problem to the Program As it was in the case of the Simulator Program, the immediate concern of the user of the Stepwise Regression Program is to communicate the problem to be solved to the program. Since the problem is essentially computational, the link is established through the u.se of a set of control cards. The program, is designed to be very flexibl.e in the analysis of the problem. Thus, the user must select the specific operations to be performed and the constraints to be imposed. The user must supply, in addition to the observed data the following control cards1. Title card 2. Problem control parameter card 3. Solution control, parameter card 4., Output control card 5. Simple Learning control card 6 Core and Tape Layout card' l 7. Initial Random Number card Depending on the contents of the. Problem control parameter card, one, several or all of the following groups of cards may be required~ 8. Ordered Term. Insertion cards 9. Data deck 9A. Format specification card 9B. Observed data deck 9C. End of data card 10o Accumulated Learning deck 11o Initial pass terms deck 12. Output Function Name card,, In the foregoing list, items 8, 10, and 11 may be present or absent from the input deck depending on the contents of the Problem control parameter card. The remainder must be present in every input deck,, The order of the deck follows exactly the order of the list. -98

99 Title Card The title card allows the user to present any title that may be desired to be printed at the beginning of a new problem. Only one card may be used and the title may appear anywhere within columns 1 thru 72. Ordinarily, the user will place the number one in column one so that the printing for the problem will begin on a new page. Whatever appears in column one is the printer carriage control character. If a blank card is used, the printer will simply single space the paper. Problem Control Parameter Card The function of the parameters on this card is to allow the specific problem being treated to be handled in: accordance with the user's wishes. The format of the card is (15, 3F10.5,'7I5 12). The parameters, in order, are: 1) Problem Number, an integer modulo 52768. (cols. 1 thru 5) 2) Tolerance for division and round off erroro A floating point bound such that if the magnitude of any divisor is less than this value.no division will occur. This value is also used to limit round off error in the matrix manipulation. Typical values are 0.0001 to 0O0005. (cols. 6 thru 15) 3) Probability of insertion erroro A floating point number in the openinterval from O. to 1. The value is the probability allowed by the user that the least significant term inserted into the predicting equation is erroneous. A value of 0.05 represents a risk of 1 chance in 20, a value 0.01 represents a risk of 1 in 100 and so one (cols. 16 thru 25)

-100 4) Probability of a deletion error. A floating point number in the open interval from 0. to 1.. The value is the probability allowed by the user that a term removed from the predicting equation for lack of support should have been allowed to remain in the equation. The probability of a deletion error must not exceed the probability of an insertion error. If it does) the program may reject every term offered. (cols. 26 thru 35) 5) Number of independent (predictor) variables. An integer less than or equal to 59. (cols, 36 thru 40). This value plus one is the total number of variables in the problem. 6) Number of functions to be considered for each independent variable. An integer less than or equal to 60c (cols. 41 thru 45)> The choice of functions to be used by the program is determined by the subroutine PFNCT in the Editor Program Core ((No. 4) and the output section of the Program Generator Core (No. 8) should obviously be made to agree with these functions). The user is free to replace PFNCT if the need arises in any particular problem, The "standard" version provides automatically integer powers, integer roots and their reciprocals. The extent of the set so generated depends on the number of functions. The order of these functions is: ( in MAD notation) Function No. 1................. X(I).P.1 2 X..(.1.).*P..l1 3.o.............. x(I).P.2 4-................ X(I).P.-2 5 * e e o o *. e;) @o e o X(I").Po1/2 6 o., X. o.., o,s o o.e o.. o ) X ( ) e P 1/P. — 2 and so on) repeating the pattern of functions 35 4, 5 and 6 above for each increasing integer

Typical values for this parameter are; 10 (yielding functions thru X(I),P.-1/3), 22 (yielding functions thru X(I).P.-1/6), 38 (yielding functions thru X(I).P.,l/10). 7) Number of terms to be tried at each solution pass. An integer less than or equal to 59. ('cols. 46 thru 50) 8) Number of terms whose order of insertion is specified in the input data. An integer less than or equal to the item (7). Usually this variable is set to zero, but may be set positive and thus force complete control, over the order of insertion of terms by the user. (cols. 51 thru 55) 9) Number of terms initially defined by the user. An integer less than or equal to item. (7). Defined terms under this control will be used subject to the statistical analysis of the program unless overridden by item (8). If less than the total terms in (7) are defined by the user, the program will attempt to generate enough new terms to satisfy (7). (cols. 56 thru 60) 10) Parameter controlling the type of regression analysis executed by the program. An integer (colSo 61 thru 65) operating as follows: 10A) If greater than zero) the data is treated with respect to the coordinate axes and the constant term is always suppressed to zero. lOB) If equal to zero) the data is treated with respect to a set of axes translated to the means of the variables. The constant term is not suppressed. 10C) If less than zero, the data is treated as in (A) but the constant term is not suppressed. The constant term is treated

just like every other term,. except that the constant is always inserted as the first term the relation and held until the next term is tried. After this point9 all constraints are removed. (The type C is most useful in dealing with physical data, Type A is most useful when other information dictates a zero constant. Type B is most useful when dealing with data that tends to group itself about the means, (Biological and sociological problems) ) 11i) Parameter indicating whether the data is all of unit weight (parameter value not equal to zero), or weighted individually (parameter value equal to zero). (cols 66 thru 70). If the parameter is zero, each set of data must carry a value of its weight, 12) Parameter indicating whether the program has previous "experience" with the problem. If not equal to zero (cT blank) the program assumes that the accumulated learning deck is present. Integer variable in columns 71-72o Solution Control Card The solution control card communicates to the program the conditions under which the program is to cease calculation. The format is (2F10,5 15). The parameters9 in order, are1) Estimated Coefficient of Determination. Floating point variable in the range of 0. to 1.0. This parameter is the user's estimate of the expected goodness of fit between the predicting equation surface and the data. Perfect agreement is represented by lo0, no agreement is represented by 0,0. Typical values for physical problems run from.95 to.999. (cols. 1 thru 10)

-103 2) Standard Error of *dependent Variable. Floating point variable whose value is the user's estimate of the standard error of the *idependent variable represented in this data. The value reflects the probable errors present in the data in units of the same kind as the data. (cols. 11 thru 20) 3) Number of Passes allowed for this problem. Integer variable whose value is the allowed number of complete passes to be made on the problem. Typical values run from 1 to 10. (cols. 21 thru 25) The program will run until both conditions (1) and (2) are met, or until the passes are used up, which ever comes first. Condition (1) is met when the program has found an equation whose Coefficient of Determination exceeds or equals the specified value. Condition (2) is met when the program has found an equation whose Standard Error of the Independent Variable is less than or equal to the specified value. Both conditions must be met in order to terminate the prdgram before the allowed number of passes are used. Output Control Card The program must perform a variety of subsidiary calculations during the equation generation process. The output control card allows the user to suppress those extra calculations and printing for which he has no need. If a blank card (no suppression) is used, all calculations will be printed. Since, for most problems, this represents a very sizeable volume of printing the user is cautioned to select only those items of real inter'est. Punching a numeric 1 in the column corresponding to the item number given below will suppress printing of the calculation. Either a numeric 0 or a blank allows the printing to occur.

The suppressable output parameeters, in order by column number, are~ Column No. 1) Raw Sums of Squares andCross Products 2) Average (Mean) values 3) Residual Sums of Squares and Cross Products 4) Standard Deviations 5) Partial Correlation Coefficients 6) Intermediate Steps in Regression process 7) Predictions using the intermediate step equations 8) Predictions using the final equation 9) Values of terms for each set (.f observations If all of the above are suppressedC the output will consist of~ 1) Listing for verification of all raw datao 2) Definitions of terms used for each pass. 3) Final equation found for each pass, with pertinent statistics, That is; the F level of the last term treated, the standard error of the independent variable, the coefficient of determination, the multiple correlation coefficient, the constar.t term, if any, and. the coefficients and their standard errors for all termrs finally retained in the equation, 4) The diagonal elements of the regression matrixo 5) The equation produced by the last pass in M. A. De subroutine form both printed and punched on cards ready for processing. 6) The final status of the "learning" mechanism punched on cards for use in restarting future problems. 7) The terms to be used for the next presentation of the problem to the program (on cards). 8) A pseudo-random number card to allow the random number generator to continue the sequence.

-105 For most applications, the automatically produced output is sufficient. The next most generally interesting results are the items (6) and (8) in the first group. If (4), (7) and (8) are suppressed, some calculation is also suppressed thus speeding execution time. The user is cautioned again that the request for all of this printing will, in general, produce a very sizeable output. Simple Learning Control Card The user may, at this stage in the development of artifical intelligence programs, control the characteristics of the learning mechanism. Use of the external function structure for the program allows fairly easy modification of the various parts of the program. With the "standard" learning mechanism as it is now used, data is accumulated concerning three kinds of selections: 1) Order of Interaction 2) Variables Entering Interaction 3) Functions of the Variables. A term is generated by selecting an interaction order, next the variables to be concerned in the interaction and finally the functions of the variables selected. The term is the cross product of the functions of the variables selected. The program must "learn" which interactions are most useful in explaining the data, which variables are most useful, and which functions of these variables are most useful. The program "learns" by trying to use terms selected by the program to explain the data. If the mechanism has selected a term which is supported by the data and retained by the regression analysis, the meachanism that selected that term is

modified so as to be more likely to select terms of a simi.lar character. On the other hand, if a term is not supported by the data and is, thus, of no apparent utility in the equation it is cast out and the mechanism is adjusted to be less likely to select terms of similar character. Since the probability of selection of any component of any allowed term should be bounded positive, the program uses a "half-life" constanrt to modify the probabilitieso In this way, the relative probability of any term may be made arbitrarily small, but remains positive The usual card is of format (I.5, 4E1.6.8). If the mechanism is modified to require more constants, succeeding cards (up to 2) are of format (E21.8,3E16.8). The parameters are~ 1) Number of constants used by "learning" mechanism, Integer in cols. 1 thru 5, (The standard mechanism uses 3 constants. The numeric 3 is punched in col. 5.) 2) The Constants used b) the "learning" mechanism. The standard mechanism uses~ 2A) The "half life" of the Interaction selector. Typical value is 3.OEOO. This means that three consecutive successes will double the present probability (or conversely, three consecutive failures in halving of the present probability)o 2B) The "half life" of the variable selector. Typical value is 3.OEOO. 2C) The "half life" of the function selector Typical value is 1.5EO. Since any function may be used relatively infrequently it is somewhat desirable to take more powerful action on each encounter.

-107 A great deal of work remains to be done in exploring "learning" mechanisms. It should be observed here that as the constants are made larger, the mechanism "learns" more slowly. In fact, for very large values of the constants the mechanism is essentially deactivated. Very small values of the constants, on the other hand, may cause wildly Perratic behavior of the mechanism since each encounter so strongly distorts the relative probabilities. The values given have received much use and appear to give quite stable operation although not necessarily optimum convergence. The user will ordinarily duplicate this card and the next one from run to run. Core and Tape Layout Card In order to allow easy extension of this program in the future, the multiple core program arrangement can be changed and tape layout changed without disrupting the entire program by using this card. Present corearrangement and tape layout is the followings (Users wishing to modify the layout are advised to study the program flow charts carefully.) 1) Starting Program is in two consecutive core loads. The first core of the Starting program is now core 1. Punch 1 in column 5. 2) The Student Program is one core load and is- now core 3. Puhch 3 in column 10. 3) The Editor Program is one core load and is now core 4. Punch 4 in column 15. 4) The Regression Program and Program Generator Program are three core loads, the first two of which are the Regression program. These must be consecutive core loads. Punch 5 in column 20.

5) The Processed Data erasable tape is nca tape 3. Punch 3 in column 22. 5A) The Selector mechanism is now stored as the first record on tape 3. Punch 1 in column 24~ 5B) The Raw Data after processing into terms values is now stored beginning as the second record in tape 3, Punch 2 in column 26. 5C) The Terms selected for eachpass are now stored as the second record on tape 35 Punch 2 in column 28. 6) The Raw Data erasable tape is now tape 4. Punch 4 in column 346A) The raw data is now stored on tape 4 in binary beginning as the first record on tapeo Punch 1 in column 56. Space is allocated in storage for five tape record assignments for each tape. At this time, the only assignments are the above..Initial Random Number Card The random number subroutine used by the Simple Learning Mechanism may be reset arbitrarily at the beginning of each problem.Since the program will produce one of these cards at the end of the problem the sequence may be continued easily. The format is 5110. The first number is any odd integer modulo 3276810. The second number is any integer modulo 3276810. The third number is any integer modulo 3210. The subroutine combines these integers to form one 35 binary digit odd integero This integer serves as ths first member of the pseudo-random number sequence generated by the library subroutine RAM2.

-109Ordered Term Insertion Card(s) If the parameter (8) of the Problem Control Parameter Card is nonzero, the user must supply a set of cards to define the order in which terms are to be inserted. Thus allows an arbitrary equation may be generated without regard to the statistical analysis after which the statistical analysis may be used to discard those terms that are not sufficiently important to meet the deletion error eriterion. If a theoretical relation is available for which a study is being made to determine how the relation may be improved, this feature may be useful. Otherwise, one must assume the risk that some of the theoretical terms will be displaced in the search. The user must be aware that the use of these term order cards is a severe constraint on the analysis and treat the results accordingly. The format is 14I5 Each five columns contains a integer whose value is the number of a term to be inserted. The first number is the first term to be inserted, the second number is the second term and so on. For example, if parameter (8) on the Problem Control Card were three and column five on the Ordered Term Insertion Care were six, column ten were three and column fifteen were one, the effect would be to im ert term six, then term three and then term one after which the Stepwise Regression Program would examine the equation to be sure that these terms meet the deletion error criterion. If any of these terms fail the test they will be discarded. When all of the terms in the equation meet the deletion error test, the remaining terms not yet in the equation will be tested for insertion. From this point on, the standard analysis is followed,

Data Deck Preparation 1. Format Specification Card Since the data may come from various sources, the data deck allows the data format to be specified at execution time, This is done by using a standard FORTRAN format statement beginning with the word FORMAT (beginning in column seven and ending in column thirteen, followed by any allowable format specificatsion that can be placed on one card and terminating with a right parenthesis in or before colu.lrmn. 72 o For example, the following format statements would be acceptable:, Column 7 FOR-MAT (5Fl0, 2,El6.8) or FORMAT (4E1.6.7, F10, 1/4E1.5 8) This card must immediately preceed the data deck, and is known as the Format Specification Card. 2. Data Cards Following this card are the data cardso Arbitrary formats are allowed as described aboveo The data must be listed for each observation in the following order however. (All values are:floating point numbers) i) Observation Number.. There must be a positive observation number for every observation. Run number one must appear once and only once~ 2) Independent Variables. These values are listed in order following the observation number. The values must correspond to one observation group. 3) Dependent Variable. The variable whose value is to be predicted must follow the predictor or independent variables.

4) Weight of this Observation group. The weight of the group may be specified or can be assumed to be unity depending on the parameter (11) of Problem Control Card. 3. End of Data Card The actual data is then followed by a complete blank data set which acts as a termination for the data. If any Observation Number is blank or less than or equal to zero, the data input is terminated at that point. Therefore, the user must take care in preparing the data deck so that the entire set of data will be read into machine storage. The program automatically counts the weighted data sets to establish the degrees of freedom for the analysis. In this way, new data can be added to the data deck and/or old data can be deleted very easily. The Accumulated Learning Deck Whenever a multiple independent variable problem occurs in which a large number of functions are allowed for each independent variable and interactions of all orders are admitted, the result is the generation of a very large set of possible terms that may appear in a predicting equation, Since the most desirable equation consists of a "minimal" set of these terms comprising those terms most significant in explaining the dependent variable behavior, it becomes apparent that the analysis must usually perform a selection process while dealing with a segment of the entire set of terms at each encounter.

-112 If it is possible to verify the validity of terms independently from their method of initial, selection then it becomes feasible to allow the machine to select the terms using some heuristic method. The terms so selected will not always be the correct ones or even the "best" ones although the method of selection should certainly tend to operate in this wayo The important point to observe is that the validity of the term is tested by the regression analysis independently of the selection and the regression analysis is, therefore, not affected in any way by the heuristic method of selection. Because of this, the heuristic term selection method is free to select terms using any convenient scale for choosing the terms. If the terms so selected are shown to have validity by the regression analysis then the heuristic method that selected the valid term is modified so as to be more likely to select similar terms. A converse action occurs whenever the term is shown to be invalid. At the completion of each solution pass the current status of the selector mechanism is represented by a set of values which give~ 1) The relative probability of each Interaction Order 2) The relative probability of each Independent Variable 3) The relative probability of each function of each variable. Whenever no accumulated learning deck is available, parameter (12) of the problem control parameter card is set equal to zero. The result is that the program will then assign equal unit relative probability to all interactions, variables and functions of variables.

If previous encounters with similar problems have occurred, however, the program has already had "experience" with a similar problem and can be allowed to take advantage of these encounters by providing the accumulated learning deck that was automatically produced at the conclusion of the former problem together with the new problem data. If the user desires to transmit this information to the program, parameter (12) of the problem. control parameter card is set equal. to one and the accumulated learning deck is placed after the data deck. The user can also suggest his own experience to the program by preparing an accumulated learning deck. The format is 5E14.7e The relative probabilities are inserted in the following order1) Relative Probabilities for Interactions from first order to the maximum order for the problem. 2) The sum of the preceeding probabilities (1). 3) Relative Probabilities for Independent Variables. from the first to the last in the same order as they appear in the data deck. 4) The sum of the preceeding probabilities (3). 5) Relative probabilities for each function of the first variable followed by the sum of these probabilitieso 6) Relative probabilities as in (5) for the second, third, etc. variables. The preceeding items are punched successively in the available fields as specified by the formato No blank fields are permitted between groups since every field is interpreted consecutively.

The accumulated learning produced by the machine program has each relative probability normalized so that the mean relative probability is unity. In this way9 a problem may be easily expanded to more variables, functions, etc. and still retain previous "experience" by making all new entries of unit value. Because of the independent regression analysis of the terms chosen from the accumulated learning it must be emphasized that this mechanism cannot force the adoption of incorrext terms. Rather, such an inQorrect set of "experience" would be modified progressively by the program. If the "experience" supplied is valid for the current problem the result is to speed the generation of the desired equation but invalid "experience" can only temporarily delay this generation. In general, if good experience is available from previous similar problems or from the user's background the user is strongly advised to make use of it. Initial Pass Terms Deck The user may suggest any initial terms that may be desired for the first pass, If the suggested terms stem from theoretical consideration and the theory is in agreement with the data, such a suggestion will speed the generation of' the predicting equation by insuring an early treatment of likely terms- Any number of terms may be given initially up to the total number of terms allowed for each pass The number of terms to be so defined by the user is given by the parameter (9) on the problem control parameter card, If fewer than the total number of terms to be tried at

-115 each pass (parameter (7)) are given initially by the user, the machine program will generate the remainder of the set by using the accumulated learning and the selector mechanism. At the end of each problem and immediately following the production of the accumulated learning deck, the program produces a set of terms to begin the next encounter with the problem. If the problem is continued later these terms may be supplied by simply including these cards following the accumulated learning deck. If the user wishes to suggest terms the procedure is the following: (The card format is 14F5.0) 1) Produce a term card (or cards if sufficient variables are present) for each term desired. 2) Treat each consecutive field in the above format specification as in one to one correspondence with the independent variables in the problems. 2A) Insert the appropriate function number in the field corresponding to the desired variable, 2B) Leave blank (or zero) every variable field not associated with the term. 3) Insert the interaction order in the field immediately following the last variable field. For example, suppose the user is dealing with two independent variables and the "standard" PFNCT subroutine and desires to form the term: X(1).P.53*X(2).P-1/2.

-116 The term is specified by punching seven in column five, six in column ten and two in column fifteen. See "standard" functions in PFNCT as defined by number. The seven selects the function integer power three and the appearance in column five assigns this function to variable one in this term. The two in column fifteen declares the term to be a second order interaction and thus produces the desired multiplication of the previous functions of the variables. In this way any desired term allowed by the subroutine PFNCT (and hence allowed by the user) can be specified as an initial term. This is true even when the function numbers in the initial term specification exceed the value of parameter (6) on the problem control card. Of course these terms are, in this instance, excluded from automatic generation but will, nevertheless, be used correctly and preserved from pass to pass correctly so long as they are in agreement with the date. Once discarded from. the set of terms only those terms allowed by parameter (6) can be regenerated. Output Functi.on Name Card At the conclusion of each problem the program produces, in printed form and on punched cards, the external function subroutine for the last regression equation found by the analysis. This function is ready for immediate translation by the MAD translator and may be used in any program as the user may desire. The Output Function Name Card assigns a name to this subroutine. If a blank card is s.pplied at this place in the input deck the function name will be left blank. Otherwise the desired name is entered somewhere in the columns 1 thru 72 on the Output Function Name Card.

The program will use the last six non-blank alphanumeric characters as the function name. If a total of fewer than six non-blank alphanumeric characters appear in columns 1 thru 72, these characters are taken to be the function name. The rules for allowable function names are those of the MAD translator. If the desired function name consists of exactly six alphanumeric characters, the user is then free to insert any desired comment before the desired nameo The comment will, in this case only, be ignored. Examples of allowable function nameso ETA17; PRATIO; TORQUE; EFF23

The Structure of the Program The Stepwise Regression Program, like the Simulator, is structured in several sections, Each section performs certain tasks which cause, as a result of the performance, a selection of a new section of the program to be performed, There are seven basic sections: lo Input Section 2~ Initial Term Section 3, Student Section 4o Editor Section 5, Regression Statistics Section 60 Stepwise Regression Analysis Section 70 Program Generation Section The input section brings into the program all of the data associated with a given problem. The control parameters and data are all entered at one time so that if any data set encounters trouble the input tape will be properly positioned for the next problem. As discussed in the section on communicating the problem to the program, the initial terms may be supplied by the user or chosen by the machine as desiredo If supplied, the input section will then bypass the Initial Term Section and the Student Section. If the initial terms were not supplied, the program Qalls the Initial Term Section to choose the terms for the first: passo The selection is based on the Accumulated Learning supplied by the usero If no Accumulated Learning is given, the program assumes initially that all possible terms are equally probable and proceeds with this assumptiono The terms are selected

-119 by choosing1o Interaction Order 2. Variables for the interaction 3. Functions of the Variables. After selecting the initial terms, the control passes to the Editor Section. In this section, the terms defined earlier are evaluated for every data point. If an eminent overflow or underflow of the machine register capacity is found, the faulty term is rejected and the problem is passed to the Student Section so that the Accumulated Learning may be adjusted to tend to avoid such a recurrence. If no such machine limitation is found, the control passes to the Regression Statistics Section. The Regression Statistics Sections computes raw sums of squares and cross-products, means, standard deviations, sums of squares and cross-products adjusted to means, and simple (product-moment) correlation coefficients. Since these statistics are generated in a conventional manner reference is made here to suitable texts for elaboration(ll l2 18 1: The important item of interest is that all of these statistics are available for the study of the data. Upon completion of these tasks, the results in the form of the regression matrix are passed along with the control to the Stepwise Regression Analysis Sectiono In the Stepwise Regression Analysis Section the techniques of Efroymson and Dallemand are employed but modified slightly to allow more flexible manipulation of the analysis by the user, Four basic analyses may be performed: 1. Analysis for fit of' data about means 2. Analysis for fit of data with respect to the coordinate planes with constant term suppressed. 3~ Analysis for fit of data with respect to the coordinate plants using constant term 4. Controlled term insertion order analysiso

- 120 Of these analyses, the fourth is most risky since the user overrides the statistical analysJis-o If the user finally removes the imposed constraints on the term insertion process however t-he Stepwse Regression Analysis Program will automatically discard any termrs that are not sufficiently correlated with the datac In this wa,y, occasionally, a special problem may be studied to advantage,, The other three analyses are basically simil.ar except as they are related to the coordinate systems and the constant term., The analysis proceeds as follows~ 1) Select the term. with greatest contribution to the explanation of the as yet unexplained variance of the data, 2) Compare the variance contribution for this term with a random variable to determine whether the contribution could be due to chanceo 3) Insert the term if and only if the variance contribution exceeds that of a ranloT. variable by wh.atever amnou.nr.t the user wishes to speci.fy, Commonly the term is inserted if the risk is less than one in twenty to one in one hundred. 4) Revi.ew all, terms in the equation and reject any that may have been redulced in. importance below the user's standard (by combinations of other terms9, etc. o 5) Cont.in-.e tthis process until none of the terms not in the equation can meet the standard set in (3)0 Tipon completion of the Stepwise Regression Analysis -the coefficients for each term found to be valid by the analysis are produced together with the corresponding standard errors for the coeffici ennts. Other statistics

-121 produced at this point are the multiple correlation coefficient, the coefficient of determination, the standard error of the dependent variable with respect to the regression equation and the regression constant, if any. The diagonal elements of the inverse regression matrix is also printed for study. If desired, the user may request the calculation and printing of a point by point comparison of the data and the predicting equation results. This calculation displays, in addition to the actual value produced by the regression equation, the predicted values plus and minus one standard deviation. This band of values may be expected to include the true value of the dependent variable 68 percent of the time. Finally, the deviations and the percentage deviations of the predicted points and data points are produced. During the process the largest absolute deviation and the largest absolute percentage deviation are sorted out and printedo Until the regression equation produced by the Stepwise Regression analysis satisfies the criteria set by the user, the Analysis next returns to the Student Section to reevaluate the Accumulated Learning stored in the selecting mechanism. The criteria set by the user are, 1. Standard error of the dependent variable 2. Coefficient of determination 3. Maximum number of solution passes allowed. The analysis continues until the generated equation properties equal or better the first two criteria, or until the allowed number of passes are consumed. This action takes place by recognizing the separation of terms by the regression analysis into those terms sufficiently correlated with the data to be included in the regression equation and those terms not this well substantiated.

-122 All of the terms inserted in the regression equation are retained for the next program unless this would not allow the selection of any new terms. In addition the values of all portions of the selecting mechansim involved in choosing the successful terms are increased so as to make the selection of similar terms more likely. Finally, the values of all portions of the selecting mechanism involved in selecting the unsuccessful. terms are reduced to make the selection of similar terms less likelyo The modification of the selecting mechanism occurs on an exponential decay basis. This technique allows the user to specify the number of failures to reduce a particular element of the term selection mechanism by one half. Conversely, this value is the number of successes to double the relative likelihood of the element of the selecting mechanism. The elements of the selecting mechanism are: lo The interaction selector 2o The variable selector 3~ The variable function selector. After the selecting mechanism is modified by this process the terms for a new attempt by the regression analysis are selected using the modified mechanism. In this way extremely large sets of possible terms may be searched in a very effective manner. An interesting property of technique is the ability of the method to work out of "local optima" in the production of the regression equationo That is, the location of very highly correlated terms may result in an equation which may fit less well than an earlier more complicated relationo This discovery may often be used on later trials to modify the selecting mechanism and thus find an equation better than any previous relationo

-125 When the criteria set by the user are satisfied by either producing an equation that meets the specified statistical criteria or by using up the allowed number of trials, the control is passed to the Program Generation Section. This section produces the final equation as a subroutine both in print and on punched cards ready to be included in any program as may be desired. Versions are available to produce the subroutine in either the M.A.D. language or in the FORTRAN language. Upon completion of the output of the subroutine form of the regression equation, the program takes one additional step. The control is passed to the Student section and the Accumulated Learning is once again modified and the set of terms chosen for a re-entry of the problem at any later time. This information is preserved on punched cards in the exact format expected by the Input section. It is important to recognize that this information is pertinent not only to the problem at hand but also may be used to expedite the solution of any similar problem. This carrying forward of artifical "experience" is an important and unusual feature of the Stepwise Regression Program with Simple Learning. By this means, the program is enabled to accelerate the generation of predicting equations by recognizing the information latent in previous encounters with similar problems. It is also important to observe that if such artificial "experience" is incorrect for the attempt being made no effect upon the generated equation will be observed except for a use of one or more passes to correct the Accumulated Learning and proceed to the generation of the predicting equation. This action occurs because the data itself is the only finally determining source of information upon which the predicting equation can be based. The Accumulated Learning, if correct, can accelerate the determination, but if incorrect cannot prevent the determination.

CONCLUSIONS The programs which have been discussed in this paper constitute two new and advanced tools for the study and analysis of the behavior of physical systemso The Simulator Program provides a tool for undertaking the simulation of complicated systemso The flexibility inherent in this technique of analysis is made possible by providing for extension and modification of the library within the structure of the Simulator. It is important to understand that the Simulator Program provides a means for bringing to analysis of systems the very best methods and most applicable techniqueso In this way, the Simulator Program proceeds from the information supplied by the user (the System Definition and the Constraints on Input Parameters to be supplied to and the Results desired from the generated simulation program.) to produce a procedure, or algorithm, to simulate the operation of the defined system when translated and executed on the digital computing machineo The Stepwise Regression Program with Simple Learning provides the technique which can. supply the programs produced by the Simulator with subroutines to implement the methods extracted from the Library of Element Descriptions. This technique is more generally applicable and has already been utilized in the process of its verification to supply useful predicting equations for data taken from many diverse sourceso These sources have included calibration data for thermocouples, fatigue life data for plastic gears, electron tube characteristics, electric transmission line loss characteristics, thermodynamic properties of steam, tool life characteristics, psychological test data and data on the effects of various drugs on human -124

-125 subjects. In additionto these diverse areas listed to illustrate the versatility of the technique, the technique has been applied successfully to the determination of the characteristics of the steam power plants simulated. These characteristics include the expansion line characteristics of the turbine, the extraction pressure characteristics, the exhaust loss characteristics, the generator loss characteristics and special flow leakage characteristics. In all of these areas, the work thus far has been most encouraging. There is, however, a great deal of work remaining in perfecting, extending and increasing the generality of the techniques. The methods developed thus far hold considerable promise in other areas. Perhaps, some of the most significant extensions will come from the study of the results produced by these techniques by allowing a powerful analysis of the data sets.

SYSTEM SIMULATOR FLOW DIAGRAMS AND CORE LAYOUTS (Comments to assist the interpretation of the flow diagrams in this section may be found beginning on page 227.) -126

-127f SYSTEM SIMULATOR FLOW DIAGRAMS & CORE LAYOUTS SAP SUBROUTINES NOT FLOW DIAGRAMMED ONLINE INPUT OUTPUT OCTDEC INSRTC EXTRC INBIT IFBIT NOT OR SKPFIL TAPSEL OCT IFABIT ANDBIT OUTPT 1 PNCH PNCH 12 RNDM 1 B AM1BLD AM1CNT AMIEIM AM1SET AM1SUB SIGN AND

-128 - CORE LAYOUT (MAIN) CORE #1 INPUT EXTERNAL FUNCTION NUCOP TAPE IN INTERNAL FUNCTION CONCK SAP INPUT OUTPUT OCTDEC ONLINE SAVTPH INSRTC EXTRC CORE # 2 ELEMENT DESCRIPTION CORE # 3 MATRICES SET'UP TAPMV TAPE IN ELTPPR TSTDMP SYNEIM IPRELM QCORE TAPE IN SSORDP NUCOR INSRTC EXTRC INBIT IFBIT TAPSEL OCTDEC INP UT OUIPUT ONLINE OR NOT SKPFIL IFABIT TAPSEL OUTPUT INSRTC EXTRC OCT OCTDEC TAPSEL OR NOT ONLINE SKPFIL INBIT

-19 - MAIN EXTERNAL FUNCTION INTERNAL FUNCTION SAP CORE # 4 DESIRED RESULT REDUCTION EXTCHK ELCHK INSERT REMOVE LISTSC RDRUM 1 RDR1JM 2 CHECK WRDREM WRDRUM 1 SKPFIL IFABIT ANDBIT NOT AND AM1SET IFBIT AM1CNT RNDM1B AM1SUB AM1RIM OR CORE # 5 PROLOGUE PCODE PRLOG PSCAN SSCAN OITPT 1 STORER PSUB 1 PSUB 2 PNAME SKIP PLIST FSUB FSUB 1 FSUB 2 LABELS INSRT SSCAN 1 CALCS IS A DUJMMY INTERNAL FUNCTION SKPFIL EXTRC RNDM1B INSRTC OUTPUT PNCH PNCH 12

-13( - MAIN EXTERNAL FUNCTION INTERNAL FUNCTION SAP CORE # 6 PROGRAM GENERATOR PSCAN SSCAN SSCAN 1 PSUB 1 PSUB 2 OUTTPT 1 SKIP QYFQ LABELS FSUB 2 FSUB 1 INSRT QFFQ QPQ SKPFIL NOT TAPSEL EXTRC AND INSRTC SIGN IFBIT OR OCTDAC OUTPUT PNCH12 CORE # 7 EPILOGUE EP ILOG PSCAN SSCAN SSCAN 1 SKIP FSUB FSUB 1 FSUB 2 LABELS INSRT QPQ SKPFIL EXTRC OUTPUT OUTPT 1 PNCH12 INSRTC OCTDEC OR QPQ AND QFQ ARE DUMMY INTERNAL FUNCTIONS CORE # 8 SELPGM DIAGNOSTIC

-131 INTERPRETATION OF BOXES CO PUTATION EXECUTION COMPUTATION EXECUTION EXECUTION REFERS TO A SUBROUTINE OR FUNCTION WHENEVER STATEMENTS FORM: THROUGH (STATEMENT NAME), FOR (VARIABLE) = (INITIAL VALUE), BY STEPS OF (NUMBER), UNTIL (CONDITION IS)).'CTORS FOR REMOTE CONNECTIONS FOR MARKING ENDS, OR SCOPES, OF ITERATIONS. TAPE SYMBOL FOR OPERATIONS INVOLVING TAPES. FOR OPERATIONS INVOLVING DRUMS. READ OR WRITE USUALLY USED TO DENOTE PRINTED OUTPUT CARD SYMBOL FOR OPERATIONS INVOLVING HOLLERITH CARDS.

-132CORE 1 MAIN LOOP SYSTEM SIMULATOR INPUT CORE

-133 CORE 1 EXITS FROM CORE 1 ILLEGAL STATEMENTS

-134 CORE 1 DECLARATION SECTION CONT'D si(8)V TS s\ 8 VXTSE,,,(9)~- GO si(9) TO s(4)

-135CORE 1 CONNECTION SECTION

-136 CORE 1 SYNONYM SECTION T

-137 SYNONYM SECTION CONT'D INPUT PARAMETER AND DESIRED RESULTS SECTION

-138 CORE 1 I.P. & D.R. CONT'D FUNCTION SUBSTITUTIONS

-139 CORE 1 INTERNAL FUNCTION - CONCK. (CONSISTENCY CHECK) F T F

SYSTEM SIMULATOR CORE /2 ELDES (ELEMENT DESCRIPTION PROCESSOR)

-141CORE #2 (CONT'D)

-142CORE #2 (cont'd) F

CORE #2 (CONT'D)

-144CORE #2 (cont'd)

-145 SYSTEM SIMULATOR CORE NO. 3 SETUP CORE F ERROR RTN T

-146IPRELM CORE NO. 3 INITIAL PROCESSING, INPUT PARAMETERS AND DESIRED RESULTS

-147CORE # 3 SSORDR, MATRIX ORDERING ROUTINE

-148SSORDR (CONT)

-149 TAPMV.(TH,TE) TAPE MOVER

-150CORE #3 SYNELM SYNONYM ELIMINATION

-151

-152 CORE # 2 QCORE TAPEIN, ELEMENT TAPE CHECKOUT ROUTINE

-153 TSTDMP, TEST DUMP R(OTINE, FOR CHECCKING OUT PROCEIDRES Core 3 RETURN ELTPPR, ELEMENT DESCRIPTION TAPE PRINTER

D.R.R. PROGRAM CONNECTION MATRIX IS RESTORED. ~P.5 AT THIS POINT, ALL DESIRED RESULTS THAT CAN BE SATISFIED WITHOUT COMPUTATION HAVE BEEN REMOVED.

-155 D.R.R. PROGRAM THE ASSOCIATIVE MEMORY IS SET TO RECEIVE ENTRIES FOR STATEMENT COLLECTIONS. THE PARM VECTOR HAS BEEN SET. DRUM=2 DRMADD = O..3

-156 D.R.R. PROGRAM Page 3 P2 P.4 P.4

-157 D.R.R. PROGRAM NOTE: WHEN IPCNT=O THE STATEMENT COLLECTION PRODUCES USEFUL INFOR- BETA1 MATION WITH NO EXTRA WORK. NOTE: AT BACK 2, STATEMENT COLLECTION UTILITY HAS BEEN P.5 COMPLETELY IN- TO VESTIGATED. THE. s(Aj PROCESS CONTINUE NO UNTIL THERE ARE NO MORE STATEMENT COLLECTIONS, THEN SETEOF. TRANSFERS TO LOOP 3. AT LOOP 5, THERE ARE SOME DE- SIRED RESULTS FOR WHICH NO STATE MENT COLLECTION HAS BEEN ASSIGNED.

-158 D.R.R. PROGRAM NO

-159D.R.R. PROGRAM

INTERNAL FUNCTION EXTCHK. YES YES DR1(LI)=AND.(DR1(LI) NOT. (AND. (AND. (DR1 (LI),EXT1), IP1 (ELCOL(IAT, III))))) DR2(LI)=AND. (DR1 (LI),NOT. (AND. (AND. (DR1(LI),EXT1), IP1(ELCOL(IAT, II2))))) MAIN PROGRAM YES NO DR1(LI)=AND. (DR1 (LI),NOT. (AND. (AND. DR1(LI),EXT1) IP1(ELCOL (1-IAT))))) YES

INTERNAL FUNCTION ELCHK. /THROUGH LOOPCl, \ /FOR II1=ELREL /(IAT, I)1,ELREL (1-IAT,ELREL(IAT, I) \ELREL (1-IAT,II1) \ /

-162INTERNAL FUNCTION REMOVE Page 1

-163INTERNAL FUNCTION REMOVE Page 2

TABILE (INDEX, 35) = DRMADD THE NEXT SECTION IS USED WHENEVER THE NUBER OF STATEMENT COLLECTIONS EXCEEDS AVAILABLE STORAGE. THE PARAMETER WITH THE GREATEST NUMBER OF COLLECTIONS DISCARDS ONE COLLECTION; MOST PROBABLY THE ONE WITH THE LEAST WEIGHT.

-165INTERNAL FUNCTION LISTSC.

-166INTERNAL FUNCTION RDRUM1.

-167 INTERNAL FUNCTION CHECK. (QQI) INTERNAL FUNCTION INSERT. INTERNAL FUNCTION RDRUM2.

INTERNAL FUNCTION WRDRUM INTERNAL FUNCTION WRDRUM1

-169 PROLOGUE - MAIN PROGRAM PROLOGUE GENERATION SECTION IS TREATED AS A SUBSECTION OF THE PROLOGUE CORE.

-170INTERNAL FUNCTION PCODE (CONSTRUCTS UNIQUE THREE CHARACTER CODES.)

-171 INTERNAL FUNCTION PRLOG ci RETURN

-172 INTERNAL FUNCTION PSCAN (QIQ) EXECUTE OUTPUT BFR(12II1, 12) EXECUTE PUNCH 12. BFR( 12II1) IIJ = IIJ + 1

-173 INTERNAL FUNCTIONS SSCAN, SSCANI PAGE 1 QFFQ=FSUB2 QPQ=PSUB2 PCOUNT=- 1 PCOUNT=O SW=O BFR((J2-6)/6)= INSRTC (BFR ((2-1)/6), J2-6(J2-J1)/6), CHAR

-174 INTERNAL FUNCTIONS SSCAN, SSCAN1 PRWRD = OR. (PRL1ST (Jl),OCTDEC(ATTNO(J))) BFRI((J4-1) 6) = INSTRC(BFR( (J4-1) 6),J4-((J4-1) 6)6, EXTRC (PRWORD, J3))

-175 INTERNAL FUNCTION OUTPT1. (XXX) YES

INTERNAL FUNCTION STORER. INTERNAL FUNCTION INSRT. BFR1(J2-1)/6)=INSRTC. (BFR1(J2-1)/6), ((J2-1)/6) *6, EXTRC. (QWRD, JJ))

-177 INTERNAL FUNCTION PSUB1 (QQ) SLP2, FOR =12, 1 A(2000+A(1800+I3) 4 A(2000+A(l1800 \ +12)) j

-178 INTERNAL FUNCTION PSUB2. Page 1 PRBLN = 1B, ATBLN = OB, IDBLN = OB, I4 = 1 PROWRD = O, ATWORD = OC, IDWORD = D NO

-179INTERNAL FUNCTION PSUB2 - r I, P2,1

-180INTERNAL FUNCTION PSUB2

INTERNAL FUNCTION PSUB2 THROUGH PSBLP, FOR II = ELREL (1-IAT,ELREL(IAT,1)) 1,ELREL(IAT, II) / \ELREL(IAT, 1) / RCOUNT = RCOUNT + 1 ATTNO (RCOUNT) = ELCOL(1-IAT, 11

INTERNAL FUNCTION PSUB2 Page 4

-183 INTERNAL FUNCTION PSUB2 Page 5 B: ERRNO EXECUTE PSBLPrS^ ERRNO ELPGM. SE LPGMN 5 ~'I = 6008 (DIAGNO) BFR1((J4-1)/6)= INSRTC (BFR1 (J4-1)/6), J4 =((J4-1)/6)*6, EXTRC. (PRWORD, JJ)

-184 INTERNAL FUNCTION PNAME. (QCNT)

-185INTERNAL FUNCTION - SKIP (YY)

-186 INTERNAL FUNCTION PLIST. (Q&QK') VECTOR VALUES CARD 2 = 0, aDR, D,(, DO,,DDnO o.,, oo )) C -=, REMARK CARD MASK FOR LISTING I-0 PARAMETERS

-187 INTERNAL FUNCTION - FSUB (QFQ.) Page 1 ENTRY i 4 = 1 SLBFLG OB X f RETURN FINFLG = 1B LFP I LBWORD = I B N EXECUTE HIS + YES -' ——'- RACTAM FLOATING ASTERISKS;>|ERRNO000 NO EQUALS ---— N X^EQUALS ZERO AlHq J = J + 11 EXTRACT CHTAACTER'S RETURN T >^ ^ss^ T3 ^NEXT HER LFPTO1ES STATEMENT FN LABELCHARACTER LA BEL FINFLG..~,., ~- NQ.I / sO NO

-188INTERNAL FUNCTION FSUB (Q.F.Q.) YES

-189 INTERNAL FJNCTIONS —FSUB1 FSUB2 YES NO NOT OK

-190 INTERNAL FUNCTION - LABELS SBL = SBL + 1 LOCLBL=LOCLBL + 1

-191PROGEN SECTION NELNAM CHECK TO SEE THAT THIS COLLECTION IS UNIQUE:

-192PROGEN SECTION

-193EPILOGUE; MAIN PROGRAM AND INTERNAL FUNCTION

-194 DIAGNOSTIC CORE NO YES

STEPWISE REGRESSION FLOW DIAGRAM AND CORE LAYOUTS

-196 CORE LAYOUTS (MA'IN) (SUB) STARTER STARTER 1/2 2/2 TAPE 1 RDRTRM PARAM DATAIN RDTRWT CMTRWT TERMIN READ 1 PKTRM TERMIN STUDENT TAPE 1 RDRTRM GRADER NRML PRINT 4 ENTRM PKTRM TERMIN RDRTRM TAPE 1 SUB-SUB FORMAT TRMCHK ENTRM XLOG RAM2A PICKV PICKS RAM2C RAM2B PICKE EXP(3 TRMCHK ENTRM PFNCT EXP(3 EXP(2 TAPE 1 PRINT 2 PRINT 3 SQRT DGFRDM VARSRT VARCHK MATRAN RGRTRM SEQPGM SELPGM SELPGM TRMCHK ENTRM (FUNCTION) SUB-SUB-SUB SELPGM SELPGM ZFNCT SELPGM EDITO?,. REGRESS ION REGRESS ION 1/2 2/2 SUMSQ RSDSUM PRCRCN RGRSSN WINDJP SELPGM SEQPGM SQRT PRINT 1 PREDCT TSTLVL FLVL DTAB XTAB ERROR TAPE 1

-197CORE IAYOUTS CONT'D (MAIN) STATENMEIT GENERATOR (SUB) PROCES PROLOG D3IMNSN CTERM RDRTRM GENTRM SUMEQN SUB-SUB ERROR GTRM EXPNT (FUNCTION) SUB-SUB-SUB SELPGM

-198STARTER PROGRAM First Part U

-199 STARTER PROGRAM Second Part PARAM NOVAR=NOIND + 1 NO Z = NOTRMS+1

-200 DATAIN O

-201 CMIRWT CALL \ Oa \ D ARRAY 1 (I) = l. T APE 1 r I-l NOIND / ARRAY 2(I,1) ARRAY 3 ARRAY 3(I,J) - DO > (I,60) 1 Jl NO] NOFNCT ARRAY 1(60) WRIE ARRAY 2(6o,1)ARA NOIND A RDIRWT READ CALL ARRAY'S APE 1 TAPE TC N TERMIN

-202 TRMCHK. ENTRM RDRTRM

-203-'PKTRM PICKV FUNCTION PICKE

PICKS

-205STUDENT PROGRAM <0 <0

-206STUrMN PROGRAM (CONTINUED)

-207 NRML PRINT 4

-208 XEITOR ROGRAM J1 = NOINIT + 2 - I J2 = J1 - I INVAR(J1) = INVAR(J2)

-209 ZFNCT 0

-210 PFNCT FUNCTION FOR INTEGER POWERS ROOTS AND RECPROCALS.

-211 REGRESSION PROGRAM FOR THE STEPWISE FITTING OF DATA RESIDUAL SUMS OF SQUARES AND CROSS PRODUCTS LOAD ARRAY AND COMPUTE SUMS OF SQUARES AND CROSS PRODUCTS PARTIAL CORRELATION COEFFICIENTS STEPWISE REGRESSION COMPLETE JOB AND DECIDE WHAT TO DO NEXT

-212 SUMSQ ARRAY (NQVAR+1, NQVAR+l) = ARRAY (NOVAR+1) NOVAR+1)+ WHT

-213 RSDSUM

-214 PRCRCN

-215 RGRSSN I.I

-216 DGFRDM

-217 COMBINED WITH VINSRT TO ALLOW INSERTATION OF TERMS, OR STANDARD OPERATION, AS DESIRED. VARSRT VARIABLE Il NOT IN EQUATION VARIABLE Il IN EQUATION

-218VARSRT (CONT'D)

-219COMBINED WITH VINCHK TO ALLOW INSERTION OF TERMS, OR STANDARD OPERATION, AS DESIRED. VARCHK

-220 TSTLVL

-221 MATRAN 0 0

-222 RGRTRM - -,,, 0 PREDCT

-223 WINIJP <0 1)1- ECltJRN

-224MAD OR FORTRAN STATEMENT GENERATING PROGRAM 0

-225 EXPNT

-226 PRNT 1 STEP NOMIN / —----— F \ "s^ ^^ T FLEVEL, NENTER NONT SIGY, RETURN \s..^^ /______,Y~ ~ETC. STEP NOMAX TAPE 1 FOR ENTER ITAPE ( " \ I. G. NORCD-l1/ READ TAPE ITAPE t a J E= ) (NO LIST) PLVL ENTER 2 ( ENTER ) - --- ~ W~T9 VERSIONS AVAILABLE ~/ ~~~~~\ 1) Functional Representation VL ~ F(P,DEFR) ( FL 2) Table Interpolation.

COMMENTS ON THE SYSTEM SIMULATOR FLOW DIAGRAMS The flow diagrams for the System Simulator may be more easily followed through the combined use of the comments presented here and the references given to the main text. Where page numbers are parenthesized the page refers to associated explanation in the main text of the paper. The System Simulator Core 1 diagrammed on page 132 is charged with the initial translation of the source program presented by the user. (51) The first task is the initialization of the status of the machine including the determination of the condition of the tape units and the blanking of core and drums and the setting of switches and counters. Beginning then at RESET1 the Simulator reads card after card into memory one at a time. As each card is read into memory, the card information is scanned character by character using the procedures indicated in the scope of LOOP. Blanks are ignored and illegal punctuation is detected. Legal punctuation together with the action of the counter HOLCNT is used to establish the type of statement being scanned and the action to be taken. Whenever more than 6 characters have been found without finding legal punctuation, the Simulator anticipates that a Declaration of some kind may be at hand. By using indices K1 and K for the statement label array S, it is possible to transfer directly to the appropriate section of the program for any declaration and to the section of the declaration analysis appropriate for the punctuation found. Because of the limitations of storage, the additional analysis required for an Element Description is done by ELDES in core 2. Otherwise, -227

-228 the declarations for Connections, Input Parameters, Desired Results, Synonyms, Function Substitutions and the data following them can be treated entirely within core 1. The diagrams on page 134 indicate the settings of the switches for the processing of the various declarations. The Connection section on page 135 may be entered through S(6), S(7), S(8), or S(9) depending upon the punctuation encountered in the scanning loop for the counter index K. Within the section, the Boolean variables CTO, CEL, CAT, CAID, and so on are to retain the structure of the statement being treated and thus to direct the processing of the statement. As might be suspected from the mnemonic symbol names, CTO is associated with the connective TO (23), CEL is associated with Element name, CEID with Element Identifier, CAT with Attachment name and CAID with Attachment Identifier. The diagram on page 135 indicates the treatment of the special unary and binary elements. (24) The subroutine CON CK on page 139 is for the purpose of establishing the consistency of the Connection matrix and to assure in this way that analysis of ambiguously defined systems will not be attempted. The Synonym section on page 136 analyzes the form of the synonym encountered (42) and saves the result for later elimination of synonyms from the Connection matrix. In a similiar but simpler way, the Input Parameters and Desired Results section on page 137 retains its information in core for later use while the Function Substitution section on page 138 records its information on magnetic tape. The consistency check subroutine CON CK on page 139 is used to insure the detection of ambiguity in the Connection matrix. If more than one connection statement has been given for a given identified attachment point CON CK determines whether the associated attachment is such that

-229 the connection is a duplicate of an earlier connectiono If this is true, the subroutine removes the duplicate statement and compresses the matrix. Otherwise, the connection is ambiguous and the error is reported. The Simulator Core 2 known as ELDES is the Section of program used whenever an Element Description is encountered. See page 140. After saving a section of the data on tape to provide space for the processing of the Element Description and after initializing counters and switches, the analysis of the Element Description begins. Since several assertions are recognized within the scope of the Element Description, a scanning occurs of each card to extract the information required, A somewhat different structure for handling the punctuation is employed. This structure is somewhat more convenient for the Element Description processing because of the periodic occurrence of the M.A.oD statements which must be passed untouched to the Library tape. Only the encounter of the Declaration DESCRIPTION FINISHED can return the program to core 1 after completing the processing at S(4) on page 1435 When the end of input data is detected by the occurrence of an End Of File mark or by the declaration NEXT SET OF DATA, the processing goes to the Setup Core 3. In the diagrams beginning on page 145, the Setup Core eliminates synonyms (SYN ELM subroutine), constructs the Boolean parameter words (51)(IPR ELM subroutine), constructs the indirect addressing lists for the connection matrix (52) (SS ORDR subroutine) and positions the tapes for further processing. The details of these subroutines may be followed by reference to the associated text as indicatedo Several other small utility routines such as the tape mover routine TAPMV,

-230 and the check out routines TAPEIN, TST DMP (Test Dump) and EL TP PR (Element Tape Printer) are shown for completeness. The Desired Result Reduction Program (core 4) begins on page 154. After initialization, the search for parameters for which program must be generated begins. First, all parameters are removed that be so treated without introducing computation. This may be done by matching the requested parameter with an Input Parameter or by taking advantage of the Broad Scope concept. (27,56) If the request remains after the execution of EXT CHK (Scope Check) the associative memory (61) is set to receive entries for every pertinent Statement Collectiono Since tape movement is very time consuming the flow diagram on page 155 computes the shortest path for the tape movement. The diagram on 156 analyzes each collection of statements to determine its utility in yielding the desired results and checks to be certain that any special conditions are also satisfied. The diagram on page 157 selects the collection to be used in the program and inserts the collection in the program (INSERT subroutine) and removes the yielded results and adjusts the connection matrix (REMOVE subroutine). The section on page 158 details the probabilistic selection mechanism. (60) Finally, the diagram on page 159 details the testing procedure for the completion of the program generation and the repetition of the process if required. The routines shown from page 160 through page 168 detail the various subroutines indicated in the main core 4. When the program has been completed by core 4. the remaining task is the production of the object program itself. (63) The generation is accomplished in three core loads of program. The first of these

-231 generates the Prologue section. (47) As a part of this, the unique parameter code is constructed by PCODE on page 170. (35) The routine PRLOG accomplishes the actual prologue generation using the primary scan PSCAN and the secondary scan SSCAN. Since multiple copies of some statements must be generated for input-output requirements the use of two scanning routines to analyze an input buffer BFR and generate an output buffer BFR1 proves to very effective in saving generation time. The routines PSCAN on page 172 and SSCAN and. SSCAN1 on 173 along with the smaller routines for output (OUTPT1), saving the Epilogue section (STORER), the word insertion routine (INSRT), the input-output statement generator (PSUBl), the general statement generator (PSUB2)(33,40), the parameter name generator (PNAME), the continuation card routine (SKIP) and the parameter dictionary output routine (PLIST) are structured to be included wherever they are required in any of the program generation cores. The routine FSUB is charged with the determination of the contents of the C symbols (33) and produces either floating statements labels or function substitutions depending on the double asterisk symbol. Depending on the decision the routine FSUB1 or FSUB2 on page 189 will be used. for function substitutions or the routine LABELS on page 190 will be used for floating statement labels. The program generation section PROGEN on page 191 makes use of the common routines mentioned above as indicated in the diagram. The generation must invert the order of the program found by the Desired Result Reduction Program. (63) The section also checks to eliminate any identical collections that may have been selected by the Desired Result

-232Reduction Program. Finally, upon completion of the Program Generation, the Epilogue section is called upon to provide the final cards required by the program to transfer control back to the Prologue for testing and completion of the simulation. (47)

COMENTS ON THE STEPWISE REGRESSION FLOW DIAGRAMS The flow diagrams for the Stepwise Regression may be more easily followed through the combined use of the comments presented here and the references given to the main texto Where page numbers are parenthesized the page refers to associated explanation in the main text of the paper. The Starter program shown on page 198 is charged with the responsibility of entering and storing all of the control parameters and data and accumulated learning for each problem Thne initial section may also save the accumulated learning and terms from an earlier problem depending upon the test of NOE XIT%, A non-zero NOEXIT corresponds to an unsuccessful execution. of the previous problem so it is then necessary to retain these arrays for later restarting of the problemo The subroutine PARAM brings all of the control parameters into storage. DATA IN reads in and saves on magnetic tape all of the raw data supplied for the problem. Next, depending upon the parameter IFTRWT; the accumulated learning is either read in from an earlier trial through RDIRWT or initi.a:ized to equal probability by CMTRWTo The various possible types of analysis (101) are initialized by the program following the test of the parameter IFCNSTo The parameter NOINT controls the input of suggested terms from the supplied data decko Finally READ 1 calls in and saves the desired natme for the subroutine to be generated upon completion of a successful analysis, If enough terms have been given in the data. the program transfers control directly to the EDITOR core Otherwise, the second section of the starter program is selected' to generate enough terms to fill the a.l.lowed regression matrixo -233

The second section of the Starter program conducts a selection of terms to fill the regression matrix by using the accumulated learning as it has been initialized by the first section. The routine PK TRM selects each new term by generating three term selection parameters (122)o TERM IN then inserts the term in the set of trial terms or returns for additional attempts by PK TRM:if the term selected should happen to be identical to any previously selected and entered term. The routines PARAM, DATA IN, CMTRWT, RDTRWT and the routine TERM. IN.shown on pages 199 through 201 execute their tasks in the straightforward manner shown. TER MIN is a "skeleton" routine (a routine that consists of a sequence of calls upon other routines) calling first upon TRM CHK to verify the uniqueness of the term under consideration and then calling upon ENTRM for the entry of the term if it is unique TRM CHK on page 202 checks the uniqueness of the term by searching the list of previously entered terms as it has been built up on the magnetic drumo The selection mechanism for the program selected terms in contained in the diagrams for PK TRM, PICKV, PICKE and PICKSo The skeleton for the selection process is the routine PK TRM which must choose the interaction order using PICKV, the variables to be used in the interaction using PICKS and the function of the chosen variable using PICKE within PICKS. The probabilistic selection mechanism (60,61) is the basis of all three of the routines. Since PICKS must select the number of variables specified by PICKV and since this total could include most or all of allowed variables, the mechanism is modified in this case to reduce the set of possible variables after each selectiono In this way every trial will produce a unique new variable.

-235 The Student program (so termed because the learning mechanism is located in this section) on pages 205 and 206 carries out the simple learning process. After arranging to carry forward all successful terms from the previous trial, the student program grades the previously used selection mechanism using the routine GRADER. The one exception to the grading occurs when the previous trial failed due to an impending over or under flow of the floating point datao In this case, only the faulty term is graded. When the grading is completed, the selection mechanism is normalized so that the mean probability is unity. Finally, the terms for the next trial are chosen in the same manner as was done in the second section of the starter program. If the selection of the student program followed the successful completion of an analysis, the student program also retains the accumulated learning for future use. The routine GRADER on page 206 utilizes the "half-life" concept in rewarding or penalizing the selection mechanism. (l053122) This method preserves the positive probability of a term even after repeated failureo This, in turn, insures that every possible term receives some consideration. The routines NRML and PRI1NT 4 are shown on page 207~ These routines carry out the normalization of the accumulated learning matrix and the printing of the status of the matrixo The EDITOR program on page 208 processes the raw observation data -to form the terms chosen by the student program or given by the usero If the user should wish to supply a function that is not one of the set allowed by the "standard' ZFNCT and PFNCT routines, the following modifications should be made~

-236 1) ZFNCT controls the interaction of the functions of the variables. If anything other than a cross-product interaction is desired, the change must be incorporated in the B loop. 2) Ordinarily, the only changes desired will be made in PFNCT. This is an interpretive routine that selects the desired function for the variable by interpreting the function number given in the term matrix. If special terms were desired for some value of J, the test for this value could be inserted after the entry and the appropriate action taken. The user should also take care to cause the proper printing for the special function to occur for use in interpreting the results of the regression analysis. An appropriate change should also be made in subroutine generation program on page 224. This change would be made in the general term generator GEN TRM. The stepwise regression program is structured as shown on page 211, Each of the major sections are written as a subroutine to allow for increased flexibility in keeping the analysis abreast of the best current methods. SUMSQ on page 212 loads the regression matrix with the processed data from the Editor program. In the loading process, the sums of the squares and the cross-products are accumulated. Printing of the result of the loading is available under the control of the parameter IFRAW. If the type of analysis selected by the parameter IFCNST requires the adjustment of the sums of squares and

-237 cross-products about the means, the residual sum routine RSDSUM on page 213 carries out the computation. The product moment coefficient of correlation for the terms with each other and with the dependent variable is calculated by the partial correlation routine PRCRCN. This routine also makes available the standard deviations for each of the terms and the dependent variable, if desired. The regression analysis itself displays the interesting structure shown on page 215. It should be noted that only the degree of freedom routine DGFRDM and the matrix transformation routine MATRAN have a single logical connection to the switch N. The switch N directs the analysis through successive steps of sorting the terms in VARSRT (78), checking the variance contribution in VARCHK and transforming the matrix. After DGFRDM, VARSRT or VARCHK the analysis may be terminated depending upon the results through the routine RGRTRM. The degree of freedom routine DGFRDM on page 216 insures that the number of degrees of freedom are continually revised as the analysis progresses, Whenever the variance of the dependent variable is non-positive due to roundoff error or machine error or whenever there are no more degrees of freedom remaining the analysis is terminated. VARSRT on pages 217 and 218 sorts the variables into the sets Xi 1 and Xi 2 (78). Depending upon the result of the sorting, the selected term will be checked by the F level test and the analysis will proceed through VARCHK or the analysis will terminate if no more terms are available.

-238 The routine VARCHK compares the F level of the selected term to insure that the risk of committing an insertion or deletion error is not exceeded (81-83). If the requirements are satisfied, the regression matrix is transformed by MATRAN (31, 120) using the relations: Aij = Ai, - Aik * Akj/Akk i 1,...n j = l,...,n i k j k n = number of terms A i ik Aki = Ak,i/Ak,k J Akk = 1./Akk The regression analysis is terminated by RGRTRM. The result of the final step is printed and the predictions of the data using the regression equation is printed, on command, by PREDCT. Upon completion of the regression analysis, WINDUP checks the postulated criteria given by the user against the properties of the generated relation. If further analysis is indicated and allowed by the number of trials, the program returns to the student program. Otherwise, the subroutine for the equation generated by the last trial is produced by the statement generating program on page 224 after which the program returns to the starter program.

ILLUSTRATIVE EXAMPLE The Stepwise Regression Program with Simple Learning may be best appreciated by presenting the program with a set of data for which a predicting equation is desiredo As a substitute for that experience, the following example is presented. This problem was one of many presented to the program during its development for the purpose of verifying the validity of the procedure. Since data arising from experimental sources always contains some random error components} the example was constructed using a normally distributed random number subroutine to add to a defined function the effect of random erroro The function used in this illustration was the following~ Y = 4~0 * X2 - 16.0 * X + 15.0 + (EPS1LON) where EPSILON is a random normally distributed error with a mean value of zero and a standard deviation of 0.25. In order to show the action of the Simple Learning mechanism, the control parameters are so chosen as to allow 20 of the "standard' functions of X but only allow the regression analysis to have access to 4 of these functions at any one time~ Thus on a fairly simple scale the behavior of the Simple Learning mechanism may be observedo A complete discussion, of the data deck and. control cardls may be found in the Communication of the Problem to the Program beginning on page 98. Obviously, the capacity of the Stepwise Regression Program would allow the solution of this problem in a single trial if so desired. Random restarting of the problem should be expected, to produce variations in the sequence and number of trials from those given here but the final result will be the same.

-240 The data was supplied without any accumulated learning deck or suggested terms. On the first trial, the terms X-6 X2, X5 and X-4 were chosen randomly from among the terms allowed by the 20 functions specified on the control card. Of these, X2 and X5 were sufficiently correlated with the data to be included in a predicting equation~ The postulated standard error and coefficient of determination criteria were not satisfied, however, so the learning mechanism was called into the computation to assist the selection of new terms to be tested. On the second trial, neither of the new terms X4, X/4 were better choices than the terms X2, X5 found previously. The third trial suggested X1/5, X3 as new candidates and retained X3 in addition to the earlier X2 and X5. On the fourth attempt, only one new term was suggested for trial, X. Since X2 and X were together far superior to any other combination of the trial terms for this attempt, the resulting equation: Y = 3.99999857 * x2 - 15.999832 * X + 15.0000304 satisfied both criteria and the analysis was terminated for the problem. This example, while too simple to be of much practical value, is a fairly reasonable illustration of the technique. The value is even more apparent when the technique is applied to more complicated situations. The.execution time for this problem was approximately five minutes, with about half of that time going to the loading of the magnetic program tape on the tape drive and initiating the problem. More representative figures for more complicated problems run on the order of fifteen to twenty minutes, with the actual time depending heavily upon the number of data sets.

* nATA 1 EYAMPLE PROBLEM FOR RELATION Y =4.*X**2 -16.X +15.+(EPSILON) 1.0001.05.05 1 20 4 1.099.5 10 1i1111l 13 3 0.3E01 0.3E01 0.15E01 1 3 4 5 3 122 4 1 23295 8185 13 69 FrRMAT(F5.0,2F15.8) 1.-0.77635685E 01 0.38030785E 03 DATA0001 2. 0.72477829E 01 0.10915799E 03 DATA0002 3. 0.18261584E 01-0.87888591E 00 DATA0003 4. 0,68130090E 01 0.91661219E 02 DATA0004 5.-0,31857'31E 01 0.10656648E 03 DATA0005 6. 0,13964993E 01 0.45702802E-00 DATA0006 7. 0.29685392E 01 0.27526532E 01 DATA0007 8.-0.93314382E 01 O.51260412E 03 DATA0008 9.-0.24912679E 01 0.79685632E 02 DATA0009 10. 0.32323989E 01 0.50756451E 01 DATA0010 11.-0.'1791103E 01 0.10629232E 03 DATA0011 12.-0.86843699E 01 0.45562153E 03 DATA0012 13.-0.80777898E 01 0.40524609E 03 DATA0013 14.-0.65720271E 01 0.29291764E 03 DATA0014 15. 0.86774063E 01 0.17735252F 03 DATA0015 16.-0.82241484E 01 0.41713149E 03 DATA0016 17.-0.37972297E 01 )0.13343099E 03 DATA0017 18. 0.61368134E 01 0.67453766E02 DATA0018 19. 0.70016510E 01 0.99067087E 02 DATA0019 20. 0.85431173E 0 1J.17?25098E C3 DATA0020 21. 0.30293,+79E 01 0.32386171E 01 DATA0021?2.-.,71202164E Cl 0.33171670F 03 DATA0022 23.-0.50384158E 01 0.19715650F 03 DATA0023 24. 0.47421 81E 01 0.29077920E 02 DATA0024 75. 0.43137113E 01 0.20414905E 02 DATAO0025?6.-O.10008156E 01 0.35019451E 02 DATAC026 77.-0.83967~83E 01 0.43136642E 03 DATA0027 28. 0.98397;10E 01 0.24484918E 03_ DATA0028 99.-0.46616156E 01 0.17650787E 3 DATA0029 30.-0.99740-75E 01 0.57251016E 03 DATA0030 31. 0.8132C423E 01 0.14940762E 03 DATA0031'2.-0,29972 48E 01 0.98891438E 02 DATA0032 33. 0.45478;62E 01 0.24966075E 02 DATA0033 34. 0.28013)60E 01 0.15693002E 01 CDATA0034 15.-0.77303)18E 01 0.3771855E 03 DATA0035 36. 0,66956154E 01 0.871o8816E 02 DATA0036 37. 0.71400)23E 00 0.56152710E 01 DATA3O37 38.-0.63002.88E 01 0.27457628F 23 DATA0038 39. 0.11567351E 01 0.18445273E 01 DATA0039 40. 0,26133195E nl 0.50497638E 00 DATA004C 41.-0.33071.46E 01 0.11166356E 03 DATA0041 42.-0,62067?32E 01 0.268403033E 0 DATA0042 43.-0.92314617E 01 0.50358116E 03 DATAO043 44. 0,35399178E 01 0.84858461E 01 DATAO044 45.-O.35085416E 01 0.12037566E 03 DATA0045 46.-0.16Q50303E 01 0.53612783F 02 DATA0046 47.-0.26867548E 01 0.86862338E 02 DATA0047 48.-0,77692993E O1 0.38077562E 03 DATA0048 EXAMPL

EXAMPLE PROBL EM FOIR RELA TIts =Y4., *.-i' - + + C +EPSI LOii STRRTER PROR0 R flM F'R' ELELE l 1O. 1 RAW DATA BSER.'TION NO... 1 W IGHT = 1 1. i-iE' E A T I -. _ 3E 0i -2! ~ 1L.0. -:-C: 1 -= 0. 7247783 E 01' 2', = 0.1 0-15 —80E 03 k0 B S E RV A T s O.. E I - H T =.00000 - O diEREF. NO. 4. *.IG.T 1.,.00000_i i-f X (' 1, - CESER I? *: I Cl- 1 ~xC ~ 1 1 =,:,; C 0' *E 01 1.. 2, =,-.1'- E, OBS EER V9 iT I N O. 8. ) E i GHT = 1. - -i00 i| z E' — C ir;! i 0.T9 E 01 X'._-'-i- - = 0.t S,! s'..S I t! i? XC OBSER'VTION INO. 5. tWE-Ei GHT.0- i..... IL 1 1, - -0 1 01._: 2:3, 0.512I041E 0.3, O- S E:R;'. A T I i. I i -i - T =. i... 1l i.......i1.1. 1-../.. __:s 1 t' -' iE L 2' 0. E 0 B S E R T t. l.it iGH T - 1.' -''''f__ - i_ Xi 1z:: -. -. 0.9... 0i I _fz-31'. 2.40'14 E3 i-0 I- - O C 1: E0.32323IE 01 2): = 0. 52- 0 4E 0 3 X C -OBSERVPTION N!O. 1. EIGHT = 1.0000.. X,:( 1:) -0. 7 1 4'E 01':,:':: - O. S E 02::':; 1: Hi S'rER- ~ ~...,.71-.'. ~ _ - OBSER'AT ION ONO. 11. 2. EdEI GHT = 1. 00 _-______:C 1:: = -0..387-4370E 01 -X:^: ='.455 1. 1,E -'.:: OBESER'ATION NO. 1. WEIGHT = 1. 00I i B S E..-R A T I' N NO. 15. 1 E Iu H 1( 1 C 11 1n __ XC: 1:> = - i 0.8 77, 79 E 01 2:: 0. 4 24 1 E 03 OBSERkV'TION NO. 14. WEIGHT = 1.00000 _ _ __-i_ X( 1) = -0.6572027E 01 XC 2) = 0. 2929176E 02', OBSERVRTION NO. 15. WEIGHT = 1.00000,C 1 = -0.8 277401 E 01 I.:,::,:: 2,} 1 7' 2315 E C03:: OBSER,,RTION NO.!'. 16. WEIGHT =-I CI1,00 XC 1) = -0. 3797230E 01 X < 2::' =. 13 4310'E, 0,3 E' _.:,: OBSERVATION NO. 18. WEIGHT = 1.00000 _ X~* 1) = 0.6136813E 01 XC 2') = 0.6745377E 02;^X: OBSERVATION NO. 19. WEIGHT = 1.00000

' 1,,.C. 7i 0 1: 0. -!I 0165I E'1 x C 2.'9 0:' - 0 E, S- E:'.RE T!'H H O. 2_. I.f E I, H T _ 1. 0 0- "Tl:. 1C 13, - 3.:-,. 1 t *,,. - 5 1 0. 7E 01 2.,.. 510'- R T! r t 1 i.1:. E.I E H, T 1. 0 0 0.32 61 0::.:.,.- i, - 3..Z,':3 2.. 4 E,'- -: E =:,'. 3- S' - _ _1 7 E t O-'E,.I T I NO.. -.,E I GHT 1'00.i 0_i'_i,x' C 1': =- -0. 1 " 2 2 0 27 E c0l',: -.. 17167E 0.3,..:SEPRr-TIO!" O.!"-1. 000 0 0EIGH _,t,1 = 0.. -i 3S.- -E 0. t.,, —',1 t0.291 56E 0 2 _ _ OBER A RTI:O 0 E EI'. HT 1..ii00 01-'i41t 1:F i 0.4 3 t 1 1_ i =i,,- i 21 t -. 2 44t E.- 02 -; -OBSEr —F i.' NO. T I iWiEI HT - 1.11-,|, I-ii,0000.; I 1. 1 E i4 -,SE-,RTI0 HO 0 2' IHT = 1 0 0 H,:-:; 1:' I 0. 4' 7'1 E 01'::,:'. = 0. 24484 F 021E 0.''-::':: C E ErH TI t H.2. W ii EIGHT; 1 C000 i 1:::. " -. t. i. 1 1 S1E 01: 2.' = - 1 7 5079E _ 1 _3 L 0L.. E:: R. 30. I HF E I i HT 1 0 0 0_'_ -:,"1 =: -0i..7407E 012:' 0.572 51 02 - E 0 3': _:-.-,"I.,..1.'., li E:. E F.': I,. T T I l I'i 0 2 hii E I iS i - T i'", t 0 lt 1-i E' Er..iT I N HO. 3.1. E 1. T I 1. I- 00000::': 8132012E 01 C 2 1 494076. O ESERRTIONr HO. 32. 2 E I GEIHT:::,::' -0. 2'997295E 01'iSE''i s'' T i i OrH -l I, H..,,. I.IE IGHT T.._.,^ 1:: 0. -454',1': 5 E 01 0'SERY14t,.i,, o... 4,tE i 1HT::( 1:: 0.2 1' 1 9 E C I:'C 1';' " -0 7'4. 01 C EffS. E R NO. 37..WE I HT OBSERi' RTIOH t NO, 38, WE I GHT - 1 i ii1'g~a C 1';,sz iti 7 1 4 ID 0 0 2 E Ci Is _ _. _).- _OG3 0 5`x 1 a, FO O.. O..',::,:: 2::: - 1.O OIi 1'l'^2" -4 1. F00000 -'1 i 0 0 t! 0 %. 2). 1 _ Ie',', ID..... _ 0.,'::''" 1 4 4 E 0.. 4,. -, 0 7't' 0. 2496607E 0. 15'.3'100'..- i: 0..37771 85E J0. 7 i "I Cl;. 8 7 1 ""i -:" 1 -J Ci,. 6 1i,7 1E o.27-4.73E C, 2 0 1 0.3 01,I,::..' C iI IF- C rt'7d r"I I.------- --- =. — _v _ _ - w - l.,I _ ~~~~~~~~~~~~~~~~~~~~~~~~~~..,,..,..,.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~z OBSERVAT H1NO 40. 3. WEI GHT'-r XC: 1)'-. 1 1567,35E 01 I, tI:lX( 2 1) = O 1844527E 01 t

OBSEFR''TION HO, 40. JEIGHT = 1.00000 X C 1'. 261.31 9E 01. 2>, O.5049'764E 00 OBSERVUPTION H. 41. WEIGHT = 1.00000: 1:: = —.33071 S5E 01 _X' 0:. i 111:G.:6E 0.3 ________ OBSER'T I OH NO. 42. WlE I GHT = 1. I ] 001 000..X: 1, - i -0. 20723E, 1. 2: 24003E 0.. OBSER VATIOI f O. 43.i TI tEIGHT =.1i00I000 IX 1 - 0 -.''21 462E 01.::, 2:, - 0. 5. 3581 2 IE 0_ _ ____ OBSER'RT I NHO. 44..lE I GHT - 1. CI00 ___HT 0 0____. C 1: = -.3 39 1E 2: = I4.' 461E 01. 2E 1 r: OE:SERyF.T IOH INt. 4:. W.EI tGHT = 1. 00000:-:,:' 1:: - -. 3'. 854200 E 0i-1'.: 120.3757E -. _____: 7' E_ OBESER')ATIrIH NO 1. 4.iJ.EIGHT = 10 _____ _ ______ ________ 1- iI 5 0.19.'30E 0i1 5.361 27E 02 OBS E R:- T I 0H HNtO. 47. lE I GHT = 1.i0000,:: 1' - -i',2Si86755E 01'i:'-'_ 2 = IO.86 r234E 0'2,____ i OBSE R'. iSTIt tOI I ]. 4',. E1EIGHT t 1. Fo-t — 0 - ___-_____I-0,' 1:: -- 7','7'79299 E 01',:'.3807556,E 0..: EDITOR PROGRtAM F'F:ROLEM t O. 1 SOLUTIOH PAiSS NO. 1 NO. OF INDEPENDEENT VIARIPBLES 1 NO. OF TRIAL. TERMS -- 4 TR IL TERM DlEFINITIOHt FOR P OSS O. 1 TEfRM;:: 1= INiTERIACTION OF ORDER 1 W I!HERE THE CO-IF'ltOHEHTS ARE DEFIHEED T E — COMlPONENTl':: 1) =:: 1::'. - TERM.: 2:: = I tTER:CTTIO OF ORDER 1,!AifWHERE THE C:Ot1FPOHEHtTE ziR A IE tIEI! TO E E -- C:... COMPOHNE TC 1:) = X: 1::' F.. 2 TERMIC.3): INTERACT'ION 1 OF ORR,1 ER 1, WHERE THE COMPONEIR1EHT'-'rE DEFINED TO _E' -- COr MPO1NHENtT" 1:: X 1::,1' 5 TERMC 4 ) = INTERACTION OF iORDER 1 WHERE TH E COMPOHENTS RLE DEFI'NE TO EE -- S_)' tCOMPONENTC 1) -:eXC 1),P, -' 4. E__ 4_____ TERMC P SM, 5 X: 2;, DEPENDENT VARItBLE.

-245 STEF'Id ISE REGRESSION HO. OF DRAT SETS - 4-: PROiBEILIT,' OF 1 ER ERROR IN E -ITER It NS T E R Fl 5'.:0 0-.. I E- CR n:l I T F - I -T IT T N; T F'hl - 5 i-l-l-In I-i.. l-i WEIGHTED DEGRFEES OF FREEDOM = 48.00 STFHNDORDE ERROR OF' = -.1.-,2831 23E 03._ STEF NO. __ TERM ENTERED F LEVEL = 0.64752.3925E-01 STANDARD ERROR OF'V = 0. 51 74,:,8E 02 COEFF OF DETE RMIN T I I -l T I I' H =.8-,-_4.'5' 1 1 O10 E 00_ MULTIPLE CORLTN COiEFF = o. 94:8 17'55.E 00 CO 3 ST F T TERM =', 1': 145's 5R E -... TERM iNO. Ci: E FF I C I E NTT ST'D ERR OF COEFF TE RM- 2 0.3 9 1 3 7430E 01 0!. 27 2':. 40E- 0 TERM-.3 -0. 25 82.7 05E-02 0 27 3 1176 24E-03 ______ REGRESSIO'N TERMINRI TED AFTEF 2 s __TEP.__ D I PR!O I R O L E L E lE N TS ___________________________________ F __ _ _ -— _______________________________________ 1 I0.':,!; 71 I0 2 E 0 _____ I-. 1 1. 411 I E 0 1 _, H I i.i 1 3 41 9 E 01 4 10.93 1943156, E 00 PE:OES CTE F.!EEI ITI L T E -.'.- -T; CIPTR, F-p I ITSON' - S GM' _.. _....___.... 4.'I IM _: _.__..T.E- INT 2___g. 0.11 - -ij iI02 1 _ _ 1 $ iO!lElL ___ _0. 1''I- Ii _ 03__ _-.6; 47i 0Q - b 0. 1 G 5 I,f!_I,'2?'? 05 2 i IJ. ~ i r, 2.3 -, LI! i.I 4!3? G E I., 2.i 13 1 1 1. I I,.!' L-1 n.. "'!! Ei 5 E I- _ -.!_-, 5 -C 4 7 1 7' 1 E 2 -2.i 16 I: T -li 2i, 1i Ti- 2:a. oE Li, 1? i.2 20;3 li.~, - - 11ii lr E 1.:1, -!? 2'7 ISg 9 El, 2.% 6 - ~).C E I2 i i i i S 0 2 1 4, L O i A i R 1 A t i. I l 4 I 44 0, 2t i329 E 9 1 i 0 I1 E 0i2 0 I. 1 2 1.3. E 02 =-71'It 10 1'i1a 410 1 372.i i -7 - I i 1 1! -7iI i 1 N 3 1,.i 1 r' ri 0 r,3. 7,;i - 0, 4.E 412 4 =Q.,,1'' 02 0, 77 t' g,'*0G0. 0 1 02 E,9 O4' =0. I41.-E 02 =-2', n.92~4?-l-j-T 1, = — 3L,;.-?,,.! 1 I,':l.;.1o t.$ i-!~ -lC'i 0l, Ci01 t' I0.',1 1 0 9I27.,.,i1 01 10 =,41,42jr'2!- E fi i. = 1i l 11..Ai_ |'2 _,l _ _ li g t314 Li 01 Li rf g 1Q. i m I l30 L- 1 0 N9 t i _1 I=! tL 93 0 4. 1, _ I i 7 4 7 __Jaj__g.14i13~4r5 0 0 4441 J O^?a~ti~74lJE!g 03 0 9~ 1 g?~B4O nO 0 __ ciMIij7~ 145AIOE~l.. ____ *=g^'^g.853.glg2*^E^1... 000 10 03C 1,. Q,a Q a -i- i -I i0n 1 4 0 0 i 4 1 a. 0la24 l - 03 0. C 3 LI1 - 01 C. Ij _. i~i,^i~sia:]i4~l aj! n jieiaaai~ii QS B.^slg&QI^I ~3 ci.^Q~i~so^I.3-h 4! 0E 1:'4. _ i _ f 14. L _ 7 a i a i O 5 Qg ea lt~q r i h,.27 * _ kg -?&i- 0 0$74AE 02 0 i 0I 3 0 U04,- 2'1 2i 2 19 0.1 11t0 O, " 4 1 92l42E 03 0i2 0'4t472 1'3'l.3 1? 2'- 1E. 41 ^I?3: E 0'2 -7.,i7

1G. 17. 18. 19. 20, 231,.,4 J. 22. 2.3, 24. 25. 26. 27, 28. 29. 30. 31. 32. 33. -34 35. — 9. 4,. 4'2., 44, 45. 4-. 0.32877041E 03 0.21229369E 02 0.8 47-' 648'E 02 0.11385031E 03 O0.13541794E 03 -0. 23912673E 01 0.21126164E 03 0.71348273E 02 0 45202-38E 02 0. 32055324E 02 -0.34360529E 02 0.35085150E 03 0.10985973E 03 0.54059473E 02. 13,- 1.364532 E 0.3 0.13374104E 03 -0. 17S6311E 01 0.3 S916670EE 02 -0. 74830217E 01 0.2714541 9E 03 - 2:.332670.0 E 02 i. 14570635E 0 03 -0.33025755E 02 -0.1141,961E 02 C.631575 1E C01 C. 13 s 1;9' - 67E 0 3 O.,: 5 7 4 9 3 4'3E 0 3 3. 10219363E 02 0. 12145368E 02 -0. 26857144E 02 0.38394888E 03 0. 76407839E 02 0. 14465512E 03 O.169028378E 03 0. 19059641E 03 0.52787202E 02 0. 26644011E 03 0.126526. 74E 03 O. 1003140E 03 0.8 7233793E 02 0. 20:7'40E 02 0.40102997E 03 0. 1650321E 03 0. 10I23794E 03 0.66:,:3:::2379E 03 0. 18::' 89i 950 E i 3 0.53299833E 02 J.94345177E 02 Q.47695447E 02 0.32 63265E 03 O.1 S8517 E 1102, 20.2Q'8482E 0.3 0.22152714E 02 0 47, i 507E 02 0.1943634E 3. 65397-33E 02..6732:387E 02 U. 2.:21725E 02 0.43912735E 03 0.13158631E 03 0. 19983358E 030. 22420725E 03 O.245774S7E 03 0.10796567E 03. 32161 57E 03 0 1:170521'E03 0.1555598'8E 03 0.14241 226E -3 0.75 99 6409E 02 0.46120:43E 03 Q.22021668E 03 0. 1 441641E 03 0.72400226E 0.3 0.2440'9797E 03 0.1 0:478 30E03.149523'4E 03. C287391E,.03 0.38181112E 03 U.21618377E 03.74030237E 02 Q. 2560632E 03 0.77331182E 02 O. 9-89997E 02 O. i 16672-S9E 03 i:. 2'4954661 E 0.3 U-i.58529C042E 0I 3 0.1 2057630E 03 I r- s 5 i -79i3 r i": 0.41713148E 03 0.13343099E 03 0.67453766E 02 0.99067086E 02 0.17025097E 03 0.32386170E01 0.33171669E 03 0. 715650E 03 O. 29'0 77-20E 02.20414905E 02 0.35019451E 02 9.43136641E 03 0. 24484918E 03 0.176:50737E 03 0.57251015E 0.3 0.! 49, I40'762E 03.9889S1437E 02. 496607E5E 02 0.15693001E 01 0.37771855E 03 0' -.87198:-:' 1 5 E 02. 56- 1527190E CI1 Q.27457628-E 03 0. 18445273E 01. 504976038E 00 C- 11 66356E03 0.26'840.33E 03 0.5035 8115E 0.3 0.84858460E01 0.12037566E 03 _ i 7,:i; 1?7;_7S l O'? 0,33182602E 02 7.955 0.57023149E 02 42.736 -0. 77201351E 02 114.451 -0. 6996167E 02 -70.621 -0.20345432E 02 -11.950 -0. 49548584E 02 -1529.930 0. 65276588E 02 19.678 0.70629758E 02 35.824 -0.71303488E02-245.215 -0. 66818888E 02-327.304 0. 1420! 510E 02 40.55 0.25336444 02 5.874 0. 79810970E 02 32. 596 0. 67269927E 02 38.112 -0.96316 37E 02 -16.823 -0.3951 1 E 02- 26 446 0. 45591 599E02 46. 103 -0.69379102E 02 -27. 894 -0.46126147E 02 -2939.281 0.51035832E 02 13.525 -0.7380:.349E i2 -284. 6.138 -0. 13236497E 02 -235.723 0.736'14 E C 26.838 -0.2030813S6E 02 -1100.997 -0. 43256531E 02-8566.050 0. 501':3 31E 2 44.9 2' 9 0. 740.321838E 02 27.583 -0.265.3080.3E 02 -5.2,683 -0. 569198E 02-670.670 0.53051323E 02 44.072 n'_2 -71'4:F 3n E 47 174 47. -0.9187085'6E 01 020.45.103E 2 0.101169S5E.3 0.3683G62337E 02 0.40870953E 02 47.053 4.. 275676 2 E 03. 33 855 30 E 0.3 I'.3860.3.37- E 0.3 0..3 075562 E 03. 499 i 0322E 02 13. 1 06 MAXI UM ABSOLUTE TDEEIDTIO = N 96.31.364E 02,'SEE 0ES. NO. 30., LINE NO. 30).ij:-:IMUM ABS LUTE PERCENT DE,,IRTIIH = S566.050, SEE OBS. HO. 40., LINE HO. 40.') POSTILAfTED CR'PITERIP ___I STPNDARD ERROR OF Y = 0.5000C'i00E 00_ C:OE'F OF DETERMINATION = 0.9990'000E 0 FITTED CURVE PROPFERTIES T. 551 7:847 E 02 COEFF OF DETERMIiTION H 0.394571: 00-. FITTED CURVE MEETS NEITHER CRITERIi._I FH.S N',-', N UM 1B E R 2 E: GI N Fi R P R I B L E N OH.f __ 1 10 TO'T7L PiSSES IAHLLOI.ilELi EDITOR PROGRAM FROBiLEM NO. 1 SOLUIT ION PASS NO. 2 0O. OF INDEPENDENT VKRIABLES = 1 NO. OF TRIAL TERMS - 4 TRIHL TERM DEFINITION FOR FPRS O,. 2 —---—, TERM i, 1S' INTEPfCTION I OF ORDER 1, WIHERE THE COMPFOiEHTs. fRE DEFINEBD T BE — COMPONEN'T< 1:i = 1).P. 2 TERM( 2: = INTERH'CTION OF ORDER 1,.HERE THE COMPONENTS RRE DEFINME TO BE - 1CO',I11POEIt'T(1: X( I 1),P. _______ TERM() 3' INTERACTIOIN OF oQRDER_ 1. WHE. E THE COMPOHEN3I H RE __~FIME TO.8. -.__ COMPONENTC 1) = X( 1).P. 4 TERMC 4) = INTERRCTION OF ORDER i, WHERE THE COMPONENTS ARE DEFINED TO ~E -E COMPONENT( 1) = X( 1) P. 1 —* 4 TERMC 5) = X( 2) DEPENDENT VARIABLE.

qtFPPW T < c PROBELEM HO. 1 NO. OF DRTf SETS = 48 FROBRBILIT' OF 1) ERROR IN ENTbRING TERM = 5.0000 0.'0 2) ERROR IN DELETING TERM = 5.0000 0/0 WEIGHTED DEGREES OF FREEDOM = 48.00 STfNDARD ERROR OF V = 0.166283123E.03 STEP NO. 2 TERM ENTERED 2 F LE','EL = 0. 50 35:'?260 0E 001 STANDRRD ERROR OF' = 0.5517846:7E 02 COEFF OF DETERMINI-TION = 0 894571 10E 00 MULTIPLE CORLTN COEFF = 0.945817955E 00 CONSTANT TERM =- 0.18174595E 02 TERM NO. COEFFICIENT STD ERR OF COEFF TERM- 1 0. 3'31 -37430E 01 0. 27:81:28640E-C0 TERM- 2 -O.258267805E-02 0.273117624E-03 REGRESSION TERMINATED AFTER 2 STEPS. D I AGONtL ELEMEr'JTS, i,,'iR:. HO._, VALUE -- I1 i.. t GI 1r l Q..16141I99E 1l. -2 O. iO 4 6tir'. 1. E 01 3 0. 744117185:E-01 4 0. 1 G63;50'5 G6E- 00 PREDICTED RESULTS ERSUiS DATA POINTS OBS. NO. PREDICTIONS DATA DEtF'I ATI Ci V - SIGMA I' _ + SIGMA P gOINTS _ _,,':;DTli - " F E:e.T 1. 0. 27505229E 03 0. 33023076E 03 1. to~540.'22E 03 0 3.1.:3037S!5E 03 0. 500770:1'- 1 E 02 1.l. 1,: 2. 0. 11965423E 03 0. 17483270E 03 20.Q23001 117E 0i:3 i-1, 1r191579.:E 03 -. G5747?15E 02 -60.1 3. -0.25102808E 02 0 30075660E 02 0.S5254129E 02 -0. 87859IE 00 -0.3095454-E 02.522.02! 4. 0.108995S8E 03 0 16417445E 03 O.21935292E 03 0.9 i612 1E 02 -0.7513238E 02 -79.t11 5. 0.29942183E 01 0 58172.87E 02 0.11335115E 03 0. 1065664 7E 03 0.483937'2PE 02 45.41 L6. -0.3059071>7E 02 0.24587761E 02 I-0.79 766230 E 1:02' 0. 45702LC020E-00r _ -Q0. 241 30733E 02 -527'-,., 2,7. -0.37834883E 01 0.1394980E 02 0.10657345E 03 0. 2752,31E 01 -0.486.4232SE 02 -17,7. 07 8. 0.49192224E 03 0.54710072E 03 0.60227918E 03 0.512.i0412E Q03 -0. 344965D97E O2 -0.6.7W. 9. -0.13341041E 02 0.41837428E 02 0.97015896E 02 0.79685631E 02 0. -794204E 02 47.4?7 10. 0.24311051E 01 0.57602574E 02 O — 1127I0SSEO4. 0 0.53756451E 01.' L -0,'?5 IL ~1 05, 11. 0.28174648E 01 0. S7995933E 02 0.11317440E 03 o 10629231E 03 0.48296395E 02 -45.4 12. 0.39023546E 03 0.44541392E 03 0.50059240E 03 0.4N562153E 03 0.102075'9E 02 2.24!_ 13. 0.31090345E 03 0.36608192E 03 0.42126039E 03 0. 4524i609E 03 0..^1l41iG 02.16.-S-4 14. 0.16569667E 03 0.220g7514E 0.3 _ r.^27f;R1l r R.02 7.291E QG. 301,.?204.AE 02. 5 15 0 13511577E 03 0.19029424E 03 0.24547271E 03 0.17735252E 03 -0.1294172E 02 -7.27

11 I I i i t.I i I i i i — 4 -4', -9 rrT — 41 -7) 11 I..),J.. [Mr in imi i-nr w. >T o a a ^ o 3's ^ro 4 r i- Vi x T 7S>. vt ^ ^ i ^ lfc* " ".- a w,- <' -"1 -n -"' -i n "^ *^ i^ ^ 4- +^ 4;s 4.. **. +** 4'-. -(:** *(*** C-4 CM CM CM CM CM i'.-j CM CM CM r-..:i K~~~~i h..PI 4. 4.r-..- r. i Pt...K) P.:P..) t..j Ir.D CD"; ^'^ - o Q -- r~ o' T1 " m "n m "T'| C. Z 7.C c, 4' _C o ".-.Cri r..: C o -J CM4-i-o OO JiL n1 4j o co' J 4' 0I,''~~~~''.TI 4-n ^ m - ~^~~~~~~~~ o r a fi r- c c:.~~~~~~~ ~.................r...... ||[T.C. 41T1 C''~~~~~~~1- I'.M.. pm..0.3~'.- -.....1....3 ~~' C-'D cro'c7-:7pmC7C.,C-,pmpmw~~~~~~~~pmopmC-'T.)mpmCZ)` pm 114.pm i?J i-pm' rrT pm -- r — i - $ I -'n -.4 -417':-,1.lt7. —jT. 14 1.4 -I —I1'4-I4.-I1-'11Z51 1J'-'W~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r. 0.', COrT Z", —. — T r T'i 74, 111i - *??' r n -i 0: P..0./'+..'41,'1 T1. (4 ^ 4 L/-j4 —I,,',P 4 - M; - i' Z 1 o 4- r- — J " I C- — I M - J —.`, CO 4 (. o -: -' - (C! r- ^ i,4^ 4 -EJ -M Z- Z XIi 3 11 73 ^ o n In C C f-II CO co 4*i..,+ n - C, 0 1 *:. (- - CO - -41 CM C, o Q 0 W 0 P- (Aj -D (A~ Ul ( N CO: - 4m3- *MM'. M'4n - M`LJ T- r -4 (*j o- O::i -- Ci;. O c. CO 9, in c'O T4: Pj -*. CO J N -4 ^^ ^ -,w m TI' n~~r -r: m -T -' rm - -j.11.'-I- *:: — 5 1' i-no'-. j1J(r4o J4 *-J;,-p — *in.oiM rc.. M- 4J' k14 -4. *^~ ~ ~ ~ ~ ~~~~~~~~~~~~~~( a' ^., w ~ ^ - ^ ^:-!1; -i- ^ ^ T 1 n ^ * ~:U k~,:- - W.' -Aj*. L.r *-i -J 7-. — J'-,J 4:-. 1,) — J >C-i (-n 4.,- u-D - Q'I,A. D P,-) ".T,- -4 0 0Cs C-J 0 14 I Ti., 4-n:''- - — i-i'1rn 11-1 r -|TI rJ 4-TI n ri.:n 4'. o4-. c -.. 1. i-4-. *ia oi (.fl *:^ o <::o - -* o - o r.I -.j -j u' M M i-J:JLA. cr4. 7j *.7 -Jr i:'4cri ^. * ti?7 ~ — 1 n *L.!'l_4,'4 *J44' m " — I' 4-* -I -- n I m r l. ) W. J' —J 1` C -P.4 -J 0:) - t*fl:, (.ri C % -P- CO 4.-. 4 2 4:. — * C rn",* "''' ^ 1. -1,-i ^ -'' ^-.- - r m mpmr! l m if. m m r'lp i m m m rmm rn m m m m m m mmmmmr mm fill iI I I I,'f M -n X Icmt'" ri m n i m 4.n l'4 4J'47 7. I'n 47C1''4.74.-'700C' i 17. 2r x D;: I 3- r j pmZ p, Y. M - I471 I: ti,-,0 *-, D' 4.,0 r-,<.r,v1 - rn - ^ ^i^^ o H -0 1=' M I' S'".n ^ " ^ "'o ^'-:11:10''"'t->c = ^ "-" m v TO *.-." TC'*-' r^: i~i 11'm ^'^ 7'i ^ ^ ^i o i:...i 4:-. -..i v' a-' 01 - o". X tC:.i *:- i I -(>CyI CTI Z* 1. Co ->I.frCf t0 Co:1 -C. C. ji * -f - -J t0 -:7~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' w~~ ~ ~ ~ ~ ~ ~~ ~ ~ ~ ~ ~~ ~~~~~~~....i'r - n*- o ~' (i.x -.jJ'..*.t.:..-.:- (..~.j. (.i. — *.. c'....j..i.r -.:. c..*en ~ pm 4 T- K. _.._~ 5, -=- >-, -. -j o o 2' c-. *"*-i **;::i M f' "-i **S* - "'.-J ^i'-J 4 4-n 0:1 i-n c'i * -? 5 4-.'.n *i:' o:. M o o *-* (.-j i~j <ji 77~C74i'r:.-:?'"'."7 s:.~1 F,1 J [ *:w: *' r'n m 11 - *-*'! i:.>j:. 0:1 o;; - l:: - i:.1 — J C,* —.i i t.:..- -44C - C* C,' - -' K-~ -4 4.4. 4 4- 0:1 c )'41 441 41- \- 0 0 0'sl'' - -0"i - C. i' Ti',-J-'*l' -l IIC' - Ci * CO'J'-'-**J C'i 4!'. -*J i~j'-1 ~-J **j~i *<i's& *<;**'*1:1 4- 4' -*J. -J4C *C. "*JC4A (Ci'C C,3'^'.... ~ S S''=' 1"-' 0 c l c;"' 0'-lj'-rl ""J ^ 1::'"''''"'''*^ "1 * " ":> [~''~' c l'-n "J ""*1''^ c l'-:''(:"'"Ir' " "J P.." "^ a, e-n o''-tj 0. 4' C*iCD 0 G,.j Q 1 f I Ln C, I C, %:7 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ * D *s -jt: Ti m.: C'-i:-4 II: ^ s — J t':-j i -j'. \!:.,.I c-..*. i..) i.Cl'.L — I: (7.'> Ctj Q,- ":,:(I 1o.q ccs) U in i-n -n) — 4-4.'Jr-2.- - X. i tB X X' pm'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i nn! wf; rn. m:'-n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~,.:.J -., C T;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~4 o4.4.4.'"i ni' n1 ci 4n " —- m... I, 3'D 0,rC,^mh MI On, j T M' PI"!4. m rni ril M -'ri rri M* M' Mo M- M- m* M* Mo o Mt rr.: cm' F *'i M " M - M. M -r M j' m * o M i^ m - M *" M ~ r M M r. MI:z rMr'-'"* "* ^ r"- 0'-"' O':i''**i0l ^ ~' —' E K.,::i'," c: ": C- -:.::'..: C. ~ C. C-:C, Cl,..:1 * "":;.-...: ~ ~ ~ ~ ~:':"': "':7' 470 t - j t. Q PJr.-c.-. Q c.- Y-..)i *n-J:.-j t-'A oj t- ) i-.j <:o -'J rq. W AC.-i N.'i o P. W- (A- LOn co W- Wi- -- CM N.- MI M. M;~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~I F..~ ~ ~:',,i,~.:. r:,,0C -,m 4" —-'-I''444 -r,, 3Di B ri -s — **cc' **.o''.j c^ +- CT"'. c''. -*J i-o *:i c^ -- -''..'i 7' -'*4 44"i't-:' -I -1 4' 4'-1 o1.' -4co -I -- 4''''co r.:i' -n Cj'Jr -'i...DmITir:i'i i:'i rir m'/q r'i"'~#i1rnr'ilrlm Z^:n:[,~~ irini~ rnO~~~~~ Cm'~C:nrrmr! 1C-1 44CC "np C) C: C,. rn Cr:. n r.1 r'. rn ) -n? m mrr n r To~~~~' o'73 a..~ -u~.I**~ (**'-,~ (Z'o~ C:.o~ o~ o~ (-_I =I cf- 1' l'' 1 - 1-.4' c'"' 1.' 1::4 1 1 14:'"'1'" 1'1 1' 1. -.7!01' I T. T I; Mn, Cr i - P..'- T! -t -- - -J 4 --- C r - 0 4 ~ - = - -P. — Q - O ", L A 4 1. C n 0 ) P,) C,,,;, rrnii~~~~~~~~~~~~~~~~~~~~~~',~~...."n'"r i~ ~ ~ ~~ ~~~~~~~~: C. --- 4.-. c I`t-..i C-* o1'. C — _i0 co,.J C!*- -. -(* co. * -j -.-j 4-c- Ci — n o *: 1 *n c-.t - J) *0 4 —J D - I HIT 1 ro' ncr r-n o c^.i o co i7-. o co (>j co -t &** *. *** -* -t — J ij-i'.1:1 co **i:' r-.:i c-'-i *** -* c:'':Z O - +** c'n — J *-t CCcLA;,. o -P... o j o Ao -c.D - > x j LA J nmj TI' nrr "n "'"' — *!'^'' -t'-'*J 1-1^ 1-11 ^ 1^'' **^' **'* -^'-"'*-:1 "'"*I'*-' 1^1''1'**!:^' **::'*'-^ ^ ^ (>J -t C —-, C-* — JJ -C-*-4 -..) -J C.-OC, M. O Ci) M.-..J O nW (Y) P ~* U-f w.D -~~~~~~~~~~~~~~~~~~4 r 44J D,,,-I' -44.i -.: 01 -.,J t — *:'. 1741J., -l-4j rn 44 - J-. j c Zn M( i r i' c. ***-! I'-s.-.. i-n **, -L c' i~i C". rU. i:."* -4 j I co.1 co 3 o -J4.o c-', -' —1 -.-'ri' a r o o J o * ii: i. 11 i I'-i i". —, I'M 0-1 c'-i c'" — —' c~~~~~~~j i-n.M — J *.:i -" *-**'Ln -i **-J'.M G'-. -* co ~4- *(*** i.:.~3 JI4 oP...:i *.-Ji-i 4r Z) — J I. O W0-,J -4 -J f-11 O., 0 0 0 0 0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 C3 0 0 0 0000000 3~~~~~~~~~~~~~~~~~~~~~~~~C C C, CC, r-'I:DC! - O-' 1 r-7 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~~~~~ CT P*)cnuiK - -n. -,i ) -,J "-j (.,J l -'. l,71 -.* J CM*11 Cr-j G'J P J 7 -',J, —:.J vfl "..J'.A 1C.J A-'An'M rrr" ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~*: Mo**i s -'_. o., i-* **-'i CMP..: oo - M n'nC "/ix C -: oC ris~ - M'o'Jo ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~:Z~,:Z; j ~ - - -:.~'oi - 7 M in'' *i - - j*~'i1'-* c M *r 7 c - * - - *: -41 — 4- — 4~~~ ~~ ~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~ M CM. i: c. j 17.. M c.... 7.. -. r.: -. r.:.' 7.. -. -. -f... o.:.: 0. is. N - **. C,`, ) J, 1 7. j - ~ rA.''" r: P.....J P...", ji lJ.. - +.. 4:.. 4;, — j i r- ~.... C{*** 7' o * - -JMi~" - q |:. -*J P.. r - -J -.- CM. CM " r..i -: Ca M. * r'..A t - -J -I:"J* J,~-W I...3 G,' /.. - i.c.. -.j CT-,.s.-, i'-C — J 4: 7'. 0 7i i-H +*"""M c-.- -:.J,.7..: — J'.,:..J C:' -D~...+ r,:, J., j,'-1 i.?M. CM -f r.J (-I 0:1 CM. **_' 4-i - 1J u C 0:1 ".n —. i7'. CM I' (17' 1 r r. i * C M -J*c~ - T, 0 1 o o." (- q 4->''~ i"~ "":'"':?'";'~' ~':".....:' -1:_; " I J.t; ~~::~, " I~ ~ ~ ~ ~~ ~ ~~~~~~~~~~~~~~~~~~~ ~ ~~~~~~~~~~~~~~~~, P I ".,J J:", C;.~,j -'-J: --- -"': ":::.-,';/,~:'" " "7'-. i o C;'. -.:" o.D -. C.C Lf i ",:0I (-q Q' -`-' I J O L Z -D 0 Tl U..D X,.,.,.,,llMr' FIMM' T rimmmmrn W MMMMM MMMI o C~~~~~~~~,,, _ ~,:::::,DC - )DC C -"C Z) - DO lO0= PJ N'IW W -A W C, ~ -'A W..J Q. (A (A'.) Ir.) W W'A!/:-J -~':i'.. -j,:_LAC, (.:, - t, -A P,.)~ -J-~ ~ 7, P...: 4!, 0.!, _I. C n c no..:.:, rx _ o-.- —, r... u..:, c, ~~~~~~~n~~~~. ~,n P (.~1J:: ~ -':. 4'::j:'l "; 1 C..'I )A - ~ -'..:'':A O.........,..:,J; ~ ~ ~~~~~~~~~~~~~~~~~:,:P~,i~ — J c. ~~~~~~~~~.: A 4 -.-'O AT,/; i I', X,,C,.D.-/ C."... c,"~ ~ ~ ~~~~~~~~~~~~~~: ---.'.~,j, 0 D- -,-P. J.JL.'L

STEPWISE REGRESSION PROBLEM NO. NO. OF TERM CHOICES = 4 PROBABILITY OF 1) ERROR IN ENTERING TERM = 5.0000 0/0 2T ERRpOR TN nFIDEETIN TFPRM =- 5.' I 0-/i' 1 WEIGHTED DEGREES OF FREEDOM = 4-8. O00___ STANDARD ERROR OF Y = 0.166283123E 03 STEP NO. 3 TERM ENTERED 2 F LEVEL = 0. 13363724-9E-00 STRNDARD ERROR OF Y = 0.2441 -..9383E 02 COEFF OF DETERMINATION = 0.79810Q469E 00 MULTIPLE CORLTN COEFF = O.989853755E 00 CONSTANT TERM = 0. 1463,4627E 02 TERM NO. ICEFFICIENT STD ERR OF COEFF TERIM- 1 0.33'871 2836E 01 0. 123308040E-00 TERM- 2 0.417513050E-02 O_. 510339618[-03 TERM- 4 -0.546617232E 00 0.401052549E-01 REGRESSION TERMINATED FFTER 3 STEF' - DIPGONllIL ELEIENTS'VR. NO. VRLIJE 1 0. 1061 3487E 01............................'Z _ _._ _......... _l,!:'2 F! 8 2 E —Ffr......... _ __ ~3 O 10.175644755E-00 PP__EDI C:TEOD FE SULT L,'ERSI S DA0RTRA PINtTS _ - OB,. NO.1 _: _ F I i,ICT IaS naT, I SRTin _ -S I= -:li4flt'M Y Y+ 5 IfGl l POIt NTS CLRTR - YT: PER__ ENT 1. 0, 685614SE 03 O,,0, 3922' O lF E 0L, 0, 41 740013E 03 I0, iig;'?.- 5t 0c i 4 -0. 1 2 0122'P4FE 02 -,. I.3 2 -. O,?- 750.S 332:E O?0 0. 9,4?727311E 02 0, 1 3 8'-i"'1o ( ~, n1.?l~ 0' 0.I 0, I52,'i:2: 3. iE 01. I 8. I... 0._.... 1 l_:: 5~ l _ 0'g.4,'269086.SE 0Q2, 49 110204E 0 L.=1 0" 7iSElES 1 iD 13 0 =1i 2'I l.' I 02 2'^09, i. 37 4. 063714263E i0, 1i 00 13 L51 2E 02 i. Li 1 1 2 S2. E 01 L. 1,13, I 21 02 0.:321 5F 1 3.84'-, 5. 0.46.'98733 02 0. 714060'7E 02L O ID 1H 2'4,10 Q _ _ 0. 101iaI6i 4?EL 00I n,. 21.f 1- _ 40' FI 02' 2 _ _4 6 O"0-, 34716561 E 01 0'2.'i::l4' 7 2 1 0. 4,3. 021IE 02 0,45702i'On02E o;-00 =020490615-F 02 -448.4. 7..12017860E 0.,4 Si _, i.71.E i012. _. I: O,@Sc M!, n i. -, 0:21,iF2-,S1 01 F=i 3.Fi4,46l 02 =122.3. 71 L..4i,14MW 03 0 i l LQiLtS1 05 0 415~4 9 4E 0 i 2 5 1 I 12 Er 0ri.ii ll 2Q E 01 i.I SF9 9 0 a2S301 5954E 0L 0. 4 l._ iE Li 0.Q _.. E 7 i a5 i3.E, I'. l 3 3,22SiS E 0 q- 4.4 10. 0.14 O9'4 322E 02:'E 02 0,62'110E 02 004 i, 107Ti4.1E 011 =0,.42341'27E -2. =674.471116. 0.442159ii939 03 1 814.3E 03 0.0 51 ri E604 I 2 1i1 00 3 03 50.1l497205g 02 2.S3 13. 3.3490 61E.1 0,.419.320E 0_ 0.44370396E 05.. _..rg.246s9E_L1. -_._ 14,1? l... Q _.. _-,_34?.3 14. 0 5 0. 22^40140C 03 0,.12 4073E S3 0. n,-17.3i 03.2ri0 2._t?2 1 0.717

-250 15. 0.13869653E 03 0.16311587E 03 0.1631.187E 03 0 03 0.17735252E 03 0.14236648F 02 8.027 16. 0.40687139E 0.3 0.43129072E 03 0.45571006E 03 0.4171314SE 03 -0.14159237E 02 -3.394 17. 0.74341639E 02 0'.9:876i,': 77E 07 2 0. 12318031E 03 0. 13 34309 E 03 0.34670010E 0'2 25. 983 18. 0.50384.38EE 02 0.74S03'9'377E,32 O. S9:':223315E 0 02. 6 -7453766E 02 -0.73502111E 01 -10.897 19. 0.68312383E 02 0.92731722E 02 0.11715106:1E n3 0.99'367SG6E 02 O.E3353G43E 01 6.395 20. 0.13039259E 03 0.15481193E 03 0.17923126E 05 0.17025097E 03 0.15439051E 02 9.068 21. 0.12677906E 02 0.37097245E 02 c0.61516534E 02' 0.32386170E 01 -0.33858629E 02 -1045.466 22. 0.31327043E 03 0.33768976E 0.3 0.36210910E 03 0.33171669E 03 -0.59730682E 01 -1.801 23. 0. 14779282E 7 0. 3 0. 1 72L21 26E 033 0.!'96663150E.l 0.19715650E 03 0.24944334E 02 12.652 24. 0. 316020.37E 02 0. 5602137E 2 0.80440714E 02 0.29077920E 02 -0.2G943456E 02 -92.660 25. 0.26772121E 02 0.51191460E 02 0.75610739E 02 0.2414905E 02 -0.30776555E 02 -150.755 26. -0.52434778E 01 0.19175861E 02 0.43595200E 02 0.35019451E 02 0.15843589E 02 45.242 27. 0.42066642E 03 0.44508576E 03 0.46950509E 03 0. 43136641E 03 -0.13719345E 02 -3.180 28. 0.24061318E 03 0.26503251E 03 0.28945185E 03 0.24484918E 03 -0.20183337E 02 -8.243. 123 4358 E 0.3 0. 1474 3,3E62 03. 171 22' 1 650787E 033r. 32944943E 02 1 6. 455 30. 0.5171143E 03 0.54153427E 03 0.56595360E 03 0.5725101E 309758835E02 5.411 31. 0.10841110E 03 0.13283043E 03 0.15724977E 03 0 14940762E 03 0.16c77183E 91 11.095 32. 0.39747377E 02 0.64166716E 02 0.38586055E 02 0.98891437E 02 0.34724721E 02 35.114 33. 0.29390577E 02 0. 53809916E 02 0.78229254E 02 0.24966075E 02 -0. 28843841E 02 -115.532 34. 0.10212438E 02 0.34631777E 02 0.59051115E 02 0.15693001E 01 -0.33062476E 02 -2106.829 35. 0.3657.3 S':5 -', 1E'03]0. 41 457 7 52 E 0.3" 0.37771"55E 3. -., 12439,6,4 0E 2 -3.. 93.36 i. 0.6 107425 L - E "2i i.:! 4. l'- E i2 0. 12.1091293 E -1 1' 5E l 02 0. 17i05220i2E 01 ~ 95 -7. -0.79'.4644364 E01.1472S5E C22 0.Q40892'-234E 02.0.56,15271' 01i -0,10857624E 02 -193.35' 3'. 0.24373405E 03 0.268153.39E 03 0.29257273E 03 0.27457628E 03 0.6422859E 01 2.339 39. -0.52833369E 1 0.1913, - E 02 0.43555340E 02' 0.18445273E,l -0.17291474E 02 -937..447 40. 0.8202118GE 01 I.32621457E 02 0.57040796E 02 0.50497638E 00 -0.3211641E 02 -6359.997 41. O.51 947930E 02 0. 76, 3 72.'- 9E 0 2 Ol. 1 001786 6,- GE I 03.. 01116:, 6.3.56 E 0C,3 O.352 96,2,'2 9 Q E 0 2 31.6 Cr 42..23 6 577 -: E 0 3 0.2 6 0 47 712 E 03.2 4 6 46 E 03.26 400'3'3 E 0 3.79232 3 E I1 2 - 5 43. 0.4801 15. 4'184 E 03 3 0. 5504534'51E 3 0.52895385E 03 0.535115E 03 -0.95336151E 00 -0.18' 44. 0.1825542E 02 0.426747E 0 02 0.67094098E 02 040 0.848580E01 -0.3418 38914E 02 -402.893 45. 0.60688604E 02 0.85107942E 02 0.. 0952727E 03 0.12'37566 E 03 0.35267717E 02 29.298' 460. 0.42782802E 01 0.28697619E 02' 0.53"116'958E 02 0.53612783E 02 0.24915163E 02 46.472 47. 0.2901771 E 02'. 5 34'.3 71 2i E 0i2 O. 77856458E 02. I 862337E 02E.'3 34 25217E 0 2 38.4 — 1 S 0..369047'96,E 0.3. 3- 34673 E 03 C. 1 788966 4E 03 0.3 807556 2E 03 -0. 1 271 16 S 1 E 02 - 3.33 MfAXIMUM ABSOLUTE DEVIATION = 0.352'-.62'-E 02,CISEE OBS. NO. 41., LINE NO. 41: MlXHIhMlIlM ABSOLUTE PERCENT DE-f-'.'IP 635-.997, SEE OBS. 1O. 40., LINE NO. 40) P' STiLATED CRITER I A STRNDARD ERROR OF = 03 i.000000E O0 COEFF OF DETERMINATION = 0.999iQiOO- E Ol_... FI TTEO CUR','E FPROPERT TIES STJriDARD ERROR OF Y =.2441'34E 02 COEFF OF DETERMIMATION = 0.9798105E 00 FITTED CURVE MEETS NEITHER CRITERIP. PFSS NUMB1ER 4 BEGUN FOR PROEBLEM NO. 1 10 TOTAL PASSES ALLOiED.. EDITOR PROGRAM PFROBLEL 1 NO. 1 SOLUTION PASS NO. 4 NO. OF INDEPENDENT VARIABLES = 1 NO. OF TRIRL TERMS = 4 TRIAL TERM DEFINIITIONS FOR PASS NO. 4 TERMC 1:) = INTERFACTION OF ORDER 1, WiIHERE THE COMPONENTS AfRE DEFINED TO BE -E- _. COMPONENT( 1) = X( 1).P. 2 TERM(C 2) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE -- COMPONENT ( 1) =.: 1).P. 5P TERM( 3) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE -- COMPONENTC 1) = X( 1).P. 3 TERMC 4) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE -- COMPONENTC 1) = )C( 1).P. 1I TERMC 5) = X( 2). DEPENDENT'VARIABLE.

-251 PROBLE M HO. 1 NO. OF DLiqT A SETS = 48 _ NO. OF TERM CHOICES:H 4 PROBABi I LIT'i,. OF 1:' ERROR Ili ENTERI NIG TERM = 5.0000 0..-0 2: ERROR IN DELETING TERM = 5.0000 0...____ _______0 ______ _ ________ ____________________ i. E I ii Tu ERERO1R DE ES F i 4F FF.,E E. 0 __E _______D____________D ______-________ __________________ STI::IT DRRD ERROR OF'.,' = o 2 l-, 123.;12._x..'E 0.3 RR tY V5'C,.'.': = -0. 6:70552.3E-07 STEP NO. 2 TERi'El ENTERED 4 F LEVEL = 0. 1 154E4945E-02 ST DRD ERlA R OR OF'D = 0.Y4.F005F2Q- 1 C.0EFF OF.ETER:i Ti OI = I 3 3 0 iMUiLTIPL-E CORLTN CO EFF = 0.' 999999J96 3E` 0 __ COnSTFNT TERM =_ _ _ __.,150000.30,4E 02_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ TER M-'4A -0. 1.';..2 E I 02 0. 1 1 00572 7FE-02 MREGRESSION TERfII N'TED O.FTER 2 STEPS, - OiT GOiI-L ELEMENTS il'UA... M O'. A' LUE 1 O -!02 -'301 ______________________________________2__.31 5741 771E-00iii____2 _________________________TFDFi____LT'S__DTF__FT___________40. 0.3.-.70297E 01 E_ _E -' I D._ I T E D` __.I I E TT RES..T!_ L PE. _OBS, HO,.___________ ____________________________ Di 1'TPREDICTIONS___1__ _____________DE____,_______!_______T___ _.:_N__________HIAi_____ Li - SIGMAi'E,1 V+ S!GNA POINTS (RTA -:,Di pEOR.'CE;NT _ 1. 0.D3i 7802i. 3 71E 03i_0.i380.0771iE.3_ ___Q_3 0,38.5171E 0.3' 5 0.3';.0.30735E 0.3 0.1..35144E-0. 2. 0.1091 1 40E0.i 0.50E 0.3 0.i190207 1.i9570iE i0.3 0. 20 7 3.1 1 iE 3 -0i.7S201294E-04 -0.000Q 3. -0.'92279042E i Di - 0,78 4lE i0jiOSSS49 _ -0.8._'4 937E 00 _ l-0.8788859'1E 0 -0. 1 01 01 497E- 3 0. 01 1 4. 0..'1S17.320E 7 3 0. 632 0 -13 2 70 E. 91 705.3.3QE 02 1. 9 7 6612 9E 02 -0. 1 06811 5 2E -0.3 -0.O00 5. 0. 1 0652237E 03 1 0 1656638E 03 0.10661038E 03 0.10656647E 0.3_ 0.10204315E-03. ___ 7. 0.270S7816, E 01 ___ 0.275278 71E 01 __ _U.'27"967926:,E 0! ___ 0. 2752653 1E 01 ______ -0. IS.3931t64E-0.3 -0.002 7 8. 0.512560.30E 0. 3 0.512604.30E 03 0.512648.30E 0.3-;.. 0.5t' E;412 E 0.3 -0.18310547E-0. 3 - 0.000 9. 0. 79641 547 E 02 Q.79685552E 02 -.796,5631E 02 _ 0. 91 54968E-04 _ 0. 000 10. 0.50317791E 01 0.50757846E1E! 01 - 0.13959408E-03 -0.00 1 1. 0. 1+ 90624821 0. 0 22E 0i0. 10629221 2E 03.3 0 -. 1.3E 0 0. 101 29983E- 03 0. 000 2. 0 0 55 77.45562149E 03 0.4556, 549E 0. 3 0.4556215E 0.3 0.3432275E-04 0.000 13. 0. 40520 1 96.3 0. 40524577E 03 00.4,'_ 524597E 0.3 1 22070.31E-.3 0.000 14. 0.29287345E 03 0.29291745E 0C_-3, 0.29296146,SE 0 3. 2 7 0.3 0. 1792907.E-03 0.000

c r r I II I i I -fl- -t j-.. iI'- I I 9j:I I II.J 1 )1 1IJ I- ltpi 11 Vi g~~~~~~~~~~ 1~: t~ ~~ l""" ~ ~[....********** - m j~ ^ 3 ^,-, S _H 4- "-V, ".'j o,.. 4 I. — n - 4. I.., -, (4.j. 1.:,d:a f —, --.-P fll 0 _';,-, (o.,j - C o, C7 - —, ILI -. -'- J - -j uJ-i *. — ) C,. -W - J CO —J ^'. p.:..:.,, -4 -. J - i U Ml 3: C r iTl F _1~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~C':' *q,1~ CO ~? i- - 1 X ~ " -'* C:, &.' > *-J' *-' + — l *'.:' ~0:' ** "-1'.:' r'.-' f'. -. " -. r'-', r'-.J -- ~~ t _,>., ~ - -.J - I.i......... *. _.l. 4,- i....!...I..4 -,I - - d -'. 1,i,,::1,*, i u mci n r Un VKI. 0~~~~~~~~~~~~~~~~~~-.:o1) a~ % 70,:.u U.)....C~O 0, C. q C) ~ C., C)....C.., -. O. ~r',? i= t'. (1 C) C,' C_ C' C).' ~ C) C C O m hy'I.:1-:.... 1... I......4.-i....;I C..-,-'q1 M ^ ^'-7 C ".':U. 4... *' 4:, -',,'-'.:.. t rio,:: 0 1,.: —' " -' -.r.'.. --: C:. o l:_r':.'1"'. i Irrl i~~~~~~~'4'',l'".:;,'r' 4rlirI...4'r'' 1 4..&a...['rM —-4-',i....LID rH X M -4 M ~ 7i.,.tif 1 4 l.-..r.LI'lC.", rrl l',r'i: -4 lD.44In~ ~1= 44. ~ ~,....O(D w4. 4.4 4 M -4- ii - I'c I T'M -r-n;-, -, f- C.11 *-i L. LI *,. (-'.-..:...'...' -, J -,..7, C,,,... ) 04.. -' >. C C, 1^;~~ ~ ~~ ~~~~~~~~~~~~~~~~~~~~~:M o -.t~.- — I u O." i- ^ w -D y) -, K.! &..i- PJ P.) W ^.^ G.j f>^ r..-... j -- * X-_i I 71 MiiI / I; C-1P1,I:.-i D K:10'1 I VI v. mr. - T, 0 -4 _4 -I IC V. n I. C4.. iJI mi Li i'..4.uC C,C.C - ) CO!~0 I.,C ( O O CDC)DCD C, C- iC),,.', ~~::~~:=:..,-,,:, ~,.:, -~ ~ ~:,:~:,~:~,~,=:,::,,,=, f'~-:~~:;~;1, -:' i >:'A PIA F - **'1' ii Wi > n f i i O -. IF.."i *.:' r', *;, 1 -1,'i I'- j i - - t' n' r-. 4 4:. "'.,j i~ *' r " -.-.',- —.' * *..?w~ Z.,~~~~~~~~~~~~~~~~~~~~~~~~~~^i M XI O C) Z~~~~~~~~~r,~:, 7,~~~=:,oo:.:, =: = -: - = D'=:''''"~ =:.; C*C.;;"'::: " 10 C), "C,. - Ml *..,.,..:'l,.i (..;i-..... t..: L.,..;.1 PC j C44J'4 C'..... I. - C,, C-C-C,..-'.(4,,-,.. VI4-. -..I'...,, —., 44..., - -,. -'4,-,,'-', Xi CD' i'i~i ~iCDDCDCrCDC 111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~1,:II _9 Ul O C') O ~ ~ ~ ~ ~ ~ ~ ~ ~ -ifl *' i,D 1'~~~ ~ ~~~~ ~ ~~~~~~~:.. C) 4:.,:J1 — 1 Lr'...,J ( -"i 4. (:..-{,,:.:.,., -~',:r,.L~.ri:...... 6,..j.. 4" O(~ j;)0;.'= ~',jI,. M.0 M I'l-J -14J~~~~~~~~~~~~~~~~~~~~ i 4. -.( 4 4. 4'.fl., C. ~-., i.AC. ii L l, J w 0 I'T' I M n -.. _ J...I...."...........J1.J:. "."' — J C-...r.4.:..'.:4 -—.,=,- - - a, — Z: ".. C,,,......''''" -' C'"' L.: C,,:. C ) C:, C-. -. I'" ":':.-'': ", *'i -I' r:':-::';;-i,-n-1'-',..,' P' -.- I'.' " 0.3,-?,'-., (-J r r..r G.,-/::;,-[._- -o o,=, _;,, |:;..,...... -',..... C:''::';.... *:'..[;:! -J'=- *... *.;i......-' %t -'J,.,.. h',; " *~,=:',:3 = J',;.,1 =,,=, c,,:=,:. c',,=1 =;:,* o':. o*.=,,~ = —,,.~,~,1.., o* c=,,:=,i *'1' o c~ o-,' _ 3 o-' o* t,~I.;:;,1 r,::,:. i..~'i:; m:,,;:,;- -.1J;.',::,:'*:'i,=* * —:-:..;':':: [,:;::. ~ [. 1,:,:'-.0)<*'-,. C.,' —,': =.-* i~;, 0:: tl-]:i 1 /"ilC ~..... Z,........l C E-.4................................ *.... 4'L'4l ij DqirC, CO t — " C,~.....':, -. 4'.. O.,.1: 7,-...)'..q 4:. C,~''..:', -J-J,V C.,,P.. -' -J J'- -J,_oj J...., -.i -.J,... ~..:.o -- o -:, -.'.'- -, o 1., C,., -J — C,.,: -4....,_:Z,-I..., C-......... — J 6.: -..,l:,.i:..I'.. c,,:L. -J I,::',...: r 4., G-4.......,-,,-4:~ -- - -,-, -, ~.:, j........,Y....,.... ~~~~~~~~~~~~~~~~~:_. -.- -I.,-,-,!~,-',,~" o) oI _..-, o,,,,=, o. oD)C i;ri::!i i:t I I I I I -- I

MAD EXTERNHL FUNCTION STATEMENTS FOR PREDiCTIHG EIJAUTIOA PRODUCED 3V LAST REGRESSION ST;EP * COMPILE MRD * PUMCH OBJECT EXTERNAL FUNCTION (i j ENTRY TO EXAMPL. T:: 2 0..',!'-i86 E 01 TC 2) = TC 2 *: X 1.F. C 2I: TC 3) = -0.159:-'832E 02 TsC 3 = T) 3) * X 1 T O =; 0. THR F.Oi H S. iU FFO I - 1 1...3 S,UM T1':. O = TI::.:: + T Ci:, FUiCTIO Ti FRETURii T,:CO) E'tD OF FUhiCTIOi FORT-RAN II SUEBOUTIME STI'TEMEHTS FOR PREDICTI!iG EQUiTIOHI PODUC ED BV LA T RE.GESSION TEP * COM1FILE FOF TR-i PFR I,-iT S-AF' * FULNCH OBJECT F U N T I O N':- E-, r- (::; 1? D I - H S I Obi T 3_ T C 1: = 0. 1 O5 3. 4E' 02 T _C 21' -. 9':''3, 57E 1:iq. 13 7 TC 2' = T, i:'! -F:i 1 * -' 2:: T, 3{ -0. 1 5?:9832 0E 02.. TC 3) TC 3- * X 1 E-,h-:nPL = 0. R E T"F' 1 EN..RFML = E'-P1F L +': I " F.,ET!-F! F' ES'- T 5T TUS OF SELECTOr iA RTA'S.-.I TEi- i-: T-: I Pi3 __: _ F,:'.r-'__. -._ I'ERt CTIOH 1 O. j:,iIGHT I ITEk:CT IOi 0...IEIGHT I EACTI':i. iI.jIT i HTI CT t. T._T t.... -:.;3:H 0 o 0 I c...'_-',, j j j-,,.I i._i - i H T S - I.'3 i- F iii i? i': i 0i!, iR I F' iEBLE h RRlF:'~'F:Ir i;L i-LT L.' I G TiqEI L — -NO *.EIG T 4'iE HT,T!i f itF ii EE t;,,.' [ - i-T T t IOti i 1 L.' O,:F i, C EI,it - _ = 1.'::r Oi.T-N,',,'E!,I:.*,,'_lqS: 1F i iE:L l.[..f-_J i,_l!_ t.... E t=';JiNCtI 3tt I O. iO,! EIGHT CiFUI'CTt1' HL -0t;ii:iGHT Fl.'',T'irii HOli tdiEICliT FU.iCTIO.;,, iT 1 1'. i u12 9 Ei 5 5 E 01 i - 4: E:-:; r 1: 1 E: I - 1- L ——,20. _ 1.._ 7402E L _ ] ___..:-S-1 1 0. 7 to — E 0 1:.It. ___. l7 2 OH _,.,. -,,',:;1141 i., J 7 4:I 1.E 09 i,. —'.'.:'-E-. 1, ";;;E-*t,I.i402 1!-i0. _ 10472' 14E, - -' I E 0 2 0 0E. i4 * _ 0 2.!_L. —' - i:. DM, -O-F'F i. I t.,,;!i.;n iO i i_' E! -

BIBLI OGRAPHY Automatic Programming and Artificial Intelligence 1. Theodoroff, To Jo and Olsztyn, J. To DYANA~ Dynamics Analyzer Programmer. General Motors Research Staff PuJlication, 1959. 2. Friedberg, Dunham, North, A Learning Machine. Parts I and II, Vol 35 No. 1 and 3, IBM Journal Research and Development, 1959. 3. Minsky, Mo Lo Heuristic Aspects of the Artificial Intelligence Problem. Group Report 34-35, MIT Lincoln Laboratory, December, 1956. 4. Minsky, Mo Lo Artificial Intelligence and Heuristic Programming Proceedings of the International Conference on the Mechanization of Thought, London, November, 1958. 5. Newell, A., Shaw, J. C. and Simon, Ho A. Elements of a Theory of Human Problem Solving. Psych. Rev. 65. Mathematics, Logic and Statistics 6. Schreirer, 0. and Sperner, Eo Modern Algebra and Matrix Theory. Chelsea Publishing Co., 1951. 7. Hartree, Do R. Numerical Analysis. Oxford at the Clarendon Press, 1958. 8. Hildebrand, Fo B. Introduction to Nulmerical Analysis, New York: McGraw-Hill Book Coo., Inco 1956. 9. Cramer, H. The Elements of Probabilit y'neory and Some of Its Applications. New York- John Wiley ancd Sons, 1959o 10. Fisher, R. Ao Statistical Methods and Scientific Inference. Oliver and Boyd, 1956. 11o Davies, 0. L. Statistical Methods in Research and Production. Oliver and Boyd, 1947. 12. Dallemand, Jo E. Stepwise Regression Program on the IBM 704. General Motors Research Staff, GMR 199, 1958.o 13. Weyl, Ho Philosophy of Mathematics and Natural Scienceo Princeton University Press, 1.949. 14, Tarski, Ao Introduction to Logic. Oxford University Press, 1951o ~W., A...

UNIVERSITY OF MICHIGAN 76~a~wa~1r~a~a1r -2553 9015 03627 7666 15. Church, A. Introduction to Mathematical Logic. Princeton University Press, 1956. 16. Tintner, G. Econometrics. New York: John Wiley and Sons, 1952. 17. Klein, Textbook of Econometrics. Row, Peter.son and Co., 1957. 0~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~