THE UNIVERSITY OF MICHIGAN
INDUSTRY PROGRAM OF TEE COLLEGE OF ENGINEERING
A STUDY OF AUTOMATIC SYSTEM SIMULATION PROGRAMMING
AND TEE ANALYSIS OF THE BEHAVIOR OF PHYSICAL
SYSTEMS USING AN INERNALLY STORED PROGRAM
COMPUTER
Franklin Ho Westervelt
A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy in The
University of Michigan
1960
October, 1960
IP470
PPEFACE
In the relatively short time during which the high speed digital
and analogule computers havre been in, use,, many remarkable appli cations have
been made, Particularly in the area of simulated behavior of physical
systems the interest has been high and the utility great While a great
many systems have been programmed and investigated, the general application
of the digital computer to system simulation has not yet been as widespread
as the applications of analog computers,, This has been true for many reasons
despite the inherently greater flexibility and more positive error control
offered by the digital machineo Perhaps the chief reason for this lies in
the more difficult encoding of the analysis. The need to reduce each
analysis to machine code has already resulted in several levels of machine
languages that are designed to assist the user in bringing his problem to
the machine,
Since many engineering problems., of which the system simulation
is a good example., require extensive analysis prior to the time at which
advanced languages can be of assistance, it may be expected that the need to
study a variety of simulations and large systems would result in the development of methods to assist the analysis as well, as the later computation of
results,, The approaches to the assistance in analysis have been variedo
The range of methods extends from thel production of a generalized system
program from which a specific system of the same type may be approximated
by an interpretive selection to more truly analytical programs capable of
producing other programs to simulate specific systems of rather specific
types,.
ii
iii
This paper treats the development of two techniques to handle a
very generalized system simulation. The first technique; referred to here
as The Simulator Program; is a procedure, referred to as an algorithm, for
prod.ucin.g programs automatically on the digital computer that are simulation
programs for very general systems. The second technique, referred to here
as The Stepwise Regression Program with Simple Learning, is an algorithm
for producing analytical expressions and subroutines for use in representing the performa.ce of the components of systemso The subroutines may be
used by the programs produced by the simulator program or by other programs
as desired, Together these techniques provide a heretofore unavailable
method for undertaking the study of large systemso
The cooperation and direction given me by the members of the
doctoral committee; Associate Dean G. V. Edmonson as chairman, Professors
Bo A. Galler, Jo J. Martin, No R. Scott' C. A, Siebert. Go J. Van Wylen
and Mro Ro D. Allen of Consumers Power Company in Jackson; Michigan., has
been greatly appreciatedo
Special thanks are extended to both Consumers Power Company and
Commonwealth Associates, Incorporated for the enabling grant that made
the work possibleo Thanks are also extended to the many people at the
Computing Center and the College of Engineering whose assistance was most
valuableo In particular, thanks are extended to Mru Jo SO Squire and Mr.
Mo P. Anderson for their valuable assistanceo
BACKGROT),O OF THE S''TUDY'
In order to establish an orl.enta.tion of the developments presented in this paper, it may prove!desi.rabe to review briefly the
developments of earlier workers in pertinent, areaso The simulation problem has attracted the attention of many workerso The conventional approach
taken by most earlier efforts was the construcotion of a special purpose
program designed to encompass as general. a description of the system to
be simulated as might be feasibleo The soluti.on. of the simulation of a
specific physical system then depenied. upon, that system being representable by some subsection of the more gene.ral program., The specific system
was usually selected by means of control para.m.eters from the more general.
program. Experience with such techniques madie obvious the inherent difficulty of representing accurately the many variations of the systems
that may be suggested for study,
This deficiency led to the consii.eration of programs () that
produced other programs which ir turn accd mp;OplIshed'. the desired, simulationo
In this way, the computer began to ie i.tilzel as an analytical aid and
was enabled to generate programs fro:rr. a, descxription of the system and a
specified set of physical laws and relatio. nships wit which the system
may be described. DYANA, for examlr.e.,..,s able to generate programs for
any system representable by a gere:a netwo:rk and' by the principles contained in Kirchoff s laws and/ocr D Aler.mbert.;i s p pericple o
The task of generalizn.g h. e siu,,lat. on problem remained and
the Simulator Program presented i.n this paper is a w.ojrkable solution of
that problem. Specifically, the genera.l'.liatiol.n allowed. by this technique
iv
extends the use of the computer for the analysis of any system describable
as a network and whose component parts may be characterized by the application of relationships involving the parameters of the system as determined at the node, or interconnection, points of the networko The nature
of the physical laws and relationships required1 for the analysis does
not influence the logical structure of the Simulator, Thu.s, the programs
generated by DYANA are, in effect, members of the set of programs that
may be generated by the Simulator, The relationships supplied by the
user to characterize the components of his system may then be time dependent or time independent as desired.
The second development presented in this paper concerns the
production of expressions for the prediction of the behavior of physical
phenomena. The methods of "least squares" and multiple regression have
received much attention. (9,10,11 12) The stepwise regression technique
of Efro\ymson as extended by Dallemand. offers a powerful technique to
assist the Simple Learning developed in this paper to extend the treatment of data representation problems to include all. orders of interaction
between multiple functions of several independent variables. The stepwise regression analysis provides the independent evaluation required by
the heuristic selection mechanism employed by the Simple Learning program
to generate the terms to be used in the prediction equatJion for the datao
In this extension of the techniques of "artificial intelligence" many of
the objections to the earlier methods of regression analysis have been
answered. Previous methods were often forced. to make rather drastic
simplifications of the statistical models in order to allow the problems
to be solved in a reasonable time, even on the largest and. fastest computers,
v
The incorporation of Simple Learning has allowed the regression analysis
to gain access to every possible term within the scope of the functions
allowed by the user for all orders of interaction in multiple independent
variable problems and still obtain the desired equations in practically
feasible time. Since the equations generated are often required by the
components of the systems treated by the Simulator, extensions of earlier
methods were made to cause the production of the resulting equations in
subroutine form ready to be usedo
It is important to understand that the stepwise regression
portion of the analysis may be replaced by other techniques as improved
methods are developed and still retain the benefits offered by the Simple
Learning methods presented here. At present, however, the stepwise regression analysis as extended here is regarded as ae very suitable technique The extensions include a treatment of the truncation and roundoff
errors generated during the analysis and an improved treatment of the
constant term when the analysis is conducted with respect to the normal
coordinate axes.
The Simple Learning techniques presented are also extended from
earlier effortso (2,)4,5) The basic principle may be regarded as analogous
to learning through reinforcement. By arranging to increment the probability of an action after encountering success and decrementing the probability after failure, earlier workers had indicated the potential ability
of a mechanism to simulate learning. Some interesting observations of
random mechanisms had also pointed up possible advantages of an initially
random mechanism that would gradually become more nearly stepwise in its
vi
action as the reinforcement process took place. Earlier workers also
pointed out some of the pitfalls awaiting learning mechanisms0 By incorporating the experiences of earliers workers and introducing a "halflife" concept of reinforcement, the mechanism presented in this paper
displays rather promising properties while retaining simplicity. The
learning mechanism employed by the program is termed "simple" because
the modification of each portion of the selecting mechanism is controlled
by a single parameter and each modification occurs individually when the
success or failure of each portion of the mechanism is determined. The
more complex problem of the interrelation of success and failure patterns
is not treated by the present mechanism.
Thus it may be observed that the contributions of many workers
in many rather diverse areas have served as the foundation upon which
the present work was built. In turn, the development and applications
of the methods and concepts presented here will serve to further extend
this area in the future.
vii
TABLE OF CONTENTS
Page
PREFACE................................................ ii
BACKGROUND OF THE STUDY...................................... iv
LIST OF FIGURES........................................ ix
INTRODUCTION................................................ 1
SUMMARY OF RESULTS.........................6............... 6
I. THE SIMULATOR PROGRAM................................... 8
THE.TRUCTURE OF A PROCEDURE TO GENERATE ALGORITHMS
TO SIMULATE PHYSICAL SYSTEMS............................ 8
IMPLEMENTING THE SIMULATOR............................... 19
Communication of the System Information to
the Program......................................... 19
THE STRUCTURE OF THE SIMULATOR TRANSLATOR................ 51
II. STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING........ 66
IMPLEMENTING THE STEPWISE REGRESSION PROGRAM WITH
SIMPLE LEARNING................................... 98
Communication of the Problem to the Program........... 98
The Structure of the Program.......................... 118
CONCLUSIONS........................................... 124
SYSTEM SIMULATION FLOW DIAGRAMS AND CORE LAYOUTS............ 126
STEPWISE REGRESSION FLOW DIAGRAM AND CORE LAYOUTS............ 195
COMMENTS ON THE SYSTEM SIMULATOR FLOW DIAGRAMS.............. 227
COMMENTS ON THE STEPWISE REGRESSION FLOW DIAGRAMS............ 233
ILLUSTRATIVE EXAMPLE......................................... 239
BIBLIOGRAPHY.................................... 254
viii
LIST OF FIGURES
Figure Page
1 Generator Electrical Losses as a Function of
Load, Hydrogen Pressure, and Power Factor......... 72
2 The Gaussian Distribution........................... 77
3 Surface Showing Values of F which. May be Exceeded.
by Chance with Stated Probability............... 82
ix
INTRODUCTION
The simulation of the behavior of physical systems is one of the
most economically promising uses of the digital computer because the simulation process affords the user an opportunity to examine system performance
without requiring a capital investment in the actual hardware of the system,
The simulation of a system also is attractive theoretically because of the
opportunity extended to conduct controlled investigations and to avoid the
"noise" problems associated with actual systems, The simulation process is.also attractive in those cases in which some degree of hazard is present
since an "explosion" occurs only on papero
In spite of the obvious advantages of simulation on the digital
computer only a relatively small number of physical systems have thus far
been studied in this wayo The reason for this state of affairs hinges
primarily on the difficulty of communicating general system problems to
the computer in the form of a procedure capable of producing the desired
results o
The work described in this paper attacks this problem by producing
two procedures of immediate use in helping to generate system simulation
programs for digital computers. The procedures have been coded and verified
for the IBM 704 computer but the techniqures are more generally applicableo
The two procedures are the following:
1) The Simulator Program.
This procedure produces simulation programs
as algorithms in compiler language ready for
translation and execution by the machine.
2) The Stepwise Regression Program with
Simple Learning.
This procedure produces predicting equations in
subroutine form for the description of the behavior of the components of the system being simulated.
Both procedures produce machine translatable programs automatically
ready for immediate processing by the machineo Taken together the procedures
offere for the first time a method of direct communication for general system
problems to the machine for analysis and production of algorithms for the
simulation of the system. The use of the machine for analysis as opposed to
calculation is not yet widespread. The implementation of the techniques discussed in this paper may be of much more general interest in this areao
Specifically, the machine must be presented with methods of proceeding with
a problem when the best available information is only capable of indicating
the relative possibility of success for the alternative pathso Methods of
probabilistic choice and mechanisms for altering the likelihood of choice
from "experience" also implemented in the development of the procedures.
The programs discussed contain workable implementations of these methodso
These methods are a first step toward more sophisticated "artificial
intelligence" techniques.
In order to understand the problem encountered in the generation
of algorithms by a machine, consider the simulation process itselfo If a
system is to be simulated, the behavior of each component of the system
must be related to the other components in such a way that the physical laws
and relationships pertinent to the problem are preserved. In the past, the
preservation of this consistency was the responsibility of the human programmer. The result of his effort of analysis was a procedure or algorithm
3
by which he, or the machine, could proceed step by step from the data supplied to the results desired. In general, the analysis of the system to
produce this algorithm depends on three kinds of information: 1) The System Definition, 2) The Given or Known Data, 3) The Desired or Unknown
Results. The algorithm may be specialized to the defined system and
designed to accept the given data and from this information produce the
desired resultso The algorithm must be so constructed that the solution
can proceed from step to step toward the result.
The Simulator Program is designed to produce these algorithms.
The same three basic classes of information are supplied to the Simulator.
The Simulator Program then carries out an analysis of this information
to produce an algorithm capable of the required performance (or, if it
should prove to be incapable of producing the algorithm due to insufficient
or otherwise inadequate information, error diagnostics are produced).
Because of the possibility that several methods may exist that will yield
the desired results, some interesting heuristic methods must be employed
by the Simulator.
When the program for the particular system has been produced by
the Simulator, it is both printed and punched on cards ready for compilation and execution. The produced program will, in general, require many
special characteristics of the components of the system to be available
to the program. These characteristics are usually functions of one or
more parameters of the system. The Stepwise Regression Program with Simple Learning is an extension of earlier stepwise regression techniques
to allow consideration of all orders of interaction in multiple independent variable problems.
4
The classical approach to such a problem may be shown to be quite unmanageable
on present (and even projected) computers without drastic simplification. The
Simple learning mechanism employed in this program avoids such a pitfall and
allows an accumulation of "experience" to be directed toward the acceleration
of the generation of the predicting equation for the desired characteristic.
The technique has been employed in many varied problems and has been carefully verified in many known caseso The procedure produces the predicting
equation for a given problem together with a complete statistical analysis
and a punched card subroutine ready to be used with the simulation program
(or by any other.program, as desired).
The Simulator Program together with the Stepwise Regression Program with Simple Learning allow the direct simulation of the performance
of any system that may be characterized by a general networko The use of
these programs to produce specialized programs of each system avoids the
loss of accuracy of representation sometimes suffered in attempting to use
a general representation in an interpretive modeo Since the information
presented to both programs is designed to be fairly easily obtained, the
manhour cost of programming system simulation problems can be greatly
reduced. At the same time, revisions of simulation programs to keep pace
with design and operating changes in the actual system can be made economically feasible0 Finally. the user of these programs during the design
phases can allow study of a greater number of possible configurations than
was previously practical.
5The feature of, perhaps, greatest importance is the opportunity
extended to the user to obtain increasingly accurate representations of
the actual behavior of a system while studying a simulation of the system.
The most recent information available on any part of the system may be
incorporated immediately in the simulation by the techniques described
here. Thus the problem of revising and correcting a former simulation
program becomes quite secondary.
SUMMARY OF RESULTS
Two programs have been produced for assisting the representation
of the performance of physical systems on the digital computer. The Simulator Program is designed to produce the analysis of physical systems in the
form of an algorithm in machine translatable language. The user of the Simulator Program must supply: 1) The System Definition, 2) The Given or Known
Parameters, 3) The Deisred or Unknown Parameterso The Simulator Program then
attempts to construct the required algorithm for the problem. In doing so,
use is made of a Library of methods pertinent to the system. The library is
accessible to the user., so that new A.ethods may be easily inserted. The methods
are grouped under the heading of Element Descriptions. That is, each possible
element that may occur in the system is described in the Library. New elements
may be easil added an:i older descriptions may be revised within the structure
of ihe Simulator Program.
The second program, The Stepwise Regression Program with Simple
Learning allows the generation of predicting equations for the behavior
of the components of the system in a form required by the simulation program produced by the Simulator. This program allows the consideration of
multiple independent variable proclems with interactions of all orders
allowed. Since the classical approach to this problem is of such magnitude
that present and projected computers cannot adequately cope with the
solution in many cases, the Simple Learning technique was developed to allow
a solution to be made.
7The two programs thus allow the simulation of very general physical
systems from a rather basic set of information available on the systems.
The resulting reduction of manhours of programming should allow the extension
of systems simulation techniques in many areas not previously practical.
Furthermdre, the availability of these techniques should allow the study,
and thus lead to understanding, of more complicated systems and components
than those previously treated.
Is THE SIMULATOR PROGRAM
THE STRUCTURE OF A PROCEDURE TO GENERATE ALGORITHMS TO
SIMULATE PHYSICAL SYSTEMS'
In order to understand a procedure that can generate algorithms
to simulate physical systems, first consider the nature of the problem. In
general, a physical system consists of a collection of components or elements
that are interconnected to each other in various ways.
For example, a typical vibrating mechanical system consistes of masses,
springs, dampers, levers and so on. These components or elements of the system
are interconnected to each other to form the desired system. One such interconnection might be the attachment between a mass element and a spring element aS an illustration.
The behavior of each component is determined by various physical
laws and relationships that are determined by the nature of the component.
In particular the behavior may be expressed in terms of the values of parameters at the points of interconnection of the component to the Other
members of the system.
For the general system thlere may be a very large number of different components and associated with each component a large number of methods
and procedures that allow the performance to be calculated. In order to
select those procedures needed to produce a simulation of the. system
additional constraints must be imposed. These constraints consist of
those parameters for which values will be supplied as initial and/or
boundary conditions.
9
For purposes of discussion, suppose that the simulation of a system
is regarded as a method or procedure that will allow the calculation of the
values of the various parameters belonging to the system. The task of the
simulator program is directed at the problem of determining the method of
calculation just mentioned. The actual calculations of the values of the
parameters will be produced by the program method produced by the simulator
program. The Simulator program enters the problem as an analytical rather
than as a calculational method. It is this use of the computer on the level
of producing programs that in turn are used to produce calculations that
allows the most powerful applications of the technique. In so doing, the
program has assumed a very large burden in the solution of system simulation
problems. This in turn will allow the user to study more complicated and
more accurately represented systems,
Specifically, the task of the simulator is not to be construed
as that of determining all possible values of all possible parameters but
rather that of determining the values of specific parameters subject to
specific initial and/or boundary conditions. If the procedure for defining
the algorithm can be made quite general then the parameters selected for
display and the conditions imposed can be made quite general. The problem
is always that of determiningwhen a sufficient set of information has
been supplied and of producing the procedure when a sufficient set of information is present,
10
The information requirements are easily set down. The determination
of the sufficiency is most difficulto The requirements are: 1) The Definition
of the System. 2) The Definition of the Components of the Systemo 3) The
Specification of the Constraints to be imposed on the System.
The Definition of the System consists of the specification of the
elements or components of the system and the way in which these components
are interconnected. For the purpose of this discussion let the Definition of
the System be complete when all of the components of the system are defined and
there are no possible interconnection points of any component that are not
connected to some other point. To secure the completeness it may be necessary
to define some components to act as sources, sinks or boundaries.
The Definition of the Components of the System consists of the
specification of the methods or procedures by which values of parameters at
the various interconnection points of the components may be found in terms
of the values of parameters at the same and other interconnection points
of the same component. Strictly speaking, completeness of the Definition
of the Components requires an exhaustive collection of all possible methods
of parameter calculation. In other words, every feasible method or technique
that can be applied to a given component must be made part of the collection.
Otherwise, it is always conceivable that a program will not be generated by
the simulator because a technique was omitted from the collection. This will
seldom, if ever, be achieved in practice. Usually the most the Definition
of Components can be expected to do is to embrace the most generally productive methods.
11
Since the decision of what constitutes general productivity is at
best highly subjective, a procedure charged with constructing a calculational
procedure from these methods must not be involved directly with this decision
or else its utility is almost certain to be limited by the decision. If
possible, the simulator procedure should allow easy extension and/or modification of the Component Definitions independ1tyyf ithe simulator.prceedure
itself. That is, the method of analysis and use of Component Definitions
should be independent of the constents of the Component Definitions.
The Specification of Constraints imposed on the System consists of
the values of thieparameters to be construed as initial and/or boundary
(operating) conditions for the system, Completeness of these Specifications
is generally dependent upon the completeness of the Component Definitions.
If very complete Component Definitions are available for the system
then it is possible that several different sets of values of the parameters
can be made to produce a given value of another parameter. For example, in
the superheat region for steam the enthalpy of the steam may be found if the
values of any two independent properties, such as pressure, temperature,
entropy, specific volume and so on, are known. If many methods are available
in the componet definition for the determination of enthalpy then almost
any combination of two parameters will allow the calculation. If only a
few methods have be&n included in the component definition obviously the
parameters given as constraints must be so chosen that these methods apply.
In other words, if the Component DefLnitions are very nearly complete, then
calculation procedures can be found for almost any set of constraints that
may be given. If this is not the case, then the set of constaints must be
large enough to include those that are needed for the calculation using the
available methods. In almost every case however there will be some minimum
set of constraints required for any given desired parameter.
The interaction between the Constraints and Component Definitions
for a given system is extremely complex. The determination of a minimum set
of constraints for a given set of component definitions and a given system
definition may be of interest in some cases but the general problem cannot
usually recognize whether a failure to yield an algorithm to produce a particular result is due to lack of constraints or deficiency in component
definitions. Furthermore, when several alternative calculational methods
exist at one or more points in a system it is possible that a valid procedure can be found before all possible procedures have been examined.
In addition to the three information requirements previously
mentioned, the generation of a specific program must be viewed as a selection
of a reduced collection of methods and their arrangement to yield specific
values for certain parameters. Otherwise the simulation of a system would
be required to be exhaustive and again this is, in general, not possible.
To be generally useful a procedure for generating a simulation program should
allow for unrestricted specification of desired information and be charged
with the task of establishing a method of calculating this information within the framework of the previous three information requirements whenever
possible.
The development of an algorithm to accomplish this generation must
be concerned with the recognition of conditions in which it can be established
that a method df calculation cannot be foundo This may seem strange Unhtil
13
it is considered that when a method cannot be found and the condition recognized
then it becomes possible to either terminate the attempt or restart the attempt
from another direction. These things could not be done otherwiseo If the
situation can be recognized when further progress cannot be made in generating
a program it is then possible to formulate a simple procedure for generating
a program.
A program may be said to be "nonextendable" when there exists at
least one required parameter that satisfies the following conditions: 1) The
parameter is not given as a constraint and 2) there is no method in either
component definition of the two components directly involved with the parameter
that will yield the value of the parametero For example, suppose that the
current flow is needed through a thermistor and that no method is available
that can produce the current flow value in this caseo Then, if further direct
progress is to be made the value must either be given or an appropriate method
must be added to the Component Definition collection. If neither of these
things can be done then the progress depends upon finding an alternative chain
of methods that avoid the need for the current flow through the thermistor,
If there have been no points earlier in the work at which alternative paths
could have been chosen then there is no possibility of the extending the
method beyond this point. The program is said to be "nonextendable",
If a program is nonextendable and each previously established
required parameter could be found in no more than one way, then the problem may be said to be not well posedo A problem is not well posed, in
general, when there does not exist any collection of methods to yield all
14
of the requested information subject to the imposed constraints and the
definitions. It should be noted that a problem may be not well posed even
when several methods exist at previous stages. The determination of the
well posed condition when this occurs may require an exhaustive investigation of all possibilities.
An algorithm that uses the definition of a non extendable program
to generate a simulation program for a system is the following:
1) Check to be certain that the System Definition is complete.
That is, determine if there is a Component Definition available for every
element of the System Definition and if every attachment point is connected
to some other point. (A Complete System Definition may not be correct but
it is capable of analysis.)
2) "heck each attachment point in an ordered search to locate
any point at which there is requested information. A "request" for a parameter
may have occurred in one of two ways, a) the value of the parameter may have
been desired by the user of the program, b) the value of the parameter may
have been required as an input for a method selected previously. A "request"
cannot occur in any other way,
2A) If no such point can be found in the entire system an
algorithm for the simulation of the program has been found.
(If no such point were found in the first search, the problem
is trivial but the preceding statement is still valid.)
2B) When such a point is found, remove the request for
information by one of the following methods:
2B1) Matching the request with a specified constraint.
That is, if a request result has been given as a
15
constraint value then the value of the requested result
is known without any further calculation.
2B2) Finding a calculational procedure that applies
at this point that will yield the requested result.
If more than one method applies, pick one and indicate
that a choice has been made.
2C) Whenever the request is removed by matching with a constraint no new requests are created. Whenever the request is
removed by finding a method, the method may introduce new
requests for information. When this is true, cause the entire
step 2 to be repeated again after completing the current search.
2D) Whenever a request cannot be removed, the program may be nonextendable. Test the choice indicator and a) if no previous
choices have been made, the program is not well posed, b) if
previous choices have been made, cause the program generation
to return to an earlier state, make another choice and try
again.
3) Repeat the search indicated by step 2 until 2A is satisfied
or until an upper limit of numbers of trials have been exceeded or until
the problem is shown to be not well posed.
4) Whenever step 2A is satisfied, the required algorithm consists
of the methods found by step 2B2 executed in the reverse order of their determination. That is, the first method determined yields the last request
required of the program. The second yields the next to the last and so on.
If any method introduces new requests, these results must be found before
16
the method can be used. This is precisely the situation that will be
obtained since the methods to determine these results will be found later,
the execution of the methods in reverse order will produce the results
before they are needed by the method requesting them.
An essential part of the previous procedure lies in the technique
of picking a method whenever more than one method is available. Obviously,
if an inflexible selection is made, that is, always choosing the "best"
method (no matter how "best" may be defined), repeating the generation when
a program has been found to be nonextendable would always lead back to the
same point. Therefore, the selection should be made flexible and, in
particular, allow equal chance of selection for equally promising methods
and occasionally the selection should allow choice of methods not locally
"best," Thus the technique must set some scale by which the characteristic
"equally promising" and, in fact, the degree of "promise" can be measured.
Many such scales could be specified. One scale that is easily determined
and contains some measure of the "promise" characteristic is the ratio of
the number of useful results produced by a method to the number of new
information requests the method will makeo Since the objective is to
eliminate the requests by finding methods that require only constraint
information, such a scale would place greater weight on methods that make
the smallest number of new requests, but would also consider the number of
useful results yielded,
For example, a method that requires one new result while yielding
one requested result is equally promising when compared to a method requiring four new results to yield four requested resultso However, a method
17
yielding four requested results and requiring only two new results would
be scaled twice as promising as either previous method. On this scale, a
method that produces any results without requiring any new results would
automatically be selected. Also a method that produces no results is
automatically rejected. The important point to be understood is that
the selection cannot be fixed so that the method of greatest finite weight
is always selected since it is possible that one of the parameters that
this method would require may not be capable of calculation due to incompleteness of Component Definitions or Constraint Specifications while a method of
less weight may avoid this difficulty. However, the methods of greatest
weight should tend to be selected if the program is ever to be finished.
The result of these considerations is to produce a selection method that
operates probabilistically in the choice.
The implementation of this procedure in the form of a program for
the digital computer requires two tasks to be performed: 1) The creation
of an artificial language to allow communication of the information concerning the system between the human user and the machine program, 2) the
preparation of the foregoing simulator procedure as a program capable of
accomplishing the translation of the artifical language into a simulation
program. Since the first of these tasks is strongly associated with the
human and his simulation problem, while the second task may, for the casual
user, be regarded as a problem removed from his immediate consideration, the
discussion of the implementation of the simulator is divided into two parts
along this division. Of course, the second part is vital to the use of the
first but its operation is of concern to a relatively small number of people
18by comparison, It must be understood, however, that the generation of the
simulation programs is accomplished by the translator, The translator is
thus the procedure of greatest importance in the solution of the simulation
problem on the machine,
IMPLEMENTING THE SIMULATOR
Communication of the System Information to the Program
Communication of the information concerning the system to be
simulated to the simulator program is the first, and for the simulator
user, the most important, step in the generation of a program to simulate
the system. This transfer of information is accomplished through the medium
of an artificial language that is designed to be reasonably like the user's
own and, at the same time, contain a structure that is recognizable by the
program so that the information content may be extracted. Thus the user
may expect to use familiar alphanumeric characters and standard punctuation
symbols in all but a few cases.
Since the user is often not acquainted with the detailed operation
of computing machines, some effort has been made to remove restrictions in
the formats for source program preparation, (The source program is the
collection of punched cards containing the user's system infornation. The
simulator program produces an object program from the source program. The
object program is a machine translatable program from which the machine can
produce the desired simulation when supplied with data, ) In general, the
source program may be punched anywhere in columns 1 through 72 on IBM cards.
Statements may run over from card to card without requiring special continuation symbolso While most users will tend to place a single statement per
card for convenience in checking and correcting source programs, more than
one statement may be placed on a card, if the user so desires. Statements
are terminated with a period (decimal point symbol) as in conventional
19
20
writing. In a few cases, notably in Element Descriptions, where the user
wishes to convey a specific object program language to the simulator,
format restrictions will be imposed and emphasized at that place.
Structure of the Simulator Source Language
Information must be transmitted to the simulator from three basis
areas and, if desired, further implemented by a fourth "utility" area. The
basic areas are 1) the System Definition, 2) the InputOutput Requirements,
and 3) the Characterization of Component Performance. In every problem, the
user will be involved with the first two of these areas directly. If the
problem requires modification or extension of the libraries on Component
Performance, the user will also be involved in the third area. The language
requirements for each area are interrelated so that the user may carry over
most of the structure from one area to the next.
I. The System Definition
In order to simulate a system, the specific system to be
considered must first be separated from the set of all possible systems
allowed by the simulator. This requires the communication of 1) the names
of the actual components that are found in the system, 2) the way in which
these components are attached (connected) to each other.
This information is transmitted to the simulator program by
statements occurring within the range of a CONNECTIONS declaration. A
Declaration does not perform any calculation but instead prepares the
program to receive the information that follows. The range of any declaration begins immediately after the declaration and continues until terminated
21
by any other declaration or by the end of input data cardss, The form of
the system definition declaration is: CONNECTIONS.
Connection Statements
Connection Statements are chains of symbolic names transmitting
the precise components and attachments and their interconnection to the
simulator, The symbolic names are of four types:
1, Element Name
Any six or fewer alphanumeric characters may be an Element
Name. The Element Name used in a Connection Statement must either agree
exactly with the corresponding Element Description Name for that component
or be made to agree by use of a synonym, There are no restrictions as to
the order of appearance of alphabetic or numeric characters,
Examples of Acceptable Element Names
PIMP; TRBIN1
6L6; 12AT7
MASS; SPRING; DAMPER
2, Element Identification
Since more than one element of a given kind may be found
within a system means must be provided to identify each different element,
Since the desired effect is to convey uniqueness of the system, the user
may use any six or fewer alphanumeric symbols to identify the elements of
the system. If an element is identified, all occurrences of the same
element must exactly agree in identification, If an element occurs only
once in a system, the element identifier may be omitted,
EXamples of Acceptable Identifiers
1; 2; 3
A; B; C
Al; 2B; 5C6
MAIN; SCNDRY
22
35 Attachment Name
Every element enters the system definition by the way in
which it is connected to the rest of the system. That is, an element
name cannot, except for unary and binary elements, occur without an
associated attachment name. The attachment name consists of six or less
alphanumeric characters and must agree with the attachment names given
in the Element Description for the element associated with the attachment
name. If the user desires the agreement can be obtained through the use
of synonymso
Examples of Acceptable Attachment Names
INLET; EXIT
1; 2
ENTRY3; OUTLET
GRID; PLATE; CTHODE; SCREEN
4. Attachment Identification
If more than one occurrence of an attachment name with its
identified element exists in a set of CONNECTIONS statements, an ambiguous
situation arises. The attachment, in effect, has been made to several
different places with but one physical contact point. The user has the
option of defining an element with branching or junction properties to
resolve this problem or, if it is more convenient or desirable, the option of writing the element description of the component to allow for
attachment identification. The attachment identifier is any alphanumeric
name of 6 or less characters. A unique identified attachment of an identified element may occur only once in each program. The acceptable forms
are like those of element identifierso As with element identifiers, the
user is free to create whatever symbolic attachment identifiers that may
be needed.
The Connective, TO, and Connections Punctuation
The foregoing forms of symbolic names are sufficient to define
a unique connection point in a system. Let the following generalized
symbolic names be defined:
Let EL1 be any allowable element name.
E11. be any allowable element identifier.
AT1 be any allowable attachment name,
AID1 be any allowable attachment identifiers.
The following forms of connection points are then allowed:
ELl(ATl) if there is only one element EL1 and only one
AT1 on EL1 in the system.
EL1, EIDl(ATl) if there is more than one EL1 but only
one AT1 on EL1, EID1.
ELl(AT1, AID1) if there is only one EL1 but more than
one AT1 on EL1.
EL1, EIDl(ATl, AID1) if there is more than one EL1 and
more than one AT1 on EL1, EID1.
Let CONN be any of the connection point forms above.
EL1 (AT1)
EL1, EID1 (AT1)
CONN ^EL1, (AT1, AID1)
ELU, EID1 (AT1, AIDl)
Then a CONNECTIONS Statement is of the form: CONN, TO, CONN.
The connective, TO, establishes the joining together of the connections.
The punctuation should appear as written in the definition.
Examples of Acceptable CONNECTIONS Statements
PUMP1, A23 (OUTLET, B), TO, HEATER, 5(INLET1).
6L6, I(PLATE), TO, XFRM, OUTPUT (TAP1, 3).
SPRING, 15B (END1), TO, LEVER, 6C, (AT.3).
24The Special Cases of Unary and Binary Elements
The unary elements (one attachment) and binary elements
(two attachments) allow special treatment in writing connections statements. These elements may be written without specifing attachment names
since the attachment is immediately obtained form the context, If the
full connection statement notation is used, no error will result but some
saving in programming will be losto The simulator program will assign the
attachment names 1 and 2 to binary elements and the attachment name 1 to
unary elements, The user must take care to use these names if reference
is made to these elements using the complete notation.
Examples of Connection Statements with Unary and Binary Elements
PUMP1, 16(OUTLET), TO, PIPE, 23, TO, HOTWEL.
2N133, l(CLLCTR, 1), TO, CAP, 3, TO, 2N132, L(BASE, 1).
II. The InputOutput Requirements
In addition to the System Definition, the user must st&te the
information requirement to be imposed on the system. That is, the parameters
for which values will be supplied as data for initial and/or boundary condiptions and the parameters for which the user expects the program to produce
values must be stated to the simulator, These parameters are listed under
one or the other of the inputoutput declarations: 1) INPUT PARAMETERS.
2) DESIRED RESULTSO The user must give the source program the appropriate
declaration followed by a list of the pertinent parameters, The list of
parameters consists of 1) the name of the parameter, 2) the symbolic location of the parameter in the system.
25
Parameter Names
Parameter names are 6 or fewer alphanumeric characters of which
the first character must be alphabetic. These names must agree exactly
with the parameter names used by the element descriptions. Synonym modification is not allowed.
Examples of Acceptable Parameters (with Associated Connections)
PRESS (SUPHTR, l(EXIT)).
VOLTS (6SN7, 3(GRID))*
Specification of More Than One Parameter at a Point
More than one parameter may be specified at a point by giving a
list separated by commas and followed by the point designation. For example:
FLOW, PRESS, TEMP (TwkIN 1, l(INLET)).
FORCE, VELCTY, ACCIRN (LINK, 3(END1)),
Parameter Range
The range of the inputoutput requirements Of a parameter is
established by the point designation, If the point designation is EL1,
EID1 (AT1, AID1) the parameter requirement will have the range of exactly
one point. If the designation is EL1, EID1 (AT1) and there is more than
one AT1 Qnr'. EL1, EID1 in the system definition, the range will be for all
AT1 on EL1, EIDJT If the designation $s EL1 (AT1), the range is for all AT1
on all ELL. In general the range is defined to cover every connection bearing the.deignation. This concept is very useful for writing parameter
inputoutput requirements but may trap the unwary user. (If for example,
the user gave the designation EL1, the effect is to place the parameter
requirement at every attachment on every occurrence of EL1, )
III. Characterization of Components
The simulator must have precise information concerning the
way in which the physical laws or relationships are to be treated for each
component in the system. Fundamentally, the problem is one of allowing
the program analytical access to a large collection of possible methods.
The methods are catalogued as to the input information required and the
output results produced. The simulator program then searches the catalogue
for the most appropriate methods to use in the generated program.
If the library of component descriptions is complete for a given
system, the user will obtain a program to simulate the performance of the
given system after supplying only the System Definition and the InputOutput requirements. If the component library is inadequate for any
reason the user must then supply additional information concerning the
component. This is done by using the declaration'ELEMENT DESCRIPTION."
followed by a collection of assertions and statements conveying the
information.
Assertions Within Element Descriptions
An Assertion, like a declaration, conveys special information to
the simulator program but unlike the declaration does not terminate the
scope of the declaration, The "ELEMENT DESCRIPTION." language has four
assertions and two forms of statements.
Element Name Assertion
The element name assertion defines the six or less alphanumeric
symbolic name by which the description will be recognized. This is the true
27
name of the element description. Let ELNAME stand for any symbolic
name, then the element name assertion is: NAME OF ELEMENT EITAME.
Examples of Allowable Element Name Assertions.NAME OF ELEMENT PUMPI.
NAME OF ELEMENT 2N133.
NAME OF ELEMENT SPRING.
Parameter Scope Assertion
So that an attachment may be identified, the concept of parameter
scope must be implemented, The parameter scope concept classifies parameters
into Broad Scope parameters and Narrow Scope parameters by applying the
following rules:
A parameter is a Broad Scope parameter if when considering
this parameter at an identified attachment it is true that when
the value of the parameter has been established at any one of
the identified attachment points it has automatically been
established for every other identified point of that attachment
A parameter is a Narrow Scope parameter otherwise, In
particular, a parameter is a Narrow Scope parameter when a
requirement for a value of this parameter at an attachment
point automatically requires individual determinations of
the value of the parameter at each identified point of the
attachment
The user may declare a parameter to be of Broad Scope with the
assertion: BROAD SCOPE PARAM1, PARAM2, PARAM3. where PARAM1, PARAM2,
PARAM3 stand for any symbolic parameter names*
The simulator program will assume a parameter to be of Narrow
Scope unless otherwise specified in every component description using the
28
parameter contained in either the Permanent or the Temporary Library.
A parameter not used by an Element Description is thus excluded from
this assumption. Failure to properly assert the scope of parameters may
result in failure to generate programs that should lie within the scope
of the simulator, but programs generated will yield correct results even
though redundant calculations were programmed.
Examples of Parameter Scope Assertions
BROAD SCOPE PRESS, TEMP.
BROAD SCOPE VOLTS.
Typical broad scope parameters are pressure, temperature, voltage,
since if these parameters are established at any identification of any
attachment all other identifications of the same attachment must have the
same value for the parameter. Typical narrow scope parameters are flow,
current and similar parameters since the value for the attachment requires
the values at every identification of the attachment,
Library Status Assertion
The elevation of an Element Description to Permanent Library
Status should be made only after the Element Description is throughly
checked and very generally useful. When the user feels that these requirements are met the status may be made permanent by giving the assertion:
PERMANENT.
After an ELEMENT DESCRIPTION is entered in the Permanent Library
it may be removed only by rewriting the entire Permanent Library. Once
entered in the Permanent Library the Element Description does not have to
appear with the source program deck to generate programs using the element,
29
The preceeding three assertions may be made in any order but if
given, must follow the Element Description declaration, and preceed the
first Statement Collection assertion for a given element description.
Statement Collection Assertion
The assertion Statement Collection prepares the simulator so that
the Statements following the assertion will be processed to form the element
description capability, The statement collection assertion has the form:
STAEMENT COLLECTION.
Collection Capability Statement
Immediately following the statement collection assertion, the
collection capability is stated. The capability language conveys the input
parameter requirements and the result capability of the collection, The
parameters are given in exactly the parameter language form of the InputOutput requirements, The Collection Capability requirements add three
special words to the simulator language; 1) WITHOUT, 2) THEN, 3) ESTIMATE.
The Connection Implication Then
The word then set off by commas separates and identifies for the
simulator the output results of a statement collection, This may be best
illustrated by example.
Suppose the user wishes to convey to the simulator the capability
of an element description that would allow the determination of enthalpy at
a point if pressure and entropy were known. This capability might be
expressed: PRESS, ENTRPY(OUTLET), THEN, ENHLPY(OUTLET).
30
A capability statement may involve any number of input parameters
and any number of output results but only one capability statement may be
given by the user for each statement collection.
The Restrictive Without
To avoid the problems associated with having many similar element
descriptions for components that are basically alike but have different
attachments the user is permitted to restrict a Statement Collection to
apply only to those elements of the type that are without certain attachment pointso For example, this allows one element description to be written
for a turbine stage and treat both stages with an extraction point and
stages without an extraction point. The collection handling.a turbine
stage without an extraction might use the capability statement:
FLOW (INLET), WITHOUT (EXPT), THEN, FLOW (OUTLET).
Clearly the word without conveys the information required to
prevent the use of the statement collection that would follow the capability
statement in the case of the turbine stage with an extraction point.
ESTIMATE, The Iterative Solution Collection Indicator
Often physical systems are simulated most conveniently through
iterative algorithms. That is, the program is so structured that an initial
estimate is improved by repeated calculation, If the user wishes to present
a collection of MAD statements to the simulator library to allow the use of
such a technique, the form is such that usefull results are apparently
produced without requiring any input information, Such a collection should
be used only if the parameter produced by the collection has been calculated
31
independently by some other methodo In order to inform the simulator that
a statement collection is iterative in form the word ESTIMATE is given
followed by the list of parameters to be estimated (and later calculated).
When this is done, the simulator will allow the use of the iterative
collection only when the parameter has been calculated by some other means
later in the program.
An acceptable capability statement for the ESTIMATE control word
iso ESTIMATE, FLOW (EXTRCT)o
Restricted Collections
Certain types of statement collections that are extremely useful
in writing element descriptions are such that if they occur in a program
they may not be used more than once at any given attachment points As an
example of such a collection consider the continuity equation for mass flow
applied at a branching junction. Suppose, for illustration, that there is
a single inlet stream designated at the attachment INLET but five outlet
streams at the attachments EXIT, 1; EXIT, 2; EXITY 3; EXIT, 4 and EXIT, 5.
If the flow were found at the inlet and at exits 1, 2, 4, 5 then continuity
allows the flow at 3 to be found by the relation:
FLOW3 = FLOWINLET  FLOWEXIT i
i=l
i/3
Clearly this is a most useful collection form but its use must
be restricted to one occurrence at any given attachment. Of course, the
collection may be used once and only once at any pertinent attachment point
in the system and, therefore, the collection may appear several times in a
program, each time at a different attachment.
32
The collection capability statement for this type of collection
uses an identified attachment name for at least one of the attachment names
that apply. The capability statement for the preceeding illustration is:
FLOW (INLET), FLOW (EXIT, i), THEN, FLOW (EXIT)
or the equally correct forms
FLOW (INLET), FLOW (EXIT), THEN, FLOW (EXIT.,j)
or
FLOW (INLET), FLOW (EXIT, i), THEN, FLOW (EXITj)
Note that the symbols i and j are completely arbitrary and therefore
open to the user's choice.
Only one form is not recognized as meaningful by the simulator.
That form omits both occurrences of the identified attachment on the opposite
sides of the implication TEEN. The previous illustration written incorrectly
is:
FLOW(INLET), FLOW(EXIT), THEN, FLOW(EXIT)
This form effectively says that the exit flow is known if the
exit flow is known. This is not meaningful to the simulator because of
the apparent redundancy.
The Collection of Statements
Immediately following the capability statement the user must
supply the program statements that will produce the results claimed.
This simulator program adopts the Michigan Algorithm Decoder language
as the medium for the program statements, The user must write the
Statement Collection conforming exactly to the format and coding restrictions and conventions of that language, Complete details and. manuals for
33
the M^AD language are available fronm the Computing Center of The
University of Michigan, It is presumed here that the user is familiar
with the MkA,,D language.
The simulator program allows the complete structure of the M.AoDo
language to be employed in implementing statement collections. Two special
symbols are the multiple punch symbols plus zero (~) and minus zero (B)a
The plus zero is punched by depressing and holding the multiple punch key
on the IBM 026 keypunch and then striking the plus sign and the numeric zero,
The resulting combination of holes in the IBM card is 120* (The equivalent
1228 combination will not be read properly by the peripheral 714 Card Reader )
The minus zero is produced by depressing and holding the multiple punch key and
then striking the minus (not the dash) and the zero, The resulting combination
of holes is ll0) (The equivalent.1128 will not be read properly by the
714 Card Reader)o
These symbols function as special brackets or parenthesises for
the simulator, Since the user should have no need for'these symbols in his
MA.D. statements their use is specifically restricted to conrey information
from the MAkD4 statements to the simulator.
Meaning and Use of the Symbol 0
The symbol 5 is used to delimit two types of statement segments;
1) function substitutions, 2) floating statement labelso
The function substitution use allows modifications to be made in
the actual program generated by simulator at the time of the execution of
the generation~ For example, suppose that the element description for the
element TRBINEI has need for the stage efficiency of the turbine stage and
34
the different turbine stages require different functions to describe the
efficiency,
The statement collection might be written:
+ +
EFF=OETAO, (0 FLOW(INLET)0)
and at execution of the program, if this collection were required a check
would be made to determine whether a substitution of names was desired for
ETA. Thus the program might be made to produce ETA1 for ETA when stage one
used, ETA2 for ETA when stage two is used, and so onO If no substitution
is given at execution time, the program will use ETA. The 0 symbols are
deleted from the object program.
The floating statement label allows the use of statement labels
in statement collections. Since an element may appear any number of times
in a system and the same statement collection might therefore occur any
number of times in a program simulating the system, the user must not use
fixed statement labels within a statement collection. If this were done
the program generated might be ambiguous. Floating statement labels allow
the simulator to generate unique statement labels and thus eliminate any
ambiguity in labels,
The user wishing to use a floating statement label writes:
O**XXXXX5
Where the X represent any six or less alphanumeric character statement
label the user may care to use in his statement collection. The floating
statement label is initiated by the O followed by two asterisks (*) and
is closed by the 6Q The simulator will generate a unique statement label,
replace the entire floating statement label by the unique label and use
35
this unique label whenever the same floating statement label is found in
this local statement collection. Should the collection appear again in
the object program another unique statement label will be generated.
Example of Floating Statement Label
THROUGH 5**A5, FOR I = 1, 1, I.G.10
WHENEVER T.G. TMAX, TRANSFER TO O**AO
O**AO CONTINUE
+
The Meaning and Use of the Symbol 0
+
The symbol 0 is used to delimit the parameter and attachment
names for which the simulator must generate unique variable names of 6
or less alphanumeric characters, If it is possible to extract the first
three characters of the parameter name and produce a unique symbol, this
will be don&o In this way the mnemonic significance will be preserved so
far as possible. If conflicts occur, the simulator will generate a unique
symbol for the parameter and use this symbol throughout the object program.
In generating the unique symbol the procedure is to first build up a three
nonblank alphabetic character symbol and if conflict still exists modify
this symbol with probability 0.10 of changing the first character to a
random alphabetic, probability 0.30 of changing the second character to
a random alphanumeric, and probability 0,60 of changing the third character
to a random alphanumeric. The process continues until a unique symbol is
generated. A maximum of 70 different parameters may occur at each attachment in any system to be simulated.
36
Let PARAM represent any true parameter name; ATTCH represent any
true attachment name; AID represent any attachment identifier, then the
following forms are allowed within the scope of a pair of 0 symbols:
+ +
O PARAM (ATTCH, AID) 0
PARAM (ATTCH) 5
t (ATTCH, AID) t
0 ATTCH, AID 0
o (ATTCH) O
o ATTCH
+
No other forms are permitted within the scope of Oo
The attachment within the scope of the 0 will cause a unique three
digit attachfent point number to be determined from the System Definition.
The unique parameter name code and three digit attachment code are combined
to form a six character variable name for use by the M.A.D. translator. In
+
case no parameter occurs in the 0 scope, as in the latter four allowable
forms, the parameter code is generated as three blank columns. The user
may use this feature in whatever way may be logically useful in his statement collection.
Because of possible ambiguity, the user may not use more than
one identified attachment with a given statement. Any number of occurrences
of the same identified attachment may appear with the same or different
parameter names within the same statement. When an identified attachment
is encountered, the simulator will produce a copy of the statement for
every identified attachment occurring in the System Definition for the
identified element concerned with this statement collection,
37
The only exception to this rule occurs when a statement
collection involves an identified attachment name agreeing exactly with
the point of attachment in the system that caused the collection to be
chosen. For example, suppose that the point OUT is identified on some
element and that a parameter, say PRESSR, occurring at OUT has caused a
collection to be selected in which the flow at OUT is treated for identified
attachments named OUTo In this case, a copy of the statement will be produced for every identified attachment OUT except the current one for which
the parameter PRESSR was requiredo The "all except the current point"
rule thus becomes "all points" if the current point is not of the same
+
name, Furthermore, if the attachment name within the 0 scope is not
identified the rule is to use the current point designation if the name
agrees with the attachment name, otherwise use the first occurrence of the
attachment name on the current element encountered in the System Definition,
These rules allow the user of the Element Description Library
to write simple but very powerful statements involving identified attachments, Summarizing, the user may:
lo Write a collection referring to all identified attachments
of an element0
2o Write a collection referring to all identified attachments
except the current one, if the current point name agrees with the
identified attachment.
35 Write a collection referring only to the first occurrence
of an attachment name or to the current point if it has the same name,
with priority awarded to the current point8
38
+
Examples of the Use of 0.
EXECUTE TR BINl (OPRESS(INLET)O,OENHLPY( INET)
1, OPRESS(OUTLET)O, OENHLPY(OUTLET )O;OETAO
2 FLOW(OUTLET)0
The numeric 1 and 2 in column 11 are continuation marks exactly
as in the usual M.A.D. language.
The following group of statements might be used to sum the current
flow at an attachment that may be identified.
CURRNT =0.
CURRNT =OCRRNT(BASE, 1)&CURN T
If K other components were attached to BASE by using attachment
identification in the System Definition the result of the two previous
ELEMENT DESCRIPTION statements will be K+1 object program statements that
will produce the total current flow at the BASE attachment.
All other characters occurring outside the scope of 0 and/or
+
0 symbols, including blanks, are automatically passed directly to the
object program. Correct M.ADo card formats are automatically produced,
as are continuation cards if needed, Remark cards are automatically produced,
as in M.A.D. by placing a character R in Column 11o
End of Element Description Declaration
As many statement collections can be given as may be required to
completely state the performance characteristics of the component. When
the description has been completed the process is terminated by giving the
declaration: DESCRIPTION FINISHED.
39
This declaration must be given, Failure to do so will cause
a simulator error that will finally throw the job off the computer,.
Example of an Element Description
The following element description is intended to illustrate the
Element Description Language and would undoubtedly require more capability
to be of general use. The increased capability may be obtained by adding
as many more statement collections as needed.
ELEMENT DESCRIPTION. NAME OF ELEMENT TRBIN. BROAD
SCOPE PRESS, TEMP. PERMANENT.
STATEMENT COLLECTION.
FLOW (INLET), FLOW(EXPT), THEN, FLOW (EXIT).
FLOWl=0
FLOW1=FLOWl+~ FLOW (INLET, 1)
FLOWl=FLOWL FLOW (EXPT,l)t
6FLOW(EXI =FLOW1
STATEMENT COLLECTION.
PRESS(INLET) FLOW(EXIT), THEN, PRESa(EXIT),PRESS(EXPT)o
5PRESS(EXIT )tPRESS( INLET)Y 5PRATIOO. (SLOWlIEXIT)$)
WHENEVER FIRST, READ FORMAT DATA(l), PDPOEXPTO
PPRESS(EXPT )=( 1.PDPtEXPTt).*PRESS (EXIT )
STATEMENT COLLECTION.
PRESS, ENHTPY, FLOW (INLET), PRESS(EXIT),
THEN, ENHLPY (EXIT), KWH(SHAFT).
FLOW=O.
FLOW=FLOW+$FLOW( INLET, 1)$
EXECUTE TR BIN1 (&PRESS(INLET)~, EENHLPY(INLET)
1 O,OPRESS(EXIT)O, OETAO. (FLOWl), FLOW OENHLPY
2 (EXIT),,6KWH(SHAFT)) )
EQUIVALENCE (tPRESS(INLET 6 6PRESS( (INNLET 1)S)
EQUIVALENCE (6PRESS (EXIT 6), SPRESS (EXIT, 1 )
STATEMENT COLLECTION o
FLOW(INLET), WITHOUT(EXPT), THEN, FLOW(EXIT)o
OFLOW(EXIT )=O.
OFLOW(EXIT )=OFLOW(EXIT )&+FLOW(INLET, 1)
DESCRIPTION FINISHED
40
IVo Utility Declaration
The foregoing three areas of information constitute a
necessary and sufficient amount of information for the simulator program
to accomplish the generation of programso There are some additional
features of the simulator language which are not strictly essential but
which allow the user to produce better programs in some cases and to
produce programs more easily in others. Still other declarations are
for use in "housekeeping", that is, for making the job of Library Maintenance easier. These declarations and their associated statements are
called the "utility" declarations. The declarations are~
1) FUNCTION SUBSTITUTIONS.
2) SYNONYMS.
3) NEW ELEMENT TAPEo
4) NEXT SET OF DATAO
FUNCTION SUBSTITUTIONS Declaration
When the author of the library Element Descriptions so desires
the descriptions may be written so that minor changes can be made at the
time of the program execution. The proper method of writing this capability
into Element Descriptions was treated in that section. The remaining task
is that of allowing the user to exploit this capability in his programs.
This is done by giving a FUNCTION SUBSTITUTIONS. declaration followed by
function substitution statements0 The declaration form is: FUNCTION SUBSTITUTIONS o
Function Substitution Statements
The function substitution statements convey the substitution to
be made and the scope of the substitution to the simulator0 Any six or
less alphanumeric character word may be substituted for any other word
enclosed in 0 in a statement collection, The form of the statement is
defined as follows:
Let NUWORD be any six or less character new word.
OLDWRD be any six or less character old word.
Let EL be any symbolic element name.
EID be any symbolic element identification,
Then the allowable function substitution forms are:
NUWORD, OLDWRD.
NUWOID, OLDWRD (EL).
NTJWORD, OLDWRD(EL,EID)o
The effect of this statement is to replace OLDWRD when it occurs
within 0, and at the element specified, by NUWORDT The scope of the subs
stitution is controlled as follows:
1) For the statement form, NUWORD, OLDWRD(EL,EID )
The substitution will take place only for statements
generated for EL, EID.
2) For the statement form, NUWORD, OLDWRD(EL)
The substitution will take place for statements generated
for any occurrence of ELo
3) For the statement form, NUWORD, OLDWRDo
The substitution will take place for any object program
statements generated.
If function substitution declaration is not used, the original
contents of the 0 in the statement collection will be used. If the user
misspells the OLDWRD, the substitution will not occuro If the user misspells the NUWORD, the object program will contain the error.
Examples of Function Substitutions
Suppose that the element description TRBIN contains the state
ment: DELTAH=DELTAH*OETAO. ('FLOW( INET )O)
and several components TRBIN occur in the system0 The characteristics of
each turbine may be different and the user may obtain different functions
to represent these characteristics by using the following statements:
FUNCTION SUBSTITUTIONSo
ETA1, ETA TRBIN 1 )
ETA2, ETA(TRBIN1,2)o
ETA3, ETA(TRBIN, 3 )
When this is done the object program statements will use the
functions ETAl, ETA2, and ETA3 in place of ETAo
SYNONYM. Declaration
The synonyms declaration allows the use of different symbolic
names in writing connection points, This declaration also may be used to
condense the connection point symbol strings to a single word or a few
words. Both of these uses are introduced, for the convenience of the usero
It is not necessary to use synonyms to write programs, but the use of
synonyms may be of considerable assistance.
The form of the declaration is: SYNONYMS
The synonyms declaration is followed by any number of synonym
statements.
Synonym Statements
Synonym statements convey to the simulator the true symbolic
name or string and the symbols that are synomynous with'the true name or
s n e only stringction is that the true name or string must be first
part of the statement. Equal sign symbols are used to separate the true
part of the statement and any synonymous parts. If any portion of a
connection symbol string is wtten as a single zero (0) that portion
will be left untouched, Let the following symbols be defined:
43
Let ELT be any true element name;
EIDT be any true element identification;
ATT be any true attachment name;
AIDT be any true attachment identification;
ELS be any element name synonym;
EIDS be any element identification synonym;
ATS be any attachment name synonym;
AIDS be any attachment identification synonym.
Then the allowable forms of synonyms are:
ELT=ELS1=ELS2=  =ELSno
OE IDT=OEIDS= OEIDS2= =0 EIDSn.
Type I (ATT)=(ATS)l=(ATS)2= =(ATS)n.
(o,AIDT)=(o,AIDS)1=(0,AIDS )2=.  =(O,AIDS)n'
The single zero must be punched as indicated. The substitution of
the true name will occur unconditionally for any synonym on the right.
ELT, EIDT=ELS1= o=ELSn.
ELT, EIDT(ATT)=ELSl= =ELSno
ELT, EIDT(ATT,AIDT=ELSl= =ELSn.
Type ELT(ATT)=ELS= —o=ELSn.
ELT(O,AIDT)=ELS1=  =ELSne
(ATT,AIDT)=(ATS )= o =(ATS)n
In the preceeding group, the occurrence of the single synonym
symbol will cause the use of the entire true symbol string.
ELT(AT4=(ATS)1=~ =(ATS )n
Type III ELT(O,AIDT)=(OAIDS)1.o =(O,AIDS)no
ELT(ATT,AIDT)=(ATS)1=I  0=(ATS)ne
In the preceding group, the occurrence of the synonyms are restricted to apply only to the true attachment names and/or attachment
identification associated with the true element name independent of element
identification.
O,EIDT(ATT)=(ATS) o =(ATS)
Type IV 0,EIDT(ATTAIDT)=(ATS)f=o=(ATTS)no
0,E IDT(0,AIDT)=(O,AIDS)i=o(O0,AIDS)n
In the preceding group, the synonym stustitution occurs for all
elements identified EIDT regardless of the element.
Type V ELT, EIDT(ATT,AIDT)=ELS,ELDS(ATS,AIDS )1=~ =EL,
Type v EDS(ATS,AIDS) no
In this type of synonym, the synonymous groups are replaced by
the'true names in a one to one substitution0
No other synonym forms are allowed.
The utility of these groups is best illustrated by exampleo
1) Suppose the synonym statement is given: PI4MPl=PMPliPMP=P,
Whenever the user writes PMP1, FMP, or P as an element name in a
connections or input s t s melt the name PTMP1l will be used as the true
name,
2) Suppose the synonym. statement is given: PUMP1, MAIN=PMPI6
Whenever PMP1 is used as an element name in a connections or
inputoitput statement, the element name PUMP1 and the element identification
MAIN will be used as the true names,
3) Suppose the synonym statement is given, PMP1 (OUTLET,PRIMRY )=Xl)o
Whenever (X1) is used as an attachment with PMP1l, regardless of
element identification, the symbols OUTLET and PRIMRY will be used \ true
names
As the user gains familiarity with the synonym capability,
occasionally large reductions in the amount of punching required for connections and inputoutput requirements may be obtained. But it must be emphasized that this is completely a matter of convenience and does not increase
the capability of the simulator.
45
NEW ELEMENT TAPE Declaration
The declaration NEW ELEMENT TAPE, if given, must precede the
entire collection of Element Descriptions that are to be used in the
program, and in particular precedes the group of statements known as
the Prologue and Epilogue. In this way, new collections of elements
can be made and new prologue and epilogue statements can be produced.
The form of the declaration is: NEW ELEMENT TAPE.
Immediately following this declaration the user must give a
set of prologue and epilogue statements, This collection of statements
will be common to every program generated using this element tape, The
prologue is charged with bringing in the input parameters, making certain
initializations, testing for the completion of the calculation and printing
the desired results, The epilogue is charged with transferring the program
back to the prologue for testings The epilogue section is entered automatically at the end of the statements generated to simulate the system. The
prologue automatically precedes the statements generated to simulate the
system.
Rules for Writing Prologue and Epilogue Collections
The rules for writing the Prologue and Epilogue collections are
the same as for Element Descriptions with three exceptions:
1. The symbol 0 when enconnectered for the first and second time
triggers the repetative generation of statements containing the input paramn
eters. Multiple copies of the statement will be generated, the copies will
differ only in the parameters and the parameters will be grouped by attachment point. After the completion of this task, a complete dictionary of the
input parameters will be produced on Remark Cards,
+
The symbol 0 when encountered for the third time triggers the
repetative generation of statements containing the desired results
parameters. Multiple copies of the statment are again generated, one
statement for each desired attachment point as before. On the completion
of this generation, a second dictionary of Remark Cards is produced for
the desired results,
This is the only permitted use of the symbol 0 and must occur in
the Prologue. The symbol 6 is not permitted in the Epilogue. The use of
+ I
three 0 symbols thus allows 1) reading in the input parameters, 2) printing
of the input parameters for verification, 3) printing the desired result
parameters as solutionso
+
The symbol 0 as used in the Element Description is not affected
in any way by the Prologue and Epilogue rules.
2. The minus sign occurring in Column 1 must appear twice in
the Prologueo The first occurrence marks the point after which the program
has completed testing and is ready to produce the desired results printing.
The second occurrence marks the end of the Prologue. Both minus signs in
column 1 must appear in the Prologue. The minus signs in column 1 must not
appear on Remark Cardso
35 The plus sign, occurring in column 1, must appear once at
the end of the Epilogue, The occurrence terminates the processing of the
Prologue and Epilogue0 The plus sign is located on the last actual
executable statement card, and must not appear on a Remark Card.
47
Typical Prologue and Epilogue Collection
The following prologue and epilogue collection is offered as an
example of a generally useful collection. Many of the basic features of
this collection would be common to any collection of prologue and epilogue
statements. The statements themselves must conform to M.A.D. formats and
restrictions.
NEW ELEMENT TAPE.
Coliunm
1 11
R SYSTEM SIMULATION PROGRAM
R PROLOGUE BEGINS
START READ FORMAT ICARD, NOTRYS
VECTOR VALUES ICARD=$7I10$
TRYCNT=1
FIRST=lB
REPEAT=OB
INTEGER NOTRYS, TRYCNT
BOOLEAN FIRST, REPEAT
READ FORMAT WORDS, DATA(l)...DATA(12)
VECTOR VALUES WORDS=$12C6*$
READ FORMAT DATA(l),0
PRINT FORMAT DATA, 0
DIMENSION DATA(12)
VECTOR VALUES DATA=$1HO$
TRANSFER TO BEGIN
BACK WHENEVER TRYCNT,,LNOTRYSo AND. REPEAT
REPEAT=OB
FIRST=OB
TRYCNT=TRYCNT+l
TRANSFER TO BEGIN
END OF CONDITIONAL
WHENEVER REPEAT, PRINT FORMAT REMARK,
1 NOTRYS
VECTOR VALUES REMARK=$18HONO CONVERGENCE IN 15,
1 8H TRIALS.*$
READ FORMAT WORDS, DATA(l)..DATA(l2)
PRINT FORMAT DATA (l),
TRANSFER TO START
R END OF PROLOGUE
BEGIN CONTINUE
R END OF GENERATED SIMUIATION PROGRAM
R EPILOGUE BEGINS
TRANSFER TO BACK
R END OF EPILOGUE
+ END OF PROGRAM
48
Program Continuation Declaration
If there is more than one system to be simulated in one approach
to the computer, a statement is needed to signal the end of one problem
and set signals to return for further problems upon completion of the
current one. The declaration accomplishing this signalling is: NEXT SET
OF DATA,
This statement must be contained on the card preceding the first
card of the next problem. Otherwise, the first card of the next problem
will be skipped in processing. If there is no next problem, there is no
NEXT SET OF DATA, declaration and return is made to terminate the simulator
program.
General Simulator Problem Considerations
The statements, assertions and declarations of the Simulator
Language may usually be presented without particular concern for ordering
and grouping of the statements,, That is, CONNECTIONS declarations and
statements may be intermixed with INPUT, PARAMETERS, DESIRED RESULTS,
FUNCTION SUBSTITUTIONS and SYNONYMS. Only those statements pertaining
to the Libraries are somewhat restricted in order NEW ELEMENT TAPE must
precede the Prologue and Epilogue and all ELEMENT DESCRIPTION declarations,
assertions and statements. The statements in ELEMENT DESCRIPTION are
restricted somewhat. Reference should be made to that section of this
paper for the exact restrictions, Beyond this restriction it should be
noted that while no error will result to prevent execution of the simulator
considerable savings in time of processing can be made by placing all Element
Descriptions containing the assertion PERMANENT before those without this
assertion. If this is not done, the Temporary Library must be saved,
the new Permanent Library entry made ad.the Temporary Library restored after
each Permanent entry. This is not a fast procedure at best but will be
done if required for processing.
The current simulator is restricted in the size of system that
can be simulated. The version for 704 Electronic Data Processing Machines
with 8192 word core storage, 8192 word drum storage and 6 tapes will
accomodate 200 connection statements. Simple revisions for 32768 word
core storage machines would accomodate 1000 connection statements, Each
connective TO produces one connection statement.
Provisions are made for 125 Element Descriptions. Each element
is allowed a maximum of 20 different kinds of attachments, If attachment
identifiers are used, no limit is placed on the number of identifiers for
each attachment. Each attachment and the system may treat up to a total
of 70 different parameter types. That is, each attachment is involved
with the same set of parameter types and the total number of different
types in the system may not exceed 70. The number of INPUT PARAMETERS
may not exceed 200. The same limit applies to DESIRED RESULTS. The
number of SYNONYMS may not exceed 400, The number of FUNCTION SUBSTITUTIONS
may not exceed 400. The number of card images allowed in one statement
generated for the object program is limitedby M*A,oD:to the first card. and upto
nine continuation cards, The number of statements in a statement collection
as well as the number of statement collections in an element description is
limited only by the length of a magnetic tape reel. No practical limitation
is expected in this area for some time.
50
There is no limit on the number of times an element may appear
in a system. The only restriction is that each unique element attachment
may be used only oncee This is a restriction to prevent ambiguity in the
system definition and not a size limitation.
The scope of a parameter is assumed to be narrow unless the
parameter is specifically defined to be a broad scope in every element
description in both libraries that refers to the parameter.
With the exception of the declaration NEXT SET OF DATA and the
statements for Element Descriptions the simulator statements may be prepared
anywhere within columns 1 through 72 of IBM cards and may run from card to
card or contain more than one statement per card. No continuation marks
are used but every simulator statement (but not the MoAoD. statements in
element descriptions), assertion and declaration terminates with a period
(decimal point, punched 1238).
Remark Cards for MADo that will be produced in every object
program carry an R in column 11l Remark Cards for the simulator current
job carry a division slash / in column lo Any card with / in column 1 is
ignored, but printed, by the simulator,
THE STRUCTURE OF THE SIMULATOR TRANSLATOR
After a language for communication of information concerning a system
to be simulated is established, the job of the simulator program renains.
That job is the translation of the various statements allowed by the language
into an algorithm or solution procedure for the system simulation requested.
This is accomplished by several sections of program. The sections
and their function are:
1) Preprocessing
2) Desired Result Reduction
3) Program Generation
The preprocessing phase consists of decomposing, analyzing and regenerating the information from the source program statements in a form more
easily handled by the machineo Input Parameters and Desired Results are
saved in a very condensed form, Since each attachment point may have up to
70 parameters and these may fall into two groups (input parameters and desired
results), each point must retain information on 140 items. Thus 200
attachments require 28000 items to be stored. These items are, fortunately,
Boolean constants. In particular, the Boolean constant for an Input Parameter is l(True) if the parameter has been stated to be giveno The constant
is O(False) otherwise. For desired results, the constant is 1 if the parameter is required as an output and 0 otherwise, Since the 704 computer is
a binary machine it is possible to identify each of the 36 binary digits in
52
the 704 word with a specific parameter and thus save the status of 36 parameters in a single storage location. The entire parameter status is compressed
into four words for each attachment by the Preprocessing section.
A second task of the Preprocessing section is to generate an image
of the system to be simulated within the machine. This is done by forming
a connection matrixo This array retains the nature of each attachment pointo
Each attachment is entered as the joint between two elements. Four items are
required to specify uniquely an attachment on an element and thus eight locations specify a connection. The result is an8 xn matrix, where n is the
number of connections in the system. The matrix entries are the true names
of the elements, attachments and identifiers either as supplied directly by
the connection statements or as replaced by synonymso
The ccnnectior matrix thus generated may be quite disordered so far
as efficient processing is concerned. After the matrix is completely entered
in the machine a sorting is done using an indirect list address array to
arrange the matrix in the order of occturence of the Element Descriptions on
magnetic tapo The indirect address lists allow the matrix to remain stationary
in memory while the effective order is completely changed. Since the finding
of information on magnetic tape is the most time consuming of all the operations every effort is used to save tape movemento The ordering also provides
for an effective method for locating all occurenoes of an element in the connecting array without searching the entire array. This is done as follows,
Let the connection array be denoted:
rALLE3tLATj LAID LEIjRED ARAIDR
EL2LEID2LAT2LAID2LE2RE ID2RAT2RAID2R
ELnLEIDnLATnLAIDnLELnLEIDnLATnLAIDnL
Where
EL is any true element name
EID is any true element identifier
AT is any true element attachment name
AID is any true attachment identification
and the subscripts iL and iR denote the attachment point
occurring to the "left" and to the "right" and the ith
such attachment point.
The ordering procedure is then:
1) Order the matrix by the ELiL (and group each
EL and EID) according to the arrangement of
descriptions on the magnetic tape. Call this
ordering vector "L"o
2) Order the matrix by the ELiR in the same way
and call this ordering vector "R".
Let i be incremented from 1 through the number of connections, say n.
Let Li R) be the value df the lth location in the L>. vector. Then the
ith member of the ELL (t ) is found in ELLi {Ei). This type of list
addressing is known as "indirect" addressingo
3) Construct a vector for the vector L whose values
are the locations of the first occurrence of each
element addressed through L in the vector R. If
no such occurrenceexists in R, then insert the
negative of the first occurrence of the element
in L itself. Call this vector "L TO R".
4) Construct a similar vector for the vector R relating the occurrences in R to L. Call this
vector "R TO L".
.
WLth these vedtots the task is simplified for finding the entire set
of occurrences of any identified element, The location of all occurrences ia
accomplished in the following ways (The method is given for i but is equl3y
validS with appropriate changes, for R)
1) If L TO R at a point is ngative, the elemnt does
not occur in R6 The first ocurrence in L is found
by taking the absolute value of L TO Ro.2) Each entry in L agrees with the first as lopg as
the corresponding L TO R corresponds to the first
L TO RB
) If L. TO R is positives the first L actarrence is
found by going first to the first occurrence of
the element in R and then bact to L by using the
associated R TO L value, The same test as in 2
applies to equivalent identified elements.
4) If L TO R is positive, the value gives the first
R occurrence. The succeeding values in R are for
the same identified element so long as R TO L for
each succeeding value agrees with the first R TO L
value,,
The remaining task of Preprocessing is to satre all inputoutput re*
quirements and function substitutions for the Program Generation, In
addition, should the library complement be inaomplete, the Preprocessing
must construct the library entries. Each Element Description is saved
as two files on the tape known as ELTAPEo The first saves the contents of
each collection capability in 80 word blocks~ This allows two words for
input parameters and two words for desired results at each of the 20 allowable attachments The words are in the Boolean form previously described.
Since two words allow for 72 parameters and only 70 parameters are allowedyw
the remaining bits are available for special use, In particular, the last
bit in the desired result word is used to signal that this capability must
apply to elements without this attachment0
55
The second file contains the Mo Ao Do statements to generate the
capability contained in the first fileo The first capability group in the
first file corresponds to the first collection of Mo Ao Do statements in the
second file and so ono
Upon completion of the Preprocessing, the status of the storage is
as follows:
1) The connection matrix is in and contains only true
nameso The matrix is ordered and the inputoutput
requirements have been packed in Boolean parameter,words
2) The Element Descriptions are processed and saved in
groups of two files per description on tape ELTAPE,
with all permanent descriptions firsto
3) The function substitutions and inputoutput parameters in complete notation are saved for the program generation phase on an erasable tapeo
At this point, control is passed to the Desired Result Reduction
section. This section is charged with the actual generation of the algorithm
for simulating the system. The procedure for accomplishing this task is
almost the reverse of the usual procedure used by humans in attempting the
same tasko The human, approach, largely because of the extremely large
storage capacity of the human brain, is a search that proceeds from the
known parameters and is directed toward the desired resultso This approach
could be implemented in the machine but because of storage limitations may
become quite unworkable. The difficulty is that the machine program cannot
reject a method until it can be shown to be unnecessary in the program to
obtain the desired resultso Thus the program would be forced to enumerate
all the possible methods available from the inprt parameters plus the results
56
of the first set and so ono The number of methods available grows rapidly
and if the problem is wellposed the desired results will eventually be
encompassed. However, this constitutes an exhaustive search with only a
small fraction of the methods actually of useo
Therefore a different approach is used, Essentially, the algorithm
is produced in reverse by working from the desired results toward the input
parameters. In this way every step generated is necessarily of use in the
program. The program is, of course, backward, in that the first statement
collection specified is the last one needed and so on but this is easily
taken care of by the Program Generation setiono The method of production
of the algorithm is the following:
1) Inspect the "Desired Result" Boolean words for each connection
point in the matrix. Whenever no Desired Result bits can be found in the
entire matrix the algorithm is completed,
2) Whenever a connection is found for which results are desired,
steps must be taken to satisfy the request for resultso
2A) The requested results may be input parameterso If this
is so, remove the corresponding desired result bitso
2B) The requested results may occur at any identified attache
ment and be of broadscopeo If this is so and the result (as
an input parameter,) can be found at any of the identified
attachments, remove the corresponding desired result bilo.
2C) If requested results still remain after steps "2' and"2B",
then some additional program must be added to obtain the results0
57
2C1) Find all of the statement collections for both elements
that occur at this attachment point that are useful in obtaining the requested results. That is, ignore any collections
that are "without' attachments specified for this element or
collections that do not happen to produce any of the desired
results.
2C2) Check each useful collection to determine its effectiveness. The effectiveness is the ratio of the number of requested results the collection produces to the number of new requests for results the collection will produce. A new request
for results will occur if any of the parameters required by
the collection is not an input parameter or already requested by previous statements.
If the number of new requests for results is zero, the
collection is always inserted in the algorithm. This collection produces results without requiring any new information.
(The only exception to this rule occurs with the iterative
ESTIMATE collection. In this case, no new information request is apparent, however, the ESTIMATE collection is restrained from inclusion in the program until the parameter
in question is found by at least one independent calculation
method.)
Otherwise, retain the effectiveness ratio as the weight
of the collection.
58
2C3o When all of the collections have been examined for
effectiveness and if desired results still remain, select
a set of the collections that will produce the requested
results
At this point, the methods in which a parameter could
be found using a method only once at a given attachment
point are checked and discarded if already usedo Otherwise,
these methods are simply placed in competition with any other
techniques available
2C3A) First check to be certain that every requested
result can be found in at least one wayo If any result cannot be so found, the problem may not be well
posed, The problem is not well posed if no "branches"
have occurred previously in the generationo A
"branch" occurs when a choice is made between more
than one method of determining a requested resulto
2C3B) Whenever there is exactly one method for producing a result, this method must be included'at
this point in the algorithmo The method is inserted,
the results produced by the method have the corresponding bits removed and any new requests occurring anywhere in the matrix have the corresponding bits inserted.
2C3C) After all single method results are taken care
of there remain only results for which there are several
methods of calculation. Since only one method will be
used for each result the selection will constitute
a "branch" in the algorithm generation, if the method
selected is not always forced to be the same one.
If one considers the available methods, each with
its associated weight, the simulator should tend to
choose the method of greatest weighto However, the
simulator should be allowed to select the method on
the basis of the probability of selection being proportional to the weighto If this is not done, one may
anticipate that in some case the method of greatest
weight may contain a parameter that is incapable of
calculation (considering the input parameter) and
therefore the program could not be generated. If,
however, the simulator makes the selection probabilistically, the method of greatest weight is most
likely to be selected but other methods may be selected
in its placeo The probabilistic selection is automatically made and the "branch" Boolean constant is
set to oneo In this way, if later there should arise
a case in which no method is available the simulator
may make another trial and possibly work out a satisfactory algorithm by having the chance to choose another method at this pointo This is a situation in
which the locally "best" method is not always the
globally "best" but tends to be so.
60
3) The steps 1 and 2 are repeated over and over. Each time the
requested results are satisfied a new set of requests are generated except when the request matches an input parameter. If the problem is well
posed then a sequence of methods may be found such that all desired results
are satisfied, through the sequence, by input parameters. When this has
been done, an algorithm for the simulation of the system has been produced.
The algorithm produced tends to be optimal since at each state
the method of greatest weight was most likely to be employed but the simulator cannot, with limited storage, view the generation of the algorithm
beyond a single step. Thus occasionally the simulator may produce several
steps that might be condensed if more information were available. In particular, it may happen that identical sets of statements may be produced
in the algorithm at different stages of the generation. This redundancy
is easily detected and the final algorithm will contain only the first
occurrence of the set.
The method of probabilistic selection is also used to discard
the least likely method should there be found too many methods to apply
at a given point.
The method of probabilistic selection for picking an item from
a group of n weighted items is the following:
Let Wi > 0 be the weight of the ith item from a goup of
n total items.
n
W = E Wi be the total weight of the group
i=l
N be a random number selected from a uniformly distributed set of
0 random numbers on the interval 09 Wo
Then the Kth member of the group of n items will be selected for
the smallest K such that
K
zWi= NW
i=l
N0 is most likely to fall in the subinterval such that Wi is maximum
but may fall anywhere in the intervalo An modification of this method to
pick the least likely item (for discards etco) consists of defining a new set
of weights p = 1/Wi and make the selection using p in place of Wo In particulard it should be noted that equally probable alternatives receive equal.
chances and every alternatives no matter how small its weight may be, receives some consideration and may be chosen at any timeo This method should
find many applications in future programso
Since there is no way to predict either the number of parameters
that may be needed at a point or the number of methods available for any
parameter it is necessary to allow an extremely flexible storage assignment so that the storage may be completely usedo This is done by means
of an "associative memory" list for the storage regiono This list functions
as follows
1) Associated with each parameter at the attachment point is a
storage location whose value is~
1A) Zero if the parameter is not requiredo
2B) Minus one if the parameter is not required and no method
has yet been found to yield the parametero
62
1C) Otherwise, the value is a positive integer giving the location of the beginning of the list of methods for this parameter
in the "associative memory"o
2) Each entry in the associative memoiy list gives the location
of the next (associated) entry. The final entry is denoted by a minus sign.
3) New entries are made by consulting the associative memory list
beginning at the zeroth location, The value of this location is the next
available location in the memory. An addition to the end of any list is
made in the available location, the value that was in this location is stored
in the zeroth location and the former list end is changed to refer to the new
list endo
4) Whenever a result collection is selected9 the storage space is
reassigned as available. storage by placing the starting location for the list
in the zeroth location, and the value formerly in the zeroth location at the
end of the list being removed. In this way the entire list is made available
with only two storage reassignmentso
5) If the capacity of the storage is exceeded before all the parameters have been treated space can be created by selecting the parameter
with the greatest number of methods and picking the method least likely
using the probabilistic section technique. The location thus chosen is
made available by giving its address to the zeroth location and reassigning
the preceeding list location to skip this location and refer to the next
item in the list.
63
Upon completion of the removal of all the desired results and those
created during the removal of others, the algorithm is completed ard Written
in reverse order on magnetic tapeo This type of storage is used because it is
not possible to predict the storage required for a program in advanceo This
is somewhat unfortunate since the program is generated in the form of a "push
down" listo A push down list is a list such that each entry occurs at the
beginning (rather than the end) of the list and thus moves the former first
item to second place, the former second to third and so ono Thus the items
are "pushed down" on the listo Removal of items from the list occurs from the
beginning of the list with the last item enteredo Thus the effective order
of the list is reversedo This is precisely the action that must occur in the
program generated since the last statement collection found must be the first
used in the program and the first collection found is the last one used in
the program. The difficulty is resolved by moving the tape backward two records and forward one working from the last record written toward the firsto
As this process is begun after the completion of the algorithm the
simulator is at this point generating the simulation program using the
Program Generatoro The first output of the Program Generator is the Prologue and its associated inputoutput statementso The Prologue is followed
by the program. The algorithm isa, stored in a short code giving the connection matrix row number and index in the L vector so that the unique connection could be located by the program generatorY the element description
name given by position in the element name vector and tagged with a plush+)
sign if the element involved was the left most (and a minus() sign if the
64
right most) and finally the statement collection number0 The program generator
section moves to the second file in the desired element description and next
to the appropriate statement collection. Finally the statement collection is
processed and produced both on cards and in print to form the desired simulation program. The rules by which the processing takes place have been stated
in the section describing Element Descriptionso Briefly, the M. A. D. statements are written using floating statement labels, and special codes for
function substitutions and for parameter and attachment codes. The Program
Generator assigns unique fixed statement labels for the floating statement
labelso Any function substitution is checked for possible modification~ If
a substitution has been requested, the substitution is made, otherwise the
original text is retainedo The parameter and attachment codes are reduced
to a six character variable name code for each parameterattachment combination
occurringo Some of the six characters may be blanks. Nonidentified attachment parameters are immediately coded and inserted in the output statemento
Identified attachments cause multiple copies of the statement to be generated.
One copy is made for each different identifiero To avoid possible embiguity,
only one attachment may be identified in each statemento However, this
attachment may occur any number of times with any number of parameters within
one statement. In this way the effect of a special junction element is produced without specifically requiring such an element.
As noted in the Collection Capability section, there is one exception
to this ruleo Namely, when an attachment is identified and the current point
in the connection matrix agrees exactly with this attachment name, a copy of
65
the statement is produced for all occurrences except the current one. Finally,
if an attachment is not identified the current attachment point will be selected
if the name agrees with the name occurring in the collection statement. Otherwise, the first occurrence of the name is used in the codeo
Upon completion of the program generation the control is returned
to the Preprocessing Section to process any other system simulation problems
that may be waiting. Since the output of this program is a program in M. A. D.
code and on punched cards the simulation program may be used as it is produced
or modified easily before using it to simulate the system. To use the program
as it is generated, the user need only supply the data and special subroutines
needed and the special cards needed by the executive system for the data processing system. The program will be translated into machine code and executed
using the data supplied.
IIo STEPWISE REGRESSION PROGRAM WITH SIMPLE LEARNING
The representation of the characteristic performance of the various components of a system, is vital to the simulation problem~ It is not
sufficient to obtain a relation which merely fits the available data, if
the relation is to be used for predictive purposes, because such a relation
may bear only superficial resemblance to the actual. performance at other
points. A much more desirable relation would consist of terms suggested by
the nature of the physical laws governing the component performance but
using only those terms which may be shown to be substantiated by the available datao The Stepwise Regression program. was written to establish this
relation and produce, in addition to the analysis of the data as just
described, the actual MoAoDo statements needed for the predicting equation
sub rout ine
The use of simple learning by the program allows the program to
deal with a much more general solution of the predicting equation problem
than, h.s been previously possible.: The usual engineering problem consists
of many independent variables (pressure, temperature, load, etc.) which
affect the behavior of the dependent variable (eog,, efficiency, loss, etco)
that is to be predictedo In addition, these variables are usually found to
enter in nonlinear manners, (eogo, raised to powers or roots or even more
complicated forms). Also, in the usual problem the dependent variable
performance is often. affected by interaction between the independent
variables and function of the independent variableso The simplest sort
of examp]le of such an. interaction is the Perfect Gas relationPV  MRT
66
67
In this case, the variables P and V interact so that T may not
be determined by a relation consisting of terms using P alone plus terms
using V alone but may be found by using a term involving the interaction
between P and V,
Thus the size of an engineering problem of several independent
variables, each of which may be represented by several functions, grows
very rapidly when all possible interactions are allowed. An illustration
will indicate the magnitude of the problem.
A common selection of twenty functions for a single independent.
variable problem, that is, a problem which may be expressed:
20
Y = Z biFi(X)
i =l
will require about 30 seconds to solve on the IBM 704 using conventional
stepwise regression techniques. If an apparently only slightly more
complicated problem involving three independent variables, each of which
has twenty functions, were to be attempted, considering all interactions,
the number of terms to be considered increases from 20 to 9260, the size
of the matrix involved grows from 21 by 21 to 9261 by 9261, and the IBM 704
time becomes approximately 2500 machine hours.
It is clear that, without a technique capable of reducing this
problem by several orders of magnitude, the general problems encountered
in engineering will have to be treated in strongly simplified terms,
Indeed, this has been precisely the motivation for earlier linearized
system modelso
The simple learning mechani.sm. developed for use with the Stepwise
Regression Programr has been used,sucessful.y to produce predictki g relations
in much less time than required for more conventional methodso The following
discussion will describe, first, the Stepwise Regression Program and,
second, the Simple Learning mechanismr employed by the programo This program
represents one of the first applicati ons of "arificial intelligence" in. an
area of immediate practical interest.
Discussion of Stepwise Regression
The following describes a, computer program useful in determining
the relationships existing among a grovup of up to 60 variables or functions
of variables at each program pass, Tak'ing one of the variables tto be a
dependent variable, the program. resu..s in a linear predicting equation
using the current set of pred. ictot vardables or terms and select;ing from
this set a "minimal" seto The program allows simple learning to occur
concerning the most satisfactor;y o terms, theey xtending t.he
usefulness in determining equations tha take account of possible variable
interactions of all orders. The programr further allows the generateion of
equations using either stepwise buildup or stepwise purification at the
discretion of the user,
This discussion concerns some extensions carried out by the
author of the work originated by Mr Mo Ao Efromnson of Esso Standard
Research and Engineering Co,, and carried forward by M,'o EJ Eo Dallemand
of General. Motors Research Staff, Th.ae problem considered is that of
determining a predicting equation rrom a colilection of datao, The method
of analysis deals with the situation which arises when data have been
69
collected on many variables, of which one is regarded as a dependent or
response variable and the remainder of the set is regarded as a set of
independent or predictor variableso It may be anticipated that the
method will be useful in experimental situations involving unknown
complicated interactions between many variables and complicated relationships (functions) of the variableso In particular, when the data are
already available, or where it is difficult to control variables systematically, or where the conduct, of a systematic experiment would disrupt
the normal operation of a system too severely, this method will be useful.
Specifically, this method is useful in obtaining answers to
questions like the following:
(1) What linear combination of the independent variables, or
functions of the independent variables, or interactions
(cross products) of independent variables and functions of
independent variables best explains the data on the
dependent variable?
(2) How good is this relationship (obtained in (1))?
(3) What is the linear relationship between the "best" single
independent variable (function of an independent variable,
or interaction) and the dependent variable? Also, what is
the relation for the "best" two, three, or other subset of
the possible predictor terms?
(4) For each subset df (3), how good is the relationship?
(5) What is the smallest set of predictor terms that will make
statistically significant contributions toward explaining
the statistical variation in the dependent variable? (The
user may set the level of significance )
70
(6) How good is the relationship in (5) and how good is the
prediction?
(7) How much of the behavior of the dependent variable is still
unexplained by the equation? (This is the Standard Error of
Estimate )
(8) If there is theoretical justification for suggesting certain
terms to explain the behavior of the dependent variable,
what is the "best" relationship for this set?
(9) How good is this relationship (8) and what can be done to
explain the behavior not explained by the present theory?
Io The Stepwise Regression Method
The Stepwise Regression analysis deals with a set of p
independent variables denoted X1, X2, oo, Xp and a single dependent
variable Y, Let N be the number of observations made on each of these
p+l variables yielding N*(p+l) data in allo The objective of the analysis
is to generate a relation of the form
Y = bo + blX + b2X2 +.* + hpXp o (1)
The bi, i = 0, 1, 2, oo~, p are the coefficients or multipliers
of the various X1o It should be clear that one could not distinguish
between the previous case of p independent variables and the case of p
linearly independent functions of a single independent variable or any
other combination of numbers of independent variables and function choices
for these variables totalling p terms in allo Therefore, the discussion
here treats the problem as if there were p independent variables without
loss of generality,
71r
1 l.
To focus these statements on a physical problem, consider the
following: Suppose that measurements have been made of the electrical
losses of a hydrogen cooled generator. Figure 1 shows the general behavior
of the variables and indicates that at least three factors must be considered, It is assumed that measurements or observations are available of
(1) gross electrical load on the generator, (2) hydrogen pressure, and
(3) power factor as well as the corresponding electrical losso
The formal relation (1) might be interpreted as the linear
relation:
GENLOS = bo + b * GKW + b2 * HPRESS + b3 * PFCTOR
where
GENLOS = generator electrical loss
GKW = gross generator load
HPRESS = hydrogen pressure
PFCTOR = power factor
However, Figure 1 indicates that such a linear relation may not
represent the actual behavior. More complicated analytical models may be
suggested to the Stepwise Regression Program by making appropriate definitions of some pseudovariables.
Suppose that the pseudovariables Xi are defined:
X1 = GKW
X2= GKW2
X3 GKW3
X4 = HPRESS
X5 = HPRESS2
6 = HPRESS3
X7 = PFCTOR
X8 = PFCTOR2
X9 = PFCTOR3
and, of course, the list may be longer and as complicated as needed to
describe the physical problem0 The "standard" types of terms automatically
72
//
/
CE) /
S 1800 / /
S /
 o J' i /
lad / /' r L/
600 I 1
8 1400
S 1200 
Variable Name Symbolic Name
Generator Electrical Loss GENLOS
1000 h Generator Load' GKW
Hydrogen Pressure HPRESS
Power Factor PFCTOR
800
50 70 90 110 130 150
GENERATOR LOAD
(GKW)
Figure 1. Generator electrical losses as a function of
load, hydrogen pressure, and power factor.
73
available to every problem include integer powers, integer roots, and the
reciprocals of these termso Provision is made to insert any other special
terms desired as well (such as logarithms, exponentials, etCo)o Then a
relation of the form (1) is:
LOSS = bo + bl * X1 + b2 * X2 + 0 o + b9 X9
or its equivalent
LOSS = bo + bi * GKW + b2 * GKW2 + o b. + b9 * PFCTOR3
Again, it often happens that interaction may occur between the
variables and the functions of variableso Once again a relation of
form (1) may result by defining:
Z1 X1 = GKW
Z2 = X2 = GKW2
z9
ZlO
Zll
X9
= xl
= X1
= PFCTOR3
* X4 = GKW * HPRESS
X5 = GKW * HPRESS2
Z36 = X6 * = HPRESS3 * PFCTOR3
Z37;X1 * X4 * X7 = GKW * HPRESS * PFCTOR
= 3 X GKW PRESS PFCTOR3
Z63 = X3 * X6 * Xg = GKW3 * HPRESS3 * PFCTOR3
The formal relation (1) is now
LOSS = bo + bl * Zl + b2 * Z2 + o00 + b63 * Z63
or its equivalent
LOSS = bo + b1 * X1 + b2 * X2 + ~ + b63 * (GKW3 * HPRESS3 * PFCTOR3)o
The problem consists of finding those Z's which contribute to the
explanation of the dependent variable (LOSS) with sufficient importance, as
indicated by the measured data, to allow their retention in a predicting
r74
equation,, And. having found the set of Z's meeting the importance criterion) the problem continues to the determination of the best possible estimates for the bCso In this way, a mi.rimal, relation is generated which may
be used to predict LOSS for given values of GKW, HPRESS9 PFCTORo This
relation is automatically generated by the Stepwise Regression Programo
In addition, the Stepwise Regressior Program produces on punched cards the
MoAoD, function. corresponding to the gererated relation an.d having any arbitrary function name desiredo In this case9 suppose that the desired
furL.ction name is GENLOS. The program would produce the MoAoDo External
Fur.ct ion
GENTLOSo (X1i X29 X3
where X1. X29 and X3 are now symbolic names for the arguments GKW9 HPRESS,
PFCTOR, in machine translatable form ready for inclusion as part of a simulation program (or any other application)o Thus or.e may later write the
relation.
NETKW = GKW  GENLOS. (GKW HPRESS PFCTOR)MECLOS
as a MoAoDo statement to be used. in a simulation program. and the result
will be the net power generated (NETKW)o It is clear that no loss of generality has resulted by considering the formal relation (1)0 It should
also be clear that the X terms in (1):may represent either the actual
measuremernts of the independent variables or that. they may represent functions of these measurements without requiring any change in technique,
In the remainder of this discussion9 the symbol X will be used and the
meaning may be understood in its most general senseo
The bi in (1) are determined in such a way that9 if one forms the
sum of the squares of the differences between the observed values of Y and
the predicted values of Y arising from, the use of (l), then that sum will.
75
be minimum. Notice that the process of squaring the differences insure that
all errors, positive and negative, contribute toward increasing the sum.
This is commonly referred to as the method of "least squares."
The importance of the Stepwise Regression method lies in the process
of "building" the expression (1) a term at a time, always insisting that the
terms be inserted in order of their relative importance to the explanation
of the behavior of Yo Furthermore, checks are made continually regarding
the continued importance of terms in the equation and only those terms will
be inserted into (or removed from) the equation (1) that meet certain
significance tests which can be controlled by the usero Thus the final
equation will comprise a "minimal" set of terms, Since terms may be removed
from the equation as well as inserted into the equation, the method of
Stepwise Regression also allows the generation of a relationship by "purifying" an initially large set of terms with very little added burden to the
user. Experience indicates that the purification process occasionally
produces valuable additional information in certain problems.
A number of statistics are computed before the task of building
the predicting equation beginso These statistics may be printed out to help
give further insight into interrelationships in the data and are used by
the program for executing the task. Included among these statistics are
the mean (average value) for each variable, the standard deviation (a measure
of variability) for each variable and the correlation coefficient for each
pair of variables. The correlation coefficient measures the linear relationship existing between the pair of variables, and ranges from +1l00 (perfect
direct relationship) to 0o0 (no relationship) to lOO (perfect inverse
relationship)
Figure 2 shows the interpretation of the standard deviation and
the mean. If the scatter of data is due to random uncontrollable error,
then the Gaussian distribution will model the variability withl respect to
the predicting equation. Taking the mean or average value to be that
indicated most likely by the data, the width of plus and minus one standard
deviation will embrace an interval about the mean within which the expectation of the true value is 68%o As indicated by the figure, if the interval
is doubled, the expectation grows to more than 95%, and if the interval.
triples, the expectation is 99.8%o In other words, basedon the data
measured on the physical component in question, one may expect to encounter
a true value of the dependent variabl.e lying more than three standard
deviations away from the predicting relation value with a long term frequency of 1 in 500O
Figure 3 illustrates this discussion with respect to a predicting
equation. If the predicting equation producces the estimate of the true
value of the predicted variable indicated by the central heavier curve,
then the bands to either side mnay be understood to indicate the range within
which the true value may be expected to lie with the stated frequencieso
Thus a predicting equation with very small standard error of estimate will
more accurately represent the trule behavior of the variable than will a
predicting equation with large standard error of estimateo
TI Generation of a Predicting Equation
Consider a simple example; suppose that an. experiment has
been made consisting of a set of observations of six variables, Regarding
one of the six as a dependent variable and the remaining five as predictor
77
WITHIN THESE LIMITS OCCUR
DEVIATIONS IN STANDARD DEVIATION UNITS
Figure 2. The Gaussian distribution.
/
Y Sigma = F(X) Sigmao
YSigmo =F(X)Si
Y
A
I2 r F(X)2 I
2c 22c3a 3aT
CROSSSECTION AA
Probability that true value of Y lies
— within interval
x
Fig. 5. Predicting equation Y = F(X) as an
approximation to true values of Y.
78
or independent variables, the analysis determines the "minimal" set of
variables which may be used in a relation of form (1), where, in this
case, p = 5.
The first step is to find that variable Xi which best predicts
Y. This is done by correlating each of the Xi to Y and the selecting that
Xi which has the greatest "correlation coefficient"* in absolute value.
If more than one Xi shares the largest value, take the Xi with the lowest
subscript i.' that is, take the first such Xi encountered. Suppose that in
this instance that best i is 4. The first predicting equation is then
Y = bo + b4X4 (2)
Thed bo and b4 satisfy the leastsquares criterion.
Succeeding steps are of slightly different form. First, the Xi
are sorted into two subsets Xil and Xi,2. The set Xi,l consists of all
those variables that are in the predicting equation at the time of sorting.
The set Xi,2 consists of all those variables that are not yet In the
predicting equation.
* The correlation coefficient is defined as the productmoment coefficient
of correlation: Let
( A WtXit)( E WtXjt)
XiXj = Z WtXitXjt .,
t twt
where t = number of observations
n = number of independent variables
j = i, i+l, i+2,, n+l
i = 1, 2, *.., n+l
Then let
Ci =X'iXi i = 1, 2, *', n+l
and the correlation coefficient r is then
(xixj)
wij (thi)(ej)
with the properties.rji = rij i j, 2, A n2L
rii = 1.000
1 ri S 1..l1r Il
79For each of the members of Xi 1 the analysis computes an "importance factor"*** which is a measure of the relative contribution of the
variable to the predicted equation. The smallest of these importance factors
is isolated. If the variable associated with this factor is less important
than the user requires for the variable to be retained in the equation, then
that variable is removed from the equation before continuing.
The scale used to determine whether a variable meets the "importance" criterion is simply the probability or chance that the user is willing
to take that a variable may be left in the predicting equation that should
have been removed. Figure 4 illustrates the nature Of this "importance"
scale. The Ftest measures the extent to which a variable will contribute
toward explaining the dependent variable behavior, and teststhis contribution against a purely chance correlation by comparing the variance with and
without the term. The hypothesis tested is that the variance is equal in
both cases and that any difference is due only to chance, Thus, the term
will be..used only when the difference in variance cannot be explained by
chance alone. Thus, in Figure 4, if one selects a probability of committing
an insertion error (that is, inserting a term into the predicting equation
that really does not belong in the equation) and finds the number of degrees
of freedom (roughly the number of weighted observations), the value of F
indicated by the surface is such that if the value of F displayed by the
"best" term exceeds the value on the surface, then the risk is less than the
probability chosen. As might be expected, the value of F on the "threshold"
surface goes to zero as the probability goes to 1 (certainty of committing
an error). In that case, any nonnegative value of F equals or exceeds the
"threshold" and the result would be to insert every term whether correlated
80
or not, Conversely, if one goes toward zero probability (certainty of not
committing an error) the threshold value grows, approaching infinity in the
case of zero probability. Thlus no value of F can exceed this threshold and
so no terms carn be inserted, For any reasonable probabi.lity, the effect of
the number of data can be assessedo As the number of data grows large, the
"threshold" value approaches a constant dependent only on the probability,,
As the number of data approach zero, the ri.sk of error is.held constant, by
requiring larger and larger F values with infinity as the value corresponding to "no datao" With such a test, the Stepwise Regression Program. can
control the generation of a predicting equation so that; each term possesses
a maximum, risk of appearing incorrectlyo Of course, many, perhaps most,
term.s actu'ally appearing in. the final equation exceed the threshold by
substantial amounts and th.us represent greatly reduced riskso The test
insures that every term is at least as good as the risk specified.
NOTE If V' > 0, then Xi is not' yet; in. the rererssion, equation and the
Vi > 0 may be regarded as the relative contribution by the respective
Xi. in explaining the as yet unexplained variance in the dependent
variabl.e'Y,
If Vi i < 0 then Xi is currently in the regression equation. The Vi
for all Vi < 0 may be regarded as the relative contribution by the
respective Xi to the regression prediction of'Y
As each term is added or deleted from the regression equation, the
regression matrix aij is modified to contai.n the corresponding effect,
** The "importance factor" is found by using the variance conftribution for
each variable o Initially, the correlation matrix ri. defined earlier is
equal to the regression matrix ai.
aij = rij; 1, j.: 1. 2, ooo n+l o
Then the variance contribution for the ith variable is:
Vi Y= ai i.i, 2, o, n
aj i
and where the subscript y is runderstood to be the dependent variasble
subscript (n+l)o
If the user takes a probability of error for removing variables of
005 then the odds are 1 in 20 that a term may be left in the predicting
equation incorrectly. Obviously, if the user wants to make this error very
rarely, he may set the probability of that error very low, say 0.01 or 0.001.
This situation requires one additional remarko When one asks that the chance
of committing an error be made small, the chance of committing the converse
error must become large. In this case, one increases the risk of removing
variables that really belong in the equation by decreasing the risk of
leaving variables that do not belong in the equation. If the chance of
leaving a variable incorrectly were set by the user at 1 in 10000, it is
also possible that insufficient data may have been accumulated to allow the
retention of any variables in the equation, and the analysis can do no more
than predict the average value of the dependent variable Y'by the appropriate
bpo The remedy is clear' if one wishes to set high standards, the price is
additional experimentation to produce additional evidence to support the case.
If and when all the importance factors exceed the minimum value
required for retention of the set Xi 1, the analysis then examines the set
Xi 2o For each element of this set a "potential importance factor" (as
defined earlier), is determined and the largest of these isolated. These
factors measure the relative contribution which each variable not presently
in the predicting equation might make to the equation if it were put ino The
largest of these is associated with the "best" variable at this stage. Once
again a comparison is made to insure that the risk of inserting a variable
incriorectly is in agreement with the significance of the "best" variable
before the insertion is alloaed to occur. Again, the user specifies the
risk he is willing to take of a variable being incorrectly inserted into the
82
L/
^^^;
Figure 4. Surface showing values of F which may be exceeded by chance with stated probability.
83
predicting equation, understanding ~05 to mean 1 chance in 20 of the error
occurring and recognizing that reducing the chances of incorrectly inserting
variables increases the chances of omitting correct variables from lack of
evidence
Suppose that, in the example considered, X4 has been retained, and
that of X1, X2, X3, X5, the variable Xi best explains the behavior of Y not
explained by X4. If there is sufficient evidence to support the insertion
of X1, then a new predicting equation is formed by least squares:
(1) (1)
Y = bo + blX1 + b4X4 (3)
The superscripts on bo and b4 indicate that these coefficients have undergone
one modification in the process and are new valueso At this point the
variables are again sorted and checked for importance and the procedure
repeated. The analysis ceases when either all the X variables have been
inserted into the predicting equation or none of the X variables that remain
as possible candidates for the equation is sufficiently important to allow
insertion.
Continuing the example, suppose that on the third step X5 is
introduced, yielding:
(2) (1) (2)
Y = bo + blX1. + b4X4 + b5X5. (4)
Further suppose that X1 and X5 behave together in such a way that
the results is like having X4 in the equation twipe. In such a case, the
importance of X4 might be considerably reduced. Suppose that this is the
case and that the importance of X4 falls below the limit set by the user,
Then X4 is removed and the equation becomes:
(3) (2) (1)
Y = bo + blXi+bX5 (5)
In step (5) suppose t;hat X2 is added giving,
(4) (3) (2)
Y = bo + blXi + b2X2 + b5X5 (6)
Now suppose that neither X3 nor X4 are sufficiently significant to
allow thelir insertion. The final prediction equation produced is (6) The
analysis makes availabl.e a n.m.ber of statistics at each step which may be
interpreted as a measure of goodness of fit or predict;ion as wel.l as the b
values and the importance level for the term considered at that stepo
2o The Statis tical. Model.
Suppose that the physical system giving rise to the preceding
example was such that it c.ould be hypothesized that the system could be
characterized or described by the mathemahatical model:
Y B+ BXo + BX + B2X2 + B3X3 + B4X4 + B5sX + E, (7)
where the Bifi., 1, 2, o, 5) are ui.nknown. and possibly some of them may be
zero, E is a randoTml erro, variable ternm. which acouants for the inabi.lty to
obtain strictly reprodu.cible dat a when obFseY7ving the physical systemo Setti.ng
aside the consideration of E for the momentr, the probl.em is that of obtaining
the best estimates of the Bio It may be observed immediately that this is
the probl.em just, considered, res. lting ir. Equation (6), and that the Bi are
estimated by bi, respectivel.yo The best estimates of B and B are zeroo
Tur.ning att.ention once againr to E i.n (7), it:s l.iar that Equation (6) is not qu.it e nom.pleteo The rar.d.omness of E makes the prediction of
E impossible. What is possible,. the detex.mi..ation of the likelihood of E
being inside a range of valueso In other words, because of E the measurements obtained are not exactly repeatable even if all. the X's could be set
85
at exactly their former valueso Therefore, the estimates are possibly, but
not necessarily, in error due to the influence of Eo A more nearly complete
treatment of (7) would
(1) estimate the Bi as before;
(2) estimate the possible errors in the Bi; and
(3) estimate the variability of Eo
The Stepwise Regression Program automatically estimates each of
the three items desired. The estimate of the Bi has already been discussed.
The possible errors.in the Bi are indicated by quantities SBi called the
"Standard Error of the Coefficient" for each io These values are such that
if one forms the interval
Bi  SBi B = Bi + SBi (8)
then the "true" value of B may be expected to be included by this interval
in about 68% of all cases (see Figure 2) If one extends the interval to
form
Bi 2S B Bi+ 2SBi ( (9)
then this interval should include the true value in about 95% of all caseso
The variability of E is measured by a statistic called "TPe
Standard Error of Estimateo" This is roughly the standard deviation of the
Eo Adding additional terms to the predicting equation usually results in
reducing the standard error of estimateo The amount of reduction is a
measure of the contribution made by that variable toward the explanation of
Yo When the analysis is completed, this statistic measures the behavior of
Y not explained by the predicting equation and reflects the remaining
observational errors and, of course, possible errorsin the hypothetical
model0 The precision of the predicting equation is reflected by the
86
magnitude of the Standard Error of Estimate (Sy) such that if one uses the
predicting equation (6) to estimate Y and then forms a band about the curve
predicted by (6) of plus and minus Sy (i.eo, the band is 2Sy in width and
centered on the curve from (6)), then the "true" value of Y may be expected
to be included by this band in about 68% of all cases (see Figure 3). Again,
doubling the band width to + 2Sy raises the expectation to about 95% of all
caseso In other words, when enough experimental observations of a physical
system are made accurately on good instruments so as to minimize observational errors, and when the hypothetical model correctly describes the physical
system, then Sy will be small and the predicting equation may be used to
estimate Y with a measure of the precision of this estimate interpreted as
indicated,*
The analysis produces two other valuable statistics at each step of
the estimation processo The "Coefficient of Determinationr'** is interpreted
as the proportion of the total variatfion in Y that is explained by the
predicting equation. The possible values lie in the range from +i.00 (perfect
prediction) to 0o0 (no prediction). Statisticains familiar with the "Multiple
Correlation Coefficient," which is the positive square root of the Coefficient
of Determination, will find it displayed also,
IIo Artificial Intelligence Applied to the Stepwise Regression
Method
Section I of this discussion treated the use of the Stepwise
Regression Method as it applied to those cases in which the entire set of
It should be mentioned in passing that the interpretation of Sy and SB
should be as stated here and that it is not correct to say that about 68%
of all observed values will lie within the intervals indicated for + S
and so ono
** The Coefficient of Determination (R2) is found by subtracting the regression matrix element ay (which measures the dependent variable
variance) from unityo That is, R2 lo ary o
87
variables and functions of variables can be represented by a single collection of small, enough size to allow complete retention within the memory of
the machineo In the case of the IBM 704 with 8192 word core storage, the
size of the problem is limited to 60 variables, which require, in addition
to several linear arrays of 60 elements, a matrix 61 by 61 or 3721 locations.
While some expansion might be realized by adroit programming, a little study
of the nature of the problem indicates that an expansion in capacity of
several orders of magnitude together with a new concept of programming will
be required to handle problems of the types commonly encountered in research.
To understand the nature of the problem encountered, consider the
following example. Suppose that, as in the example of Section I, an
experiment has been made consisting of a set of observations of six variableso Once again we regard one of the variables as a dependent variable and
the remainder as predictor or independent variableso Assuming for the
moment that only linear behavior is to be expected from any variable (a
drastic simplification), it is apparent that the formal relation (1)
Y = bo + blXl + o.. + bpXp (1)
is not completely descriptive of even this simplified caseo This is because
of the possible existence of interactions between variableso Extending the
example proposed to include interaction requires the inclusion of sets of
terms of the forms 1) Xi, 2) XiXj, 3) XiXjXk, 4) XiXjXkXi and in this
case the single term X1 X2 X3 X4 X5 as possible candidates for the predicting equation. The number of such terms is found in the following wayo
Let there be K groups of ni distinct objects'(i =1, 2, oo., k)
and let there be selected j objects
K
j = ni
i =1
88
at a time to form combib.nai., orns, The number of such combinations is readily
obtained for the case Cij = 1 for a.l i (the present example case)o The
number is (for n.i i)
Nj K K/(K j)I j 
and the case of K = 5 produces the tab le
TABLE OF Ni FOB ri  1 AND K  5
j T^, N 3
,. r 5
2 10 15
3 i0 25
4 30
5 1 31
Th. table shows:.au even t,; simp.l example chosen has expanded
the required. storage capacity from a 6 + I square matrix of 49 locations to
a 32 + i. sqeuare matrix of 1089 locationso Furthermore, the usual problem
does no', peermit the assummption of n —. lo'The more general case may be
de,ermined if
(1) ni is corntarnt for all. i
(2) selection occurs always between groups and not within groupso
Then
N. [KJ/(K j) j nJ (13)
Condition (1) is not urnreasonable and condition (2) simply requires
that nr be large enough to incu.r,de whatever terms might be desired generated
within the smaller group. That; is, if one considers X2 and and wishes
also to consider X5 = X2'* X3, then condition (2) requires that X5 be made
a member of the ni (and not generated from X2 and X3)o
Suppose that in the example 10 function choices are suggested for
each of the five variables (the use of 20 or more is not uncommon in problems concerning a single independent variable). Neglecting interactions
the problem requires 52 * 52 locationso Considering interactions and using
(13), one obtains the table:
TABLE OF Nj FOR ni = 10 AND K = 5
j N5 Nj
1 50... 50
2 1000 1050
3 10000 11050
4 50000 61050
5 100000 161050
Obviously this is outside the range of even projected computers
since the matrix now requires (161052)2 locations. The cost of solution by
conventional methods is also prohibitive since the solution of a 3variable
problem with 20 function choices per variable (which requires (9260)2
locations) has been estimated to require 2500 hours on the 704~
Conventionally, work has progressed in this field by the expedient
of setting the coefficients of all but a very few of these terms identically
equal to zero. The formal relation (1) is such a reductiono This method,
while enabling some attack to be made on otherwise nearly hopeless problems,
suffers greatly for several reasonso First of all, the:.,choice..of'omitted
terms is a process of discarding thousands of terms to retain one. Secondly,
the usual practice of relying on apparent fit to select terms before the
90
regression process begins may result i.n the omission of exactly the terms
needed d.
A procedure is needed to conduct a search through thousands of
possible terms engaging only a few dozen at; a t;ime to produce the predicting
equationo To be as effectivre as possible, it would be very desirable to use
each experience with. the problem, whether successful or not, to learn more
about the nature of. th+,e terms t hat are generally useful and thereby accelerate the search. Such tec,hni.ques as "l.earning" and the "acquiring of
experience" are generally associated witth nonmiechanistic organisms. Since
it is proposed that these techniques be simulated by the 704 computer, this
is the application of artificial.IntelliSgence to the problemo
The program has been wri,tten so that the machine is not: presented
with the condensed sJbset, as usulu.ally happens, but instead is given access
to all possibl.e t;erms and interactions wit'hijn the bounds of the number of
variables considered and the numb'er of function choices per variable allowedo
As usual in eproblems of this type, no stcraightcorwacrd procedure can be
given t1o proceed to the soluti' on that, does not, a.Lso appea:r economically
prohibitive. It.s not a matter of:instr ucting the machine how to solve
+the problem, but ins.tead of i.ns..tr.t;i:ng the machine how to "lear.n" to solve
the problemr Specifically, the machine must "learn," how to select terms so
* Experience with the regression program on si.ngle irndependent variable
problems indicates that the terms added suceessively to the predicting
eqluation bear little relation after the first step to their partia'l
correlation. coefficients with respect to the dependent variable This is
because the added terms are always charged with explaining the as yet
unexplained variation in the dependent variableo Consequently, if the
first term entered explairns the dependent variable behavior quite well,
the next term may be of quite different, chara,.cter in order to explain
what is left by the first termo0
91
that the set of terms chosen contain those terms needed to produce predicting
equations of high precisiono Much remains to be done in this new and vital
areao The present effort contains only the most rudimentary learning but is
written in such a way that more sophisticated learning models can be inserted
fairly easily. Experience with the simple learning mechanism has been
extremely encouragingo
Turning to the example of 5 variables and 10 functions per variable, the following discussion describes the nature of the learning scheme
used by the program. Suppose that no knowledge of the nature of the more
likely terms nor of the relative importance of the various term classifiers
are known a priori. A "term classifier" is one of the set of (1) interaction order identifier, (2) variable identifier, or (3) function identifier,
and is used to classify terms as to the degree of interaction, variables
involved, and funct.ions of variables i.nvolved i.n the term. If such knowledge
is presumed known before commencing the solution, means are provided to
suggest either the initial set of terms to try or the initial distribution
of weight among the term classifiers or both or neithero Ixn the present
case, neither are assumed to be supplied so the discussion may be understood
for any other case where more information is given initially.
Since term classifiers are not assumed to be supplied, the program
assumes no previous experience with the problem and accordingly sets the
relative likelihood of all terms equalo This is accomplished by considering
each of the classifiers as an array the elements of which are the lengths of
the components of a vectoro Each component is initially set to a unit
lengtho
92
Next the initial set of terms must be generated by the program0
Each of the 161.,0o0 possible terms in. this example are equally likely at
this stageo The program uses a pseudorandom number generator to select
(I) an interaction classifier, (2) variables to satisfy the interaction
selected, and (3) a function for each variable chosen for the interactiono
As each term is selected, a check is made to be sure that it is not a
duplicate of an earlier term chosen for the current passo When the number
of terms (less than 60) requested by the user for each pass have been
chosen and entered in a term matrix, the program calls upon an editor
program to examine the data and the term matrix and thereby generate the
set of edited data required by the regression analysis programo The editing process consists of operating on the raw data by referring to the term
matrix for the definitions of the terms and to subroutines to carry out
the generation of the termso Each raw observation is converted into the
edited data and a magnetic tape recording of the result Js madeo When all
the data have been edited, the program turns to the Stepwise Regression
Program to carry out the analysis exactly as before with respect to the set
of terms chosen by the program. Upon completion of the Regression Program
for this selection of terms, a check of the generated predicting equation
is made to see if:
(1) the Coefficient of Determination is as large as the user
specified,
(2) the Standard Error of Estimate is as small as the user
specified,
(3) the number of passes executed have not exceeded the limit
by the usero
95
If further work is allowed as the result of these checks, the
program proceeds to examine the results of the pass just completed and in.
so doing acquires "experience" conerning the types of terms most suitable
for future useo
This "experience" is acquired by the student program as followso
Each term is checked against the list of terms included in the predicting
equation. If a term has been successfully used in the relation, the
student (1) retains the term to be used again, and (2) increases the
probability of trying similar terms by incrementing the lengths of the
vector components of the classifier arrays that chose the termo If the
term was not successful, the student decrements the lengths of the vector
components that selected the term. By modifying the vectors by amount
proportional to their current size, no term will ever be reduced to zero
probability but may have its probability made arbitrarily small. but positive. In this way the arbitrary setting of huge blocks of coefficients to
zero is avoided and any term may at any time be used successfully and thereby become a member of the predicting equation un.til supplanted by a still
better term,
After the student program completes the study of the previous
run, the "experience" gained is utilized to select a new set of trial terms
for the next pass, That is to say, the previously successful terms are
retained from the former pass and the term matrix is filled out with terms
chosen. by using the modified classifier arrays and the random selection
process. Since the classifier arrays have been modified, the selection of
new terms no longer occurs with equal probability for all interactions,
variables, and functionso Thus the search is less random and becomes more
944
nearly stepwise as success and failure direct the modification of the
classifier arrayso o long as it is possib.le to retain terms used successfully on the previous pass and still select some additional term or terms,
the program retains the previously successful terms0 In this way, the new
pass will always be at least as "successful" as the last passo If, however,
a new pass is called for and there is no room, for additional terms, the
program has encountered a "traffic jam" since a new pass would not be
requested if the old selection had been good enougho In this situation,
a fresh start is needed but old "experience" may still provide valuable
assistance in the selection of terms. The student program discards the old
selection of terms (printout of the discarded set is automatic so that
human study can be made of it) and selects a complete new set while retaining the "experience" imbedded in the classifier arrayso In this case the
machine is completely able to handle the "traffic jams" without outside helpo
Another pitfall ,which might be encountered by the program concerns
the case in which the solution has progressed to a. locally maximally successful predicting equatlion In this case, any change appears to make the predicting equation less useful and yet the present predictting equation is not
good enougho An interesting property of the Stepwise Regression method for
choosing the most desirable term.s results in the ability of the program to
work itself out of such a situationo In fact, several instances have been
observed in which the program accepted somewhat poorer overall fits for one
or two trials in order to retain particularly good terms and on a succeeding
trial found the fitted predicting equation to be several times better than
the best previous equationo
95
In any case the process repeats itself, studying, grading, selecting, editing, fitting, until the conditions on the goodness of fit are met
or until the desired number of passes have been used whichever comes first,
While one cannot be certain that the very best predicting equation possible
has been found after any predetermined number of passes (a characteristic
of iterative processes generally), the procedure insures that the best
solution to date is preserved and that all trials contribute to the improvement of the selection process.
The learning scheme employed by the student program embodies many
of the principles discussed by Friedburg, Dunham, and North in their
articles on "Learning Machines" in the IoBoMo Journal. The student program
extends these ideas and incorporates the advantages of both random search
and stepwise search, Initial passes search rather randomly looking for
promising leads, As evidence accumulates, the mode of search becomes
increasingly stepwise as the number of "gqod" terms retained growo Thus the
search narrows itself into promising areas and progress is made toward solution until either a solution is found or the allowable number of passes is
exceeded or until either a solution is found or the allowable number of
passes is exceeded Or until a "traffic jam" forces the random search to
begin again. Random searching of the early stages is most promising since
a podr start does not inhibit progress. Later stages have experienced
some success and therefore the modifications are less drastic to allow the
previous leads to be followed as far as they may prove to be profitable.
During the solution of any particular problem, it may happen
that, when the data are operated upon by the editor program to produce
the edited data, the size of the numbers generated may overflow or underflow the size of the IBM 704 word. In floating point arithmetic this may
96
occur whenever the editor produces a nonzero number with absolute value
outside the range,10 8 to 10+18 because of a later production of the sums
of the squares and cross products of the terms by the Stepwise Regression
Program. In these circumstances, the student program cannot experience
"learning " for those terms of correct size since they have not yet been
tried for the actual curve fitting, but the student program must "learn"
about the selection of terms acceptable to the 704. Occasionally, it has
been observed that the terms suggested by the curve fitting process and the
terms acceptable in size to the 704 may not agreeo The present learning
mechanism is capable of correcting itself in this case without requiring
human intervention.
Some final remarks may be of assistance in understanding the
analysis. First of all, a given set of data may result in more than one
predicting equation of a specified goodness of fito This corresponds to
the existence of several mathematical models of the system. Classically,
this situation leads to the development of experiments capable of distinguishing between the models and the retention of those models which best
describeete greatest variety of consistent circumstances accurately, By
randomly restarting the problem this possibility may be investigated, If
the program produces different predicting equations upon random restarting,
more evidence is neededo Failure to produce different equations, however,
does not guarantee freedom from such difficulty but decreases the probability of this difficulty,; Secondly, if previous experience with a problem is
available, prudence usually dictates that the initial pass make full use of
ito The program provides ready means for saving previous results and for
97
restarting with any or all of the previous classifier arrays and term
selections intact. This same philosophy may be extended to initial runs in
which the user s training and experience or previous encounters with similar
problems may serve to generate an initial selection and/or weighting. The
penalty for a poor guess is an increased number of passes, but a good guess
results in considerable savingo
The Stepwise Regression Program with Simple Learning has been used
successfully on many test problems and actual physical component modeling
problems~ In addition, interest in this technique has been generated in
many diverse areas of the physical and social sciences. The ability to
know precisely the worth of each and every term in a predicting equation, as
well as the worth of the equation as a whole, as it is supported by actual
evidence, should enable extensions of knowledge in many fieldso
IMP13EMENTING THE STEPWISE REGRESSION PROGRAM
WITH SIMPLE LEARNING
Communication of the Problem to the Program
As it was in the case of the Simulator Program, the immediate concern
of the user of the Stepwise Regression Program is to communicate the problem
to be solved to the program. Since the problem is essentially computational,
the link is established through the u.se of a set of control cards.
The program, is designed to be very flexibl.e in the analysis of the
problem. Thus, the user must select the specific operations to be performed
and the constraints to be imposed. The user must supply, in addition to the
observed data the following control cards1. Title card
2. Problem control parameter card
3. Solution control, parameter card
4., Output control card
5. Simple Learning control card
6 Core and Tape Layout card' l 7. Initial Random Number card
Depending on the contents of the. Problem control parameter card,
one, several or all of the following groups of cards may be required~
8. Ordered Term. Insertion cards
9. Data deck
9A. Format specification card
9B. Observed data deck
9C. End of data card
10o Accumulated Learning deck
11o Initial pass terms deck
12. Output Function Name card,,
In the foregoing list, items 8, 10, and 11 may be present or absent
from the input deck depending on the contents of the Problem control parameter card. The remainder must be present in every input deck,, The order
of the deck follows exactly the order of the list.
98
99
Title Card
The title card allows the user to present any title that may be desired to be printed at the beginning of a new problem. Only one card may
be used and the title may appear anywhere within columns 1 thru 72. Ordinarily,
the user will place the number one in column one so that the printing for the
problem will begin on a new page. Whatever appears in column one is the
printer carriage control character. If a blank card is used, the printer will
simply single space the paper.
Problem Control Parameter Card
The function of the parameters on this card is to allow the specific
problem being treated to be handled in: accordance with the user's wishes.
The format of the card is (15, 3F10.5,'7I5 12). The parameters, in order,
are:
1) Problem Number, an integer modulo 52768. (cols. 1 thru 5)
2) Tolerance for division and round off erroro A floating point
bound such that if the magnitude of any divisor is less than this value.no
division will occur. This value is also used to limit round off error in the
matrix manipulation. Typical values are 0.0001 to 0O0005. (cols. 6 thru 15)
3) Probability of insertion erroro A floating point number in the
openinterval from O. to 1. The value is the probability allowed by the user
that the least significant term inserted into the predicting equation is
erroneous. A value of 0.05 represents a risk of 1 chance in 20, a value 0.01
represents a risk of 1 in 100 and so one (cols. 16 thru 25)
100
4) Probability of a deletion error. A floating point number in the
open interval from 0. to 1.. The value is the probability allowed by the
user that a term removed from the predicting equation for lack of support
should have been allowed to remain in the equation.
The probability of a deletion error must not exceed the probability
of an insertion error. If it does) the program may reject every term offered.
(cols. 26 thru 35)
5) Number of independent (predictor) variables. An integer less than
or equal to 59. (cols, 36 thru 40). This value plus one is the total number
of variables in the problem.
6) Number of functions to be considered for each independent variable.
An integer less than or equal to 60c (cols. 41 thru 45)> The choice of
functions to be used by the program is determined by the subroutine PFNCT in
the Editor Program Core ((No. 4) and the output section of the Program Generator
Core (No. 8) should obviously be made to agree with these functions). The
user is free to replace PFNCT if the need arises in any particular problem,
The "standard" version provides automatically integer powers, integer roots
and their reciprocals. The extent of the set so generated depends on the
number of functions. The order of these functions is: ( in MAD notation)
Function No. 1................. X(I).P.1
2 X..(.1.).*P..l1
3.o.............. x(I).P.2
4................ X(I).P.2
5 * e e o o *. e;) @o e o X(I").Po1/2
6 o., X. o.., o,s o o.e o.. o ) X ( ) e P 1/P. — 2
and so on) repeating the pattern of functions 35 4, 5 and 6 above for each
increasing integer
Typical values for this parameter are; 10 (yielding functions thru
X(I),P.1/3), 22 (yielding functions thru X(I).P.1/6), 38 (yielding functions
thru X(I).P.,l/10).
7) Number of terms to be tried at each solution pass. An integer
less than or equal to 59. ('cols. 46 thru 50)
8) Number of terms whose order of insertion is specified in the input data. An integer less than or equal to the item (7). Usually this variable is set to zero, but may be set positive and thus force complete control,
over the order of insertion of terms by the user. (cols. 51 thru 55)
9) Number of terms initially defined by the user. An integer less
than or equal to item. (7). Defined terms under this control will be used
subject to the statistical analysis of the program unless overridden by item
(8). If less than the total terms in (7) are defined by the user, the program
will attempt to generate enough new terms to satisfy (7). (cols. 56 thru 60)
10) Parameter controlling the type of regression analysis executed
by the program. An integer (colSo 61 thru 65) operating as follows:
10A) If greater than zero) the data is treated with respect
to the coordinate axes and the constant term is always suppressed to zero.
lOB) If equal to zero) the data is treated with respect to a
set of axes translated to the means of the variables. The constant term is not suppressed.
10C) If less than zero, the data is treated as in (A) but the
constant term is not suppressed. The constant term is treated
just like every other term,. except that the constant is always
inserted as the first term the relation and held until the next
term is tried. After this point9 all constraints are removed.
(The type C is most useful in dealing with physical data, Type A
is most useful when other information dictates a zero constant. Type B is
most useful when dealing with data that tends to group itself about the means,
(Biological and sociological problems) )
11i) Parameter indicating whether the data is all of unit weight (parameter value not equal to zero), or weighted individually (parameter value equal
to zero). (cols 66 thru 70). If the parameter is zero, each set of data must
carry a value of its weight,
12) Parameter indicating whether the program has previous "experience"
with the problem. If not equal to zero (cT blank) the program assumes that the
accumulated learning deck is present. Integer variable in columns 7172o
Solution Control Card
The solution control card communicates to the program the conditions
under which the program is to cease calculation. The format is (2F10,5 15).
The parameters9 in order, are1) Estimated Coefficient of Determination. Floating point variable
in the range of 0. to 1.0. This parameter is the user's estimate of the
expected goodness of fit between the predicting equation surface and the data.
Perfect agreement is represented by lo0, no agreement is represented by 0,0.
Typical values for physical problems run from.95 to.999. (cols. 1 thru 10)
103
2) Standard Error of *dependent Variable. Floating point variable
whose value is the user's estimate of the standard error of the *idependent
variable represented in this data. The value reflects the probable errors
present in the data in units of the same kind as the data. (cols. 11 thru 20)
3) Number of Passes allowed for this problem. Integer variable
whose value is the allowed number of complete passes to be made on the problem.
Typical values run from 1 to 10. (cols. 21 thru 25)
The program will run until both conditions (1) and (2) are met, or
until the passes are used up, which ever comes first. Condition (1) is met
when the program has found an equation whose Coefficient of Determination
exceeds or equals the specified value. Condition (2) is met when the program
has found an equation whose Standard Error of the Independent Variable is
less than or equal to the specified value. Both conditions must be met in
order to terminate the prdgram before the allowed number of passes are used.
Output Control Card
The program must perform a variety of subsidiary calculations during
the equation generation process. The output control card allows the user to
suppress those extra calculations and printing for which he has no need. If
a blank card (no suppression) is used, all calculations will be printed.
Since, for most problems, this represents a very sizeable volume of printing
the user is cautioned to select only those items of real inter'est. Punching
a numeric 1 in the column corresponding to the item number given below will
suppress printing of the calculation. Either a numeric 0 or a blank allows
the printing to occur.
The suppressable output parameeters, in order by column number, are~
Column No. 1) Raw Sums of Squares andCross Products
2) Average (Mean) values
3) Residual Sums of Squares and Cross Products
4) Standard Deviations
5) Partial Correlation Coefficients
6) Intermediate Steps in Regression process
7) Predictions using the intermediate step equations
8) Predictions using the final equation
9) Values of terms for each set (.f observations
If all of the above are suppressedC the output will consist of~
1) Listing for verification of all raw datao
2) Definitions of terms used for each pass.
3) Final equation found for each pass, with pertinent statistics,
That is; the F level of the last term treated, the standard error of the independent variable, the coefficient of determination, the multiple correlation coefficient, the constar.t term, if any, and. the coefficients and their
standard errors for all termrs finally retained in the equation,
4) The diagonal elements of the regression matrixo
5) The equation produced by the last pass in M. A. De subroutine
form both printed and punched on cards ready for processing.
6) The final status of the "learning" mechanism punched on cards
for use in restarting future problems.
7) The terms to be used for the next presentation of the problem
to the program (on cards).
8) A pseudorandom number card to allow the random number generator
to continue the sequence.
105
For most applications, the automatically produced output is sufficient.
The next most generally interesting results are the items (6) and (8) in the
first group. If (4), (7) and (8) are suppressed, some calculation is also
suppressed thus speeding execution time.
The user is cautioned again that the request for all of this printing
will, in general, produce a very sizeable output.
Simple Learning Control Card
The user may, at this stage in the development of artifical intelligence programs, control the characteristics of the learning mechanism. Use
of the external function structure for the program allows fairly easy modification of the various parts of the program. With the "standard" learning
mechanism as it is now used, data is accumulated concerning three kinds of
selections:
1) Order of Interaction
2) Variables Entering Interaction
3) Functions of the Variables.
A term is generated by selecting an interaction order, next the
variables to be concerned in the interaction and finally the functions of
the variables selected. The term is the cross product of the functions
of the variables selected. The program must "learn" which interactions
are most useful in explaining the data, which variables are most useful,
and which functions of these variables are most useful. The program "learns"
by trying to use terms selected by the program to explain the data. If the
mechanism has selected a term which is supported by the data and retained
by the regression analysis, the meachanism that selected that term is
modified so as to be more likely to select terms of a simi.lar character. On
the other hand, if a term is not supported by the data and is, thus, of no
apparent utility in the equation it is cast out and the mechanism is adjusted
to be less likely to select terms of similar character. Since the probability
of selection of any component of any allowed term should be bounded positive,
the program uses a "halflife" constanrt to modify the probabilitieso In this
way, the relative probability of any term may be made arbitrarily small, but
remains positive The usual card is of format (I.5, 4E1.6.8). If the mechanism
is modified to require more constants, succeeding cards (up to 2) are of format (E21.8,3E16.8).
The parameters are~
1) Number of constants used by "learning" mechanism, Integer in
cols. 1 thru 5, (The standard mechanism uses 3 constants. The numeric 3 is
punched in col. 5.)
2) The Constants used b) the "learning" mechanism.
The standard mechanism uses~
2A) The "half life" of the Interaction selector. Typical
value is 3.OEOO. This means that three consecutive successes
will double the present probability (or conversely, three
consecutive failures in halving of the present probability)o
2B) The "half life" of the variable selector. Typical value
is 3.OEOO.
2C) The "half life" of the function selector Typical value
is 1.5EO. Since any function may be used relatively infrequently it is somewhat desirable to take more powerful action
on each encounter.
107
A great deal of work remains to be done in exploring "learning"
mechanisms. It should be observed here that as the constants are made
larger, the mechanism "learns" more slowly. In fact, for very large values
of the constants the mechanism is essentially deactivated. Very small
values of the constants, on the other hand, may cause wildly Perratic behavior of the mechanism since each encounter so strongly distorts the relative probabilities.
The values given have received much use and appear to give quite
stable operation although not necessarily optimum convergence.
The user will ordinarily duplicate this card and the next one from
run to run.
Core and Tape Layout Card
In order to allow easy extension of this program in the future, the
multiple core program arrangement can be changed and tape layout changed
without disrupting the entire program by using this card. Present corearrangement and tape layout is the followings (Users wishing to modify the
layout are advised to study the program flow charts carefully.)
1) Starting Program is in two consecutive core loads. The first
core of the Starting program is now core 1. Punch 1 in column 5.
2) The Student Program is one core load and is now core 3. Puhch
3 in column 10.
3) The Editor Program is one core load and is now core 4. Punch 4
in column 15.
4) The Regression Program and Program Generator Program are three
core loads, the first two of which are the Regression program. These must
be consecutive core loads. Punch 5 in column 20.
5) The Processed Data erasable tape is nca tape 3. Punch 3 in
column 22.
5A) The Selector mechanism is now stored as the first record
on tape 3. Punch 1 in column 24~
5B) The Raw Data after processing into terms values is now
stored beginning as the second record in tape 3, Punch 2 in
column 26.
5C) The Terms selected for eachpass are now stored as the
second record on tape 35 Punch 2 in column 28.
6) The Raw Data erasable tape is now tape 4. Punch 4 in column 346A) The raw data is now stored on tape 4 in binary beginning
as the first record on tapeo Punch 1 in column 56.
Space is allocated in storage for five tape record assignments for
each tape. At this time, the only assignments are the above..Initial Random Number Card
The random number subroutine used by the Simple Learning Mechanism
may be reset arbitrarily at the beginning of each problem.Since the program
will produce one of these cards at the end of the problem the sequence may
be continued easily. The format is 5110. The first number is any odd integer modulo 3276810. The second number is any integer modulo 3276810.
The third number is any integer modulo 3210. The subroutine combines these
integers to form one 35 binary digit odd integero This integer serves as
ths first member of the pseudorandom number sequence generated by the
library subroutine RAM2.
109Ordered Term Insertion Card(s)
If the parameter (8) of the Problem Control Parameter Card is nonzero, the user must supply a set of cards to define the order in which terms
are to be inserted. Thus allows an arbitrary equation may be generated without regard to the statistical analysis after which the statistical analysis
may be used to discard those terms that are not sufficiently important to
meet the deletion error eriterion. If a theoretical relation is available
for which a study is being made to determine how the relation may be improved,
this feature may be useful. Otherwise, one must assume the risk that some of
the theoretical terms will be displaced in the search. The user must be aware
that the use of these term order cards is a severe constraint on the analysis
and treat the results accordingly.
The format is 14I5 Each five columns contains a integer whose
value is the number of a term to be inserted. The first number is the first
term to be inserted, the second number is the second term and so on. For
example, if parameter (8) on the Problem Control Card were three and column
five on the Ordered Term Insertion Care were six, column ten were three and
column fifteen were one, the effect would be to im ert term six, then term
three and then term one after which the Stepwise Regression Program would
examine the equation to be sure that these terms meet the deletion error
criterion. If any of these terms fail the test they will be discarded.
When all of the terms in the equation meet the deletion error test, the remaining terms not yet in the equation will be tested for insertion. From
this point on, the standard analysis is followed,
Data Deck Preparation
1. Format Specification Card
Since the data may come from various sources, the data deck allows
the data format to be specified at execution time, This is done by using a
standard FORTRAN format statement beginning with the word FORMAT (beginning
in column seven and ending in column thirteen, followed by any allowable format specificatsion that can be placed on one card and terminating with a
right parenthesis in or before colu.lrmn. 72 o For example, the following format
statements would be acceptable:,
Column
7
FORMAT (5Fl0, 2,El6.8)
or
FORMAT (4E1.6.7, F10, 1/4E1.5 8)
This card must immediately preceed the data deck, and is known as the
Format Specification Card.
2. Data Cards
Following this card are the data cardso Arbitrary formats are allowed
as described aboveo The data must be listed for each observation in the
following order however. (All values are:floating point numbers)
i) Observation Number.. There must be a positive observation
number for every observation. Run number one must appear once and only once~
2) Independent Variables. These values are listed in order
following the observation number. The values must correspond to one observation group.
3) Dependent Variable. The variable whose value is to be predicted
must follow the predictor or independent variables.
4) Weight of this Observation group. The weight of the group may
be specified or can be assumed to be unity depending on the parameter (11) of
Problem Control Card.
3. End of Data Card
The actual data is then followed by a complete blank data set which
acts as a termination for the data. If any Observation Number is blank or
less than or equal to zero, the data input is terminated at that point.
Therefore, the user must take care in preparing the data deck so that the
entire set of data will be read into machine storage.
The program automatically counts the weighted data sets to establish
the degrees of freedom for the analysis. In this way, new data can be added
to the data deck and/or old data can be deleted very easily.
The Accumulated Learning Deck
Whenever a multiple independent variable problem occurs in which
a large number of functions are allowed for each independent variable and
interactions of all orders are admitted, the result is the generation of a
very large set of possible terms that may appear in a predicting equation,
Since the most desirable equation consists of a "minimal" set of these terms
comprising those terms most significant in explaining the dependent variable
behavior, it becomes apparent that the analysis must usually perform a
selection process while dealing with a segment of the entire set of terms
at each encounter.
112
If it is possible to verify the validity of terms independently
from their method of initial, selection then it becomes feasible to allow
the machine to select the terms using some heuristic method. The terms so
selected will not always be the correct ones or even the "best" ones although the method of selection should certainly tend to operate in this
wayo The important point to observe is that the validity of the term is
tested by the regression analysis independently of the selection and the
regression analysis is, therefore, not affected in any way by the heuristic
method of selection. Because of this, the heuristic term selection method
is free to select terms using any convenient scale for choosing the terms.
If the terms so selected are shown to have validity by the regression
analysis then the heuristic method that selected the valid term is modified so as to be more likely to select similar terms. A converse action
occurs whenever the term is shown to be invalid.
At the completion of each solution pass the current status of the
selector mechanism is represented by a set of values which give~
1) The relative probability of each Interaction Order
2) The relative probability of each Independent Variable
3) The relative probability of each function of each variable.
Whenever no accumulated learning deck is available, parameter (12)
of the problem control parameter card is set equal to zero. The result is
that the program will then assign equal unit relative probability to all
interactions, variables and functions of variables.
If previous encounters with similar problems have occurred, however,
the program has already had "experience" with a similar problem and can be
allowed to take advantage of these encounters by providing the accumulated
learning deck that was automatically produced at the conclusion of the former
problem together with the new problem data. If the user desires to transmit
this information to the program, parameter (12) of the problem. control parameter card is set equal. to one and the accumulated learning deck is placed
after the data deck.
The user can also suggest his own experience to the program by preparing an accumulated learning deck. The format is 5E14.7e The relative
probabilities are inserted in the following order1) Relative Probabilities for Interactions from first order to the
maximum order for the problem.
2) The sum of the preceeding probabilities (1).
3) Relative Probabilities for Independent Variables. from the first
to the last in the same order as they appear in the data deck.
4) The sum of the preceeding probabilities (3).
5) Relative probabilities for each function of the first variable
followed by the sum of these probabilitieso
6) Relative probabilities as in (5) for the second, third, etc.
variables.
The preceeding items are punched successively in the available
fields as specified by the formato No blank fields are permitted between
groups since every field is interpreted consecutively.
The accumulated learning produced by the machine program has each
relative probability normalized so that the mean relative probability is
unity. In this way9 a problem may be easily expanded to more variables,
functions, etc. and still retain previous "experience" by making all new
entries of unit value.
Because of the independent regression analysis of the terms chosen
from the accumulated learning it must be emphasized that this mechanism cannot force the adoption of incorrext terms. Rather, such an inQorrect set
of "experience" would be modified progressively by the program. If the
"experience" supplied is valid for the current problem the result is to
speed the generation of the desired equation but invalid "experience" can
only temporarily delay this generation.
In general, if good experience is available from previous similar
problems or from the user's background the user is strongly advised to make
use of it.
Initial Pass Terms Deck
The user may suggest any initial terms that may be desired for the
first pass, If the suggested terms stem from theoretical consideration and
the theory is in agreement with the data, such a suggestion will speed the
generation of' the predicting equation by insuring an early treatment of
likely terms Any number of terms may be given initially up to the total
number of terms allowed for each pass The number of terms to be so defined by the user is given by the parameter (9) on the problem control
parameter card, If fewer than the total number of terms to be tried at
115
each pass (parameter (7)) are given initially by the user, the machine program
will generate the remainder of the set by using the accumulated learning and
the selector mechanism.
At the end of each problem and immediately following the production
of the accumulated learning deck, the program produces a set of terms to begin the next encounter with the problem. If the problem is continued later
these terms may be supplied by simply including these cards following the
accumulated learning deck.
If the user wishes to suggest terms the procedure is the following:
(The card format is 14F5.0)
1) Produce a term card (or cards if sufficient variables are
present) for each term desired.
2) Treat each consecutive field in the above format specification
as in one to one correspondence with the independent variables in the problems.
2A) Insert the appropriate function number in the field corresponding to the desired variable,
2B) Leave blank (or zero) every variable field not associated
with the term.
3) Insert the interaction order in the field immediately following
the last variable field.
For example, suppose the user is dealing with two independent
variables and the "standard" PFNCT subroutine and desires to form the term:
X(1).P.53*X(2).P1/2.
116
The term is specified by punching seven in column five, six in column
ten and two in column fifteen. See "standard" functions in PFNCT as defined
by number. The seven selects the function integer power three and the appearance in column five assigns this function to variable one in this term. The
two in column fifteen declares the term to be a second order interaction and
thus produces the desired multiplication of the previous functions of the
variables. In this way any desired term allowed by the subroutine PFNCT (and
hence allowed by the user) can be specified as an initial term. This is true
even when the function numbers in the initial term specification exceed the
value of parameter (6) on the problem control card. Of course these terms
are, in this instance, excluded from automatic generation but will, nevertheless, be used correctly and preserved from pass to pass correctly so long as
they are in agreement with the date. Once discarded from. the set of terms
only those terms allowed by parameter (6) can be regenerated.
Output Functi.on Name Card
At the conclusion of each problem the program produces, in printed
form and on punched cards, the external function subroutine for the last
regression equation found by the analysis. This function is ready for
immediate translation by the MAD translator and may be used in any program
as the user may desire. The Output Function Name Card assigns a name to
this subroutine. If a blank card is s.pplied at this place in the input
deck the function name will be left blank. Otherwise the desired name is
entered somewhere in the columns 1 thru 72 on the Output Function Name Card.
The program will use the last six nonblank alphanumeric characters as the
function name. If a total of fewer than six nonblank alphanumeric characters
appear in columns 1 thru 72, these characters are taken to be the function
name. The rules for allowable function names are those of the MAD translator.
If the desired function name consists of exactly six alphanumeric
characters, the user is then free to insert any desired comment before the
desired nameo The comment will, in this case only, be ignored.
Examples of allowable function nameso
ETA17; PRATIO; TORQUE; EFF23
The Structure of the Program
The Stepwise Regression Program, like the Simulator, is structured
in several sections, Each section performs certain tasks which cause, as a
result of the performance, a selection of a new section of the program to
be performed, There are seven basic sections:
lo Input Section
2~ Initial Term Section
3, Student Section
4o Editor Section
5, Regression Statistics Section
60 Stepwise Regression Analysis Section
70 Program Generation Section
The input section brings into the program all of the data associated with a given problem. The control parameters and data are all entered
at one time so that if any data set encounters trouble the input tape will
be properly positioned for the next problem. As discussed in the section
on communicating the problem to the program, the initial terms may be
supplied by the user or chosen by the machine as desiredo If supplied, the
input section will then bypass the Initial Term Section and the Student
Section.
If the initial terms were not supplied, the program Qalls the Initial Term Section to choose the terms for the first: passo The selection is
based on the Accumulated Learning supplied by the usero If no Accumulated
Learning is given, the program assumes initially that all possible terms are
equally probable and proceeds with this assumptiono The terms are selected
119
by choosing1o Interaction Order
2. Variables for the interaction
3. Functions of the Variables.
After selecting the initial terms, the control passes to the Editor
Section. In this section, the terms defined earlier are evaluated for every
data point. If an eminent overflow or underflow of the machine register
capacity is found, the faulty term is rejected and the problem is passed to
the Student Section so that the Accumulated Learning may be adjusted to tend
to avoid such a recurrence. If no such machine limitation is found, the control passes to the Regression Statistics Section.
The Regression Statistics Sections computes raw sums of squares and
crossproducts, means, standard deviations, sums of squares and crossproducts adjusted to means, and simple (productmoment) correlation coefficients.
Since these statistics are generated in a conventional manner reference is
made here to suitable texts for elaboration(ll l2 18 1: The important item
of interest is that all of these statistics are available for the study of
the data. Upon completion of these tasks, the results in the form of the
regression matrix are passed along with the control to the Stepwise Regression Analysis Sectiono
In the Stepwise Regression Analysis Section the techniques of Efroymson
and Dallemand are employed but modified slightly to allow more flexible
manipulation of the analysis by the user, Four basic analyses may be performed:
1. Analysis for fit of' data about means
2. Analysis for fit of data with respect to the coordinate planes
with constant term suppressed.
3~ Analysis for fit of data with respect to the coordinate plants
using constant term
4. Controlled term insertion order analysiso
 120
Of these analyses, the fourth is most risky since the user overrides
the statistical analysJiso If the user finally removes the imposed constraints
on the term insertion process however the Stepwse Regression Analysis Program
will automatically discard any termrs that are not sufficiently correlated with
the datac In this wa,y, occasionally, a special problem may be studied to
advantage,, The other three analyses are basically simil.ar except as they are
related to the coordinate systems and the constant term., The analysis proceeds as follows~
1) Select the term. with greatest contribution to the explanation of
the as yet unexplained variance of the data,
2) Compare the variance contribution for this term with a random
variable to determine whether the contribution could be due to chanceo
3) Insert the term if and only if the variance contribution exceeds
that of a ranloT. variable by wh.atever amnou.nr.t the user wishes to speci.fy,
Commonly the term is inserted if the risk is less than one in twenty to one
in one hundred.
4) Revi.ew all, terms in the equation and reject any that may have
been redulced in. importance below the user's standard (by combinations of other
terms9, etc. o
5) Cont.in.e tthis process until none of the terms not in the equation
can meet the standard set in (3)0
Tipon completion of the Stepwise Regression Analysis the coefficients
for each term found to be valid by the analysis are produced together with
the corresponding standard errors for the coeffici ennts. Other statistics
121
produced at this point are the multiple correlation coefficient, the coefficient
of determination, the standard error of the dependent variable with respect
to the regression equation and the regression constant, if any. The diagonal
elements of the inverse regression matrix is also printed for study. If desired, the user may request the calculation and printing of a point by point
comparison of the data and the predicting equation results. This calculation
displays, in addition to the actual value produced by the regression equation,
the predicted values plus and minus one standard deviation. This band of
values may be expected to include the true value of the dependent variable
68 percent of the time. Finally, the deviations and the percentage deviations
of the predicted points and data points are produced. During the process the
largest absolute deviation and the largest absolute percentage deviation are
sorted out and printedo
Until the regression equation produced by the Stepwise Regression
analysis satisfies the criteria set by the user, the Analysis next returns
to the Student Section to reevaluate the Accumulated Learning stored in the
selecting mechanism. The criteria set by the user are,
1. Standard error of the dependent variable
2. Coefficient of determination
3. Maximum number of solution passes allowed.
The analysis continues until the generated equation properties equal or better
the first two criteria, or until the allowed number of passes are consumed.
This action takes place by recognizing the separation of terms by the regression analysis into those terms sufficiently correlated with the data to be
included in the regression equation and those terms not this well substantiated.
122
All of the terms inserted in the regression equation are retained for the next
program unless this would not allow the selection of any new terms. In addition the values of all portions of the selecting mechansim involved in choosing
the successful terms are increased so as to make the selection of similar terms
more likely. Finally, the values of all portions of the selecting mechanism
involved in selecting the unsuccessful. terms are reduced to make the selection
of similar terms less likelyo
The modification of the selecting mechanism occurs on an exponential
decay basis. This technique allows the user to specify the number of failures
to reduce a particular element of the term selection mechanism by one half.
Conversely, this value is the number of successes to double the relative
likelihood of the element of the selecting mechanism. The elements of the
selecting mechanism are:
lo The interaction selector
2o The variable selector
3~ The variable function selector.
After the selecting mechanism is modified by this process the terms for a
new attempt by the regression analysis are selected using the modified
mechanism. In this way extremely large sets of possible terms may be
searched in a very effective manner. An interesting property of technique
is the ability of the method to work out of "local optima" in the production of the regression equationo That is, the location of very highly
correlated terms may result in an equation which may fit less well than an
earlier more complicated relationo This discovery may often be used on
later trials to modify the selecting mechanism and thus find an equation
better than any previous relationo
125
When the criteria set by the user are satisfied by either producing
an equation that meets the specified statistical criteria or by using up the
allowed number of trials, the control is passed to the Program Generation
Section. This section produces the final equation as a subroutine both in
print and on punched cards ready to be included in any program as may be desired. Versions are available to produce the subroutine in either the M.A.D.
language or in the FORTRAN language.
Upon completion of the output of the subroutine form of the regression equation, the program takes one additional step. The control is passed
to the Student section and the Accumulated Learning is once again modified
and the set of terms chosen for a reentry of the problem at any later time.
This information is preserved on punched cards in the exact format expected
by the Input section. It is important to recognize that this information
is pertinent not only to the problem at hand but also may be used to expedite
the solution of any similar problem. This carrying forward of artifical
"experience" is an important and unusual feature of the Stepwise Regression
Program with Simple Learning. By this means, the program is enabled to
accelerate the generation of predicting equations by recognizing the information latent in previous encounters with similar problems. It is also important to observe that if such artificial "experience" is incorrect for the
attempt being made no effect upon the generated equation will be observed
except for a use of one or more passes to correct the Accumulated Learning
and proceed to the generation of the predicting equation. This action occurs
because the data itself is the only finally determining source of information
upon which the predicting equation can be based. The Accumulated Learning,
if correct, can accelerate the determination, but if incorrect cannot prevent
the determination.
CONCLUSIONS
The programs which have been discussed in this paper constitute two
new and advanced tools for the study and analysis of the behavior of physical
systemso The Simulator Program provides a tool for undertaking the simulation
of complicated systemso The flexibility inherent in this technique of analysis is
made possible by providing for extension and modification of the library within the structure of the Simulator. It is important to understand that the
Simulator Program provides a means for bringing to analysis of systems the very
best methods and most applicable techniqueso In this way, the Simulator Program proceeds from the information supplied by the user (the System Definition
and the Constraints on Input Parameters to be supplied to and the Results
desired from the generated simulation program.) to produce a procedure, or
algorithm, to simulate the operation of the defined system when translated
and executed on the digital computing machineo
The Stepwise Regression Program with Simple Learning provides the
technique which can. supply the programs produced by the Simulator with subroutines to implement the methods extracted from the Library of Element
Descriptions. This technique is more generally applicable and has already
been utilized in the process of its verification to supply useful predicting
equations for data taken from many diverse sourceso These sources have included calibration data for thermocouples, fatigue life data for plastic
gears, electron tube characteristics, electric transmission line loss
characteristics, thermodynamic properties of steam, tool life characteristics,
psychological test data and data on the effects of various drugs on human
124
125
subjects. In additionto these diverse areas listed to illustrate the versatility
of the technique, the technique has been applied successfully to the determination of the characteristics of the steam power plants simulated. These characteristics include the expansion line characteristics of the turbine, the extraction pressure characteristics, the exhaust loss characteristics, the generator
loss characteristics and special flow leakage characteristics.
In all of these areas, the work thus far has been most encouraging.
There is, however, a great deal of work remaining in perfecting, extending and
increasing the generality of the techniques. The methods developed thus far
hold considerable promise in other areas. Perhaps, some of the most significant extensions will come from the study of the results produced by these
techniques by allowing a powerful analysis of the data sets.
SYSTEM SIMULATOR
FLOW DIAGRAMS AND CORE LAYOUTS
(Comments to assist the interpretation of the flow diagrams in this section may be found beginning on page 227.)
126
127f
SYSTEM SIMULATOR
FLOW DIAGRAMS & CORE LAYOUTS
SAP SUBROUTINES NOT FLOW DIAGRAMMED
ONLINE
INPUT
OUTPUT
OCTDEC
INSRTC
EXTRC
INBIT
IFBIT
NOT
OR
SKPFIL
TAPSEL
OCT
IFABIT
ANDBIT
OUTPT 1
PNCH
PNCH 12
RNDM 1 B
AM1BLD
AM1CNT
AMIEIM
AM1SET
AM1SUB
SIGN
AND
128 
CORE LAYOUT
(MAIN)
CORE #1
INPUT
EXTERNAL FUNCTION
NUCOP
TAPE IN
INTERNAL FUNCTION
CONCK
SAP
INPUT
OUTPUT
OCTDEC
ONLINE
SAVTPH
INSRTC
EXTRC
CORE # 2
ELEMENT
DESCRIPTION
CORE # 3
MATRICES
SET'UP
TAPMV
TAPE IN
ELTPPR
TSTDMP
SYNEIM
IPRELM
QCORE
TAPE IN
SSORDP
NUCOR
INSRTC
EXTRC
INBIT
IFBIT
TAPSEL
OCTDEC
INP UT
OUIPUT
ONLINE
OR
NOT
SKPFIL
IFABIT
TAPSEL
OUTPUT
INSRTC
EXTRC
OCT
OCTDEC
TAPSEL
OR
NOT
ONLINE
SKPFIL
INBIT
19 
MAIN
EXTERNAL FUNCTION
INTERNAL FUNCTION
SAP
CORE # 4
DESIRED
RESULT
REDUCTION
EXTCHK
ELCHK
INSERT
REMOVE
LISTSC
RDRUM 1
RDR1JM 2
CHECK
WRDREM
WRDRUM 1
SKPFIL
IFABIT
ANDBIT
NOT
AND
AM1SET
IFBIT
AM1CNT
RNDM1B
AM1SUB
AM1RIM
OR
CORE # 5
PROLOGUE
PCODE
PRLOG
PSCAN
SSCAN
OITPT 1
STORER
PSUB 1
PSUB 2
PNAME
SKIP
PLIST
FSUB
FSUB 1
FSUB 2
LABELS
INSRT
SSCAN 1
CALCS IS A DUJMMY INTERNAL
FUNCTION
SKPFIL
EXTRC
RNDM1B
INSRTC
OUTPUT
PNCH
PNCH 12
13( 
MAIN
EXTERNAL FUNCTION
INTERNAL FUNCTION
SAP
CORE # 6
PROGRAM
GENERATOR
PSCAN
SSCAN
SSCAN 1
PSUB 1
PSUB 2
OUTTPT 1
SKIP
QYFQ
LABELS
FSUB 2
FSUB 1
INSRT
QFFQ
QPQ
SKPFIL
NOT
TAPSEL
EXTRC
AND
INSRTC
SIGN
IFBIT
OR
OCTDAC
OUTPUT
PNCH12
CORE # 7
EPILOGUE
EP ILOG
PSCAN
SSCAN
SSCAN 1
SKIP
FSUB
FSUB 1
FSUB 2
LABELS
INSRT
QPQ
SKPFIL
EXTRC
OUTPUT
OUTPT 1
PNCH12
INSRTC
OCTDEC
OR
QPQ AND QFQ ARE DUMMY INTERNAL
FUNCTIONS
CORE # 8
SELPGM
DIAGNOSTIC
131
INTERPRETATION OF BOXES
CO PUTATION
EXECUTION
COMPUTATION
EXECUTION
EXECUTION REFERS TO A SUBROUTINE OR FUNCTION
WHENEVER STATEMENTS
FORM: THROUGH (STATEMENT NAME),
FOR (VARIABLE) = (INITIAL VALUE),
BY STEPS OF (NUMBER), UNTIL
(CONDITION IS)).'CTORS
FOR REMOTE
CONNECTIONS
FOR MARKING ENDS,
OR SCOPES, OF
ITERATIONS.
TAPE
SYMBOL
FOR OPERATIONS INVOLVING
TAPES.
FOR OPERATIONS INVOLVING
DRUMS.
READ OR
WRITE
USUALLY USED TO DENOTE
PRINTED OUTPUT
CARD SYMBOL
FOR OPERATIONS INVOLVING
HOLLERITH CARDS.
132CORE 1 MAIN LOOP
SYSTEM SIMULATOR INPUT CORE
133
CORE 1
EXITS FROM CORE 1
ILLEGAL STATEMENTS
134
CORE 1
DECLARATION SECTION CONT'D
si(8)V TS
s\ 8 VXTSE,,,(9)~ GO
si(9) TO
s(4)
135CORE 1
CONNECTION SECTION
136
CORE 1
SYNONYM SECTION
T
137
SYNONYM SECTION CONT'D
INPUT PARAMETER AND DESIRED RESULTS SECTION
138
CORE 1
I.P. & D.R. CONT'D
FUNCTION SUBSTITUTIONS
139
CORE 1
INTERNAL FUNCTION  CONCK.
(CONSISTENCY CHECK)
F
T
F
SYSTEM SIMULATOR
CORE /2 ELDES
(ELEMENT DESCRIPTION PROCESSOR)
141CORE #2 (CONT'D)
142CORE #2 (cont'd)
F
CORE #2 (CONT'D)
144CORE #2 (cont'd)
145
SYSTEM SIMULATOR
CORE NO. 3 SETUP CORE
F
ERROR
RTN
T
146IPRELM CORE NO. 3 INITIAL PROCESSING, INPUT PARAMETERS AND
DESIRED RESULTS
147CORE # 3 SSORDR, MATRIX ORDERING ROUTINE
148SSORDR (CONT)
149
TAPMV.(TH,TE)
TAPE MOVER
150CORE #3 SYNELM SYNONYM ELIMINATION
151
152
CORE # 2 QCORE
TAPEIN, ELEMENT TAPE CHECKOUT ROUTINE
153
TSTDMP, TEST DUMP R(OTINE, FOR CHECCKING OUT PROCEIDRES
Core 3
RETURN
ELTPPR, ELEMENT DESCRIPTION TAPE PRINTER
D.R.R. PROGRAM
CONNECTION MATRIX IS RESTORED.
~P.5
AT THIS POINT, ALL
DESIRED RESULTS THAT
CAN BE SATISFIED WITHOUT COMPUTATION HAVE
BEEN REMOVED.
155
D.R.R. PROGRAM
THE ASSOCIATIVE MEMORY IS SET TO
RECEIVE ENTRIES FOR STATEMENT COLLECTIONS.
THE PARM VECTOR HAS BEEN SET.
DRUM=2
DRMADD
= O..3
156
D.R.R. PROGRAM
Page 3
P2
P.4
P.4
157
D.R.R.
PROGRAM
NOTE: WHEN IPCNT=O THE
STATEMENT COLLECTION
PRODUCES USEFUL INFOR BETA1
MATION WITH NO EXTRA
WORK.
NOTE: AT BACK 2,
STATEMENT COLLECTION
UTILITY HAS BEEN P.5
COMPLETELY IN TO
VESTIGATED. THE. s(Aj
PROCESS CONTINUE NO
UNTIL THERE ARE
NO MORE STATEMENT COLLECTIONS,
THEN SETEOF. TRANSFERS TO LOOP 3.
AT LOOP 5, THERE ARE SOME DE
SIRED RESULTS FOR WHICH NO STATE
MENT COLLECTION HAS BEEN
ASSIGNED.
158
D.R.R. PROGRAM
NO
159D.R.R. PROGRAM
INTERNAL FUNCTION EXTCHK.
YES
YES
DR1(LI)=AND.(DR1(LI)
NOT. (AND. (AND. (DR1
(LI),EXT1), IP1
(ELCOL(IAT,
III)))))
DR2(LI)=AND. (DR1
(LI),NOT. (AND. (AND.
(DR1(LI),EXT1),
IP1(ELCOL(IAT,
II2)))))
MAIN
PROGRAM
YES
NO
DR1(LI)=AND. (DR1
(LI),NOT. (AND. (AND.
DR1(LI),EXT1)
IP1(ELCOL
(1IAT)))))
YES
INTERNAL FUNCTION ELCHK.
/THROUGH LOOPCl, \
/FOR II1=ELREL
/(IAT, I)1,ELREL
(1IAT,ELREL(IAT, I)
\ELREL (1IAT,II1)
\ /
162INTERNAL FUNCTION REMOVE
Page 1
163INTERNAL FUNCTION REMOVE
Page 2
TABILE
(INDEX, 35)
= DRMADD
THE NEXT SECTION IS USED WHENEVER THE
NUBER OF STATEMENT COLLECTIONS EXCEEDS
AVAILABLE STORAGE. THE PARAMETER WITH THE
GREATEST NUMBER OF COLLECTIONS DISCARDS
ONE COLLECTION; MOST PROBABLY THE ONE WITH
THE LEAST WEIGHT.
165INTERNAL FUNCTION LISTSC.
166INTERNAL FUNCTION RDRUM1.
167
INTERNAL FUNCTION CHECK. (QQI)
INTERNAL FUNCTION INSERT.
INTERNAL FUNCTION RDRUM2.
INTERNAL FUNCTION WRDRUM
INTERNAL FUNCTION WRDRUM1
169
PROLOGUE  MAIN PROGRAM
PROLOGUE GENERATION
SECTION IS TREATED AS
A SUBSECTION OF THE
PROLOGUE CORE.
170INTERNAL FUNCTION PCODE
(CONSTRUCTS UNIQUE THREE CHARACTER CODES.)
171
INTERNAL FUNCTION PRLOG
ci
RETURN
172
INTERNAL FUNCTION
PSCAN (QIQ)
EXECUTE
OUTPUT
BFR(12II1, 12)
EXECUTE
PUNCH 12.
BFR( 12II1)
IIJ = IIJ + 1
173
INTERNAL FUNCTIONS SSCAN, SSCANI PAGE 1
QFFQ=FSUB2
QPQ=PSUB2
PCOUNT= 1
PCOUNT=O
SW=O
BFR((J26)/6)=
INSRTC (BFR ((21)/6),
J26(J2J1)/6), CHAR
174
INTERNAL FUNCTIONS SSCAN, SSCAN1
PRWRD = OR. (PRL1ST
(Jl),OCTDEC(ATTNO(J)))
BFRI((J41) 6) =
INSTRC(BFR( (J41)
6),J4((J41) 6)6,
EXTRC (PRWORD,
J3))
175
INTERNAL FUNCTION OUTPT1. (XXX)
YES
INTERNAL FUNCTION
STORER.
INTERNAL FUNCTION
INSRT.
BFR1(J21)/6)=INSRTC.
(BFR1(J21)/6), ((J21)/6)
*6, EXTRC. (QWRD, JJ))
177
INTERNAL FUNCTION PSUB1 (QQ)
SLP2, FOR
=12, 1
A(2000+A(1800+I3)
4 A(2000+A(l1800
\ +12)) j
178
INTERNAL FUNCTION PSUB2.
Page 1
PRBLN = 1B, ATBLN = OB, IDBLN = OB, I4 = 1
PROWRD = O, ATWORD = OC, IDWORD = D
NO
179INTERNAL FUNCTION PSUB2
 r I, P2,1
180INTERNAL FUNCTION PSUB2
INTERNAL FUNCTION PSUB2
THROUGH PSBLP,
FOR II = ELREL
(1IAT,ELREL(IAT,1))
1,ELREL(IAT, II) /
\ELREL(IAT, 1) /
RCOUNT = RCOUNT + 1
ATTNO (RCOUNT) =
ELCOL(1IAT, 11
INTERNAL FUNCTION PSUB2 Page 4
183
INTERNAL FUNCTION PSUB2 Page 5
B: ERRNO EXECUTE
PSBLPrS^ ERRNO ELPGM.
SE LPGMN
5 ~'I = 6008
(DIAGNO)
BFR1((J41)/6)=
INSRTC (BFR1
(J41)/6), J4
=((J41)/6)*6,
EXTRC. (PRWORD,
JJ)
184
INTERNAL FUNCTION
PNAME. (QCNT)
185INTERNAL FUNCTION  SKIP (YY)
186
INTERNAL FUNCTION
PLIST.
(Q&QK')
VECTOR VALUES
CARD 2 = 0,
aDR, D,(,
DO,,DDnO
o.,, oo ))
C =,
REMARK CARD MASK
FOR LISTING I0
PARAMETERS
187
INTERNAL FUNCTION  FSUB (QFQ.)
Page 1
ENTRY
i 4 = 1
SLBFLG OB X f RETURN
FINFLG = 1B LFP I
LBWORD = I B N
EXECUTE
HIS + YES
' ——' RACTAM
FLOATING ASTERISKS;>ERRNO000
NO
EQUALS — N
X^EQUALS ZERO AlHq J = J + 11
EXTRACT
CHTAACTER'S
RETURN T
>^ ^ss^ T3 ^NEXT HER LFPTO1ES
STATEMENT FN
LABELCHARACTER
LA BEL FINFLG..~,., ~ NQ.I / sO
NO
188INTERNAL FUNCTION FSUB (Q.F.Q.)
YES
189
INTERNAL FJNCTIONS —FSUB1 FSUB2
YES
NO
NOT OK
190
INTERNAL FUNCTION  LABELS
SBL = SBL + 1
LOCLBL=LOCLBL
+ 1
191PROGEN SECTION
NELNAM
CHECK TO SEE THAT THIS
COLLECTION IS UNIQUE:
192PROGEN SECTION
193EPILOGUE; MAIN PROGRAM AND INTERNAL FUNCTION
194
DIAGNOSTIC CORE
NO
YES
STEPWISE REGRESSION
FLOW DIAGRAM AND CORE LAYOUTS
196
CORE LAYOUTS
(MA'IN)
(SUB)
STARTER
STARTER
1/2
2/2
TAPE 1
RDRTRM
PARAM
DATAIN
RDTRWT
CMTRWT
TERMIN
READ 1
PKTRM
TERMIN
STUDENT
TAPE 1
RDRTRM
GRADER
NRML
PRINT 4
ENTRM
PKTRM
TERMIN
RDRTRM
TAPE 1
SUBSUB
FORMAT
TRMCHK
ENTRM
XLOG
RAM2A
PICKV
PICKS
RAM2C
RAM2B
PICKE
EXP(3
TRMCHK
ENTRM
PFNCT
EXP(3
EXP(2
TAPE 1
PRINT 2
PRINT 3
SQRT
DGFRDM
VARSRT
VARCHK
MATRAN
RGRTRM
SEQPGM
SELPGM
SELPGM
TRMCHK
ENTRM
(FUNCTION)
SUBSUBSUB
SELPGM
SELPGM
ZFNCT
SELPGM
EDITO?,.
REGRESS ION
REGRESS ION
1/2
2/2
SUMSQ
RSDSUM
PRCRCN
RGRSSN
WINDJP
SELPGM
SEQPGM
SQRT
PRINT 1
PREDCT
TSTLVL
FLVL
DTAB
XTAB
ERROR
TAPE 1
197CORE IAYOUTS CONT'D
(MAIN)
STATENMEIT GENERATOR
(SUB)
PROCES
PROLOG
D3IMNSN
CTERM
RDRTRM
GENTRM
SUMEQN
SUBSUB
ERROR
GTRM
EXPNT
(FUNCTION)
SUBSUBSUB
SELPGM
198STARTER PROGRAM
First Part
U
199
STARTER PROGRAM
Second Part
PARAM
NOVAR=NOIND + 1
NO Z = NOTRMS+1
200
DATAIN
O
201
CMIRWT
CALL \ Oa \ D ARRAY 1 (I)
= l.
T APE 1 r Il NOIND / ARRAY 2(I,1)
ARRAY 3 ARRAY 3(I,J)  DO
> (I,60) 1 Jl NO]
NOFNCT
ARRAY 1(60) WRIE
ARRAY 2(6o,1)ARA
NOIND A
RDIRWT
READ CALL
ARRAY'S APE 1 TAPE TC N
TERMIN
202
TRMCHK.
ENTRM
RDRTRM
203'PKTRM
PICKV
FUNCTION
PICKE
PICKS
205STUDENT PROGRAM
<0
<0
206STUrMN PROGRAM (CONTINUED)
207
NRML
PRINT 4
208
XEITOR ROGRAM
J1 = NOINIT + 2  I
J2 = J1  I
INVAR(J1) = INVAR(J2)
209
ZFNCT
0
210
PFNCT
FUNCTION
FOR INTEGER POWERS ROOTS AND RECPROCALS.
211
REGRESSION PROGRAM
FOR THE STEPWISE
FITTING OF DATA
RESIDUAL SUMS
OF SQUARES
AND CROSS PRODUCTS
LOAD ARRAY
AND COMPUTE SUMS
OF SQUARES AND
CROSS PRODUCTS
PARTIAL CORRELATION
COEFFICIENTS
STEPWISE
REGRESSION
COMPLETE JOB AND
DECIDE WHAT TO
DO NEXT
212
SUMSQ
ARRAY (NQVAR+1,
NQVAR+l) =
ARRAY (NOVAR+1)
NOVAR+1)+
WHT
213
RSDSUM
214
PRCRCN
215
RGRSSN
I.I
216
DGFRDM
217
COMBINED WITH VINSRT TO ALLOW INSERTATION
OF TERMS, OR STANDARD OPERATION, AS DESIRED.
VARSRT
VARIABLE Il
NOT IN
EQUATION
VARIABLE Il
IN EQUATION
218VARSRT (CONT'D)
219COMBINED WITH VINCHK TO ALLOW INSERTION
OF TERMS, OR STANDARD OPERATION, AS DESIRED.
VARCHK
220
TSTLVL
221
MATRAN
0
0
222
RGRTRM
 ,,,
0
PREDCT
223
WINIJP
<0
1)1 ECltJRN
224MAD OR FORTRAN STATEMENT GENERATING PROGRAM
0
225
EXPNT
226
PRNT 1
STEP
NOMIN
/ —— F \ "s^ ^^ T FLEVEL,
NENTER NONT SIGY, RETURN
\s..^^ /______,Y~ ~ETC.
STEP
NOMAX
TAPE 1
FOR
ENTER ITAPE ( "
\ I. G. NORCDl1/
READ TAPE
ITAPE t a J E= )
(NO LIST)
PLVL
ENTER 2 ( ENTER )
  ~ W~T9 VERSIONS AVAILABLE
~/ ~~~~~\ 1) Functional Representation
VL ~ F(P,DEFR) ( FL 2) Table Interpolation.
COMMENTS ON THE SYSTEM SIMULATOR FLOW DIAGRAMS
The flow diagrams for the System Simulator may be more easily
followed through the combined use of the comments presented here and
the references given to the main text. Where page numbers are parenthesized the page refers to associated explanation in the main text of
the paper.
The System Simulator Core 1 diagrammed on page 132 is charged
with the initial translation of the source program presented by the user. (51)
The first task is the initialization of the status of the machine including
the determination of the condition of the tape units and the blanking of
core and drums and the setting of switches and counters. Beginning then
at RESET1 the Simulator reads card after card into memory one at a time.
As each card is read into memory, the card information is scanned character by character using the procedures indicated in the scope of LOOP.
Blanks are ignored and illegal punctuation is detected. Legal punctuation
together with the action of the counter HOLCNT is used to establish the
type of statement being scanned and the action to be taken. Whenever more
than 6 characters have been found without finding legal punctuation, the
Simulator anticipates that a Declaration of some kind may be at hand. By
using indices K1 and K for the statement label array S, it is possible
to transfer directly to the appropriate section of the program for any
declaration and to the section of the declaration analysis appropriate
for the punctuation found.
Because of the limitations of storage, the additional analysis
required for an Element Description is done by ELDES in core 2. Otherwise,
227
228
the declarations for Connections, Input Parameters, Desired Results,
Synonyms, Function Substitutions and the data following them can be
treated entirely within core 1. The diagrams on page 134 indicate
the settings of the switches for the processing of the various declarations. The Connection section on page 135 may be entered through S(6),
S(7), S(8), or S(9) depending upon the punctuation encountered in the
scanning loop for the counter index K. Within the section, the Boolean
variables CTO, CEL, CAT, CAID, and so on are to retain the structure of
the statement being treated and thus to direct the processing of the
statement. As might be suspected from the mnemonic symbol names, CTO
is associated with the connective TO (23), CEL is associated with Element
name, CEID with Element Identifier, CAT with Attachment name and CAID
with Attachment Identifier. The diagram on page 135 indicates the treatment of the special unary and binary elements. (24) The subroutine CON CK
on page 139 is for the purpose of establishing the consistency of the
Connection matrix and to assure in this way that analysis of ambiguously
defined systems will not be attempted.
The Synonym section on page 136 analyzes the form of the synonym
encountered (42) and saves the result for later elimination of synonyms
from the Connection matrix. In a similiar but simpler way, the Input
Parameters and Desired Results section on page 137 retains its information in core for later use while the Function Substitution section on
page 138 records its information on magnetic tape.
The consistency check subroutine CON CK on page 139 is used to
insure the detection of ambiguity in the Connection matrix. If more than
one connection statement has been given for a given identified attachment
point CON CK determines whether the associated attachment is such that
229
the connection is a duplicate of an earlier connectiono If this is true,
the subroutine removes the duplicate statement and compresses the matrix.
Otherwise, the connection is ambiguous and the error is reported.
The Simulator Core 2 known as ELDES is the Section of program
used whenever an Element Description is encountered. See page 140.
After saving a section of the data on tape to provide space for the processing of the Element Description and after initializing counters and
switches, the analysis of the Element Description begins. Since several
assertions are recognized within the scope of the Element Description, a
scanning occurs of each card to extract the information required, A somewhat different structure for handling the punctuation is employed. This
structure is somewhat more convenient for the Element Description processing because of the periodic occurrence of the M.A.oD statements which
must be passed untouched to the Library tape. Only the encounter of the
Declaration DESCRIPTION FINISHED can return the program to core 1 after
completing the processing at S(4) on page 1435
When the end of input data is detected by the occurrence of an
End Of File mark or by the declaration NEXT SET OF DATA, the processing
goes to the Setup Core 3. In the diagrams beginning on page 145, the
Setup Core eliminates synonyms (SYN ELM subroutine), constructs the
Boolean parameter words (51)(IPR ELM subroutine), constructs the indirect
addressing lists for the connection matrix (52) (SS ORDR subroutine) and
positions the tapes for further processing. The details of these subroutines may be followed by reference to the associated text as indicatedo
Several other small utility routines such as the tape mover routine TAPMV,
230
and the check out routines TAPEIN, TST DMP (Test Dump) and EL TP PR
(Element Tape Printer) are shown for completeness.
The Desired Result Reduction Program (core 4) begins on page
154. After initialization, the search for parameters for which program
must be generated begins. First, all parameters are removed that be so
treated without introducing computation. This may be done by matching
the requested parameter with an Input Parameter or by taking advantage
of the Broad Scope concept. (27,56) If the request remains after the
execution of EXT CHK (Scope Check) the associative memory (61) is set to
receive entries for every pertinent Statement Collectiono Since tape
movement is very time consuming the flow diagram on page 155 computes
the shortest path for the tape movement. The diagram on 156 analyzes
each collection of statements to determine its utility in yielding the
desired results and checks to be certain that any special conditions are
also satisfied. The diagram on page 157 selects the collection to be
used in the program and inserts the collection in the program (INSERT
subroutine) and removes the yielded results and adjusts the connection
matrix (REMOVE subroutine). The section on page 158 details the probabilistic selection mechanism. (60) Finally, the diagram on page 159 details the testing procedure for the completion of the program generation
and the repetition of the process if required. The routines shown from
page 160 through page 168 detail the various subroutines indicated in the
main core 4.
When the program has been completed by core 4. the remaining
task is the production of the object program itself. (63) The generation
is accomplished in three core loads of program. The first of these
231
generates the Prologue section. (47) As a part of this, the unique parameter code is constructed by PCODE on page 170. (35) The routine PRLOG
accomplishes the actual prologue generation using the primary scan PSCAN
and the secondary scan SSCAN. Since multiple copies of some statements
must be generated for inputoutput requirements the use of two scanning
routines to analyze an input buffer BFR and generate an output buffer
BFR1 proves to very effective in saving generation time. The routines
PSCAN on page 172 and SSCAN and. SSCAN1 on 173 along with the smaller
routines for output (OUTPT1), saving the Epilogue section (STORER), the
word insertion routine (INSRT), the inputoutput statement generator
(PSUBl), the general statement generator (PSUB2)(33,40), the parameter
name generator (PNAME), the continuation card routine (SKIP) and the
parameter dictionary output routine (PLIST) are structured to be included
wherever they are required in any of the program generation cores. The
routine FSUB is charged with the determination of the contents of the C
symbols (33) and produces either floating statements labels or function
substitutions depending on the double asterisk symbol. Depending on the
decision the routine FSUB1 or FSUB2 on page 189 will be used. for function substitutions or the routine LABELS on page 190 will be used for
floating statement labels.
The program generation section PROGEN on page 191 makes use
of the common routines mentioned above as indicated in the diagram. The
generation must invert the order of the program found by the Desired
Result Reduction Program. (63) The section also checks to eliminate any
identical collections that may have been selected by the Desired Result
232Reduction Program. Finally, upon completion of the Program Generation,
the Epilogue section is called upon to provide the final cards required
by the program to transfer control back to the Prologue for testing and
completion of the simulation. (47)
COMENTS ON THE STEPWISE REGRESSION FLOW DIAGRAMS
The flow diagrams for the Stepwise Regression may be more
easily followed through the combined use of the comments presented
here and the references given to the main texto Where page numbers
are parenthesized the page refers to associated explanation in the
main text of the paper.
The Starter program shown on page 198 is charged with the
responsibility of entering and storing all of the control parameters
and data and accumulated learning for each problem Thne initial section may also save the accumulated learning and terms from an earlier
problem depending upon the test of NOE XIT%, A nonzero NOEXIT corresponds to an unsuccessful execution. of the previous problem so it is
then necessary to retain these arrays for later restarting of the problemo The subroutine PARAM brings all of the control parameters into
storage. DATA IN reads in and saves on magnetic tape all of the raw
data supplied for the problem. Next, depending upon the parameter
IFTRWT; the accumulated learning is either read in from an earlier trial
through RDIRWT or initi.a:ized to equal probability by CMTRWTo The various possible types of analysis (101) are initialized by the program
following the test of the parameter IFCNSTo The parameter NOINT controls the input of suggested terms from the supplied data decko Finally
READ 1 calls in and saves the desired natme for the subroutine to be
generated upon completion of a successful analysis, If enough terms
have been given in the data. the program transfers control directly to
the EDITOR core Otherwise, the second section of the starter program
is selected' to generate enough terms to fill the a.l.lowed regression matrixo
233
The second section of the Starter program conducts a selection
of terms to fill the regression matrix by using the accumulated learning
as it has been initialized by the first section. The routine PK TRM
selects each new term by generating three term selection parameters (122)o
TERM IN then inserts the term in the set of trial terms or returns for
additional attempts by PK TRM:if the term selected should happen to be
identical to any previously selected and entered term.
The routines PARAM, DATA IN, CMTRWT, RDTRWT and the routine
TERM. IN.shown on pages 199 through 201 execute their tasks in the straightforward manner shown. TER MIN is a "skeleton" routine (a routine that consists of a sequence of calls upon other routines) calling first upon TRM CHK
to verify the uniqueness of the term under consideration and then calling
upon ENTRM for the entry of the term if it is unique TRM CHK on page 202
checks the uniqueness of the term by searching the list of previously
entered terms as it has been built up on the magnetic drumo
The selection mechanism for the program selected terms in contained in the diagrams for PK TRM, PICKV, PICKE and PICKSo The skeleton
for the selection process is the routine PK TRM which must choose the
interaction order using PICKV, the variables to be used in the interaction using PICKS and the function of the chosen variable using PICKE
within PICKS. The probabilistic selection mechanism (60,61) is the basis
of all three of the routines. Since PICKS must select the number of
variables specified by PICKV and since this total could include most or
all of allowed variables, the mechanism is modified in this case to reduce the set of possible variables after each selectiono In this way
every trial will produce a unique new variable.
235
The Student program (so termed because the learning mechanism
is located in this section) on pages 205 and 206 carries out the simple
learning process. After arranging to carry forward all successful terms
from the previous trial, the student program grades the previously used
selection mechanism using the routine GRADER. The one exception to the
grading occurs when the previous trial failed due to an impending over
or under flow of the floating point datao In this case, only the faulty
term is graded. When the grading is completed, the selection mechanism
is normalized so that the mean probability is unity. Finally, the terms
for the next trial are chosen in the same manner as was done in the
second section of the starter program. If the selection of the student
program followed the successful completion of an analysis, the student
program also retains the accumulated learning for future use.
The routine GRADER on page 206 utilizes the "halflife" concept in rewarding or penalizing the selection mechanism. (l053122) This
method preserves the positive probability of a term even after repeated
failureo This, in turn, insures that every possible term receives some
consideration. The routines NRML and PRI1NT 4 are shown on page 207~
These routines carry out the normalization of the accumulated learning
matrix and the printing of the status of the matrixo
The EDITOR program on page 208 processes the raw observation
data to form the terms chosen by the student program or given by the
usero If the user should wish to supply a function that is not one of
the set allowed by the "standard' ZFNCT and PFNCT routines, the following modifications should be made~
236
1) ZFNCT controls the interaction of the functions of
the variables. If anything other than a crossproduct
interaction is desired, the change must be incorporated
in the B loop.
2) Ordinarily, the only changes desired will be made in
PFNCT. This is an interpretive routine that selects
the desired function for the variable by interpreting
the function number given in the term matrix. If
special terms were desired for some value of J, the
test for this value could be inserted after the entry
and the appropriate action taken. The user should also
take care to cause the proper printing for the special
function to occur for use in interpreting the results
of the regression analysis. An appropriate change
should also be made in subroutine generation program
on page 224. This change would be made in the general
term generator GEN TRM.
The stepwise regression program is structured as shown on
page 211, Each of the major sections are written as a subroutine to
allow for increased flexibility in keeping the analysis abreast of the
best current methods. SUMSQ on page 212 loads the regression matrix
with the processed data from the Editor program. In the loading process, the sums of the squares and the crossproducts are accumulated.
Printing of the result of the loading is available under the control
of the parameter IFRAW. If the type of analysis selected by the parameter IFCNST requires the adjustment of the sums of squares and
237
crossproducts about the means, the residual sum routine RSDSUM on page
213 carries out the computation.
The product moment coefficient of correlation for the terms
with each other and with the dependent variable is calculated by the
partial correlation routine PRCRCN. This routine also makes available
the standard deviations for each of the terms and the dependent variable,
if desired.
The regression analysis itself displays the interesting structure shown on page 215. It should be noted that only the degree of freedom routine DGFRDM and the matrix transformation routine MATRAN have a
single logical connection to the switch N. The switch N directs the
analysis through successive steps of sorting the terms in VARSRT (78),
checking the variance contribution in VARCHK and transforming the matrix.
After DGFRDM, VARSRT or VARCHK the analysis may be terminated depending
upon the results through the routine RGRTRM. The degree of freedom routine DGFRDM on page 216 insures that the number of degrees of freedom
are continually revised as the analysis progresses, Whenever the variance of the dependent variable is nonpositive due to roundoff error or
machine error or whenever there are no more degrees of freedom remaining
the analysis is terminated.
VARSRT on pages 217 and 218 sorts the variables into the sets
Xi 1 and Xi 2 (78). Depending upon the result of the sorting, the selected term will be checked by the F level test and the analysis will
proceed through VARCHK or the analysis will terminate if no more terms
are available.
238
The routine VARCHK compares the F level of the selected term
to insure that the risk of committing an insertion or deletion error is
not exceeded (8183). If the requirements are satisfied, the regression
matrix is transformed by MATRAN (31, 120) using the relations:
Aij = Ai,  Aik * Akj/Akk i 1,...n
j = l,...,n
i k
j k
n = number of terms
A i ik
Aki = Ak,i/Ak,k J
Akk = 1./Akk
The regression analysis is terminated by RGRTRM. The result
of the final step is printed and the predictions of the data using the
regression equation is printed, on command, by PREDCT.
Upon completion of the regression analysis, WINDUP checks the
postulated criteria given by the user against the properties of the
generated relation. If further analysis is indicated and allowed by
the number of trials, the program returns to the student program. Otherwise, the subroutine for the equation generated by the last trial is produced by the statement generating program on page 224 after which the
program returns to the starter program.
ILLUSTRATIVE EXAMPLE
The Stepwise Regression Program with Simple Learning may be
best appreciated by presenting the program with a set of data for which
a predicting equation is desiredo As a substitute for that experience,
the following example is presented. This problem was one of many presented to the program during its development for the purpose of verifying the validity of the procedure. Since data arising from experimental
sources always contains some random error components} the example was constructed using a normally distributed random number subroutine to add to
a defined function the effect of random erroro
The function used in this illustration was the following~
Y = 4~0 * X2  16.0 * X + 15.0 + (EPS1LON)
where EPSILON is a random normally distributed error with a mean value
of zero and a standard deviation of 0.25.
In order to show the action of the Simple Learning mechanism,
the control parameters are so chosen as to allow 20 of the "standard'
functions of X but only allow the regression analysis to have access to
4 of these functions at any one time~ Thus on a fairly simple scale
the behavior of the Simple Learning mechanism may be observedo A complete discussion, of the data deck and. control cardls may be found in the
Communication of the Problem to the Program beginning on page 98. Obviously, the capacity of the Stepwise Regression Program would allow the
solution of this problem in a single trial if so desired. Random restarting of the problem should be expected, to produce variations in the
sequence and number of trials from those given here but the final result
will be the same.
240
The data was supplied without any accumulated learning deck
or suggested terms. On the first trial, the terms X6 X2, X5 and X4
were chosen randomly from among the terms allowed by the 20 functions
specified on the control card. Of these, X2 and X5 were sufficiently
correlated with the data to be included in a predicting equation~ The
postulated standard error and coefficient of determination criteria
were not satisfied, however, so the learning mechanism was called into
the computation to assist the selection of new terms to be tested.
On the second trial, neither of the new terms X4, X/4 were
better choices than the terms X2, X5 found previously. The third trial
suggested X1/5, X3 as new candidates and retained X3 in addition to the
earlier X2 and X5.
On the fourth attempt, only one new term was suggested for
trial, X. Since X2 and X were together far superior to any other combination of the trial terms for this attempt, the resulting equation:
Y = 3.99999857 * x2  15.999832 * X + 15.0000304
satisfied both criteria and the analysis was terminated for the problem.
This example, while too simple to be of much practical value,
is a fairly reasonable illustration of the technique. The value is even
more apparent when the technique is applied to more complicated situations. The.execution time for this problem was approximately five minutes, with about half of that time going to the loading of the magnetic
program tape on the tape drive and initiating the problem. More representative figures for more complicated problems run on the order of fifteen to twenty minutes, with the actual time depending heavily upon the
number of data sets.
* nATA
1 EYAMPLE PROBLEM FOR RELATION Y =4.*X**2 16.X +15.+(EPSILON)
1.0001.05.05 1 20 4 1.099.5 10
1i1111l 13
3 0.3E01 0.3E01 0.15E01
1 3 4 5 3 122 4 1
23295 8185 13 69
FrRMAT(F5.0,2F15.8)
1.0.77635685E 01 0.38030785E 03 DATA0001
2. 0.72477829E 01 0.10915799E 03 DATA0002
3. 0.18261584E 010.87888591E 00 DATA0003
4. 0,68130090E 01 0.91661219E 02 DATA0004
5.0,31857'31E 01 0.10656648E 03 DATA0005
6. 0,13964993E 01 0.45702802E00 DATA0006
7. 0.29685392E 01 0.27526532E 01 DATA0007
8.0.93314382E 01 O.51260412E 03 DATA0008
9.0.24912679E 01 0.79685632E 02 DATA0009
10. 0.32323989E 01 0.50756451E 01 DATA0010
11.0.'1791103E 01 0.10629232E 03 DATA0011
12.0.86843699E 01 0.45562153E 03 DATA0012
13.0.80777898E 01 0.40524609E 03 DATA0013
14.0.65720271E 01 0.29291764E 03 DATA0014
15. 0.86774063E 01 0.17735252F 03 DATA0015
16.0.82241484E 01 0.41713149E 03 DATA0016
17.0.37972297E 01 )0.13343099E 03 DATA0017
18. 0.61368134E 01 0.67453766E02 DATA0018
19. 0.70016510E 01 0.99067087E 02 DATA0019
20. 0.85431173E 0 1J.17?25098E C3 DATA0020
21. 0.30293,+79E 01 0.32386171E 01 DATA0021?2..,71202164E Cl 0.33171670F 03 DATA0022
23.0.50384158E 01 0.19715650F 03 DATA0023
24. 0.47421 81E 01 0.29077920E 02 DATA0024
75. 0.43137113E 01 0.20414905E 02 DATAO0025?6.O.10008156E 01 0.35019451E 02 DATAC026
77.0.83967~83E 01 0.43136642E 03 DATA0027
28. 0.98397;10E 01 0.24484918E 03_ DATA0028
99.0.46616156E 01 0.17650787E 3 DATA0029
30.0.9974075E 01 0.57251016E 03 DATA0030
31. 0.8132C423E 01 0.14940762E 03 DATA0031'2.0,29972 48E 01 0.98891438E 02 DATA0032
33. 0.45478;62E 01 0.24966075E 02 DATA0033
34. 0.28013)60E 01 0.15693002E 01 CDATA0034
15.0.77303)18E 01 0.3771855E 03 DATA0035
36. 0,66956154E 01 0.871o8816E 02 DATA0036
37. 0.71400)23E 00 0.56152710E 01 DATA3O37
38.0.63002.88E 01 0.27457628F 23 DATA0038
39. 0.11567351E 01 0.18445273E 01 DATA0039
40. 0,26133195E nl 0.50497638E 00 DATA004C
41.0.33071.46E 01 0.11166356E 03 DATA0041
42.0,62067?32E 01 0.268403033E 0 DATA0042
43.0.92314617E 01 0.50358116E 03 DATAO043
44. 0,35399178E 01 0.84858461E 01 DATAO044
45.O.35085416E 01 0.12037566E 03 DATA0045
46.0.16Q50303E 01 0.53612783F 02 DATA0046
47.0.26867548E 01 0.86862338E 02 DATA0047
48.0,77692993E O1 0.38077562E 03 DATA0048
EXAMPL
EXAMPLE PROBL EM FOIR RELA TIts =Y4., *.i'  + + C +EPSI LOii
STRRTER PROR0 R flM
F'R' ELELE l 1O. 1
RAW DATA
BSER.'TION NO... 1 W IGHT = 1 1.
iiE' E A T I . _ 3E 0i 2! ~ 1L.0. :C: 1 = 0. 7247783 E 01' 2', = 0.1 015 —80E 03 k0 B S E RV A T s O.. E I  H T =.00000 
O diEREF. NO. 4. *.IG.T 1.,.00000_i if
X (' 1, 
CESER I? *: I Cl 1
~xC ~ 1 1 =,:,; C 0' *E 01 1.. 2, =,.1' E,
OBS EER V9 iT I N O. 8. ) E i GHT = 1.  i00 i z E' — C ir;! i 0.T9 E 01 X'._'i  = 0.t S,! s'..S I t! i? XC
OBSER'VTION INO. 5. tWEEi GHT.0 i..... IL 1 1,  0 1 01._: 2:3, 0.512I041E 0.3,
O S E:R;'. A T I i. I i i  T =. i...
1l i.......i1.1. 1../..
__:s 1 t' ' iE L 2' 0. E 0
B S E R T t. l.it iGH T  1.' ''''f__  i_
Xi 1z:: . . 0.9... 0i I _fz31'. 2.40'14 E3 i0 I 
O C 1: E0.32323IE 01 2): = 0. 52 0 4E 0 3 X C
OBSERVPTION N!O. 1. EIGHT = 1.0000..
X,:( 1:) 0. 7 1 4'E 01':,:'::  O. S E 02::':; 1:
Hi S'rER ~ ~...,.71.'. ~ _ 
OBSER'AT ION ONO. 11. 2. EdEI GHT = 1. 00 _______:C 1:: = 0..3874370E 01 X:^: ='.455 1. 1,E '.::
OBESER'ATION NO. 1. WEIGHT = 1. 00I
i B S E..R A T I' N NO. 15. 1 E Iu H 1( 1 C 11 1n __
XC: 1:> =  i 0.8 77, 79 E 01 2:: 0. 4 24 1 E 03
OBSERkV'TION NO. 14. WEIGHT = 1.00000 _ _ __i_
X( 1) = 0.6572027E 01 XC 2) = 0. 2929176E 02',
OBSERVRTION NO. 15. WEIGHT = 1.00000,C 1 = 0.8 277401 E 01 I.:,::,:: 2,} 1 7' 2315 E C03::
OBSER,,RTION NO.!'. 16. WEIGHT =I CI1,00
XC 1) = 0. 3797230E 01 X < 2::' =. 13 4310'E, 0,3 E' _.:,:
OBSERVATION NO. 18. WEIGHT = 1.00000 _
X~* 1) = 0.6136813E 01 XC 2') = 0.6745377E 02;^X:
OBSERVATION NO. 19. WEIGHT = 1.00000
' 1,,.C. 7i 0 1: 0. !I 0165I E'1 x C 2.'9 0:' 
0 E, S E:'.RE T!'H H O. 2_. I.f E I, H T _ 1. 0 0 "Tl:. 1C 13,  3.:,. 1 t *,,.  5 1 0. 7E 01 2.,.. 510' R T! r t 1 i.1:. E.I E H, T 1. 0 0 0.32 61 0::.:.,. i,  3..Z,':3 2.. 4 E,' : E =:,'. 3 S'  _ _1 7 E t
O'E,.I T I NO.. .,E I GHT 1'00.i 0_i'_i,x' C 1': = 0. 1 " 2 2 0 27 E c0l',: .. 17167E 0.3,..:SEPRrTIO!" O.!"1. 000 0 0EIGH
_,t,1 = 0.. i 3S. E 0. t.,, —',1 t0.291 56E 0 2 _ _
OBER A RTI:O 0 E EI'. HT 1..ii00 01'i41t 1:F i 0.4 3 t 1 1_ i =i,, i 21 t . 2 44t E. 02 ;
OBSEr —F i.' NO. T I iWiEI HT  1.11,, Iii,0000.; I 1. 1 E i4
,SE,RTI0 HO 0 2' IHT = 1 0 0 H,::; 1:' I 0. 4' 7'1 E 01'::,:'. = 0. 24484 F 021E 0.''::'::
C E ErH TI t H.2. W ii EIGHT; 1 C000
i 1:::. " . t. i. 1 1 S1E 01: 2.' =  1 7 5079E _ 1
_3 L
0L.. E:: R. 30. I HF E I i HT 1 0 0 0_'_
:,"1 =: 0i..7407E 012:' 0.572 51 02  E 0 3':
_:.,"I.,..1.'.,
li E:. E F.': I,. T T I l I'i 0 2 hii E I iS i  T i'", t 0 lt 1i
E' Er..iT I N HO. 3.1. E 1. T I 1. I 00000::': 8132012E 01 C 2 1 494076.
O ESERRTIONr HO. 32. 2 E I GEIHT:::,::' 0. 2'997295E 01'iSE''i s'' T i i OrH l I, H..,,. I.IE IGHT T.._.,^ 1:: 0. 454',1': 5 E 01
0'SERY14t,.i,, o... 4,tE i 1HT::( 1:: 0.2 1' 1 9 E C I:'C 1';' " 0 7'4. 01
C EffS. E R NO. 37..WE I HT
OBSERi' RTIOH t NO, 38, WE I GHT
 1 i ii1'g~a C 1';,sz iti 7 1 4 ID 0 0 2 E Ci Is
_ _. _). _OG3 0 5`x
1 a, FO O.. O..',::,:: 2::: 
1.O OIi
1'l'^2" 4
1. F00000
'1 i 0 0 t! 0
%. 2).
1 _
Ie',', ID.....
_
0.,'::''" 1 4 4 E
0.. 4,. , 0 7't'
0. 2496607E
0. 15'.3'100'.. i:
0..37771 85E
J0. 7 i "I
Cl;. 8 7 1 ""i :" 1 J
Ci,. 6 1i,7 1E
o.274.73E
C, 2
0 1
0.3
01,I,::..' C
iI
IF C
rt'7d
r"I I.

=. —
_v
_ _

w

l.,I
_ ~~~~~~~~~~~~~~~~~~~~~~~~~~..,,..,..,.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~z
OBSERVAT H1NO 40. 3. WEI GHT'r
XC: 1)'. 1 1567,35E 01
I, tI:lX( 2 1) =
O 1844527E 01 t
OBSEFR''TION HO, 40. JEIGHT = 1.00000
X C 1'. 261.31 9E 01. 2>, O.5049'764E 00
OBSERVUPTION H. 41. WEIGHT = 1.00000: 1:: = —.33071 S5E 01 _X' 0:. i 111:G.:6E 0.3 ________
OBSER'T I OH NO. 42. WlE I GHT = 1. I ] 001 000..X: 1,  i 0. 20723E, 1. 2: 24003E 0..
OBSER VATIOI f O. 43.i TI tEIGHT =.1i00I000
IX 1  0 .''21 462E 01.::, 2:,  0. 5. 3581 2 IE 0_ _ ____
OBSER'RT I NHO. 44..lE I GHT  1. CI00 ___HT 0 0____. C 1: = .3 39 1E 2: = I4.' 461E 01. 2E 1 r:
OE:SERyF.T IOH INt. 4:. W.EI tGHT = 1. 00000::,:' 1::  . 3'. 854200 E 0i1'.: 120.3757E . _____: 7' E_
OBESER')ATIrIH NO 1. 4.iJ.EIGHT = 10 _____ _ ______ ________
1 iI 5 0.19.'30E 0i1 5.361 27E 02
OBS E R: T I 0H HNtO. 47. lE I GHT = 1.i0000,:: 1'  i',2Si86755E 01'i:''_ 2 = IO.86 r234E 0'2,____
i OBSE R'. iSTIt tOI I ]. 4',. E1EIGHT t 1. Fot — 0  ________I0,' 1::  7','7'79299 E 01',:'.3807556,E 0..:
EDITOR PROGRtAM
F'F:ROLEM t O. 1
SOLUTIOH PAiSS NO. 1
NO. OF INDEPENDEENT VIARIPBLES 1
NO. OF TRIAL. TERMS  4
TR IL TERM DlEFINITIOHt FOR P OSS O. 1
TEfRM;:: 1= INiTERIACTION OF ORDER 1 W I!HERE THE COIF'ltOHEHTS ARE DEFIHEED T E —
COMlPONENTl':: 1) =:: 1::'. 
TERM.: 2:: = I tTER:CTTIO OF ORDER 1,!AifWHERE THE C:Ot1FPOHEHtTE ziR A IE tIEI! TO E E 
C:... COMPOHNE TC 1:) = X: 1::' F.. 2
TERMIC.3): INTERACT'ION 1 OF ORR,1 ER 1, WHERE THE COMPONEIR1EHT''rE DEFINED TO _E' 
COr MPO1NHENtT" 1:: X 1::,1' 5
TERMC 4 ) = INTERACTION OF iORDER 1 WHERE TH E COMPOHENTS RLE DEFI'NE TO EE 
S_)' tCOMPONENTC 1) :eXC 1),P, ' 4. E__ 4_____
TERMC P SM, 5 X: 2;, DEPENDENT VARItBLE.
245
STEF'Id ISE REGRESSION
HO. OF DRAT SETS  4:
PROiBEILIT,' OF
1 ER ERROR IN E ITER It NS T E R Fl 5'.:0 0.. I
E CR n:l I T F  I T IT T N; T F'hl  5 illIn Ii.. li
WEIGHTED DEGRFEES OF FREEDOM = 48.00
STFHNDORDE ERROR OF' = .1.,2831 23E 03._
STEF NO. __
TERM ENTERED
F LEVEL = 0.64752.3925E01
STANDARD ERROR OF'V = 0. 51 74,:,8E 02
COEFF OF DETE RMIN T I I l T I I' H =.8,_4.'5' 1 1 O10 E 00_
MULTIPLE CORLTN COiEFF = o. 94:8 17'55.E 00
CO 3 ST F T TERM =', 1': 145's 5R E ...
TERM iNO. Ci: E FF I C I E NTT ST'D ERR OF COEFF
TE RM 2 0.3 9 1 3 7430E 01 0!. 27 2':. 40E 0
TERM.3 0. 25 82.7 05E02 0 27 3 1176 24E03 ______
REGRESSIO'N TERMINRI TED AFTEF 2 s __TEP.__
D I PR!O I R O L E L E lE N TS
___________________________________ F __ _ _ — _______________________________________
1 I0.':,!; 71 I0 2 E 0 _____
I. 1 1. 411 I E 0 1
_, H I i.i 1 3 41 9 E 01
4 10.93 1943156, E 00
PE:OES CTE F.!EEI ITI L T E .'. T; CIPTR, Fp I ITSON'  S GM' _.. _....___.... 4.'I IM _: _.__..T.E INT
2___g. 0.11  ij iI02 1 _ _ 1 $ iO!lElL ___ _0. 1''I Ii _ 03__ _.6; 47i 0Q  b 0. 1 G 5
I,f!_I,'2?'? 05 2 i IJ. ~ i r, 2.3 , LI! i.I 4!3? G E I., 2.i 13 1 1 1. I I,.!' L1 n.. "'!! Ei 5 E I _ .!_, 5 C 4 7 1 7' 1 E 2 2.i 16 I:
T li 2i, 1i Ti 2:a. oE Li, 1? i.2 20;3 li.~,   11ii lr E 1.:1, !? 2'7 ISg 9 El, 2.% 6  ~).C E I2 i i i i S 0 2 1
4, L O i A i R 1 A t i. I l 4 I 44 0, 2t i329 E 9 1 i 0 I1 E 0i2 0 I. 1 2 1.3. E 02 =71'It 10
1'i1a 410 1 372.i i 7  I i 1 1! 7iI i 1 N 3 1,.i 1 r' ri 0 r,3. 7,;i  0, 4.E 412
4 =Q.,,1'' 02 0, 77 t' g,'*0G0. 0 1 02 E,9 O4' =0. I41.E 02 =2', n.92~4?ljT 1, = — 3L,;.?,,.! 1 I,':l.;.1o t.$ i!~ lC'i 0l, Ci01 t' I0.',1 1 0 9I27.,.,i1 01 10 =,41,42jr'2! E fi i. = 1i l
11..Ai_ '2 _,l _ _ li g t314 Li 01 Li rf g 1Q. i m I l30 L 1 0 N9 t i _1 I=! tL 93 0 4. 1, _ I i 7 4 7
__Jaj__g.14i13~4r5 0 0 4441 J O^?a~ti~74lJE!g 03 0 9~ 1 g?~B4O nO 0 __ ciMIij7~ 145AIOE~l.. ____ *=g^'^g.853.glg2*^E^1... 000 10 03C
1,. Q,a Q a i i I i0n 1 4 0 0 i 4 1 a. 0la24 l  03 0. C 3 LI1  01 C. Ij
_. i~i,^i~sia:]i4~l aj! n jieiaaai~ii QS B.^slg&QI^I ~3 ci.^Q~i~so^I.3h 4! 0E 1:'4. _
i _ f 14. L _ 7 a i a i O 5 Qg ea lt~q r i h,.27 * _ kg ?&i 0 0$74AE 02 0 i 0I 3 0 U04, 2'1 2i 2
19 0.1 11t0 O, " 4 1 92l42E 03 0i2 0'4t472 1'3'l.3 1? 2' 1E. 41 ^I?3: E 0'2 7.,i7
1G.
17.
18.
19.
20,
231,.,4 J.
22.
2.3,
24.
25.
26.
27,
28.
29.
30.
31.
32.
33.
34
35.
— 9.
4,.
4'2.,
44,
45.
4.
0.32877041E 03
0.21229369E 02
0.8 47' 648'E 02
0.11385031E 03
O0.13541794E 03
0. 23912673E 01
0.21126164E 03
0.71348273E 02
0 4520238E 02
0. 32055324E 02
0.34360529E 02
0.35085150E 03
0.10985973E 03
0.54059473E 02. 13, 1.364532 E 0.3
0.13374104E 03
0. 17S6311E 01
0.3 S916670EE 02
0. 74830217E 01
0.2714541 9E 03
 2:.332670.0 E 02
i. 14570635E 0 03
0.33025755E 02
0.1141,961E 02
C.631575 1E C01
C. 13 s 1;9'  67E 0 3
O.,: 5 7 4 9 3 4'3E 0 3
3. 10219363E 02
0. 12145368E 02
0. 26857144E 02
0.38394888E 03
0. 76407839E 02
0. 14465512E 03
O.169028378E 03
0. 19059641E 03
0.52787202E 02
0. 26644011E 03
0.126526. 74E 03
O. 1003140E 03
0.8 7233793E 02
0. 20:7'40E 02
0.40102997E 03
0. 1650321E 03
0. 10I23794E 03
0.66:,:3:::2379E 03
0. 18::' 89i 950 E i 3
0.53299833E 02
J.94345177E 02
Q.47695447E 02
0.32 63265E 03
O.1 S8517 E 1102, 20.2Q'8482E 0.3
0.22152714E 02
0 47, i 507E 02
0.1943634E 3. 6539733E 02..6732:387E 02
U. 2.:21725E 02
0.43912735E 03
0.13158631E 03
0. 19983358E 030. 22420725E 03
O.245774S7E 03
0.10796567E 03. 32161 57E 03
0 1:170521'E03
0.1555598'8E 03
0.14241 226E 3
0.75 99 6409E 02
0.46120:43E 03
Q.22021668E 03
0. 1 441641E 03
0.72400226E 0.3
0.2440'9797E 03
0.1 0:478 30E03.149523'4E 03. C287391E,.03
0.38181112E 03
U.21618377E 03.74030237E 02
Q. 2560632E 03
0.77331182E 02
O. 989997E 02
O. i 16672S9E 03
i:. 2'4954661 E 0.3
Ui.58529C042E 0I 3
0.1 2057630E 03
I r s 5 i 79i3 r i":
0.41713148E 03
0.13343099E 03
0.67453766E 02
0.99067086E 02
0.17025097E 03
0.32386170E01
0.33171669E 03
0. 715650E 03
O. 29'0 7720E 02.20414905E 02
0.35019451E 02
9.43136641E 03
0. 24484918E 03
0.176:50737E 03
0.57251015E 0.3
0.! 49, I40'762E 03.9889S1437E 02. 496607E5E 02
0.15693001E 01
0.37771855E 03
0' .87198::' 1 5 E 02. 56 1527190E CI1
Q.27457628E 03
0. 18445273E 01. 504976038E 00
C 11 66356E03
0.26'840.33E 03
0.5035 8115E 0.3
0.84858460E01
0.12037566E 03
_ i 7,:i; 1?7;_7S l O'?
0,33182602E 02 7.955
0.57023149E 02 42.736
0. 77201351E 02 114.451
0. 6996167E 02 70.621
0.20345432E 02 11.950
0. 49548584E 02 1529.930
0. 65276588E 02 19.678
0.70629758E 02 35.824
0.71303488E02245.215
0. 66818888E 02327.304
0. 1420! 510E 02 40.55
0.25336444 02 5.874
0. 79810970E 02 32. 596
0. 67269927E 02 38.112
0.96316 37E 02 16.823
0.3951 1 E 02 26 446
0. 45591 599E02 46. 103
0.69379102E 02 27. 894
0.46126147E 02 2939.281
0.51035832E 02 13.525
0.7380:.349E i2 284. 6.138
0. 13236497E 02 235.723
0.736'14 E C 26.838
0.2030813S6E 02 1100.997
0. 43256531E 028566.050
0. 501':3 31E 2 44.9 2' 9
0. 740.321838E 02 27.583
0.265.3080.3E 02 5.2,683
0. 569198E 02670.670
0.53051323E 02 44.072
n'_2 71'4:F 3n E 47 174
47. 0.9187085'6E 01 020.45.103E 2 0.101169S5E.3 0.3683G62337E 02 0.40870953E 02 47.053
4.. 275676 2 E 03. 33 855 30 E 0.3 I'.3860.3.37 E 0.3 0..3 075562 E 03. 499 i 0322E 02 13. 1 06
MAXI UM ABSOLUTE TDEEIDTIO = N 96.31.364E 02,'SEE 0ES. NO. 30., LINE NO. 30).ij::IMUM ABS LUTE PERCENT DE,,IRTIIH = S566.050, SEE OBS. HO. 40., LINE HO. 40.')
POSTILAfTED CR'PITERIP ___I
STPNDARD ERROR OF Y = 0.5000C'i00E 00_
C:OE'F OF DETERMINATION = 0.9990'000E 0
FITTED CURVE PROPFERTIES
T. 551 7:847 E 02
COEFF OF DETERMIiTION H 0.394571: 00.
FITTED CURVE MEETS NEITHER CRITERIi._I
FH.S N',', N UM 1B E R 2 E: GI N Fi R P R I B L E N OH.f __ 1
10 TO'T7L PiSSES IAHLLOI.ilELi
EDITOR PROGRAM
FROBiLEM NO. 1
SOLUIT ION PASS NO. 2
0O. OF INDEPENDENT VKRIABLES = 1
NO. OF TRIAL TERMS  4
TRIHL TERM DEFINITION FOR FPRS O,. 2 ——,
TERM i, 1S' INTEPfCTION I OF ORDER 1, WIHERE THE COMPFOiEHTs. fRE DEFINEBD T BE —
COMPONEN'T< 1:i = 1).P. 2
TERM( 2: = INTERH'CTION OF ORDER 1,.HERE THE COMPONENTS RRE DEFINME TO BE 
1CO',I11POEIt'T(1: X( I 1),P. _______
TERM() 3' INTERACTIOIN OF oQRDER_ 1. WHE. E THE COMPOHEN3I H RE __~FIME TO.8. .__
COMPONENTC 1) = X( 1).P. 4
TERMC 4) = INTERRCTION OF ORDER i, WHERE THE COMPONENTS ARE DEFINED TO ~E E
COMPONENT( 1) = X( 1) P. 1 —* 4
TERMC 5) = X( 2) DEPENDENT VARIABLE.
qtFPPW T < c
PROBELEM HO. 1
NO. OF DRTf SETS = 48
FROBRBILIT' OF
1) ERROR IN ENTbRING TERM = 5.0000 0.'0
2) ERROR IN DELETING TERM = 5.0000 0/0
WEIGHTED DEGREES OF FREEDOM = 48.00
STfNDARD ERROR OF V = 0.166283123E.03
STEP NO. 2
TERM ENTERED 2
F LE','EL = 0. 50 35:'?260 0E 001
STANDRRD ERROR OF' = 0.5517846:7E 02
COEFF OF DETERMINITION = 0 894571 10E 00
MULTIPLE CORLTN COEFF = 0.945817955E 00
CONSTANT TERM = 0.18174595E 02
TERM NO. COEFFICIENT STD ERR OF COEFF
TERM 1 0. 3'31 37430E 01 0. 27:81:28640EC0
TERM 2 O.258267805E02 0.273117624E03
REGRESSION TERMINATED AFTER 2 STEPS.
D I AGONtL ELEMEr'JTS,
i,,'iR:. HO._, VALUE 
I1 i.. t GI 1r l Q..16141I99E 1l.
2 O. iO 4 6tir'. 1. E 01
3 0. 744117185:E01
4 0. 1 G63;50'5 G6E 00
PREDICTED RESULTS ERSUiS DATA POINTS
OBS. NO. PREDICTIONS DATA DEtF'I ATI Ci
V  SIGMA I' _ + SIGMA P gOINTS _ _,,':;DTli  " F E:e.T
1. 0. 27505229E 03 0. 33023076E 03 1. to~540.'22E 03 0 3.1.:3037S!5E 03 0. 500770:1' 1 E 02 1.l. 1,:
2. 0. 11965423E 03 0. 17483270E 03 20.Q23001 117E 0i:3 i1, 1r191579.:E 03 . G5747?15E 02 60.1
3. 0.25102808E 02 0 30075660E 02 0.S5254129E 02 0. 87859IE 00 0.3095454E 02.522.02!
4. 0.108995S8E 03 0 16417445E 03 O.21935292E 03 0.9 i612 1E 02 0.7513238E 02 79.t11
5. 0.29942183E 01 0 58172.87E 02 0.11335115E 03 0. 1065664 7E 03 0.483937'2PE 02 45.41
L6. 0.3059071>7E 02 0.24587761E 02 I0.79 766230 E 1:02' 0. 45702LC020E00r _ Q0. 241 30733E 02 527',., 2,7. 0.37834883E 01 0.1394980E 02 0.10657345E 03 0. 2752,31E 01 0.486.4232SE 02 17,7. 07
8. 0.49192224E 03 0.54710072E 03 0.60227918E 03 0.512.i0412E Q03 0. 344965D97E O2 0.6.7W.
9. 0.13341041E 02 0.41837428E 02 0.97015896E 02 0.79685631E 02 0. 794204E 02 47.4?7
10. 0.24311051E 01 0.57602574E 02 O — 1127I0SSEO4. 0 0.53756451E 01.' L 0,'?5 IL ~1 05,
11. 0.28174648E 01 0. S7995933E 02 0.11317440E 03 o 10629231E 03 0.48296395E 02 45.4
12. 0.39023546E 03 0.44541392E 03 0.50059240E 03 0.4N562153E 03 0.102075'9E 02 2.24!_
13. 0.31090345E 03 0.36608192E 03 0.42126039E 03 0. 4524i609E 03 0..^1l41iG 02.16.S4
14. 0.16569667E 03 0.220g7514E 0.3 _ r.^27f;R1l r R.02 7.291E QG. 301,.?204.AE 02. 5
15 0 13511577E 03 0.19029424E 03 0.24547271E 03 0.17735252E 03 0.1294172E 02 7.27
11
I
I
i
i
t.I
i
I
i
i
i
— 4 4', 9 rrT — 41 7) 11 I..),J..
[Mr in imi inr w. >T o a a ^ o 3's ^ro 4 r i Vi x T
7S>. vt ^ ^ i ^ lfc* " ". a w, <' "1 n "' i n "^ *^ i^ ^ 4 +^ 4;s 4.. **. +** 4'. (:** *(*** C4 CM CM CM CM CM i'.j CM CM CM r..:i K~~~~i h..PI 4. 4.r.. r. i Pt...K) P.:P..) t..j Ir.D CD"; ^'^  o Q  r~ o' T1 " m "n m "T' C. Z 7.C c, 4' _C o "..Cri r..: C o J CM4io OO JiL n1 4j o co' J 4' 0I,''~~~~''.TI 4n ^ m  ~^~~~~~~~~ o r a fi r c c:.~~~~~~~ ~.................r......
[T.C. 41T1 C''~~~~~~~1 I'.M.. pm..0.3~'. .....1....3
~~' C'D cro'c7:7pmC7C.,C,pmpmw~~~~~~~~pmopmC'T.)mpmCZ)`
pm 114.pm i?J ipm' rrT pm  r — i  $ I 'n .4 417':,1.lt7. —jT. 14 1.4 I —I1'4I4.I1'11Z51 1J''W~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r. 0.', COrT Z",
—. — T r T'i 74, 111i  *??' r n i 0: P..0./'+..'41,'1 T1. (4 ^ 4 L/j4 —I,,',P 4
 M;  i' Z 1 o 4 r — J " I C — I M  J —.`, CO 4 (. o : ' 
(C! r ^ i,4^ 4 EJ M Z Z XIi 3 11 73 ^ o n In C C fII CO co 4*i..,+ n  C, 0 1 *:. (  CO  41 CM C, o Q 0 W 0 P (Aj D (A~ Ul ( N CO:  4m3 *MM'. M'4n  M`LJ T r 4 (*j o O::i  Ci;. O c. CO 9, in c'O T4: Pj *. CO J N 4
^^ ^ ,w m TI' n~~r r: m T ' rm  j.11.'I *:: — 5 1' ino'. j1J(r4o J4 *J;,p — *in.oiM rc.. M 4J' k14 4.
*^~ ~ ~ ~ ~ ~~~~~~~~~~~~~~( a' ^., w ~ ^  ^ ^:!1; i ^ ^ T 1 n ^ * ~:U k~,:  W.' Aj*. L.r *i J 7. — J',J 4:. 1,) — J >Ci (n 4., uD  Q'I,A. D P,) ".T, 4 0 0Cs CJ 0
14 I Ti., 4n:''  — ii'1rn 111 r TI rJ 4TI n ri.:n 4'. o4. c .. 1. i4. *ia oi (.fl *:^ o <::o  * o  o r.I .j j u' M M iJ:JLA. cr4. 7j *.7 Jr i:'4cri ^.
* ti?7 ~ — 1 n *L.!'l_4,'4 *J44' m " — I' 4* I  n I m r l. ) W. J' —J 1` C P.4 J 0:)  t*fl:, (.ri C % P CO 4.. 4 2 4:. — * C
rn",* "''' ^ 1. 1,i ^ '' ^.  r m mpmr! l m if. m m r'lp i m m m rmm rn m m m m m m mmmmmr mm
fill iI I I I,'f M n X Icmt'" ri m n i m 4.n l'4 4J'47 7. I'n 47C1''4.74.'700C' i
17. 2r x D;: I 3 r j
pmZ p, Y. M  I471 I:
ti,,0 *, D' 4.,0 r,<.r,v1  rn  ^ ^i^^ o
H 0 1=' M I' S'".n ^ " ^ "'o ^':11:10''"'t>c =
^ "" m v TO *.." TC'*' r^: i~i 11'm ^'^ 7'i ^ ^ ^i o i:...i 4:. ..i v' a' 01  o". X tC:.i *: i I (>CyI CTI Z* 1. Co >I.frCf t0 Co:1 C. C. ji * f  J t0
:7~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'
w~~ ~ ~ ~ ~ ~ ~~ ~ ~ ~ ~ ~~ ~~~~~~~....i'r  n* o ~' (i.x .jJ'..*.t.:...: (..~.j. (.i. — *.. c'....j..i.r .:. c..*en
~ pm 4 T K. _.._~ 5, = >, . j o o 2' c. *"*i **;::i M f' "i **S*  "'.J ^i'J 4 4n 0:1 in c'i * ? 5 4.'.n *i:' o:. M o o ** (.j i~j <ji 77~C74i'r:.:?'"'."7 s:.~1
F,1 J [ *:w: *' r'n m 11  **'! i:.>j:. 0:1 o;;  l::  i:.1 — J C,* —.i i t.:.. 44C  C* C,'  ' K~ 4 4.4. 4 4 0:1 c )'41 441 41 \ 0 0 0'sl''  0"i  C. i' Ti',J'*l' l IIC'  Ci * CO'J''**J C'i 4!'. *J i~j'1 ~J **j~i *<i's& *<;**'*1:1 4 4' *J. J4C *C. "*JC4A (Ci'C C,3'^'.... ~ S S''=' 1"' 0 c l c;"' 0'lj'rl ""J ^ 1::'"''''"'''*^ "1 * " ":> [~''~' c l'n "J ""*1''^ c l':''(:"'"Ir' " "J P.." "^ a, en o''tj 0. 4' C*iCD 0 G,.j Q 1 f I Ln C, I C, %:7 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ * D *s jt: Ti m.: C'i:4 II: ^ s — J t':j i j'. \!:.,.I c..*. i..) i.Cl'.L — I: (7.'> Ctj Q, ":,:(I 1o.q ccs) U
in in n) — 44.'Jr2. 
X. i tB X X'
pm'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i
nn! wf; rn. m:'n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~,.:.J ., C T;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~4 o4.4.4.'"i
ni' n1 ci 4n " — m... I, 3'D 0,rC,^mh MI On, j T M' PI"!4. m rni ril M 'ri rri M* M' Mo M M m* M* Mo o Mt rr.: cm' F *'i M " M  M. M r M j' m * o M i^ m  M *" M ~ r M M
r. MI:z rMr''"* "* ^ r" 0'"' O':i''**i0l ^ ~' —' E K.,::i'," c: ": C :.::'..: C. ~ C. C:C, Cl,..:1 *
"":;....: ~ ~ ~ ~ ~:':"': "':7' 470
t  j t. Q PJr.c.. Q c. Y..)i *nJ:.j t'A oj t ) i.j <:o 'J rq. W AC.i N.'i o P. W (A LOn co W Wi  CM N.
MI M. M;~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~I
F..~ ~ ~:',,i,~.:. r:,,0C ,m
4" —'I''444
r,, 3Di B ri s — **cc' **.o''.j c^ + CT"'. c''. *J io *:i c^  ''..'i 7' '*4 44"i't:' I 1 4' 4'1 o1.' 4co I  4''''co r.:i' n Cj'Jr 'i...DmITir:i'i i:'i rir m'/q r'i"'~#i1rnr'ilrlm
Z^:n:[,~~ irini~ rnO~~~~~ Cm'~C:nrrmr! 1C1 44CC "np C) C: C,. rn Cr:. n r.1 r'. rn ) n? m mrr n r
To~~~~' o'73 a..~ u~.I**~ (**',~ (Z'o~ C:.o~ o~ o~ (_I =I cf 1' l'' 1  1.4' c'"' 1.' 1::4 1 1 14:'"'1'" 1'1 1' 1. .7!01'
I T. T I; Mn, Cr i  P..' T! t   J 4  C r  0 4 ~  =  P. — Q  O ", L A 4 1. C n 0 ) P,) C,,,;,
rrnii~~~~~~~~~~~~~~~~~~~~~~',~~...."n'"r
i~ ~ ~ ~~ ~~~~~~~~: C.  4.. c I`t..i C* o1'. C — _i0 co,.J C!* . (* co. * j .j 4c Ci — n o *: 1 *n c.t  J) *0 4 —J D  I
HIT 1 ro' ncr rn o c^.i o co i7. o co (>j co t &** *. *** * t — J iji'.1:1 co **i:' r.:i c'i *** * c:'':Z O  +** c'n — J *t CCcLA;,. o P... o j o Ao c.D  > x j LA J
nmj TI' nrr "n "'"' — *!'^'' t''*J 11^ 111 ^ 1^'' **^' **'* ^'"'*:1 "'"*I'*' 1^1''1'**!:^' **::'*'^ ^ ^ (>J t C —, C* — JJ C*4 ..) J C.OC, M. O Ci) M...J O nW (Y) P ~*
Uf w.D ~~~~~~~~~~~~~~~~~~4 r 44J D,,,I' 44.i .: 01 .,J t — *:'. 1741J., l4j rn 44  J. j c
Zn M( i r i' c. ***! I's... in **, L c' i~i C". rU. i:."* 4 j I co.1 co 3 o J4.o c', ' —1 .'ri' a r o o J o *
ii: i. 11 i I'i i". —, I'M 01 c'i c'" — —' c~~~~~~~j in.M — J *.:i " ***'Ln i **J'.M G'. * co ~4 *(*** i.:.~3 JI4 oP...:i *.Jii 4r Z) — J I. O W0,J 4 J f11 O., 0 0 0 0 0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 C3 0 0 0 0000000 3~~~~~~~~~~~~~~~~~~~~~~~~C C C, CC, r'I:DC!  O' 1
r7 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~~~~~ CT P*)cnuiK  n. ,i ) ,J "j (.,J l '. l,71 .* J CM*11 Crj G'J P J 7 ',J, —:.J vfl "..J'.A 1C.J A'An'M
rrr" ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~*: Mo**i s '_. o., i* **'i CMP..: oo  M n'nC "/ix C : oC ris~  M'o'Jo ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~:Z~,:Z; j ~   :.~'oi  7 M in'' *i   j*~'i1'* c M *r 7 c  *   *:
41 — 4 — 4~~~ ~~ ~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~ M CM. i: c. j 17.. M c.... 7.. . r.: . r.:.' 7.. . . f... o.:.: 0. is. N  **.
C,`, ) J,
1 7. j  ~ rA.''" r: P.....J P...", ji lJ..
 +.. 4:.. 4;, — j i r ~.... C{*** 7' o *  JMi~"  q :. *J P.. r  J . CM. CM " r..i : Ca M. * r'..A t  J I:"J*
J,~W I...3 G,' /.. 
i.c.. .j CT,.s., i'C — J 4: 7'. 0 7i iH +*"""M c. :.J,.7..: — J'.,:..J C:' D~...+ r,:, J.,
j,'1 i.?M. CM f r.J (I 0:1 CM. **_' 4i  1J u C 0:1 ".n —. i7'. CM I' (17' 1 r r. i * C M J*c~  T, 0 1 o o." ( q 4>''~ i"~ "":'"':?'";'~' ~':".....:' 1:_; " I J.t; ~~::~, "
I~ ~ ~ ~ ~~ ~ ~~~~~~~~~~~~~~~~~~~ ~ ~~~~~~~~~~~~~~~~, P I ".,J J:", C;.~,j 'J:  "': ":::.,';/,~:'" " "7'. i o C;'. .:" o.D .
C.C Lf i ",:0I (q Q' `' I J O L Z D 0 Tl U..D X,.,.,.,,llMr' FIMM' T rimmmmrn W MMMMM MMMI
o C~~~~~~~~,,, _ ~,:::::,DC  )DC C "C Z)  DO lO0=
PJ N'IW W A W C, ~ 'A W..J Q. (A (A'.) Ir.) W W'A!/:J ~':i'.. j,:_LAC, (.:,  t, A P,.)~ J~ ~ 7, P...:
4!, 0.!, _I. C n c no..:.:, rx _ o. —, r... u..:, c,
~~~~~~~n~~~~. ~,n P (.~1J:: ~ ':. 4'::j:'l "; 1 C..'I )A  ~ '..:'':A O.........,..:,J; ~ ~ ~~~~~~~~~~~~~~~~~:,:P~,i~
— J c. ~~~~~~~~~.: A 4 .'O AT,/; i I', X,,C,.D./ C."...
c,"~ ~ ~ ~~~~~~~~~~~~~~: .'.~,j, 0 D ,P. J.JL.'L
STEPWISE REGRESSION
PROBLEM NO.
NO. OF TERM CHOICES = 4
PROBABILITY OF
1) ERROR IN ENTERING TERM = 5.0000 0/0
2T ERRpOR TN nFIDEETIN TFPRM = 5.' I 0/i' 1
WEIGHTED DEGREES OF FREEDOM = 48. O00___
STANDARD ERROR OF Y = 0.166283123E 03
STEP NO. 3
TERM ENTERED 2
F LEVEL = 0. 133637249E00
STRNDARD ERROR OF Y = 0.2441 ..9383E 02
COEFF OF DETERMINATION = 0.79810Q469E 00
MULTIPLE CORLTN COEFF = O.989853755E 00
CONSTANT TERM = 0. 1463,4627E 02
TERM NO. ICEFFICIENT STD ERR OF COEFF
TERIM 1 0.33'871 2836E 01 0. 123308040E00
TERM 2 0.417513050E02 O_. 510339618[03
TERM 4 0.546617232E 00 0.401052549E01
REGRESSION TERMINATED FFTER 3 STEF' 
DIPGONllIL ELEIENTS'VR. NO. VRLIJE
1 0. 1061 3487E 01............................'Z _ _._ _......... _l,!:'2 F! 8 2 E —Ffr......... _ __
~3 O 10.175644755E00
PP__EDI C:TEOD FE SULT L,'ERSI S DA0RTRA PINtTS _ 
OB,. NO.1 _: _ F I i,ICT IaS naT, I SRTin _ S
I= :li4flt'M Y Y+ 5 IfGl l POIt NTS CLRTR  YT: PER__ ENT
1. 0, 685614SE 03 O,,0, 3922' O lF E 0L, 0, 41 740013E 03 I0, iig;'?. 5t 0c i 4 0. 1 2 0122'P4FE 02 ,. I.3 2
. O,? 750.S 332:E O?0 0. 9,4?727311E 02 0, 1 3 8'i"'1o ( ~, n1.?l~ 0' 0.I 0, I52,'i:2: 3. iE 01. I 8.
I... 0._.... 1 l_:: 5~ l _ 0'g.4,'269086.SE 0Q2, 49 110204E 0 L.=1 0" 7iSElES 1 iD 13 0 =1i 2'I l.' I 02 2'^09, i. 37
4. 063714263E i0, 1i 00 13 L51 2E 02 i. Li 1 1 2 S2. E 01 L. 1,13, I 21 02 0.:321 5F 1 3.84',
5. 0.46.'98733 02 0. 714060'7E 02L O ID 1H 2'4,10 Q _ _ 0. 101iaI6i 4?EL 00I n,. 21.f 1 _ 40' FI 02' 2 _ _4
6 O"0, 34716561 E 01 0'2.'i::l4' 7 2 1 0. 4,3. 021IE 02 0,45702i'On02E o;00 =020490615F 02 448.4.
7..12017860E 0.,4 Si _, i.71.E i012. _. I: O,@Sc M!, n i. , 0:21,iF2,S1 01 F=i 3.Fi4,46l 02 =122.3. 71
L..4i,14MW 03 0 i l LQiLtS1 05 0 415~4 9 4E 0 i 2 5 1 I 12 Er 0ri.ii ll 2Q E 01 i.I SF9
9 0 a2S301 5954E 0L 0. 4 l._ iE Li 0.Q _.. E 7 i a5 i3.E, I'. l 3 3,22SiS E 0 q 4.4
10. 0.14 O9'4 322E 02:'E 02 0,62'110E 02 004 i, 107Ti4.1E 011 =0,.42341'27E 2. =674.471116. 0.442159ii939 03 1 814.3E 03 0.0 51 ri E604 I 2 1i1 00 3 03 50.1l497205g 02 2.S3
13. 3.3490 61E.1 0,.419.320E 0_ 0.44370396E 05.. _..rg.246s9E_L1. _._ 14,1? l... Q _.. _,_34?.3
14. 0 5 0. 22^40140C 03 0,.12 4073E S3 0. n,17.3i 03.2ri0 2._t?2 1 0.717
250
15. 0.13869653E 03 0.16311587E 03 0.1631.187E 03 0 03 0.17735252E 03 0.14236648F 02 8.027
16. 0.40687139E 0.3 0.43129072E 03 0.45571006E 03 0.4171314SE 03 0.14159237E 02 3.394
17. 0.74341639E 02 0'.9:876i,': 77E 07 2 0. 12318031E 03 0. 13 34309 E 03 0.34670010E 0'2 25. 983
18. 0.50384.38EE 02 0.74S03'9'377E,32 O. S9:':223315E 0 02. 6 7453766E 02 0.73502111E 01 10.897
19. 0.68312383E 02 0.92731722E 02 0.11715106:1E n3 0.99'367SG6E 02 O.E3353G43E 01 6.395
20. 0.13039259E 03 0.15481193E 03 0.17923126E 05 0.17025097E 03 0.15439051E 02 9.068
21. 0.12677906E 02 0.37097245E 02 c0.61516534E 02' 0.32386170E 01 0.33858629E 02 1045.466
22. 0.31327043E 03 0.33768976E 0.3 0.36210910E 03 0.33171669E 03 0.59730682E 01 1.801
23. 0. 14779282E 7 0. 3 0. 1 72L21 26E 033 0.!'96663150E.l 0.19715650E 03 0.24944334E 02 12.652
24. 0. 316020.37E 02 0. 5602137E 2 0.80440714E 02 0.29077920E 02 0.2G943456E 02 92.660
25. 0.26772121E 02 0.51191460E 02 0.75610739E 02 0.2414905E 02 0.30776555E 02 150.755
26. 0.52434778E 01 0.19175861E 02 0.43595200E 02 0.35019451E 02 0.15843589E 02 45.242
27. 0.42066642E 03 0.44508576E 03 0.46950509E 03 0. 43136641E 03 0.13719345E 02 3.180
28. 0.24061318E 03 0.26503251E 03 0.28945185E 03 0.24484918E 03 0.20183337E 02 8.243. 123 4358 E 0.3 0. 1474 3,3E62 03. 171 22' 1 650787E 033r. 32944943E 02 1 6. 455
30. 0.5171143E 03 0.54153427E 03 0.56595360E 03 0.5725101E 309758835E02 5.411
31. 0.10841110E 03 0.13283043E 03 0.15724977E 03 0 14940762E 03 0.16c77183E 91 11.095
32. 0.39747377E 02 0.64166716E 02 0.38586055E 02 0.98891437E 02 0.34724721E 02 35.114
33. 0.29390577E 02 0. 53809916E 02 0.78229254E 02 0.24966075E 02 0. 28843841E 02 115.532
34. 0.10212438E 02 0.34631777E 02 0.59051115E 02 0.15693001E 01 0.33062476E 02 2106.829
35. 0.3657.3 S':5 ', 1E'03]0. 41 457 7 52 E 0.3" 0.37771"55E 3. ., 12439,6,4 0E 2 3.. 93.36 i. 0.6 107425 L  E "2i i.:! 4. l' E i2 0. 12.1091293 E 1 1' 5E l 02 0. 17i05220i2E 01 ~ 95
7. 0.79'.4644364 E01.1472S5E C22 0.Q40892'234E 02.0.56,15271' 01i 0,10857624E 02 193.35'
3'. 0.24373405E 03 0.268153.39E 03 0.29257273E 03 0.27457628E 03 0.6422859E 01 2.339
39. 0.52833369E 1 0.1913,  E 02 0.43555340E 02' 0.18445273E,l 0.17291474E 02 937..447
40. 0.8202118GE 01 I.32621457E 02 0.57040796E 02 0.50497638E 00 0.3211641E 02 6359.997
41. O.51 947930E 02 0. 76, 3 72.' 9E 0 2 Ol. 1 001786 6, GE I 03.. 01116:, 6.3.56 E 0C,3 O.352 96,2,'2 9 Q E 0 2 31.6 Cr
42..23 6 577 : E 0 3 0.2 6 0 47 712 E 03.2 4 6 46 E 03.26 400'3'3 E 0 3.79232 3 E I1 2  5
43. 0.4801 15. 4'184 E 03 3 0. 5504534'51E 3 0.52895385E 03 0.535115E 03 0.95336151E 00 0.18'
44. 0.1825542E 02 0.426747E 0 02 0.67094098E 02 040 0.848580E01 0.3418 38914E 02 402.893
45. 0.60688604E 02 0.85107942E 02 0.. 0952727E 03 0.12'37566 E 03 0.35267717E 02 29.298'
460. 0.42782802E 01 0.28697619E 02' 0.53"116'958E 02 0.53612783E 02 0.24915163E 02 46.472
47. 0.2901771 E 02'. 5 34'.3 71 2i E 0i2 O. 77856458E 02. I 862337E 02E.'3 34 25217E 0 2 38.4 — 1
S 0..369047'96,E 0.3. 3 34673 E 03 C. 1 788966 4E 03 0.3 807556 2E 03 0. 1 271 16 S 1 E 02  3.33
MfAXIMUM ABSOLUTE DEVIATION = 0.352'.62'E 02,CISEE OBS. NO. 41., LINE NO. 41:
MlXHIhMlIlM ABSOLUTE PERCENT DEf'.'IP 635.997, SEE OBS. 1O. 40., LINE NO. 40)
P' STiLATED CRITER I A
STRNDARD ERROR OF = 03 i.000000E O0
COEFF OF DETERMINATION = 0.999iQiOO E Ol_...
FI TTEO CUR','E FPROPERT TIES
STJriDARD ERROR OF Y =.2441'34E 02
COEFF OF DETERMIMATION = 0.9798105E 00
FITTED CURVE MEETS NEITHER CRITERIP.
PFSS NUMB1ER 4 BEGUN FOR PROEBLEM NO. 1
10 TOTAL PASSES ALLOiED..
EDITOR PROGRAM
PFROBLEL 1 NO. 1
SOLUTION PASS NO. 4
NO. OF INDEPENDENT VARIABLES = 1
NO. OF TRIRL TERMS = 4
TRIAL TERM DEFINIITIONS FOR PASS NO. 4
TERMC 1:) = INTERFACTION OF ORDER 1, WiIHERE THE COMPONENTS AfRE DEFINED TO BE E _.
COMPONENT( 1) = X( 1).P. 2
TERM(C 2) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE 
COMPONENT ( 1) =.: 1).P. 5P
TERM( 3) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE 
COMPONENTC 1) = X( 1).P. 3
TERMC 4) = INTERACTION OF ORDER 1, WHERE THE COMPONENTS ARE DEFINED TO BE 
COMPONENTC 1) = )C( 1).P. 1I
TERMC 5) = X( 2). DEPENDENT'VARIABLE.
251
PROBLE M HO. 1
NO. OF DLiqT A SETS = 48 _
NO. OF TERM CHOICES:H 4
PROBABi I LIT'i,. OF
1:' ERROR Ili ENTERI NIG TERM = 5.0000 0..0
2: ERROR IN DELETING TERM = 5.0000 0...____ _______0 ______ _ ________ ____________________
i. E I ii Tu ERERO1R DE ES F i 4F FF.,E E. 0 __E _______D____________D ______________ __________________
STI::IT DRRD ERROR OF'.,' = o 2 l, 123.;12._x..'E 0.3
RR tY V5'C,.'.': = 0. 6:70552.3E07
STEP NO. 2
TERi'El ENTERED 4
F LEVEL = 0. 1 154E4945E02
ST DRD ERlA R OR OF'D = 0.Y4.F005F2Q 1
C.0EFF OF.ETER:i Ti OI = I 3 3 0
iMUiLTIPLE CORLTN CO EFF = 0.' 999999J96 3E` 0 __
COnSTFNT TERM =_ _ _ __.,150000.30,4E 02_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
TER M'4A 0. 1.';..2 E I 02 0. 1 1 00572 7FE02
MREGRESSION TERfII N'TED O.FTER 2 STEPS, 
OiT GOiIL ELEMENTS
il'UA... M O'. A' LUE
1 O !02 '301
______________________________________2__.31 5741 771E00iii____2
_________________________TFDFi____LT'S__DTF__FT___________40. 0.3..70297E 01
E_ _E ' I D._ I T E D` __.I I E TT RES..T!_ L PE.
_OBS, HO,.___________ ____________________________ Di 1'TPREDICTIONS___1__ _____________DE____,_______!_______T___ _.:_N__________HIAi_____
Li  SIGMAi'E,1 V+ S!GNA POINTS (RTA :,Di pEOR.'CE;NT
_ 1. 0.D3i 7802i. 3 71E 03i_0.i380.0771iE.3_ ___Q_3 0,38.5171E 0.3' 5 0.3';.0.30735E 0.3 0.1..35144E0.
2. 0.1091 1 40E0.i 0.50E 0.3 0.i190207 1.i9570iE i0.3 0. 20 7 3.1 1 iE 3 0i.7S201294E04 0.000Q
3. 0.'92279042E i Di  0,78 4lE i0jiOSSS49 _ 0.8._'4 937E 00 _ l0.8788859'1E 0 0. 1 01 01 497E 3 0. 01 1
4. 0..'1S17.320E 7 3 0. 632 0 13 2 70 E. 91 705.3.3QE 02 1. 9 7 6612 9E 02 0. 1 06811 5 2E 0.3 0.O00
5. 0. 1 0652237E 03 1 0 1656638E 03 0.10661038E 03 0.10656647E 0.3_ 0.10204315E03.
___ 7. 0.270S7816, E 01 ___ 0.275278 71E 01 __ _U.'27"967926:,E 0! ___ 0. 2752653 1E 01 ______ 0. IS.3931t64E0.3 0.002 7
8. 0.512560.30E 0. 3 0.512604.30E 03 0.512648.30E 0.3;.. 0.5t' E;412 E 0.3 0.18310547E0. 3  0.000
9. 0. 79641 547 E 02 Q.79685552E 02 .796,5631E 02 _ 0. 91 54968E04 _ 0. 000
10. 0.50317791E 01 0.50757846E1E! 01  0.13959408E03 0.00
1 1. 0. 1+ 90624821 0. 0 22E 0i0. 10629221 2E 03.3 0 . 1.3E 0 0. 101 29983E 03 0. 000
2. 0 0 55 77.45562149E 03 0.4556, 549E 0. 3 0.4556215E 0.3 0.3432275E04 0.000
13. 0. 40520 1 96.3 0. 40524577E 03 00.4,'_ 524597E 0.3 1 22070.31E.3 0.000
14. 0.29287345E 03 0.29291745E 0C_3, 0.29296146,SE 0 3. 2 7 0.3 0. 1792907.E03 0.000
c
r
r
I
II
I
i
I
fl t j.. iI' I I 9j:I I II.J 1 )1 1IJ
I ltpi 11 Vi
g~~~~~~~~~~ 1~: t~ ~~ l""" ~ ~[....**********
 m j~ ^ 3 ^,, S _H 4 "V, ".'j o,.. 4 I. — n  4. I.., , (4.j. 1.:,d:a f —, .P
fll 0 _';,, (o.,j  C o, C7  —, ILI . ' J  j uJi *. — ) C,. W  J CO —J ^'. p.:..:.,, 4 . J
 i U Ml 3: C r iTl F
_1~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~C':' *q,1~ CO ~? i  1 X ~ " '* C:, &.' > *J' *' + — l *'.:' ~0:' ** "1'.:' r'.' f'. . " . r'', r'.J  ~~ t _,>., ~  .J
 I.i......... *. _.l. 4, i....!...I..4 ,I   d '. 1,i,,::1,*,
i u mci n r Un VKI.
0~~~~~~~~~~~~~~~~~~.:o1) a~ % 70,:.u U.)....C~O 0, C. q C) ~ C., C)....C.., . O. ~r',? i= t'. (1 C) C,' C_ C' C).' ~ C) C C O
m hy'I.:1:.... 1... I......4.i....;I C..,'q1
M ^ ^'7 C ".':U. 4... *' 4:, ',,''.:.. t rio,:: 0 1,.: —' " ' .r.'.. : C:. o l:_r':.'1"'.
i Irrl i~~~~~~~'4'',l'".:;,'r' 4rlirI...4'r'' 1 4..&a...['rM —4',i....LID
rH X M 4 M ~ 7i.,.tif 1 4 l...r.LI'lC.", rrl l',r'i:
4 lD.44In~ ~1= 44. ~ ~,....O(D w4. 4.4 4
M 4 ii  I'c I T'M rn;, , f C.11 *i L. LI *,. ('...:...'...' , J ,..7, C,,,... ) 04.. ' >. C C,
1^;~~ ~ ~~ ~~~~~~~~~~~~~~~~~~~~~:M o .t~. — I u O." i ^ w D y) , K.! &..i PJ P.) W ^.^ G.j f>^ r..... j  *
X_i I 71 MiiI / I; C1P1,I:.i D K:10'1
I VI v. mr.  T, 0 4 _4 I IC V. n I. C4..
iJI mi
Li i'..4.uC C,C.C  ) CO!~0 I.,C ( O O CDC)DCD C, C iC),,.', ~~::~~:=:..,,,:, ~,.:, ~ ~ ~:,:~:,~:~,~,=:,::,,,=, f'~:~~:;~;1,
:' i >:'A PIA F  **'1' ii Wi > n f i i O . IF.."i *.:' r', *;, 1 1,'i I' j i   t' n' r. 4 4:. "'.,j i~ *' r " ..', —.' * *..?w~
Z.,~~~~~~~~~~~~~~~~~~~~~~~~~~^i
M XI O C) Z~~~~~~~~~r,~:, 7,~~~=:,oo:.:, =: = :  = D'=:''''"~ =:.; C*C.;;"'::: "
10 C), "C,.  Ml *..,.,..:'l,.i (..;i.....
t..: L.,..;.1
PC j C44J'4 C'..... I.  C,, CCC,..'.(4,,,..
VI4. ..I'...,, —., 44...,  ,. '4,,,'',
Xi CD' i'i~i ~iCDDCDCrCDC
111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~1,:II _9 Ul O C') O ~ ~ ~ ~ ~ ~ ~ ~ ~ ifl *' i,D 1'~~~ ~ ~~~~ ~ ~~~~~~~:.. C) 4:.,:J1 — 1 Lr'...,J ( "i 4. (:..{,,:.:.,., ~',:r,.L~.ri:...... 6,..j.. 4" O(~ j;)0;.'= ~',jI,.
M.0 M I'lJ 14J~~~~~~~~~~~~~~~~~~~~ i 4. .(
4 4. 4'.fl., C. ~., i.AC. ii L l, J w 0
I'T' I M n .. _ J...I...."...........J1.J:. "."' — J C...r.4.:..'.:4 —.,=,   a,
— Z: ".. C,,,......''''" ' C'"' L.: C,,:. C ) C:, C. .
I'" ":':.'': ", *'i I' r:':::';;i,n1'',..,' P' . I'.' " 0.3,?,'., (J r r..r G.,/::;,[._ o o,=, _;,, :;..,...... ',..... C:''::';.... *:'..[;:! J'= *... *.;i......' %t 'J,.,.. h',; "
*~,=:',:3 = J',;.,1 =,,=, c,,:=,:. c',,=1 =;:,* o':. o*.=,,~ = —,,.~,~,1.., o* c=,,:=,i *'1' o c~ o,' _ 3 o' o* t,~I.;:;,1 r,::,:. i..~'i:; m:,,;:,; .1J;.',::,:'*:'i,=* * —::..;':':: [,:;::. ~ [. 1,:,:'.0)<*',. C.,' —,': =.* i~;, 0:: tl]:i 1
/"ilC ~..... Z,........l C
E.4................................ *....
4'L'4l ij DqirC,
CO t — " C,~.....':, . 4'.. O.,.1: 7,...)'..q 4:. C,~''..:', JJ,V C.,,P.. ' J J' J,_oj J...., .i .J,... ~..:.o  o :, .'.' , o 1., C,., J — C,.,: 4....,_:Z,I..., C......... — J 6.: ..,l:,.i:..I'.. c,,:L. J I,::',...: r
4., G4.......,,,4:~   ,, , ~.:, j........,Y....,....
~~~~~~~~~~~~~~~~~:_. . I.,,,!~,',,~" o) oI _.., o,,,,=, o. oD)C
i;ri::!i i:t
I
I
I
I
I
 I
MAD EXTERNHL FUNCTION STATEMENTS FOR PREDiCTIHG EIJAUTIOA PRODUCED 3V LAST REGRESSION ST;EP
* COMPILE MRD
* PUMCH OBJECT
EXTERNAL FUNCTION (i j
ENTRY TO EXAMPL.
T:: 2 0..',!'i86 E 01
TC 2) = TC 2 *: X 1.F. C 2I:
TC 3) = 0.159:'832E 02
TsC 3 = T) 3) * X 1
T O =; 0.
THR F.Oi H S. iU FFO I  1 1...3
S,UM T1':. O = TI::.:: + T Ci:,
FUiCTIO Ti FRETURii T,:CO)
E'tD OF FUhiCTIOi
FORTRAN II SUEBOUTIME STI'TEMEHTS FOR PREDICTI!iG EQUiTIOHI PODUC ED BV LA T RE.GESSION TEP
* COM1FILE FOF TRi
PFR I,iT SAF'
* FULNCH OBJECT
F U N T I O N': E, r (::; 1?
D I  H S I Obi T 3_
T C 1: = 0. 1 O5 3. 4E' 02
T _C 21' . 9':''3, 57E 1:iq. 13 7
TC 2' = T, i:'! F:i 1 * ' 2::
T, 3{ 0. 1 5?:9832 0E 02..
TC 3) TC 3 * X 1
E,h:nPL = 0.
R E T"F'
1 EN..RFML = E'P1F L +': I "
F.,ET!F!
F' ES' T 5T TUS OF SELECTOr iA RTA'S..I TEi i: T: I Pi3 __: _ F,:'.r'__. ._
I'ERt CTIOH 1 O. j:,iIGHT I ITEk:CT IOi 0...IEIGHT I EACTI':i. iI.jIT i HTI CT t. T._T
t.... :.;3:H 0 o 0 I c...'_',, j j j,,.I i._i  i H T S  I.'3 i F iii i? i': i 0i!,
iR I F' iEBLE h RRlF:'~'F:Ir i;L iLT L.' I G TiqEI L — NO *.EIG T 4'iE HT,T!i f itF ii EE t;,,.' [  iT
T t IOti i 1
L.' O,:F i, C EI,it  _ = 1.'::r Oi.TN,',,'E!,I:.*,,'_lqS: 1F i iE:L l.[..f_J i,_l!_ t.... E
t=';JiNCtI 3tt I O. iO,! EIGHT CiFUI'CTt1' HL 0t;ii:iGHT Fl.'',T'irii HOli tdiEICliT FU.iCTIO.;,, iT
1 1'. i u12 9 Ei 5 5 E 01 i  4: E::; r 1: 1 E: I 
1 L ——,20. _ 1.._ 7402E L _ ] ___..:S1 1 0. 7 to — E 0 1:.It. ___. l7 2 OH _,.,. ,,',:;1141 i., J 7 4:I 1.E 09 i,. —'.'.:'E. 1, ";;;E*t,I.i402 1!i0. _ 10472' 14E,  ' I E 0 2 0 0E. i4 * _ 0 2.!_L. —'  i:.
DM, OF'F i. I t.,,;!i.;n iO i i_' E! 
BIBLI OGRAPHY
Automatic Programming and Artificial Intelligence
1. Theodoroff, To Jo and Olsztyn, J. To DYANA~ Dynamics Analyzer
Programmer. General Motors Research Staff PuJlication, 1959.
2. Friedberg, Dunham, North, A Learning Machine. Parts I and II,
Vol 35 No. 1 and 3, IBM Journal Research and Development, 1959.
3. Minsky, Mo Lo Heuristic Aspects of the Artificial Intelligence
Problem. Group Report 3435, MIT Lincoln Laboratory, December, 1956.
4. Minsky, Mo Lo Artificial Intelligence and Heuristic Programming
Proceedings of the International Conference on the Mechanization of
Thought, London, November, 1958.
5. Newell, A., Shaw, J. C. and Simon, Ho A. Elements of a Theory of
Human Problem Solving. Psych. Rev. 65.
Mathematics, Logic and Statistics
6. Schreirer, 0. and Sperner, Eo Modern Algebra and Matrix Theory.
Chelsea Publishing Co., 1951.
7. Hartree, Do R. Numerical Analysis. Oxford at the Clarendon Press,
1958.
8. Hildebrand, Fo B. Introduction to Nulmerical Analysis, New York:
McGrawHill Book Coo., Inco 1956.
9. Cramer, H. The Elements of Probabilit y'neory and Some of Its
Applications. New York John Wiley ancd Sons, 1959o
10. Fisher, R. Ao Statistical Methods and Scientific Inference. Oliver
and Boyd, 1956.
11o Davies, 0. L. Statistical Methods in Research and Production.
Oliver and Boyd, 1947.
12. Dallemand, Jo E. Stepwise Regression Program on the IBM 704. General
Motors Research Staff, GMR 199, 1958.o
13. Weyl, Ho Philosophy of Mathematics and Natural Scienceo Princeton
University Press, 1.949.
14, Tarski, Ao Introduction to Logic. Oxford University Press, 1951o
~W., A...
UNIVERSITY OF MICHIGAN
76~a~wa~1r~a~a1r 2553 9015 03627 7666
15. Church, A. Introduction to Mathematical Logic. Princeton University
Press, 1956.
16. Tintner, G. Econometrics. New York: John Wiley and Sons, 1952.
17. Klein, Textbook of Econometrics. Row, Peter.son and Co., 1957.
0~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~