# ENGINEERING RESEARCH INSTITUTE THE UNIVERSITY OF MICHIGAN ANN ARBOR THE LOGIC OF AUTOMATA AFOSR TN-56-539 ASTIA AD-110-358 Project 2512 AIR FORCE OFFICE OF SCIENTIFIC RESEARCH MATHEMATICS DIVISION, CONTRACT NO. AF 18(603)-72, FILE 4.12 AIR RESEARCH AND DEVELOPMENT COMMAND WASHINGTON, D. C. December 1956 ensm umrøs68 . #### TABLE OF CONTENTS | | Page | |----------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------| | LIST OF FIGURES | iv | | ABSTRACT | v | | OBJECTIVE | v | | 1. INTRODUCTION | 1 | | 2. AUTOMATA AND NETS 2.1 Fixed and Growing Automata 2.2 Characterizing Tables and a Decision Procedure 2.3 Representation by Nets, the Coded Normal Form | 2<br>2<br>11<br>20 | | <ul> <li>TRANSITION MATRICES AND MATRIX FORM NETS</li> <li>3.1 Transition Matrices</li> <li>3.2 Matrix Form Nets</li> <li>3.3 Some Uses of Matrices</li> </ul> | 31<br>31<br>33<br>38 | | 4. CYCLES, NETS, AND QUANTIFIERS 4.1 Decomposing Nets 4.2 Truth Functions and Quantifiers 4.3 Nerve Nets | 45<br>45<br>49<br>53 | | BIBLIOGRAPHY | 55 | | DISTRIBUTION LIST | 57 | #### LIST OF FIGURES | Figure | | Page | |--------|-----------------------------|------| | 1 | Normal form net. | 22 | | 2 | Decoded normal form net. | 34 | | 3 | Matrix box of order 4. | 37 | | 4 | Matrix form binary counter. | 39 | | 5 | Net of degree 3. | 47 | | 6 | Two simple nets. | 53 | #### ABSTRACT Classes of automata are distinguished: fixed and growing, deterministic and probabilistic. Then we present methods for analysing and synthesizing fixed, deterministic automata by four kinds of state tables. Our use of these tables gives a decision procedure for determining whether or not two automaton junctions behave the same. Matrix theory is applied to some of the state tables, and theorems are proved about the resulting matrices and a corresponding normal form automaton. Finally, we analyze fixed, deterministic automaton nets in terms of cycles. #### OBJECTIVE The aim of this paper is to develop systems and techniques of mathematical logic which are useful in analyzing the structure and behavior of automata. #### 1. INTRODUCTION\* We are concerned in this paper with the use of logical systems and techniques in the analysis of the structure and behavior of automata. In Section 2 we discuss automata in general. A new kind of automaton is introduced, the growing automaton, of which Turing machines and self-duplicating automata are special cases. Thereafter we limit the discussion to fixed, deterministic automata and define their basic features. We give methods of analyzing these automata in terms of their states. Four kinds of state tables—complete tables, admissibility trees, characterizing tables, and output tables—are used for this purpose. These methods provide a decision procedure for determining whether or not two automaton junctions behave the same. Finally, a class of well-formed automaton nets is defined, and it is shown how to pass from nets to state tables and vice versa. A coded normal form for nets is given. In Section 3 we show how the information contained in the state tables can be expressed in matrix form. The (i,j) element of a transition matrix gives those inputs which cause state $S_i$ to produce state $S_j$ . Various theorems are proved about these matrices and a corresponding normal form (the decoded normal form or matrix form) for nets is introduced. In Section 4 we first show how to decompose a net into one or more subnets which contain cycles but which are not themselves interconnected cyclically. We then discuss the relation of cycles in nets to the use of truth functions and quantifiers for describing nets. We conclude by relating nerve nets to other automaton nets. 1 <sup>\*</sup>We wish to thank Irving M. Copi, Calvin C. Elgot, John H. Holland, and especially Jesse B. Wright for many helpful suggestions. #### 2. AUTOMATA AND NETS #### 2.1. FIXED AND GROWING AUTOMATA To begin with we will consider any object or system (e.g., a physical body, a machine, an animal, or a solar system) that changes its state in time; it may or may not change its size in time, and it may or may not interact with its environment. When we describe the state of the object at any arbitrary time, we have in general to take account of: the time under consideration, the past history of the object, the laws governing the inner action of the object or system, the state of the environment (which itself is a system of objects), and the laws governing the interaction of the object and its environment. If we choose to, we may refer to all such objects and systems as automata. The main concern of this paper is with a special class of these automata: viz., digital computers and nerve nets. To define this class as a subclass of automata in general, we will introduce various simplifying and specifying assumptions. It will become clear that in adopting each assumption we are making a deliberate and somewhat arbitrary decision to confine our attention to a certain subclass of automata. For example, by altering some of the decisions we arrive at the rather interesting concept of indefinitely growing automata, which include the well-known Turing machines as particular cases. A. Discrete Units.—The first decision we make is to use only discrete descriptions; this means that between any two moments and between any two elements (particles, cells, etc.) there is a finite number of other moments or elements. This decision is a consequence of our interest in digital computers. It carries with it a commitment to emphasize discrete mathematics in the analysis of the systems under investigation: recursive function theory and symbolic logic. Hence our problems differ from the more common ones in which time, the elements of a system, the states (color, hardness, etc.) of an element, and the interaction of an object with its environment are all treated as continuous. When this is done, the emphasis is naturally placed on classical analysis and its applications. In contrast with digital computers, which use discrete units, analog computers simulate in a continuous manner, and for the study of them continuous (nondiscrete) mathematics is especially appropriate. It should be remarked that, though discrete mathematical systems are generally more useful for the investigation of discrete automata, a very common (and perhaps at the present time even primary) scientific use of digital computers is to represent (approximately, of course) continuous mathematics (e.g., to solve differential equations). In effect, this procedure involves finding discrete mathematical systems which adequately approximate the particular continuous system at hand. - B. <u>Deterministic</u> <u>Behavior</u>.—We will not deal with elements capable of random (nondeterministic) behavior. Rather, we will assume that at each time the complete state of an object is entirely determined by its past history, including the effects of its environment throughout the past. Statistics could be employed to treat deterministic automata containing large numbers of elements (cf. the kinetic theory of gases), but we will not do this here. - C. <u>Finitude of Bases</u>.—We will always exclude an "actual infinity" of states: each system will contain a finite number of elements at each time, and each element can be in only one of a finite number of states at any time. We will reserve for an independent decision the possibility of a "potential infinity" of elements and states, i.e., whether or not the number of elements or states of each element may change with time and in particular increase without bound. The finitude of time requires separate treatment. The discreteness condition (A) implies that there are never infinitely many moments between two given times. But this still leaves open the question as to how many different times are to be considered. For a general theory it would be inelegant to take a definite finite number, say, $10^{18}$ or $10^{27}$ , as an upper bound to the number of possible moments. It seems desirable to allow time to increase indefinitely and to study the behavior of an object through all time. The one remaining choice is the question of an infinite past. The assumption of an infinite past has the advantage of making the entire time sequence homogeneous, thereby destroying a major part of the individuality of each moment of time. However, in the presence of our deterministic assumption an infinite past would be inconvenient (for reasons to be given in Section 3.3), so we stipulate that there is a first moment of time (zero). The internal state of the automaton at time zero will be some distinguished state. Speaking arithmetically, the two alternatives are to represent moments of time by positive (or nonnegative) integers or by all integers (including the negative ones). - (C1) An infinite future but a finite past: every nonnegative integer represents a time moment and vice versa; the number zero represents the beginning of time. - (C2) An infinite future and an infinite past. Our decision is to adopt the first of these two alternatives. D. Synchronous Operations.—Some computers contain component circuits which operate at different speeds according to their function, location, the time, or the input at the time. Such circuits are called "asynchronous," in contrast to synchronous circuits which work on a uniform time scale (usually under the direction of a central control clock). There are really two aspects to asynchronous operation: (1) the actual intervals of time between operations vary with the time; (2) different parts of the computer pass through their states at different rates, these rates depending, in general, on the information held in the circuits. We will assume that all elements of a given system (net) operate at the same rate, and we will call this the "synchronous" mode of operation. In particular, this means an element may operate at each nonnegative integer t. This assumption does not imply that all time intervals are equal; e.g., the interval from time 7 to time 8 can be one microsecond, while the interval from time 8 to time 9 is 12 hours. Thus our assumption does not exclude possibility (1) of the preceding paragraph. It involves some restriction with regard to point (2), but not as much as one might think. For example, we can represent different parts of an asynchronous machine by subsystems operating at different rates, interconnecting these with logical representations of interlocks. And just as discrete systems (e.g., digital computers) can be used to simulate continuous systems [cf. the second paragraph under the discussion of assumption (A)], so synchronous computers can be used to simulate asynchronous ones. E. Determination by the Immediate Past and the Present.—The stipulation that the behavior of an automaton at time t+l is determined by the past leaves unsettled the question as to how the remote past influences the present. We will assume that such influence occurs indirectly through the states at intermediary time moments, so that to calculate the state of the automaton at t+l it is not necessary to know its state for any time earlier than the preceding moment t. Thus we assume that for the present to be influenced by an event which happened in the remote past a record of that event must have been preserved internally during the intervening time. This postulate corresponds closely to the actual mode of operation of automatic systems. Of course, a computer with, e.g., a microsecond clock, may have delay lines, e.g., 500 microseconds long, so that a stored pulse is not accessible to the arithmetic unit at every clock time, but such a delay line is naturally represented as a chain of 500 unit (microsecond) delays, with the input and output of the chain connected to the rest of the computer. Indeed, the assumption we are now making causes no significant loss in generality, because for each fixed N such that an automaton A can always remember what happened during the N immediately preceding moments (but not more), we can easily devise an automaton A' which, though capable of directly remembering only the immediate past, simulates completely the automaton A. When we think of automata which have unbounded memory or, in particular, ones which can remember everything that has happened in the past, we encounter a basically different general situation. In such cases, the information to be retained increases with time, whence for any automation of fixed capacity there may come a time beyond which it can no longer hold all the information accumulated in its life history. Thus, for a machine to remember all past history it is necessary and sufficient that it grow in some suitable fashion. Such growth can be accomplished by the appearance at each moment of a new delay element for each automaton input. In short, in the presence of the postulate of determination by the immediate past, the alternative of remembering all past history is best studied in connection with growing automata. Another problem connected with determination by the immediate past is the role of present inputs in determining the outputs. It would seem natural to stipulate that the environment at time t+l cannot influence the outputs of the automaton at time t+l but only at time t+2. That is the case with neural nets, since each neuron has a delay built into it. Strictly speaking, it is true in computer nets, but because the delay in a switch may be small compared to the unit delay of the system, it is convenient to regard this switching action as instantaneous. Thus the well-formed nets of Burks and Wright, which will be discussed further in Section 2.3, permit outputs which are switching functions of the inputs. F. Automata and Environment.—Supposedly the change of state of a solipsist is independent of the environment, and the environment is not affected by the solipsist; cf. Leibnitz's concept of a "monad." A "solipsistic" automation would be one which (1) changed its state independently of any environmental changes and (2) whose output did not influence the environment. We are not primarily interested in such automata. Rather, we will consider how the environment (the inputs) affects the automaton but not how the output of an automaton affects its environment. The last point is related to the ordinary method of representing inputs and outputs. It is well known that there are significant and useful logical symbols for the internal action of automata (see Section 2.3). The standard method of representing inputs and outputs is in terms of the binary states of input and output wires. This is not directly applicable in simulating such "inputs" as light and sound waves, physical pressures, etc., and such "outputs" as physical actions. Theoretically, we can, however, just as well interpret certain standard binary elements as representing these. For some purposes, we may want to add as new primitives representations of the lights and keys commonly used on computers (see Burks and Copi, p. 306), as well as symbols representing additional methods of sensing and other methods of acting on the environment that automata are capable of. Von Neumann has done some work along this line, but it has not been published (see Shannon, p. 1240). It might be suggested that one ought to devise a symbolism for <sup>\*</sup>References are to the bibliography at the end of the paper. magnetic or paper tape input and output to a computer; but that is unnecessary because such devices are very well represented by net diagrams for a serial type of storage (see Burks and Copi, 2 p. 313, ftn. 9). We will not here attempt to devise separate notations for the various kinds of interactions possible between an automaton and its environment, but will content ourselves with the customary way of simulating inputs and outputs by binary states of wires. Even subject to this restriction there are a number of alternatives to consider. The most general case would be to identify the environment partly or wholly with certain automata so that interaction occurs among these and the particular automaton under study (cf. the many-body problem of mechanics). A simple case would be to identify the whole environment with another automaton (cf. the two-body problem of mechanics). Accordingly, we have the following alternatives. - (FO) An object changes its state automatically, independent of the environment. - (F1) An automaton changes its state in accordance with its structure and the inputs (the environment). - (F2) Different automata interact with one another. We will be primarily concerned with (F1). In other words, we will assume that the automaton has no influence over what inputs it receives, and that in general the inputs do have effects on the internal state (i.e., state of internal cells) of the automaton. As a consequence, we can define the units or atoms of which an automaton is compounded into two classes: input cells and internal cells, or input and internal wires, or input and internal junctions. The situation under (FO) becomes a special case of that under (F1) when either the number of input cells (or wires) is zero (a limiting case) or the effects of the inputs are more or less canceled so that the automaton behaves in an input-independent manner. The latter case is exemplified by a logical element whose output wire is always active regardless of the state of the input wire. (F2) may also be regarded as a special case of (F1). Since the inputs and outputs of an automaton are wires, two automata may be interconnected to produce a single (more complex) automaton, of which the original automata are parts or subsystems. Thus we regard (F2) as a special case of analyzing a complex machine into interrelated submachines. A common application of this concept is to be found in the design of a general-purpose computer. Typically such a computer is divided into Arithmetic Unit, Storage, Input-Output Unit, and Control (see, for example, Burks and Copi, 2 p. 301). The utility of making such divisions lies partly in the relatively independent functioning of these units and partly in the (related) fact that it is conceptually easier to under- stand what goes on in terms of these parts. The kind of structuring under discussion usually occurs at more than one level; e.g., the Parallel Storage (of Burks and Copi, 2 pp. 307-313) divides naturally into a switch and 4096 bins (each storing a word), and the bins are in turn "composed" of cells (each storing one bit of a word). G. Exclusion of Growth.—While we have adopted the postulate of finite bases, we have yet to decide whether the structure of an automaton, the number of its cells, or the number of possible states of each cell are to be allowed to change with time. If changes are permitted but are confined by a preassigned finite bound, we might as well have used a fixed automaton which embodies this bound to begin with. Hence the really interesting new case is that of a growing automaton which has no preassigned finite upper bound on the possible number of cells or cell states. Structural changes (e.g., rewiring a given circuit) do not seem to generate unbounded possibilities, although in special studies, such as investigations into the mode of operation of the human brain (cf. Rochester et al. 4), the use of a structurally changing automaton is more illuminating than the use of the corresponding fixed automaton. In any case, we can, theoretically, reduce all three kinds of growth to increase in either the number of cells or the number of possible cell states: given any growing automaton, we can find another which functions in the same way but grows only in the number of its cells (or, alternatively, only in the number of possible states for each cell). For every and all forms of growth, it seems natural (in the context of our deterministic assumption) to require that the process be effective (recursive). We will therefore assume once and for all that each definition of a growing automaton determines an effective method by which we can, for each time t, construct the automaton and determine its state for that time. An important particular case corresponds to primitive recursive definitions, each of which yields a method by which we can construct the automaton for t = 0 and, given the automaton and its state at t, we can construct the automaton at t+1. The growth may not depend on the state of the inputs, but the possibility of its doing so is provided for. Moreover, "growth" is taken to include shrinkage as well as expansion. Thus we could have a "growing" computer which expands and contracts as the computation proceeds, having at each time period just the capacity needed to store the information existing at that time. Two types of automata, <u>fixed</u> and <u>growing</u>, can be characterized as follows: - (G1) The structure and cells of the automaton are fixed once and for all, and each cell is capable of a fixed number of states. - (G2) The automaton may grow (expand <u>and</u> contract) in time in a predetermined effective manner. In this paper we will be concerned entirely with fixed automata, except for some remarks on growing nets in this subsection. These remarks are intended to elucidate the concept of a growing net and to indicate why we think it is important. But before beginning on them we wish to specify (G1) further by stipulating that each cell, junction, or wire is capable of two states, on and off, firing and quiet; we will later correlate these with one and zero, and with true and false. We could of course allow each cell to have any fixed finite number of possible states and different cells to have different numbers of states. But it is better to fix the number of states at the constant two. There are a number of reasons for this. The wires and cells of many automata and most digital computers do in fact have two significant states. When this is not the case we can always represent a cell with q possible states by p two-state cells for any $p \ge \log_2 q$ (e.g., ten of the sixteen different states of four binary net wires can represent ten discrete electrical states of a single circuit wire), so by adapting our system to the commonest case we do not lose the power to treat the nonbinary cases. This commitment to two-valued logic need not blind us to the fact that there may be cases where multivalued logic is more convenient; the point is that our logic can handle these cases and we have no interest at the moment in exploiting whatever advantages multivalued logic might have here. We return now to growing nets, mentioning first some special cases of them already known. A Turing machine (see Turing; 5 Kleene; 6 Wang 7, 8) may be regarded as an automaton with a growing tape. Usually the tape is regarded as infinite, but at any time only a finite amount of information has been stored on it, so it is essentially a finite but expanding automaton net (of Burks and Copi, 2 p. 313, ftn. 8). If, in a Turing machine, we take as input cells the squares included on the minimum consecutive tape position which contains all marked squares at the moment, then the growth consists simply of the expansion and contraction of the tape. Or if we use the formulation of Wang 7 which eliminates the erasure operation, a Turing machine is a growing automaton with an even more limited type of growth—namely, an expansion of the tape. In contrast, a growing automaton may in general grow anywhere, not only at the periphery but also internally (by having new elements arise between elements already present). Though a Turing machine is a special kind of growing automaton, it has as much mathematical (calculating) ability as any growing automaton; for every type of computation can be done by some Turing machine, and the mathematical ability of an automaton is limited to computation. In view of this situation one might wonder why the general concept of a growing net is of interest. Its importance can be shown by the following considerations. John von Neumann has developed some models of self-reproducing machines (von Neumann; 9 Shannon, 5 p. 1240; Kemeny, 10 pp. 64-67). These are machines which grow until there are two machines, connected together, the original one and a duplicate of it; the two machines may then separate. Hence they are clearly cases of growing nets. The basic process to be simulated or modeled in the growth and reproduction of living organisms is the complete process from a fertilized egg to a developed organism which can produce a fertilized egg. For this purpose we would need to design a relatively small and simple automaton which would grow to maturity (given an appropriate environment) and would then produce as an offspring a new small automaton. Von Neumann's models can be construed either at the level of cells or at the level of complete organisms, but in either case they seem to provide only a partial solution. The process of cell duplication is only one component of the complete process described above, and the self-reproduction of a completely developed entity omits the important process of development from infancy to maturity. Hence the model we suggest is a type of growing automaton not yet covered in the literature. A second novel type of growing automaton is a generalized Turing machine in which growth is permitted at points other than at the ends of the tape. A typical Turing machine, although logically powerful, is clumsy and slow in its operation. Consequently, to design a special-purpose Turing machine or to code a program for a universal Turing machine is a complicated and laborious process (although a completely straightforward one). What complicates the task is the linear arrangement of information on a single tape, which requires the tape or reading-head to be moved back and forth to find the information. That movement may be reduced somewhat by shifting the old information around to make room for the new information, but this operation also contributes to the complexity of the whole process. To develop this point further, we will discuss in more detail the relation between recursive functions and Turing machines. Turing worked in terms of computable numbers; Kleene $^6$ and Wang $^{7,8}$ work in terms of recursive functions. Since our discussion has been in terms of functions, we will use Kleene's and Wang's works as our references. The basic mathematical result underlying the significance of Turing machines is the following. Mathematicians have rigorously characterized a set of functions, called partial recursive functions, and this set of functions is in some sense equivalent to what is computable (Kleene, 6 ch. XII). Each partial recursive function is definable by a finite sequence of definitions, each definition being of one of six possible forms (Kleene, 6 pp. 219 and 279). It is known how to translate each sequence of definitions into a special-purpose Turing machine and into a program for a general-purpose Turing machine (Kleene, 6 ch. XIII; Wang 7). This translation, while rigorous and straightforward, is often complicated for the reasons, among others, mentioned in the preceding paragraph. Simpler and more direct translations can be made by using growing nets in which growth is allowed to occur whenever it simplifies the construction, not just at two places, i.e., at the ends of the tape, where it is allowed to occur in the conventional Turing machine. Such growing nets will be generalizations of a Turing machine. We can arrive at a third novel kind of growing automaton by generalizing a general-purpose computer in the way we generalized a Turing machine in the last paragraph. The usual general-purpose computer consists of a fixed internal computer together with one or more tapes. As in the Turing machine, these tapes may be regarded as expanding at the ends whenever needed; in practice the expansion is handled by an operator replacing tape reels, using either blank tape or tape reels from a library of tapes. In writing programs for such a machine, the programmer needs to keep track of two things: - (1) the development of the computation, in terms of the growth of old blocks of information and the appearance of new blocks of information; - (2) shifting the information from one kind of storage to another (e.g., from a serial to a parallel storage) and moving the information about within a storage unit. Both of these components of computation are essential. But (1) seems more basic for understanding the nature of the computation, and at any rate it is helpful to be able to study each of the components in isolation. This can be done with growing nets, for we can eliminate (2) by providing for growth wherever it is needed to accommodate new information or new connections to old information. We feel that the study of growing automata would contribute to the theory of automatic programming. The development of a powerful theory of automatic programming has so far been impeded by the many details involved in actual computation; by eliminating (2) we would eliminate many of these details and would focus attention on the more basic component (1). We turn now to fixed automata which satisfy the assumptions (A), (B), (Cl), (D), (E), (Fl), and (Gl). In summary, we arrive at the following definition of a (finite) automaton: Definition 1: A (finite) automaton is a fixed finite structure with a fixed finite number of input junctions and a fixed finite number of internal junctions such that (1) each junction is capable of two states, (2) the states of the input junctions at every moment are arbitrary, (3) the states of the internal junctions at time zero are distinguished,\* and (4) the internal state (i.e., the states of the internal junctions) at the time t+1 is completely determined by the states of all junctions at time t and the input junctions at time t+1, according to an arbitrary preassigned law (which is embodied in the given structure). An abstract automaton is obtained from an automaton by allowing an arbitrary initial internal state. <sup>\*</sup>This condition applies only to those junctions whose state at a given time does not depend on the inputs at the same time; cf. condition (4) following. Several aspects of this definition call for comment. In it automata states have been defined in terms of junction states. This follows Burks and Wright, where each wire has the state of the junction to which it is attached, and the nuclei or cell bodies are not regarded as having states but as realizing transformations between junctions or wires. An alternative would be to define automata states in terms of cell states. Condition (4) places some restrictions on the way automata elements are to be interconnected, but it does not completely specify the situation; this will be discussed further in Section 2.3. The initial state of the internal junctions also calls for discussion. In the definition of an abstract automaton this is taken more or less as an additional input which can be changed arbitrarily. As a result, two abstract automata, to be equivalent, must behave the same for each initial state picked for the pair of them. On the other hand, for most applications to actual automata, it is best to assume a single initial state. The word "structure" in the above definition can be avoided if we speak exclusively in mathematical terms and consider the transformations realized by automata and abstract automata. We will do so in the next subsection, returning to a more detailed investigation of the structure of automata in the following subsection (viz., 2.3). #### 2.2. CHARACTERIZING TABLES AND A DECISION PROCEDURE Consider for a moment automata whose internal states are determined only by the immediate past and hence are not influenced by the present inputs. Let there be M possible input states, $I_0$ , $I_1$ , ..., $I_{M-1}$ , and N possible internal states, $S_0$ , $S_1$ , ..., $S_{N-1}$ . Even though each junction of an automaton is capable of only two states, we do not require M and N to be powers of two. For one thing, when an automaton is being defined, the values of M and N are stipulated and are not necessarily powers of two. Also, when an automaton is given, not all possible combinations of internal junction states may occur because of the structure of the automaton, and not all possible combinations of input junctions may be of interest (because, e.g., the automaton is to be embedded in a larger automaton where not all of the possible inputs will be used). We will assign nonnegative integers to the input and internal states. Let I and S range over these numbers, respectively. A complete automaton state is represented by the ordered pair $\langle I,S \rangle$ . (If the automaton has no inputs, then there are no I's and the complete automaton state is just S.) Let $S_O$ be the integer assigned to the distinguished initial internal state; $S_O$ will usually be zero, but not always. An abstract automaton differs from a nonabstract one just in not having a distinguished initial state. Since the input states are represented by numbers, a complete history of the inputs is a numerical function from the nonnegative integers 0, 1, 2, ... (representing discrete times) to integers of the set $\{I\}$ . That is, it is an infinite sequence I(0), I(1), I(2),..., I(t), ...; it may be viewed as representing the real number $\{I(0)+[I(1)/K]+[I(2)/K^2]+...\}$ , in which K is the maximum of the set $\{I\}$ . By our convention that the initial internal state is $S_0$ we have $S(0) = S_0$ . By the assumption of complete determination by the immediate past we have for all t $$S(t+1) = \tau[I(t), S(t)],$$ where $\tau$ is an arbitrary function from the integer pairs $\{<I,S>\}$ to the integers $\{S\}$ . Or, in other words, as the input function I and the time t are the independent variables, $\tau$ is an arbitrary function of two arguments (one ranging over functions of integers and another ranging over integers) whose values are integers. It follows by a simple induction that for each infinite sequence I(0), I(1), ..., I(t), ..., repeated application of the function $\tau$ yields a unique infinite sequence S(0), S(1), ..., S(t), ..., with $S(0) = S_0$ . Since for many purposes we are interested not only in the existence of values of the function $\tau$ , but also in finding them, we will assume that $\tau$ is defined effectively, though actually much of our discussion would be valid without this restriction. We next broaden our theory so as to include automata whose internal state at t+1 depends also on the inputs at t+1. To do this we allow P "output" states $O_0$ , $O_1$ , ..., $O_{P-1}$ such that $$O(t) = \lambda[I(t), S(t)],$$ where $\lambda$ is again an arbitrary effective function. In general, the complete state of an automaton at any time is given by the ordered triad <1,5,0>. In specific cases I, S, or O may be missing. We can now give an analytic definition of automata and abstract automata by means of these transformations. <u>Definition 2:</u> An automaton is in general characterized by two arbitrary effective transformations ( $\tau$ and $\lambda$ ) from pairs of integers to integers. These integers are drawn from finite sets $\{I\}$ , $\{S\}$ , and $\{O\}$ . $\{S\}$ contains a distinguished integer $S_O$ . The transformations are given by $$S(0) = S_0$$ $S(t+1) = \tau[I(t), S(t)]$ $O(t) = \lambda[I(t), S(t)]$ . If we omit the condition $S(0) = S_0$ , we obtain an abstract automaton. Thus, speaking analytically, the study of finite automata is essentially an investigation of the rather simple class of transformations $\tau$ and $\lambda$ in the above definition. The definition of the class as thus given is superficially very general in allowing $\tau$ and $\lambda$ to be arbitrary calculable functions. On account of the very restricted range and domain of these functions, however, that generality is only apparent. We can find a simple representation of the class of automaton transformations in the following way Since $\tau$ is effective, and since its domain and range are finite, we can effectively find for each pair $\langle I,S \rangle$ the value of $\tau[I(t),S(t)]$ . Hence we can produce a table of M x N pairs $\langle I,S \rangle$ , S'> such that if $\langle I,S \rangle$ is part of the state at t, then S' is part of the state at t+1. We shall call this set the M-N complete table of the given automaton. Each complete table is a definition of the function $\tau$ . In a similar way we can construct an output table for the automaton, each row being of the form $\ll I,S>$ , O>. Such a table defines the function $\lambda$ . It is important that the function $\tau$ and the complete table involve a time shift, while the function $\lambda$ and the output table do not. Hence for an investigation of the behavior of an automaton through time the complete table is basic, the output table derivative. That is, by means of the complete table we can compute S(1), S(2), S(3), ... from the inputs I(0), I(1), I(2), ... and leave the determination of O(0), O(1), and O(2), ... for later. Note that to stipulate that the output at time t cannot be influenced by the input at the same time is to require $\lambda$ to be such that $O(t) = \lambda[S(t)]$ . When this is the case the states O(t) and the output table can be dispensed with since O(t) is completely dependent on S(t). (When the behavior of individual junctions is being investigated, the output table may nevertheless be convenient.) For these various reasons the state numbers $\{I\}$ and $\{S\}$ are more basic than the state numbers {0}, and the complete table is much more important than the output table. This being so, we will often concentrate on automata whose internal states are determined only by the immediate past and ignore the output table. Since the states S at t are so defined that they do not depend on the inputs at t, any I can occur with any S, and hence there are M x N possible pairs <I,S> in the complete table. There are N possible values of S'. Hence there are N possible complete tables for an M-N automaton. We will say that two abstract M-N automata, in whatever language they may be described, are equivalent, just in case they have the same complete table. That this definition is proper can be seen from the following considerations. If the complete tables are the same, then for the same in- itial internal state and the same input functions, the initial complete states in the two automata are the same, and the same complete state at any moment plus the same input functions always yield the same complete state at the next moment. On the other hand, if two complete tables are different, there must be a pair <<I,S>, S'> in one table but not in both, such that by choosing suitable input functions and a suitable internal state represented by <I,S>, we can have the complete state represented by <I,S> realized at time zero, and yet the complete state at time one will have to differ. Since we can find effectively the complete table of a given automaton (the details of this process will be explained in the next subsection) and compare effectively whether two complete tables are the same, we have a decision procedure for deciding of two given abstract M-N automata whether or not they are equivalent. The situation is more complex with automata which have predetermined initial internal states, for two M-N automata with different complete tables may yet behave the same (be equivalent). This is possible because it can happen that for every pair <<I,S>, S'> which occurs in one table but not in the other, we can never arrive at the internal state S from the distinguished initial internal state So, and hence can never have the complete state <I.S>. no matter how we choose the input functions. In such a case, two M-N automata with the same prechosen initial internal state may behave the same under all input functions, despite the fact that they have different complete tables. Hence, identity of complete tables is a sufficient but not necessary condition for equivalence of two automata. To secure a necessary and sufficient condition, it suffices to determine all the internal states which the given initial internal state can yield when combined with arbitrary input words, and then to repeat this process with the internal states thus found, etc. If the two complete tables coincide insofar as all the pairs occurring in these determinations are concerned, the two automata are equivalent. To establish that such determinations will always terminate in a finite time requires an argument: since there are only finitely many pairs in each complete table, the process of determination will repeat itself in a finite time. To describe the procedure exactly, we introduce a few auxiliary concepts. We can think of a tree with the chosen initial internal state $S_O$ as the root. From the root M branches are grown, one for each possible input word $I_i$ with the corresponding internal state at the next moment $S_{Oi}$ at the end. These M branches can be represented by $<\!\!<\!I_O, S_O\!\!>\!, S_{OO}\!\!>\!, \ldots, <\!\!<\!I_{M-1}, S_O\!\!>\!, S_O, M-1\!\!>\!>\!,$ which all belong to the complete table. If all the numbers $S_{OO}, \ldots, S_{O,M-1}\!\!>\!,$ which all belong to the complete table. If not M branches are grown on each $S_{Oi}$ such that $S_{Oi} \neq S_O$ , and such that $S_{Oi}$ does not equal any $S_{OP}$ for which M branches have already been grown, and we arrive at $S_{OiO}, \ldots, S_{O,i,M-1}$ . If all the numbers $S_{Oij}$ (i,j arbitrary) already occur among $S_O$ , $S_{Oi}, \ldots, S_{O,M-1}$ , then the tree stops its growth. If not, then M branches are grown on each $S_{Oij}$ such that it does not equal $S_O$ , or any of the $S_{OP}$ , or any $S_{OPQ}$ for which M branches have already been grown. That is, whenever in the construction we come to an S, if it is already on the tree we stop, else we grow M branches on it, one for each I. This process is continued as long as some new internal state is introduced at every height. Since there are altogether only M a priori possible internal states, the height (i.e., the number of distinct branch-levels) of the tree cannot exceed M. For any M-N automaton, we can construct such a tree which will be called the admissibility tree of the automaton. We can, of course, start with any state S as the assumed initial state, and this gives us an admissibility tree relative to S for an abstract automaton. Those values of S (including $S_O$ ) which appear in the admissibility tree are called <u>admissible</u> internal states. All other values of S are inadmissible. If we collect together the ordered pairs which represent branches of the admissibility tree, we obtain a (proper or improper) subset of the complete table, which we shall call the M-N characterizing table of the automaton. (As in the case of the admissibility tree, it is easy to define the concept of a characterizing table relative to S for an abstract automaton.) In order that two M-N automata be equivalent (i.e., behave the same), it is necessary and sufficient that they have the same characterizing table. Since there is an effective method of deciding whether two M-N automata have the same characterizing table, we have a decision procedure for testing whether two M-N automata are equivalent. Quite often we are not interested in the whole automaton, but rather in the transformations which particular cells (junctions, wires) of an automaton realize. To discuss this aspect of the situation we need to correlate states of automata with states of the elements of automata. We will do this in two stages; first, by putting state numbers in binary form (in the present subsection), and, next by correlating zero and one with junction states (in the next subsection). The state numbers I and S are nonnegative integers. The binary representation of states is made simply by putting each state number in binary form, making all the I the same length, and making all the S the same length (by adding vacuous zeros at the beginning when necessary). Let m,n be the number of bits of I,S, respectively, in a characterizing table or complete table in binary form. Clearly m is the least integer as large as (or larger than) the logarithm (to the base two) of the maximum I; similarly with n. Let the bits of I be called $A_0$ , $A_1$ , ..., $A_{m-1}$ , so that $I = A_0$ $A_1$ ... $A_{m-1}$ where the arch signifies concatenation. Similarly, let the bits of S be called $B_0$ , $B_1$ , ..., $B_{n-1}$ , so that $S = B_0$ $B_1$ ... $B_{n-1}$ . In the next subsection we will associate the A's with input junctions and the B's with internal junctions. We speak of the characterizing table in binary form as an m-n characterizing table. It follows from our discussion of characterizing tables that the function $\boldsymbol{\tau}$ of Definition 2, given by $$S(0) = S_0$$ $S(t+1) = \tau[I(t), S(t)]$ is a rather simple primitive recursive function (with a finite domain) and that the function S(t) is defined primitive recursively relative to the input function I(t). If our interest is in the transformation realized by a particular internal junction, we use another primitive recursive function $\sigma_1$ such that $\sigma_1(n)$ gives the i-th binary digit of n (i = 0, 1, ...). Hence each such junction realizes a transformation $\sigma_1[S(t)]$ or $B_1(t)$ which is primitive recursive relative to I(t) (Burks and Wright, Theorem XIV; Kleene, 11 Theorem 8). Since the magnitudes of m and n affect the number of junctions of the corresponding automaton, it is of interest to obtain a minimal representation in terms of bits. Given a characterizing table, one can so rewrite the state numbers as to minimize m and n. That is accomplished by so assigning the numbers that the largest I is smaller than the least power of 2 greater than or equal to M, and similarly for S and N. A special case occurs when the states $I_0$ , $I_1$ , ..., $I_{M-1}$ are assigned the numbers 0, 1, ..., M-1, respectively, and the states $S_0$ , $S_1$ , ..., $S_{N-1}$ are assigned the numbers 0, 1, ..., N-1, respectively (note that the distinguished state $S_0$ is assigned the number zero). A characterizing table put in this form is said to be in coded normal form. Automata nets corresponding to this form will be discussed in the next subsection. (Note that minimizing a complete table does not suffice here, because the number of inadmissible states may be such as to require more bits for representing the set of states than for representing the set of admissible states.) Another special type of automaton is the decoded normal form automaton; it is of interest in connection with the application of matrices to the analysis of nets. In a decoded normal form characterizing table the input words are coded as for the coded normal form, but the internal states $S_{\rm O}$ , $S_{\rm I}$ , ..., $S_{\rm N-1}$ are assigned the numbers $2^{\rm O}$ , $2^{\rm I}$ , ..., $2^{\rm N-1}$ ; here an N bit word is needed to represent the N internal states. For six internal states we would have the numbers 100000, 010000, 001000, 000100, 000010, 000001; notice that $S_{\rm O}$ has a one on the extreme left, i.e., for $S_{\rm O}$ , $B_{\rm O}$ is one and all other $B_{\rm I}$ are zero. Automata nets corresponding to decoded normal form characterizing tables will be presented in Section 3. Each of the A's and B's (bit positions of the binary representations of I and S) is a binary variable. Hence the complete table and, more importantly, the characterizing table are (when put in binary form) kinds of truth tables. Thus we have to large extent reduced the problem of automata description and analysis to the theory of truth functions. Of course the S' in <T,S>, S'> is the state at t+1, while S is the state at t, so we need to distinguish different times here and hence to use propositional functions (see Section 4). Nevertheless, as the characterizing table shows, we need only a very special form of the theory of quantifiers, in which each time step is a matter of the theory of truth functions. So great is the advantage of this partial reduction to the theory of truth-function logic that we will hereafter assume that all characterizing tables are in binary form. Consequently, we may henceforth use any of the techniques of the theory of truth functions which are applicable, not merely the (often cumbersome) truth-table technique. We return now to the transformations realized by individual elements of the automata, which involves considering the bits of S, i.e., the B's. In the next subsection each B will be associated with an internal junction, so the analysis is also in terms of junctions. The basic problem is to compare the behavior of two bits or junctions, which may or may not belong to the same automaton or characterizing table. If the two junctions to be compared belong to the same automaton, then they realize the same transformation (behave the same) if and only if the corresponding bits in the S (or S') entries of the characterizing table are everywhere the same. (The state $S_0$ need not appear in the S' column of $\langle I,S \rangle$ , S' $\rangle$ ; every other state which is in the S column is in the S' column and vice versa. Of course, all bits are the same in $S_0$ .) This is so because the values of S are the admissible states of the automaton, and at each moment the internal junctions of the automaton are in just one of these states. Hence the question as to whether two junctions of an automaton behave the same can be decided effectively. If the two junctions are in two different automata, then it is in general not necessary that the automata have the same number of junctions, i.e., that the characterizing tables have the same number of columns, for them to behave the same. Since the transformations depend ultimately only on the time and the inputs, the number of internal junctions need not be the same; since the behavior of an internal junction may be independent of some inputs, even the number of input junctions may be different. Suppose the two junctions belong to an m1-n1 and an m2-n2 automaton. Then a necessary and sufficient condition for these junctions to realize the same transformation is that there exist some new m<sub>3</sub>-n<sub>3</sub> automaton, with n<sub>3</sub> = n<sub>1</sub>+n<sub>2</sub>, m<sub>1</sub>+m<sub>2</sub> $\geq$ $m_3 = \max(m_1, m_2)$ , which is obtained from the two given automata by connecting a subset of the inputs of one to a subset of inputs of the other in a one-one fashion, and in which the internal junctions under consideration realize the same transformation. That supplies an effective procedure because there is only a finite number of inputs to each automaton and hence only a finite number of ways to interconnect them, and for each way the question of equivalent behavior can be decided effectively. When the process is conducted on the characterizing tables it involves identifying certain of the columns of the I part of the tables. It is allowed that the subset of inputs which are interconnected may be null, in which case $m_3 = m_1 + m_2$ and the resulting automaton is just the result of juxtaposing the two original automata. For just as the behavior of a junction or cell may be independent of one of the inputs, so it may be independent of all of the inputs. In this case the junction changes from one state at t to another at t+1 in a uniform manner independent of the states of the inputs at t. In other words, it realizes a transformation which is independent of the input functions; we will call such a transformation an inputindependent transformation (it was called a "constant transformation" in Burks and Wright, p. 1358) and speak of the junction as an input-independent junction. The number of internal states of an automaton is finite, and an automaton is completely determined by the immediate past, hence all inputindependent transformations must be periodic (Burks and Wright, 1 Theorem I). Therefore no automaton can realize the simple primitive recursive input-independent transformation which has the value one if and only if t is a square (0, 1, 4, 9, ...) (Burks and Wright, Theorem II; Kleene, 11 Section 13). A very special type of automaton is one whose internal junctions are all input-independent junctions. In such a net, which we call an <u>input-independent net</u>, there may be input junctions, but these cannot influence the internal state at any time. For each such automaton, complete and characterizing tables can be found which have no input states. The admissibility tree provides an effective means for deciding whether the behavior of an internal junction or cell is independent of a specified input and hence for deciding whether the behavior of a junction is independent of all inputs (i.e., realizes an input-independent transformation). For this purpose it is helpful to identify all occurrences of a given state on the admissibility tree. Then one can trace the behavior of the automaton by proceeding in cycles around the tree. We will not describe the procedure in detail, but will make a few comments about it. By a direct inspection of the characterizing table we can tell whether a change in an input junction A at t can make a difference in B at t+1. Repeating this process we can find all the junctions that A can influence directly, all that these can influence directly, etc. Since the net is finite, this process will terminate. That is, because of the finite nature of the net there is an interval of time q such that if A can influence the behavior of B, it can do it within the time interval q; this interval may be determined from the structure of the net. If no input junction influences $B_j$ , then $B_j$ realizes an input-independent transformation, which has been already stated to be periodic. This special case of input-independence can be discovered directly from the characterizing table, for a junction $B_j$ realizes an input-independent transformation if and only if for each S there is a unique value of $B_j$ in S', no matter what I is. The behavior of the input-independent transformation during its initial phase and during its main period can be found from the admissibility tree. (The problem of deciding whether or not two junctions $B_j$ and $B_k$ realize the same transformation is really a special case of the problem of deciding whether a junction realizes a particular input-independent transformation; for we can have $B_j$ and $B_k$ drive an equivalence element, whose output will be the simple input-independent transformation lllll... if and only if $B_j$ and $B_k$ realize the same transformation. See the next subsection.) In our discussion we have for some time ignored the output table. It too can be put in binary form, and since both the O and the S entries refer to the same time, the result is a straight truth table (in contrast to the characterizing table, where some columns refer to time t and some to t+1). Hence the preceding results are easily extended to include the case of output tables. We have not yet considered methods of minimizing the labor required to calculate the admissibility tree and the characterizing table. In many cases it is convenient to work with the equations describing a net by means of variables (see the next subsection) rather than with the values of these variables. In some cases one can go directly from such equations to the characterizing table. It is also possible to decompose many nets so as to reduce greatly the number of states to be considered (see Section 4.1). Other methods of simplifying the work will occur to one who is engaged in it and to one familiar with the methods for simplifying truth-table computation. Before proceeding further let us briefly summarize the concepts introduced in this subsection. Definition 3: An automaton is in general characterized by state numbers I,S and O. The complete table of an automaton is the set of all pairs << I,S>, S'> such that, for given I(t) and S(t), S' is the value of S(t+1). The characterizing table of an automaton is the subset of its complete table such that each S and S' in it is admissible. A state S is admissible if and only if it is the distinguished initial state $S_O$ or it can be arrived at from the initial state by choosing a suitable finite sequence of inputs. An admissibility tree is a graph used in computing the admissible states, beginning with $S_O$ and proceeding systematically. An output table is a table of pairs << I,S>, O>, stating a value O(t) for given values of I(t) and S(t). An input-independent junction realizes a transformation whose values are independent of the inputs. We will conclude this subsection by commenting on the relation of the decision procedure described above for testing whether two junctions realize the same transformation to other decision procedures. Recently a decision procedure for Church's formulation of computer logic has been announced.\* We are not acquainted with this decision procedure and hence cannot compare it with ours. However, we can prove the equivalence of Church's system to ours, from which it follows that the two decision procedures accomplish the same result. We do this in two steps. First, the definition of automata given in Section 2.1 is in all essential respects equivalent to that of a well-formed net in Burks and Wright (this will be shown in the next subsection). Church's simultaneous recursion is a slight generalization of the second definition of determinism given in Burks and Wright, 1 p. 1360; the difference lies in the fact that Church's "A's" and "B's" are independent of each other, whereas in the Burks-Wright definition of determinism each Ai has a certain relation to the corresponding $\mathbf{B}_{i}\,.\,$ Because of this relation between the two definitions, it follows directly that every transformation realized by a well-formed net is definable by a Church recursion. The converse may be shown by a net construction in which for each i a net is made for Ai and for $B_i$ , and the outputs are combined to give $A_i$ for t = 0 and $B_i$ for t > 0. It is perhaps worth noting that our decision procedure may be extended to give a method for deciding whether the transformations defined by a set of equations (Burks and Wright, p. 1358) are deterministic or not. This may be done by going through all the states <I,S> and seeing if for each of these the equations yield a unique S'. If a given S is admissible (by the admissibility tree) and does not yield a unique S' for each I, then the net (i.e., the transformations it realizes) is not deterministic. There is, in any event, only a finite number of cases to consider, so the procedure is effective. We remark finally that since monadic propositional functions of time may be used in describing net behavior, it might seem that known decision procedures for the monadic functional calculus directly apply here. However, the exact relation between quantifiers and net theory is not known, and in any event when quantifiers are used they required bounds, which are essentially dyadic (see Section 4.2). #### 2.3. REPRESENTATION BY NETS, THE CODED NORMAL FORM We turn now to the representation of automata by diagrams (called automata nets) which show the internal structure of automata. For this purpose we need to correlate the binary digits zero and one used in the preceding subsection to the physical states of wires, junctions, or cells. On the normal interpretation zero and one are associated with the inactive and active <sup>\*</sup>Joyce Friedman, "Some results in Church's restricted recursive arithmetic," The Journal of Symbolic Logic, 21, 219 (June, 1956); this is an abstract of a paper presented at a meeting of the Association for Symbolic Logic on December 29, 1955. states, respectively. A <u>dual interpretation</u> (zero to active, one to inactive) is also possible, and the two interpretations may be interrelated by the well-known principle of duality. It is clear from the developments of the preceding subsection that we need net elements capable of performing two kinds of operations: truth functions and delays. For these purposes we adopt two distinct kinds of elements: switching elements for truth-function operations and a delay element for the delay operation. Some standard logical connectives of the theory of truth functions are: $\cdot$ , & (two representations of conjunction, "and"), v (disjunction, "or"), $\downarrow$ ("neither-nor"), | ("not both"), $\equiv$ ("if and only if"), $\supset$ ("if...then..."), and -,~, ' (three representations of negation, "not"). Circuits for realizing all of these are common. As is well-known, all truth functions may be constructed from the dagger ( $\downarrow$ ) or from the stroke (|), so we shall in general assume sufficient primitive switching elements to realize these. Sometimes it is convenient to have an infinity of primitive switching elements, one for each truth function. Of course, in practice complicated switching functions are realized by compounding simple switching elements, but by representing such circuits by single net elements we can separate the problem of compounding these circuits from other problems in net analysis (see, for example, Fig. 1). A switching element consists of a nucleus together with input wires and an output wire. The termini of these wires are called junctions. Switching elements may be interconnected in switching nets in ways to be discussed subsequently. For examples, see Fig. 1 and other figures of this paper. Propositional variables are associated with each junction of net. Corresponding to each switching element there will be an equation of the theory of truth functions which describes the behavior of that element. For example, if a conjunction ("and") element has the variables $A_O$ and $A_1$ attached to its input junctions (and wires) and the variable $C_O$ attached to its output junction (and wire), it realizes the equation $C_O(t) \equiv [A_O(t) \& A_1(t)]$ , or, more succinctly, $C_O \equiv (A_O \& A_1)$ . The theory of switching nets corresponds to the theory of truth functions and is well developed (see Shannon; 12 Burks and Wright<sup>1</sup>). One aspect of the equation $C_O(t) \equiv [A_O(t) \& A_1(t)]$ needs discussion; it is that the value of the output is given at the same time t as the inputs. In the physical realization of a conjunction this, of course, cannot happen; the output will occur slightly later than the inputs. This suggests putting a delay in at the output of each switching element. Such a delay does in fact exist in each nerve cell. However, for purposes of theoretical analysis it is best to isolate the logical, nontemporal functions of automata from the temporal aspects of their behavior. Hence we can first construct the theory of switches, basing it on the theory of truth functions, and we can later augment this theory to deal with the additional complications brought in Normal form net. by delays. This organization of the subject has practical bearings as well as theoretical value, for to a certain extent the design of switches does and should proceed independently of the design of those parts of computers which produce the transitions from state to state. Hence our switching nets have no delays in them. When we come to formulate the rules governing their interconnection (formation rules for well-formed nets), we will take this factor of idealization into account and not permit interconnections that could lead to trouble because we have ignored it. Hence we will return to this topic at that time. The <u>delay element</u> consists of a nucleus with an input and an output wire; see Fig. 1. It delays an input signal one unit of time; i.e., its input wire state at time t becomes its output wire state at time t+1. We assume that its output wire is inactive (in the zero state) at time zero. If $A_O$ is the variable associated with its input and $E_O$ the variable associated with its output, its behavior is defined by the equations. $$E_O(O) \equiv O$$ $$E_O(t+1) \equiv A_O(t) .$$ Another way of expressing this is $E_O(t) \equiv (t>0) \& A_O(t=1)$ , where "=" signifies the primitive recursive pseudosubtraction $$x - y = x - y \text{ if } x \ge y$$ $x - y = 0 \text{ if } x < y$ Each switching element corresponds to a symbol (or complex of symbols) of the theory of truth functions. We will introduce the symbol $\delta$ to correspond to the delay element. If the input and output to the delay element are $A_O$ and $E_O$ as before, then $E_O(t) \triangleq \delta[A_O(t)]$ or, more succinctly, $E_O \equiv \delta A_O$ . Hence, $$\delta[A_O(O)] \equiv O$$ $$\delta[A_O(t+1)] \equiv A_O(t)$$ . In itself the $\delta$ operator does not, strictly speaking, take us beyond the theory of truth functions (see Section 4.2), but the $\delta$ operator together with a cycle rule which allows an output of a net to be connected back to an input of the same net does take us beyond truth-function theory to quantifier theory (see Section 4). We need now a set of formation rules such that all nets constructed by these rules represent automata, and all automata defined by characterizing tables and output tables may be represented by these nets. We will use the rules given in Burks and Wright, p. 1361, extending them to allow an arbitrary set of switching elements $\Sigma$ . Definition 4: A combination of figures is a well-formed net (w.f.n.) relative to the set $\Sigma$ if and only if it can be constructed by the follow- - (1) A switching element or a delay element is w.f. - (2) Assume $N_1$ and $N_2$ are disjoint w.f.n. Then, - (a) the juxtaposition of N<sub>1</sub> and N<sub>2</sub> is w.f.; - (b) the result of joining junctions $F_{q_1}, \ldots, F_{q_j}$ of $N_1$ to distinct input junctions $G_{p_1}$ , ..., $G_{p_j}$ of $N_2$ is w.f.; (c) the result of joining input junctions $F_p$ and $F_q$ of $N_1$ is w.f.; - (d) if all the wires connected to $F_D$ of $N_1$ are delay-element input wires, then the result of joining any $F_q$ of $N_1$ to $F_p$ is w.f. The ends of wires which do not impinge on a switching-element circle or a delay-element rectangle are called junctions. A junction with no output wires attached to it is called an input junction; all other junctions are called internal junctions (these are sometimes called output junctions). One can label each junction of a net with a variable. We will usually use Ao, A1, ... for input junctions, Co, C1, ... for switch output junctions (junctions driven by switching elements), and $E_0$ , $E_1$ , ... for <u>delay</u> output junctions (junctions driven by delay elements). A well-formed net (diagram) with every junction labeled with a variable is called a labeled net. One can also label the input junctions with functional constants designating and the internal junctions with functional constants naming the functions they actually realize (e.g., if the inputs to a conjunction are labeled with lll... and OlOlOl..., the output should be labeled OlOlOl...). The result is called a <u>net history</u>; cf. the concept of a net state in Burks, McNaughton, <u>et al</u>., 13 p. 207. Consider the net of Fig. 1. Every net of this form (with arbitrary numbers of delays and switching elements) is well-formed (relative to a sufficiently rich set of switching elements) by our rules. We say that the net of Fig. 1 is in normal form. A normal form net is organized as follows. It has a direct-transition switch, fed by the net inputs and the delay outputs, and driving the delay inputs. It has an output switch, fed by the delay outputs and the net inputs, and not driving any delay elements. Given a sufficiently rich set of switching elements, we can construct for each well-formed net a normal form net which behaves the same. We first place the delays of the original net in an array like that of Fig. 1. Then, for each delay element Ei, we analyze the original net to determine what switching element $T_i$ will produce the same result at $\epsilon_i$ as is produced by the switching circuitry of the original net. In the same way we find those switching elements Ta, Tb, ... whose outputs behave the same as the switch output junctions of the original net, or those switch output junctions we are particularly interested in. (The latter can be indicated by labeling them with triangles.) Similarly, given a set of switching elements $\Sigma$ rich enough to represent all truth functions, we can translate a normal form net into a w.f.n. made of those switching elements; e.g., if $\Sigma$ contains only the stroke element, we replace each T of Fig. 1 by an equivalent stroke-element switch. (Note that while a switching element Ti receives inputs from all the net input junctions and all the delay output junctions, its output need not depend on all of these. For example, if in the original net the input to delay element E<sub>2</sub> was the net input junction A<sub>0</sub>, then T<sub>2</sub> has the property that $\epsilon_2(t) \equiv$ $A_{O}(t)$ for all values of $A_{1}(t)$ , $E_{O}(t)$ , $E_{1}(t)$ , and $E_{2}(t)$ .) (Note also that any well-formed net can be arranged somewhat in the form of Fig. 1, if we allow the switches to be of other forms and allow the direction-transition switch to have junctions which do not drive delay inputs.) At this point we wish to make two comments about our representation of switches. The first concerns a topic we have mentioned earlier, the fact that physically it takes time for information to go from the inputs of a switch to the output, while in calculating the behavior of a switch we assume that the output occurs at the same time as the input. The reason for this assumption is that in many applications the switching time is much less than the delay time, so the logic of switching is treated separately from the logic of delay. We wish our theory to accommodate this case. The reader can imagine a small delay in the output of each switching element, with extra delays put in at various places to make the phasing correct. He can then imagine that each unit delay of Fig. 1 is reduced by the accumulated amount of delay in the switch driving it, so the total delay from delay output back to delay output is one unit. (The concept of rank as defined in Burks and Wright, 1 p. 1361, is useful here.) The concept of well-formed net has been so defined as to make this always possible, as is evident from Fig. 1. This way of regarding the matter conforms with practice in designing some machines; see De Turk et al. 14 Those automata with delay built into each switching element (e.g., neural nets) can also be accommodated within our theory; they correspond to special cases of w.f.n. and can be defined by modifying the formation rules. The second comment is connected with the fact that our switching elements represent the flow of information in only one direction; i.e., inputs and outputs are not interchangeable. There are many devices that permit information to flow in only one direction (vacuum tubes, transistors, etc.), but not all do; relays are one notable exception. Relay contacts permit information to flow in either direction, and hence bridge circuits can be made from them. Relays are electromechanical devices and hence are relatively slow. For this reason they are becoming less important as much faster electronic and solid-state devices become available and competitive in price. Further, because of the combination of a coil and contacts, relay automata present special problems, and no formation rules for them which take full account of all the uses that can be made of contacts and coils have been published. However, a new and promising device, the cryotron, also permits the information (in this case, current) to flow in either direction and hence can be used in bridge circuits (see Buck<sup>15</sup>). We will not attempt here to devise formation rules for all uses of relays, cryotrons, and whatever other devices there may be which are not unidirectional. It should be pointed out, however, that every well-formed switching net can be realized by a relay and by a cryotron circuit. Since every truth transformation (i.e., every switching function) can be represented by a well-formed switch and vice versa (Theorem XII of Burks and Wright<sup>1</sup>), our diagrams do represent ways of realizing all truth functions with nonunidirectional elements. Since our diagrams represent a unidirectional flow of information, it follows that the power of relays and cryotrons to pass information in two directions does not add to their power to do logic. It does make a difference in the number of elements needed. Thus a relay bridge circuit may do a certain job more economically than a relay contact network in the form of a well-formed switch. We return now to the problem of correlating w.f.n. and automata. As a first step we will define a set of state numbers $D_0$ , $D_1$ , ..., $D_q$ . Each D will express the states of the delay output junctions. Let these junctions be labeled $E_0$ , $E_1$ , ..., $E_q$ . Then D is the binary number $E_0$ $E_1$ ... $E_q$ . Since a delay output is assumed to be zero at time zero, D(0) = 0. Let $D_0$ be this initial state, i.e., $D(0) = D_0$ . We wish to justify this decision, but before doing so we need to discuss a question concerning the identity of an automaton. We may run a machine from Monday to Friday, turn it off at 5:00 P.M. Friday and then turn it on again at 8:00 A.M. the following Monday. Should we regard it as one machine or two? There is a similarity between this and a human (automaton?) going to sleep at night; however, when a human wakes up in the morning he still remembers quite a bit of his past history, while often (though not always) a computer starts a new life every time it is turned on anew. To preserve the identity of the machine before and after the gap of inaction, we can think of some simple ad hoc device such as a special input cell whose sole function is to turn the machine on and off in such a manner that, when a machine is in operation, stimulating this input cell will put the machine into a unique initial state; such an operation is often called an initial clear. In this way we can preserve the identity of a machine through all the different runs it makes. On this assumption there is only one initial state of an automaton, and we can identify it with the all-off or all-quiet state. Such an identification is natural since neurons, vacuum tubes, etc., are usually inactive when first turned on, and even if they are not this identification can be made by a suitable convention without much loss of generality. The situation is somewhat different if we choose to regard each machine run as a new automaton, but even here there will probably be a single initial state for all runs and it is convenient to identify it with the all-quiet state. Note that this does not commit us to identifying $S_{\rm O}$ with $D_{\rm O}$ . In fact, we shall not always do so (the complete decoded net of Section 3.2 is an example). Hence one can handle other initial states by identifying $D_{\rm O}$ with some value of S other than $S_{\rm O}$ . We now proceed to establish the equivalence of w.f.n. and automata. We show first how to derive a characterizing table and an output table for each w.f.n. We translate the given net into a normal form net. Label the inputs of this normal form net $A_0$ , $A_1$ , ... and let $I = A_0 A_1 ...$ ; for a two input net we would have, for example, $I_0 \equiv \overline{A}_0 \& \overline{A}_1$ , $I_1 \equiv \overline{A}_0 \& A_1$ , $I_2 \equiv \overline{A}_0 \& \overline{A}_1$ , and $I_3 \equiv A_0 \& A_1$ . Label the delay inputs $\epsilon_0$ , $\epsilon_1$ , ... and define $\Delta = \epsilon_0 \epsilon_1 ...$ Label the delay outputs $E_0$ , $E_1$ , ...; then $D = E_0 E_1 ...$ Let $T_0$ , $T_1$ , ... be the truth functions realized by the direct-transition switch. Then we have $$\epsilon_{i}(t) \equiv T_{i}[A_{O}(t), A_{I}(t), ...; E_{O}(t), E_{I}(t), ...]$$ $$E_{i}(t) \equiv \delta \epsilon_{i}(t)$$ for each i. Finally, we let $D_O$ be $S_O$ and each other D be an S, and thereby get a complete table. By the use of the admissibility tree we can construct the characterizing table. This procedure takes care of the transition part of a normal form net. To complete the analysis, we perform a similar construction for the switching elements of the original net, or for those switch outputs we are interested in as final outputs. Let $T_a$ , $T_b$ , ... be the truth functions realized by these outputs. Then we have $$C_{j}(t) \equiv T_{j}[A_{O}(t), A_{1}(t), ...; E_{O}(t), E_{1}(t), ...]$$ for each j, and this gives us the output table. A $\underline{\operatorname{coded}}$ $\underline{\operatorname{normal}}$ $\underline{\operatorname{form}}$ $\underline{\operatorname{net}}$ is a normal form net whose characterizing table is in $\underline{\operatorname{coded}}$ $\underline{\operatorname{normal}}$ $\underline{\operatorname{form}}$ . When the complete table is derived from a net, there will be a bit position for each input junction and each delay output junction. In this case the numbers m and n (of Section 2.1) are the numbers of input and delay output junctions, respectively. An m-n automaton has then $2^{m+n}$ possible complete states and $2^n$ possible internal states. If all input states are considered, we then have $(2^n)^{(2^{m+n})}$ different m-n automata complete tables. Many of these are the same except for the permutation of columns (i.e., of input and internal cells or junctions). Clearly, there is little significance whether a particular junction is labeled the 1st, 2nd, or the m-th. In other words, if we can find a way of identifying one-to-one the input junctions of two m-n automata so that they behave the same, they are equivalent even though they may have different characterizing tables. Analogously, permutations among the labels for the delay output junctions make no essential difference. It follows that there are actually only $(2^n)^{(2^{m+n})}/(m!)(n!)$ rather than (2<sup>n</sup>)(2<sup>m+n</sup>) distinct abstract m-n automata complete tables. Similarly, characterizing tables which are obtainable from one another by permuting columns are to be identified. There will be fewer than $(2^n)^{(2^{m+n})}/(m!)(n!)$ distinct m-n automata characterizing tables, for some of the distinct complete tables will differ only with regard to inadmissible states. In designing the transition part of an automaton it is in general desirable to maximize the number of admissible internal states (i.e., to minimize the number of inadmissible states), since the total number of states is a rough measure of the parts needed for construction, while the capacity for doing different things is in general proportional to the number of admissible states. We will call an automaton complete if all states S are admissible. (A stronger condition would be that all states S are admissible relative to every initial state, instead of just the distinguished initial state $S_0$ ; we will not give to automata satisfying the stronger condition any special name.) If the number of admissible states of an automaton is less than $2^{n-1}$ , we can always replace the automaton by a simpler automaton by using the coded normal form. In a coded normal form characterizing table n is the least integer as large as or larger than the logarithm of N to the base two (similarly for m). For automata with the same number of admissible states, it seems desirable to maximize the "recoverable" ones. Following Moore, $^{16}$ p. 140, we shall call an automaton strongly connected if and only if it is possible to go from every admissible state $\mathbf{S}_i$ to every admissible state $\mathbf{S}_j$ (i may be equal to j). An alternative definition can be given in terms of an admissibility tree in which all occurrences of a given state $\mathbf{S}_i$ are identified. An automaton is strongly connected if and only if for any ordered pair of states $\langle \mathbf{S}_i, \mathbf{S}_j \rangle$ (i may be equal to j), we can pass from $\mathbf{S}_i$ on the tree to $\mathbf{S}_j$ on the tree by a continuous forward route (plus backward jumps from a given state to the same state located lower on the tree). Since the possibility of repetition is important for an automaton, any admissible state which cannot be recovered adds rather little to the capacity of the automaton. Thus it would seem best in general to design a machine which is both complete and strongly connected. Hence, from a practical point of view, complete, strongly connected, and coded normal form automata are the most important. For the theory of automata, however, many nets falling outside this class are of interest. In particular, we will find decoded normal form automata nets of interest in connection with the use of matrices to analyze nets. It remains to show how to construct a well-formed net for any given characterizing table (or complete table) and output table. There are various ways of doing this, one of which is to identify $S_0$ with $D_0$ (if $S_0$ is not equal to zero, its value must be changed to zero) and to let every other S be a value of D. The general process of going from nets to tables is then just reversed. There are various ways of constructing the switches needed. Let us consider the matter with regard to the characterizing table $\ll I, S>$ , S'>. A single column of S' is to be identified with a particular $E_1$ (and $\varepsilon_1$ ). Delete all other columns of the S' part of the table. We then have a truth-table definition of our function $T_1$ , such that $$\epsilon_{i}(t) \equiv T_{i}[A_{O}(t), A_{I}(t), ...; E_{O}(t), E_{I}(t), ...]$$ . Given sufficient primitives, this can be realized by one switching element, as in Fig. 1. Given switches for "and," "or," and "not," it can be realized by using disjunctive normal form. Consider each row of the truth table. If $\epsilon_i$ is zero, do nothing; if $\epsilon_i$ is one, construct an element to sense $\widehat{\mathsf{IS}}$ of that row. The desired switch for $\epsilon_i$ is obtained by disjoining (using "or" on) all the outputs so obtained. We have thus established our first theorem. THEOREM 1: Given a well-formed net, we can construct a complete table, a characterizing table, and an output table describing its behavior. Given a complete table, characterizing table, and an output table, we can construct a well-formed net realizing these tables. This theorem establishes the equivalence of automata and nets, and since nets are idealized representations of digital computers, it follows that for most theoretical considerations automata without any special sensing and acting organs can be viewed as digital computers. We conclude this section by noting the similarity of well-formed net diagrams and flow diagrams used in programming. This similarity is what one would expect, since a net diagram describes the structure of a computer, and a flow diagram describes its behavior during a certain computation, and both symbolize recursive functions. While a program is stored in a computer, part of it (the coded representation of the operations) usually remains invariant through the computation; this means that during the computation not all states of the computer are used. For each such fixed program one could devise a special-purpose machine which would perform the same computation. This is a special case of the general principle that there is a great deal of flexibility with regard to what a machine is constructed to do versus what it is instructed to do. This suggests that there should be one unified theory of which the theory of automata structure and the theory of automata behavior (i.e., the theory of programming) are parts. Each program is in effect a definition of a recursive function. Since there is no effective way of deciding whether two definitions define the same recursive function, there is no effective way of deciding whether two programs will produce the same answer. Two different programs, each finite, may nevertheless produce the same answer because the feedback from the computation may be different in the two cases. #### 3. TRANSITION MATRICES AND MATRIX FORM NETS #### 3.1. TRANSITION MATRICES The transition part of a net controls the passage of the net from state to state and is therefore the heart of the net. In this subsection we introduce "the transition matrix," a table which describes a net by showing the various ways in which it may pass from one delay state to another. We use a characterizing table with M input (state) words, I, N internal-state words, S, and M x N rows, each of the form $\langle I,S \rangle$ , S'>, to define $\mathbb{N}^2$ direct-transition expressions $I_{ij}$ as follows. $I_{ij}$ is a disjunction of all those $I_k$ such that $\langle I_k, S_i \rangle$ , $S_j \rangle$ is a row of the characterizing table; if there are no such $I_k$ , then $I_{ij}$ is $\emptyset$ ("the false"). That is, $I_{ij}$ is a disjunction of all those input words (if any) which can cause the net to pass from state $S_i$ at time t to state $S_j$ at time t+1; it is allowed that i equals j. It is clear that each $I_{ij}$ is a disjunctive normal form expression of the input function variables. Note that the direct-transition expression " $\phi$ " is distinct from the direct-transition expressions "0," "00," "000," etc.; the former means that no direct transition between the two states is possible, while the latter mean that such a transition is brought about by making all the inputs zero. A <u>direct transition</u> from $S_i$ to $S_j$ in a net is a passage from state $S_i$ at t to state $S_j$ at t+1. Such a transition is possible only in the case where $I_{i,j} \neq \emptyset$ . We say that $\langle I_k, S_i \rangle$ (or $I_k S_i$ ) at time t <u>directly produces</u> $S_j$ at time t+1 only in the case where the net makes a direct transition from $S_i$ to $S_j$ under the influence of input $I_k$ at t. A transition from $S_i$ to $S_j$ in a net is a passage from state $S_i$ at t to state $S_j$ at t+w (w > 0). Such a transition is possible only where there exists a sequence of direct-transition expressions, none of which are $\emptyset$ , of the form $I_{ia_1}$ , $I_{a_1a_2}$ , ..., $I_{a_wj}$ . We say that $I_{ia_1}$ $S_i(t)$ , $I_{a_1a_2}(t+1)$ , ..., $I_{a_wj}(t+w-1)$ produces $S_j$ at t+w if and only if there is a transition from $S_i(t)$ to $S_j(t+w)$ under the direction of the listed inputs; this is a transition of w steps (or, alternatively, a transition of length w). It is convenient to arrange the information in an M-N characterizing table in a <u>direct-transition matrix of order N</u> by arranging the N<sup>2</sup> direct-transition expressions in square array. The following is a <u>direct-transition matrix schema</u> of order four: It is clear that a direct-transition matrix presents the same information as a characterizing table, but in a different way. For many purposes this form is more convenient, because it reflects the fact that the basic behavior of an automaton consists of a succession of transitions from one state to another. (Since a transition matrix is equivalent to a characterizing table, the formulae given in Section 2.3 for the number of M-N automaton characterizing tables apply here also.) The information contained in a complete table can also be expressed in matrix form. Since for an abstract automaton the complete table is the characterizing table, the matrix derived from the complete table is a direct-transition matrix. We give an example of a transition matrix. A matrix for a four-stage cyclic counter is (We have added the S's as a mnemonic aid, but, given a conventional ordering of them, they need not be written in.) Thus, an input $I_0$ (e.g., 0) causes the counter to stay in its given state, while an input $I_1$ (e.g., 1) causes it to advance to the next state (modulo 4). All other entries are $\phi$ 's since they represent cases where direct transitions are impossible. ### 3.2. MATRIX FORM NETS In this subsection we will present a net form closely related to the transition matrix. Our presentation is in two steps; the first is to construct a decoded normal form net. Consider for a moment a coded normal form net in relation to its coded characterizing table. Each S is associated with a D, and in general any arbitrary number of bits of D may be unity. Consider in contrast a decoded normal form characterizing table. Exactly one bit of each S is unity, which suggests associating each state S primarily with a single junction. That can be done for an N state decoded normal form table as follows: Let S = $B_0$ $B_1$ ... $B_{N-1}$ as before, and form a net with N delay elements so that D = $E_0$ $E_1$ ... $E_{N-1}$ . Of the $2^N$ delay words D, we use only N+1, namely, $D_0$ (= 000...) and the N words having exactly one bit which is unity and all other bits zero (100..., 010..., etc.). We next construct a junction C which is the output of a disjunctive element ("or") fed by $E_0$ and by the input-independent transformation 100.... Hence, $$C(0) \equiv 1$$ $$C(t) \equiv E_O(t)$$ , for all $t > 0$ . We now associate $B_0$ with C, and each $B_1$ with $E_1$ for 0 < i < N; that is, we equate $B_0$ $B_1$ ... $B_{N-1}$ and C $E_1$ ... $E_{N-1}$ . Hence we have N junctions $(C, E_1, \ldots, E_{N-1})$ such that each state S is associated primarily with one junction; namely, that junction which is active when the net is in that state. These junctions are called the state junctions of the net. See Fig. 2, where the state junctions are labeled C, $E_1$ , and $E_2$ , and are also labeled with the states $(S_0, S_1, and S_2)$ which they "represent." We now wish to construct a net containing wires C, $E_1$ , ..., $E_{N-1}$ , so connected that at each time exactly one of them is active (in state one) while all others are zero, and with the following inductive property. - (A) Junction C (representing $S_0$ ) is active at time 0. - (B) For any time t, if the junction labeled $S_i$ (i.e., C for i=0, $E_i$ for i>0) is active at time t, the net input at time t is $I_k$ , and $\langle\langle I_k, S_i \rangle, S_j \rangle$ is a row of the characterizing table, then the junction labeled $S_j$ will be activated at time t+1. To realize Condition (A), we construct a <u>starter</u> (see Fig. 2) to produce the input-independent transformation 100.... The starter output is then disjoined with $E_O$ to produce C. This will insure that $C(O) \equiv 1$ , and that Fig. 2. Decoded normal form net. $C(t) \equiv E_0$ for t > 0. The starter, which adds another delay element to the net, may be constructed without a cycle (see Section 4.3). The realization of (B) is more complicated. Since there is a state junction for each state (C for $S_0$ , $E_{i+1}$ for $S_{i+1}$ ), the concept of a direct-transition word is useful here. At each state junction $S_i$ we build N switches such that: $I_{i0}$ $S_i$ directly produces $S_0$ (i.e., activates C at the next moment of time); $I_{i1}$ $S_i$ directly produces $S_1$ (i.e., activates $E_1$ at the next moment of time); $I_{i,N-1}$ S<sub>i</sub> directly produces $S_{N-1}$ (i.e., activates $E_{N-1}$ at the next moment of time). A net to accomplish this purpose is shown in Fig. 2, which is actually a $\underline{\text{net}}$ $\underline{\text{schema}}$ rather than a net, since the $I_{\underline{i},\underline{i}}$ are not specified. The boxes labeled with the direct-transition expressions $I_{ij}$ are called <u>direct-transition switches</u>. (Note that these direct-transition switches are different from the direct-transition switch of Fig. 1, though both kinds of switches play the same basic role in a net.) A direct-transition switch $I_{ij}$ has an output at time t if and only if the input to the net is represented by a disjunct of $I_{ij}$ . Every such switch can be made of a disjunction driven by conjunctions of positive and negative inputs. For example, let $I_{10}$ be 0101 v 0110 v 1111, which may be written as $\overline{A_0A_1A_2A_3}$ v $\overline{A_0A_1A_2A_3}$ v $\overline{A_0A_1A_2A_3}$ , and the latter is readily realized by a switch. Of course, the number of inputs may vary from net to net; we will usually show four inputs in our figures (as we do in Fig. 2). If a particular $I_{ij}$ is $\emptyset$ , the switch $I_{ij}$ and the conjunction it drives may be deleted. If a net always passes directly from a state $S_i$ to a state $S_j$ , no matter what the inputs are (or because there are no inputs), then we can run a direct line from the $S_i$ junction to the input of the delay element driving the $S_j$ junction. An input-independent net thus becomes a string of delays (corresponding to the initial part of the input-independent transformation), driven by a starter and driving a cycle of delays (corresponding to the periodic part of the function); cf. Theorem XIII and the accompanying figure of Burks and Wright, $^1$ p. 1363. We call a net of the form of Fig. 2 a <u>decoded normal form net</u>. We will first explain why we call this form "decoded" in contrast to the "coded" form described earlier, and then discuss the exact relation between decoded and other nets. The terminology is justified by the fact that in a coded normal form net the numbers representing delay states (i.e., the D's) appear in coded form, while in the decoded normal form net the numbers representing the delay states (the D's, except that $E_0$ is replaced by C at time zero) appear in decoded form, in the sense in which the terms "coded" and "decoded" are used in switching theory. A decoding switch is a switch with the same number of output junctions $C_0$ , ... $C_{N-1}$ as there are admissible input words $I_0$ , ..., $I_{N-1}$ , and so connected that when the input state is $I_n$ the output junction $C_n$ is active and all other outputs are inactive. (This is a special case of the nets discussed in Burks, McNaughton, et al. $I_n$ ) The information on the inputs is in coded form, while that on the outputs is decoded. In our coded and decoded normal form nets, however, the coding and decoding applies to delay outputs rather than to switch outputs. Given any automaton net, we can construct a decoded normal form net which models that automaton net in the sense of having junctions which behave the same. Let us begin with the transition part of the given automaton net. Suppose it has n delay output junctions $F_0$ , $F_1$ , ..., $F_{n-1}$ and N admissible states $S_0$ , $S_1$ , ..., $S_{N-1}$ (N $\leq$ 2<sup>n</sup>). We next construct the transition part of a decoded normal form net with junctions C, $E_1$ , ..., $E_{N-1}$ , such that the i-th bit from the left of C $E_1$ ... $E_{N-1}$ is unity when the original automaton is in state $S_{i-1}$ . Any function $F_j$ is equivalent to a disjunction $S_{a_1}$ v $S_{a_2}$ v ... v $S_{ak}$ of just those states for which $F_j$ has the value one. Hence by disjoining the appropriate state junctions of the decoded normal form net we can obtain a junction $F_j$ such that $F_j(t) \equiv F_j(t)$ for all t. Figure 2 shows a decoded output switch which realizes $S_0$ v $S_1$ , $S_0$ v $S_2$ , and $S_1$ v $S_2$ . The "single-disjunct disjunctions" $S_0$ , $S_1$ , and $S_2$ are already represented in Fig. 2, and the input-independent outputs ( $S_0$ & $\overline{S}_1$ & $\overline{S}_2$ ) and ( $S_0$ v $S_1$ v $S_2$ ) (i.e., 000... and lll...) are readily obtained from the net if needed. This shows how to construct junctions of a decoded normal form net which behave the same as the delay output junctions of the original net. Any other junction of the original net whose behavior at time t does not depend on the state of the inputs at time t can be treated in the same way. For the remaining junctions we can build an output switch fed both by the decoded output switch and the net inputs. Alternatively, the decoded output switch can be replaced by a switch driven by the state junctions and the net inputs. Switches of these various kinds are allowed as parts of decoded normal form nets. We can now incorporate these results in a theorem. THEOREM 2: For any well-formed net with junctions $C_0$ , $C_1$ , ..., one can construct a decoded normal form net with junctions $C_0^{\dagger}$ , $C_1^{\dagger}$ , ... such that $C_1(t) \equiv C_1^{\dagger}(t)$ for all i and t. The transition part of a decoded normal form net can be drawn in matrix form to bring out its relation to the transition matrix. In Fig. 3 this is done for a transition matrix of order 4; the result is called a matrix box. The disjunction elements of Fig. 2 are omitted by the convention that Fig. 3. Matrix box of order 4. several wires can drive a line (see Burks and Copi, 2 p. 307). A normal form net with the transition part put in matrix-box form is called a <u>matrix form</u> net. Figure 4 is an example which lacks an output switch. A particular net of order 4 may be obtained from the schema of Fig. 3 by substituting the appropriate switches for the direct-transition-switch schemata. We illustrate this with the four-stage cyclic counter whose transition matrix was given in Section 3.1. If we let input $I_0 \equiv 0$ and input $I_1 \equiv 1$ (so the counter counts pulses rather than the absence of pulses), we get the matrix box of Fig. 4. Each $I_{ii} \equiv I_0 \equiv 0$ so we replace these direct-transition-switch schemata by negation elements. For each i, j such that j = i+1 modulo four, $I_{ij} \equiv 1$ , so we replace these direct-transition-switch schemata by single input lines. All other transition-switch schemata correspond to $\emptyset$ 's in the transition matrix defining the counter, so these are dropped; e.g., no direct transition from $S_2$ to $S_1$ is possible, so there is no direct coupling from $S_2$ to $S_1$ . It is manifest from Fig. 4 that the counter stays in its prior stage when the input is zero, but advances to the next stage (modulo four) when the input is one. #### 3.3. SOME USES OF MATRICES The discovery that matrices may be used to characterize automata nets opens up a number of interesting lines of investigation. In the present subsection we will discuss a few applications of matrices to the analysis of automata nets. A direct-transition matrix (whose elements are direct-transition expressions) characterizes the direct transitions of an automaton. We will first establish some properties of this matrix, and then generalize the concepts of direct-transition expression and direct-transition matrix to cover transitions of arbitrary length. Each row of a direct-transition matrix is a partition of M input words $I_0$ , $I_1$ , ..., $I_{M-1}$ into N columns $S_0$ , $S_1$ , ..., $S_{N-1}$ . Hence the disjunction of a row contains all the admissible input words. If these are all the possible words, the disjunction of a row is a tautology. If there are inadmissible input words, then the hypothetical whose consequent is the disjunction of the matrix row and whose antecedent is a disjunction of all the admissible input words, is a tautology. In this case we can say that the matrix row sums to a tautology relative to the admissibility conditions, or that it is a tautology in an extended sense of this word. (Note that this relative sense of "tautology" is relevant in minimality problems; in minimizing a switch we are not looking for a switch logically equivalent to the given one, but rather for a switch logically Fig. 4. Matrix form binary counter. equivalent to the given one relative to the admissibility conditions on the inputs.) A single element j of a row i may be a tautology, in which case all other elements in the row are $\emptyset$ ; this means that whenever the automaton is in state $S_i$ it makes a direct transition to state $S_j$ , no matter what the input is. No input word can occur more than once in a row, else the automaton would not be deterministic. The disjunction of the elements of a column is not in general a tautology, but cases where it is are of special interest as they are related to the concept of backward determinism. Definition 5: An abstract automaton is backwards deterministic if and only if for each finite sequence I(0), I(1), ..., I(t), S(t+1), there is a unique sequence S(0), S(1), ..., S(t) satisfying the complete table. A direct-transition matrix is backwards deterministic if and only if for each $I_k$ and $S_j$ there is at most one state $S_i$ such that $I_k$ $S_i$ directly produces $S_j$ . We give an example of a backwards-deterministic matrix of order 3. $$\begin{bmatrix} I_0 & v & I_1 & \phi & I_2 \\ \phi & I_0 & v & I_1 & v & I_2 & \phi \\ & I_2 & \phi & I_0 & v & I_1 \end{bmatrix}.$$ Another example of interest is $$\begin{bmatrix} I_0 & I_1 & I_2 & I_3 \\ I_1 & I_2 & I_3 & I_0 \\ I_2 & I_3 & I_0 & I_1 \\ I_3 & I_0 & I_1 & I_2 \end{bmatrix}.$$ Besides being backwards deterministic, this matrix has the property that a direct transition is possible from any state to any other state. We call such a matrix directly strongly connected; see Definition 6 below. THEOREM 3: The disjunction of every column of a direct-transition matrix is a tautology if and only if that matrix is backwards deterministic. An abstract automaton is backwards deterministic if and only if its direct-transition matrix is backwards deterministic. We prove first that having every column sum (logically) to a tautology is a necessary and sufficient condition for a direct-transition matrix to be backwards deterministic. If every column sums to a tautology, every input word must occur at least once in each column. But no input word could occur twice in a column, because in the N by N matrix every input word must appear exactly once in a row and there are exactly N occurrences of each input word. If an input word occurred twice in the same column, then at least one of the (N-1) remaining columns must miss that word and could not be a tautology. Hence for a given state $S_j$ at time t+1, and a given input word $I_k$ at time t, there is only one state $S_i$ which together with $I_k$ could have directly produced S. Therefore, having every column sum to a tautology is a sufficient condition for a direct-transition matrix to be backwards deterministic. The proof that it is a necessary condition is obtained by reversing the considerations just used. In a backwards-deterministic matrix no input word can occur twice in the same column. But each row contains exactly one occurrence of each input word and there are exactly N occurrences of each input word in the matrix. Hence, no input word is missing in any column, because otherwise it must occur twice or more in at least one other column. Therefore, every column must sum to a tautology. We show next that if a direct-transition matrix is backwards deterministic, the abstract automaton is backwards deterministic. Consider a finite sequence I(0), I(1), ..., I(t), S(t+1). It follows from the results of the preceding paragraph that there is exactly one S(t) which together with I(t) directly produced S(t+1). Iterating this argument, we see that there is a unique sequence S(0), S(1), ..., S(t) satisfying the complete table (for the given I(0), ..., I(t), S(t+1)). To prove the second part of the theorem in the other direction, we note that if a matrix is not backwards deterministic, there will be some $I_k$ and some $S_j$ such that there are two distinct states $S_a$ and $S_b$ , either of which will, together with $I_k$ , directly produce $S_j$ . Hence for the sequence I(0) = $I_k$ and S(1) = $S_j$ , there are two sequences [namely, $S(0) = S_a$ and $S(0) = S_b$ ] satisfying the complete table. In Section 2.1 we remarked that in the presence of our deterministic assumption an infinite past would be inconvenient. The first part of Theorem 3 may be used to justify this statement. In order to describe the behavior of a net over a certain period of time t, t+1, ..., t+w, we would naturally need to know the inputs I(t), I(t+1), ..., I(t+w) and the internal state S at one of these times (or perhaps at t+w+1). Now if every net were backwards deterministic, it would not matter for which time S was known. But for a net which is not backwards deterministic we must know S(t) to determine S(t+1), ..., S(t+w). Hence we might as well pick a time t=0 as a standard reference point for our analysis and always work forward from this time; we therefore allow t to range over the nonnegative integers only. [We could define backwards deterministic on the basis of each infinite sequence ..., I(-7), ..., I(0), ..., I(t), S(t+1) determining an infinite sequence ..., S(-7), ..., S(0), ..., S(t) and conduct the discussion in terms of this definition. Theorem 3 then holds with the following exception: the direct-transition matrix of a backwards-deterministic automaton may fail to be backwards deterministic with regard to states which cannot be recovered. For example, a backwards-deterministic automaton can have a transition matrix in which both $S_a$ $I_k$ and $S_b$ $I_k$ directly produce the same state $S_j$ , but in which no state and input combination directly produces $S_a$ or $S_b$ . We turn now to the task of generalizing the notion of direct-transition expression to cover nondirect transitions. Consider an example. Suppose it is possible to go from state three of an automaton to state seven with either a sequence $I_4$ , $I_6$ , or with $I_2$ . We could write this as $I_4I_6$ v $I_2$ , but it must be understood that juxtaposition here represents a noncommutative type of conjunction, since $I_6$ followed by $I_4$ may not carry the net from state three to state seven. We will sometimes use a special operation, called concatenated conjunction, to express the order-preserving conjunction needed here. Thus the above may be written $I_4 \cap I_6$ v $I_2$ . However, the concatenated-conjunction symbol, $\cap$ , may be omitted if the context makes clear what is intended. The noncommutative nature of concatenated conjunction can be brought out by making the role of time explicit: $I_4 \cap I_6$ is short for $I_4(t) \cdot I_6(t+1)$ , while $I_6 \cap I_4(t+1)$ . Clearly $I_4(t) \cdot I_6(t+1)$ is not equivalent to $I_6(t) \cdot I_4(t+1)$ . Concatenated conjunction can be used to build up transition words from direct-transition expressions. For example, we might have the transition expression $I_3, 2 \cdot I_2, 5 \cdot I_5, 7$ v $I_3, 7$ , or, more briefly, $I_3, 2I_2, 5I_5, 7$ v $I_3, 7$ . In a concrete case it might reduce to the transition expression $I_4 \cdot (I_6 \text{ v } I_9) \cdot I_5 \text{ v } (I_3 \text{ v } I_6)$ , or, more briefly, $I_4(I_6 \text{ v } I_9)I_5 \text{ v } (I_3 \text{ v } I_6)$ , which is of course equivalent to the transition expression $I_4 \cdot I_6 \cdot I_5 \text{ v } I_4 \cdot I_9 \cdot I_5 \text{ v } I_3 \text{ v } I_6$ , or, more briefly, $I_4I_6I_5 \text{ v } I_4I_9I_5 \text{ v } I_3 \text{ v } I_6$ . For this expansion we use a distribution principle for concatenated conjunction: (p v q) \cdot(r v s) \equiv (p \cdot r v p \cdot s) \equiv q \cdot s). We can now define the general concept of transition expression. A transition expression is a disjunction of concatenated conjunctions of direct-transition expressions, provided that if any concatenated conjunction contains a $\emptyset$ , it may be replaced by $\emptyset$ . Thus $I_{2,5}I_{5,3}$ v $I_{2,3}$ might become ( $I_3$ v $I_7$ ) $\emptyset$ v $\emptyset$ , which would reduce to $\emptyset$ . We allow direct-transition expressions as special cases of transition expressions. <u>Definition 6:</u> A <u>transition matrix</u> of order N is an N by N array whose elements are transition expressions. Two transition matrices of order N can be combined by the following operations, where $\alpha(a,b)$ , $\beta(a,b)$ are transition expressions for transitions from state a to state b. Matrix disjunction: $[\alpha(a,b)]$ v $[\beta(a,b)]$ = $[\alpha(a,b)$ v $\beta(a,b)]$ . Matrix concatenated conjunction: $[\alpha(a,b)]$ ? $[\beta(a,b)] = [\gamma(a,b)]$ , where $$\gamma(a,b) = \sum_{i=0}^{N-1} \alpha(a,i) \gamma \beta(i,b)$$ , where $\Sigma$ represents disjunction. Matrix (concatenated) power: $M^1 = M$ $M^n = N^{n-1} M$ Sum of matrix powers: $\sum_{i=1}^{n} M^{i} = M v M^{2} v ... v M^{n}$ The characteristic matrix C(M) of a transition matrix M is obtained by replacing the elements of M with zeros or ones, according to whether the elements do or do not reduce to $\emptyset$ , i.e., according to whether transitions from state a to state b are not or are possible by M. A direct-transition matrix M is directly strongly connected if and only if every element of C(M) is unity. Inequality between characterizing matrices is defined by $C(M) \leq C(N)$ if and only if for each element $\alpha(a,b)$ of C(M) and the corresponding element $\beta(a,b)$ of C(N), $\alpha(a,b) \leq \beta(a,b)$ , i.e., $\alpha \supset \beta$ . THEOREM 4: Let M be a direct-transition matrix of order n. A transition from state $S_j$ to state $S_j$ in exactly w steps is possible if and only if the element $\alpha(i,j)$ of $C(M^W)$ is one. A transition from $S_i$ to $S_j$ in w or less steps is possible if and only if the element $\alpha(i,j)$ of $$C\begin{pmatrix} x \\ \sum_{k=1}^{w} M^k \end{pmatrix}$$ is one. A transition from state $S_i$ to state $S_j$ is possible if and only if (a) for $i \neq j$ , the element $\alpha(i,j)$ of $$C\begin{pmatrix} n-1 \\ \sum_{k=1}^{n-1} M^k \end{pmatrix}$$ is one; (b) for i = j, the element $\alpha(i,j)$ of $$C\begin{pmatrix} n \\ \sum_{k=1}^{n} M^k \end{pmatrix}$$ is one. A net is strongly connected if and only if every element of $$C\begin{pmatrix} n \\ \sum_{k=1}^{n} M^k \end{pmatrix}$$ is unity. If $\alpha$ and $\beta$ are positive integers, then $$\mathbf{C} \begin{pmatrix} \alpha \\ \sum\limits_{k=1}^{\alpha} \mathbf{M}^k \end{pmatrix} \ \leqq \ \ \mathbf{C} \begin{pmatrix} \alpha + \beta \\ \sum\limits_{k=1}^{\alpha} \mathbf{M}^k \end{pmatrix}$$ $$C\begin{pmatrix} n \\ \sum\limits_{k=1}^{n} M^k \end{pmatrix} = C\begin{pmatrix} n + \alpha \\ \sum\limits_{k=1}^{n} M^k \end{pmatrix} .$$ To prove this theorem, we first examine the structure of the elements of $M^W$ , where M is the direct-transition matrix. Each element $\alpha(i,j)$ is a disjunction of concatenated conjuncts, each of the form $I_{ia_1}\ I_{a_1a_2}\ \dots\ I_{a_Wj}$ . Clearly a transition from $S_i$ to $S_j$ in exactly w steps is possible if and only if at least one of these concatenated conjunctions does not reduce to $\emptyset$ , i.e., if and only if the element $\alpha(i,j)$ of $C(M^W)$ is unity. A matrix $$\sum_{i=1}^{w} M^{k}$$ has as its elements transition expressions covering transitions in 1, 2, ..., or w steps, and hence the elements of $$C\left(\sum_{i=1}^{W}M^{k}\right)$$ are one or zero according to whether a transition from $S_i$ to $S_j$ can or cannot be made in w or less steps. It remains for us to show that beyond a certain power (n for i=j, n-l for i\neq j), raising M to a higher power does not add to the possible transitions that can occur, but only to the way in which they occur. Consider two distinct states, $S_i$ and $S_j$ . There are only n-2 other states to pass through. The automaton's being in one of these states $S_k$ for more than one moment of time does not increase the possibility of getting to $S_j$ from $S_i$ , since whatever can be accomplished from a later occurrence of $S_k$ can be accomplished from the first occurrence of $S_k$ . The argument is similar for possible transitions from $S_i$ to $S_i$ , with the difference that here we must consider n-l other states. ### 4. CYCLES, NETS, AND QUANTIFIERS #### 4.1. DECOMPOSING NETS In this section we discuss cycles in nets and their bearing on the application of logic to net analysis. As a first step we discuss the elimination of unnecessary cycles from nets. A well-formed net (w.f.n.) may have unused switching-element input wires. This is especially likely to be the case for a coded normal form net constructed from a characterizing table, for not all bits of D and I need influence a given delay input junction. By inspection of the characterizing table of a w.f.n., we can tell which bits are irrelevant to a switch output $C_i$ . A particular bit $A_j$ of I D is irrelevant to $C_i$ if and only if for each pair I D identical in every position $A_j$ the value of $C_i$ is the same. Using this criterion we can eliminate all the unused switch input wires by replacing the original switching elements with other elements which behave the same for all inputs and on which every switch input has an influence. The same process can be applied to an output table and an output switch. It should be noted that the above process is a minimization technique, i.e., a technique for producing a simpler net which realizes the same transformations as the original net. In Section 2.3 we showed how to minimize the number of delay elements by using a coded normal form. Other minimization methods are implicit in the results of preceding sections. For example, if two junctions of a net behave the same (cf. the decision procedure of Section 2.2), one may be eliminated. Note that for these minimization procedures we can work from complete tables, characterizing tables, and output tables; we need not refer to the net diagrams at all. However, our main interest at present is not in minimality in general, but in minimality only insofar as it relates to the number and nature of cycles in a net. For example, every normal form net with at least one delay element will have cycles, while the corresponding net with no irrelevant switching-element inputs may have either fewer cycles or perhaps no cycles at all. For this reason we shall hereafter consider only such nets. Our next task is to define a measure of the complexity of the cycles of a net. A sequence of junctions $A_1, A_2, \ldots, A_n, A_1$ (possibly with repetitions) constitutes a cycle if and only if each Aj is an input to an element whose output is $A_k$ , where $k \equiv j+1$ modulo n. Thus a junction occurs in a cycle if it is possible to start at that junction, proceed forward (in the direction of the arrows) through switching elements and delay elements, and ultimately return to the junction. A junction which does not occur in a cycle has degree zero, as does an input-independent junction. It should be noted that this definition assigns degree zero to some junctions occurring in cycles, i.e., to all inputindependent junctions which occur in cycles. The reason will become clear in the next subsection. For the same reason we require a further modification of the net before degrees are assigned to the remaining junctions of it. That modification is to replace all cycles containing both input-independent and non-input-independent junctions by cycles containing only one of these kinds of junctions. Let C be an input-independent junction occurring in a cycle with a non-input-independent junction E. Break the cycle at C by deleting the element whose output wire is joined to junction C; to make the net behave the same, we connect C to the output of a subnet which realizes the input-independent transformation originally realized by C. Such a subnet may be so constructed that it has only one cycle, and such that every junction in it is an inputindependent junction (Burks and Wright, 1 Theorem XIII, p. 1363). Thus, given any net N, we can find an equivalent net $N^{\dagger}$ with no more cycles than N and which has no cycles containing both input-independent and non-input-independent junctions. We say that a net with no irrelevant switching-element inputs and with no cycles containing both input-independent and non-input-independent junctions is in reduced form. We assign degrees to all the junctions of N' (and hence derivatively to all junctions of N) as follows. The degree of a non-input-independent junction which occurs in a cycle is the maximum number of distinct delay elements it is possible to pass through by traveling around cycles in which the junction occurs. Figure 5 shows a net with the degree of each junction in parentheses. (We stipulate that in Fig. 5 the switching functions are so chosen that no junction is input-independent, and the net is in reduced form.) Note that in order to get to both $E_2$ and $E_3$ from $C_0$ , it is necessary to pass through $E_1$ twice. The degree of a net is the maximum of the degrees of its junctions. Figure 5 is of degree 3. A net is entirely connected if and only if its degree is greater than zero and the number of delay elements in it is equal to its degree. This notion should be compared with the analogous notion of "strongly connected," defined in Section 2.3. We define directly entirely connected analogously to the notion "directly strongly connected" of Section 3.3; that is, in a directly entirely connected net it is possible to start at any delay output junction and proceed forward to any delay input junction, passing only through switching elements. One of these sets of notions concerns states; the other set concerns the bits used to represent states. Figure 5 is not entirely connected, but it may be completely decomposed into two nets of degree zero (the net $A_1$ - $E_0$ and the net $E_4$ - $E_1$ - $A_1$ - $C_3$ - $E_5$ ) and two entirely connected subnets ( $E_0$ - $C_0$ - $E_1$ - $A_2$ - $C_1$ - $E_2$ - $E_3$ and $E_0$ - $C_0$ - $E_4$ - $C_2$ ). A <u>maximal</u> entirely connected subnet associated with a net junction, say F, is the net formed of all junctions which occur in a cycle with F, together with the elements between these junctions, and the switch input junctions of all switches whose output junctions are in a cycle with F. Subnet 2 of Fig. 5 (the net $E_0-C_0-E_1-A_2-C_1-E_2-E_3$ ) is a maximal entirely connected subnet associated with the junctions $E_2$ , $C_1$ , $C_0$ , $E_1$ , $E_3$ . The part of this subnet which results by deleting the delay element between $E_1$ and $E_3$ is an entirely connected subnet of Fig. 5, but it is not maximal. Since any two junctions of a net either do or do not occur in the same cycle, each element of a net either belongs to a subnet of degree zero (i.e., is not in a cycle) or belongs to a unique, maximal, entirely connected subnet of the original net. (Note in this connection that "occurring in the same cycle" is a transitive relation.) A given net element may belong to several subnets of degree zero; e.g., the delay $C_3$ - $E_5$ of Fig. 5 belongs to subnet 4 and to the subnet consisting of itself. There are various ways to group the elements connected to junctions of degree zero into maximal subnets, of which we will give one. Let A be a junction of degree zero and B be any other junction of the net. Proceeding forward from B to A along a certain path, we can pass through $n(n \ge 0)$ maximal entirely connected subnets before arriving at A. Note that n is bounded, for if we could pass through a given maximal entirely connected subnet M and then (always proceeding forward in the direction of the arrows) later come back to M and pass through it again, it would not be the case that M is maximal. Since there are a finite number of junctions in the net and a finite number of paths from each to A, there is a maximum such number N to be associated with A. Then group into a maximal subnet of degree zero all the elements lying between junctions with the same maximal numbers N, together with the input wires of all switches whose output junctions are assigned the number N. Subnet 4 of Fig. 5 is a maximal subnet of degree zero. It is by now clear that any net in reduced form can be uniquely and effectively decomposed into maximal entirely connected subnets and maximal subnets of degree zero, i.e., into maximal subnets of various degrees. Figure 5 is uniquely decomposed into the four subnets shown there. To decompose a net, one need only find the degrees of the junctions, one by one, remove all input-independent junctions which occur in cycles with non-input-independent junctions, determine the classes of junctions belonging to the same cycles, and then determine the maximal subnets of degree zero. A rank can then be assigned to each maximal subnet inductively. A maximal subnet which has no net inputs or whose only inputs are net inputs is of rank O. A maximal subnet which has at least one input from another maximal subnet of rank r and no inputs from maximal subnets or rank greater than r, is of rank r+1. See Fig. 5 for an example of ranks. There may, of course, be several maximal subnets of the same rank. It is clear that every maximal subnet has a unique rank, for there cannot be two such subnets driving each other, else they would not be maximal (cf. Theorem IX of Burks and Wright I). It is worth noting that if each maximal subnet of a net is replaced by a single box with inputs and outputs, the result is a diagram without cycles. The following structure theorem summarizes these results. THEOREM 5: Any net in reduced form may be uniquely decomposed into (one or more) maximal subnets, each of which has a unique degree and rank. We conclude this subsection with a conjecture: For any degree d, there is some transformation not realized by any net of degree d. This means that there is no maximal degree such that any transformation can be realized by a net of this degree. Our grounds for making this conjecture are as follows. Consider counters with one input, designed to produce an output modulo m. When m is a power of two, one can construct a sequence of binary counters, each of degree one, each driving its successor, except the last one, which drives nothing but produces the desired output, and the whole net will be of degree 1. When m is not a power of two, the standard way of constructing the desired counter is to take a counter modulo a power of two, sense with a switch when it reaches m-1, and use that information to clear the counter back to zero. But such a feedback loop produces a net of arbitrarily high degree. Considerations of this sort lead us to believe that the conjecture is true. ### 4.2. TRUTH FUNCTIONS AND QUANTIFIERS We have already indicated (Section 2.2) the close correspondence between switching nets (switches) and the theory of truth functions (the propositional calculus, Boolean algebra). That correspondence permits us to assign variables to switch inputs and to associate with each switch output a truthfunctional expression which is a truth function of the input variables for that switch. Thus we can represent the output of a switch as an explicit function (in particular, a truth function) of its inputs. It is natural to seek analogs for well-formed nets in general. We will give an analog for nets of degree zero and then discuss the problem for nets of arbitrary degree. Consider first nets with delays but without cycles. For these we can express each output as an explicit function of the inputs by using the theory of truth functions enriched with the delay operator $\delta$ . Thus in Fig. 5 $E_0 \equiv \delta A_1$ . That this can always be done for noncyclic nets can be proved from the formation rules (with rule 5 deleted, of course); the considerations involve generalizations of those connected with the concept of rank in Burks and Wright, p. 1361 ff. We next mention two theorems for delay nets without cycles. The first concerns shifting a delay operator across a logical connective; for example, $\delta(A \& B) \equiv \delta A \& \delta B$ . To prove this formula, we apply the definition of $\delta$ to both sides: $$\delta[A(O) \& B(O)] \equiv \delta A(O) \& \delta B(O) \equiv O$$ $$\delta[A(t+1) \& B(t+1)] \equiv \delta A(t+1) \& \delta B(t+1) \equiv A(t) \& B(t)$$ . The legitimacy of this operation is connected to the fact that conjunction is a positive truth function (i.e., has the value zero when all its arguments are zero). In general, if P is a positive truth function, the following holds: $$\delta P(A_1, A_2, ...) \equiv P(\delta A_1, \delta A_2, ...)$$ . Both v and $\neq$ are also positive truth functions. A negative truth function is one which has the value one when all arguments are zero; $\sim$ , $\mid$ , $\downarrow$ , $\equiv$ , and $\supset$ are examples. To develop analogous principles for these, we need an operator $\delta$ ' defined by $$\delta'A(0) \equiv 1$$ $$\delta' A(t+1) \equiv A(t)$$ . If N is a negative truth function, we have $$\delta'N(A_1, A_2, ...) \equiv N(\delta A_1, \delta A_2, ...)$$ . If the negative function is not tautologous (i.e., not true for all values of its variables), then $$\delta N(A_1, A_2, ...) \equiv N(\delta^{\alpha_1}A_1, \delta^{\alpha_2}A_2, ...)$$ where each $\delta^{\alpha_i}$ is either $\delta$ or $\delta'$ . For example, $\delta \sim A \equiv \sim \delta'A$ . Note that two formulae which are the same except for the absence or presence of primes on deltas describe two functions which differ only initially; after some fixed time which is determined by the number of deltas involved they are equivalent. Shifting deltas across truth-functional connectives is equivalent to shifting all delay elements to inputs, so the resultant net consists of delays followed by a switch. This theorem can often be used to simplify expressions and nets. For example, consider a net which realizes $\delta(\delta A \neq A) \neq (\delta A \neq A)$ . Applying the theorem, we get $(\delta \delta A \neq \delta A) \neq (\delta A \neq A)$ , which by the theory of truth functions reduces to $\delta \delta A \neq A$ . The second theorem concerns input-independent transformations. By a result of Section 2 every such transformation is periodic, and hence is of the form $\emptyset \alpha \alpha \alpha \ldots$ , where $\emptyset$ and $\alpha$ are binary words. For example, in 1010100100100... $\emptyset$ is 1010 and $\alpha$ is 100. We call the length of $\alpha$ in bits the periodicity of the transformation, assuming that $\alpha$ is of minimum length. periodicity of our example is three (not six, or nine, or etc.). The second theorem states that the class of input-independent transformations realized by cycle-free nets is equivalent to the class of periodic transformations of period one. We omit a detailed proof. The essential point in showing that every input-independent transformation realized by a cycle-free net is of period one lies in the fact that an automaton without cycles cannot remember anything for more than a fixed period of time. To show that every periodic transformation of period one can be realized by a noncyclic net, we can use part of the construction of the figure for Theorem XIII of Burks and Wright. With this construction we can realize any transformation of the form \$0000.... realize a transformation of the form $\psi$ 1111..., we feed $\overline{\psi}$ 0000... through a negation element, where $\overline{\psi}$ is the bitwise complement of $\psi$ . Consider next input-independent nets, i.e., nets all of whose internal junctions realize input-independent transformations. These nets may have cycles. Nevertheless, we can express the behavior of a net output as an explicit function of the inputs (in a vacuous sense) without using quantifiers. To do so it suffices to state the times at which the junctions are active. Thus, for F(t) = 111010101..., we have $F(t) \equiv [(t=1) \ v \ (t \equiv 0 \ \text{mod } 2)]$ . We can now let an input of a noncyclic net be driven by an input-independent junction, and by making an appropriate substitution still obtain an expression for the output as an explicit function of the inputs. Thus, given $$C(t) \equiv A_O(t) \& A_1(t)$$ , we can identify A<sub>1</sub> with F above and obtain $$C(t) \equiv A_O(t) \& \delta[(t = 1) \ v \ (t \equiv 0 \ mod \ 2)]$$ $$\equiv A_O(t) \& [(t = 2) \ v \ \{(t > 0) \ \& \ (t \equiv 1 \ mod \ 2)\}] .$$ We can further extend out theory of truth functions to include expressions like those just used. By adding t = a, t > a, $(t-a) \equiv c \mod b$ , where t is a variable and a,b, and c are integers, we can describe any periodic function (using, of course, the truth-functional connectives). We call the theory obtained by adding these symbols and the operator $\delta$ the <u>extended theory of truth functions</u>. It is clear from the preceding discussion that the following theorem holds. THEOREM 6: For every junction of a net of degree zero, we can effectively construct a formula of the extended theory of truth functions which describes the behavior of the junction as an explicit function of the behavior of the inputs. This theorem provides the motivation for our decision in the preceding subsection to classify input-independent junctions occurring in cycles along with non-input-independent junctions not occurring in cycles, for both can be handled by our extended theory of truth functions. A much more difficult problem is to find formulae which describe the behavior of junctions of degree greater than zero as explicit functions of the net inputs. The natural place to seek such formulae is quantification theory, the next step beyond truth-function theory in the usual development of symbolic logic. The theory of quantifiersuses, in addition to the truth-functional connectives, the quantifiers "(x)" ("all x"), "( $\pm$ x)" or "( $\pm$ x)" ("some x"), etc. The functional expressions of net theory "A(t)," "B(t+3)," etc., are clearly monadic propositional functions or predicates. An essential feature of a deterministic net is that an output C(t) cannot depend on any inputs for times greater than t; hence the quantifiers used must be bounded. These bounds may be expressed by predicates such as "x < t" and "x $\leq$ y < t," which are basically dyadic (the second is triadic but is easily reduced to dyadic predicates). Hence the required form of quantification theory involves monadic predicates and bounded quantifiers ranging over the nonnegative integers. Figure 6A shows a very simple cyclic net; it is described by the bounded quantifier expression (4.2-1) $$E(t) \equiv (Ex):x < t \cdot A(x)$$ , which states that E is active at t if and only if A has been active at some prior time. The slightly more complicated cyclic net shown in Fig. 6B is described by the quantifier expression $$(4.2-2)$$ $C(t) \equiv (Ex)::x \leq t.A_{O}(x):.(y):x \leq y \leq t.\supset A_{1}(y)$ , which asserts that C is active at time t if and only if there is some nonlater time $\mathbf{x}$ at which $A_O$ was active and such that at that time and all later times $A_1$ was active. It is easy to give examples of quantifier formulae for much more complicated nets with cycles. Whether or not formulae of this type can be found for arbitrary w.f.n. is an open question. Fig. 6. Two simple nets. It should be noted that in the above examples the quantifier expressions do describe the output as an explicit function of the inputs; i.e., the only function variables on the right are input variables. That is analogous to using a truth-functional expression to describe a switch output as a truth function of its inputs alone. It stands in contrast to the recursive methods for describing net behavior used previously, in which the output was expressed as a function not only of the input junctions of the net but also (in general) of the internal junctions (at an earlier time). In some cases such a recursive formulation is the natural way of specifying the behavior of a desired circuit. On the other hand, it is often simpler and more direct to specify the behavior of a net in terms of the inputs alone by means of quantifiers and simple arithmetic predicates like "is odd," "is between m and n," etc. Hence it is of interest to develop a form of quantification theory that will facilitate this method of characterizing an automaton and to find both effective (in the purely theoretical sense) and practical ways of passing from formulae in the calculus to the corresponding automaton nets and vice versa. The problem of finding a quantifier formula for a net characterized recursively may be viewed as one of converting recursive definitions into explicit ones. As we have remarked in Section 2.2, the transformation realized by each delay output of a net is primitive recursive relative to the net inputs. Theoretically one can use the well-known procedures for converting primitive recursive functions (cf. Hilbert and Bernays, 17 pp. 412-421) to obtain the desired result. As it turns out, however, this method produces quantifier expressions in which some quantified variables range not over time but over the history of the states of the delay outputs. The quantifier expressions so obtained are intuitively always no more and actually less transparent than the corresponding recursive characterization. ### 4.3 NERVE NETS We will close this paper with a few remarks about nerve nets and cycles in nets. A nerve net is a special case of a well-formed automaton net, in which each neuron consists of a positive switching element driving a delay element. Hence our general results apply to nerve nets. Not all transformations realized by well-formed nets can be realized by nerve nets. According to Theorem 2, every transformation realized by a w.f.n. can be realized by a decoded normal form net. By the results of Section 4.1 the starter of a decoded normal form net may be constructed without cycles. Hence we can construct a decoded normal form net whose cycles pass through conjunctions and delays only. Hence every transformation realized by a w.f.n. can be realized by a w.f.n. in which the only positive switches occur in cycles. A neural net is a net in which only positive switches occur in cycles. It differs from a decoded normal form net in two basic respects: first, it has no starter, and second, every switch is combined with a delay. Hence, if a starter is added to the system of nerve nets, every automaton transformation can be realized by a nerve net, except that the nerve-net output may be later in time because each neuron has a delay built into it. Usually the total time lag can be made two, because a disjunctive normal form expression, e.g., $(p \cdot \overline{q})$ v $(\overline{p} \cdot q)$ , is a disjunction of conjuncts (see, for example, Kleene, ll Theorem 3). Kleenell has investigated the logic of nerve nets in some detail. He analyzes nets in terms of the kinds of events (input histories) they can detect, and he establishes the result that an event can be detected by a net if and only if the event is regular (Theorems 3 and 5). The reader is referred to page 22 of Kleenell for a definition of "regular"; we note here merely that an important ingredient of the notion of regularity is periodicity. For example, an input of the form $\alpha \alpha \ldots \alpha \beta$ , with an indefinite number of $\alpha$ 's, is regular. It is easy to construct a net which will be active at time t if and only if the history of its input is of the form $\alpha \alpha \ldots \alpha \beta$ , for an indefinite number of $\alpha$ 's; cf. the discussion of Section 4.2 on periodic transformations. The pervasiveness and importance of cycles in the analysis of automata and nerve nets are worth emphasizing. When cycles are permitted in automata nets, these nets become much more powerful, and, correspondingly, the logic required to treat them becomes much more complicated. There are many ways in which nets can involve cycles. We have just noted that by Kleene's results an important aspect of any input history which can be detected or distinguished by automata is the periodicity ingredient in its regularity. By our results of the previous subsection the internal structure of an automaton is analyzable into cycles; and by earlier results (see Section 2.2) any output which is independent of the inputs is periodic, and hence cyclic in character. The relations between these various cyclic aspects of automata remain to be investigated. It would be of interest to have a theory which shows how they are interconnected. #### BIBLIOGRAPHY - 1. Burks, Arthur W., and Jesse B. Wright, "Theory of Logical Nets," <u>Proc.</u> IRE, 41: 1357-1365 (1953). - 2. Burks, Arthur W., and Irving M. Copi, "The Logical Design of an Idealized General-Purpose Computer," <u>J. Franklin Inst.</u>, <u>261</u>: 299-314 and 421-436 (1956). - 3. Shannon, Claude, "Computers and Automata," <u>Proc. IRE</u>, <u>41</u>: 1234-1241 (1953). - 4. Rochester, N., J. H. Holland, L. H. Haibt, and W. L. Duda, "Tests on a Cell Assembly Theory of the Action of the Brain, Using a Large Digital Computer," <u>IRE Trans. on Information Theory</u>, 1956, pp. 80-93. - 5. Turing, A. M., "On Computable Numbers, with an Application to the Entscheidungsproblem," <u>Proc. London Math. Soc.</u> (Series 2), <u>42</u>: 230-265 (1936-37), with a correction, ibid., <u>43</u>: 544-546 (1937). - 6. Kleene, S. C. <u>Introduction to Metamathematics</u>. New York: D. Van Nostrand Company, Inc., 1952. - 7. Wang, Hao, "A Variant to Turing's Theory of Computing Machines" (to be published in J. Assn. Computing Machinery). - 8. Wang, Hao, "Universal Turing Machines: An Exercise in Coding" (to be published). - 9. von Neumann, John, "The General and Logical Theory of Automata," pp. 1-41 in Cerebral Mechanisms in Behavior, John Wiley and Sons, 1951. - 10. Kemeny, John G., "Man Viewed as a Machine," Scientific American, 192: 58-67 (1955). - 11. Kleene, S. C., "Representation of Events in Nerve Nets and Finite Automata," pp. 3-41 in <u>Automata Studies</u>, edited by C. E. Shannon and J. Mccarthy, Princeton Univ. Press, 1956. - 12. Shannon, Claude, "A symbolic analysis of relay and switching circuits," Trans. AIEE, 57: 713-723 (1938). - 13. Burks, Arthur W., Robert McNaughton, Carol H. Pollmar, Don W. Warren, and Jesse B. Wright, "Complete Decoding Nets: General Theory and Minimality," J. Soc. Ind. Appl. Math., 2: 201-243 (1954). - 14. De Turk, J. E., A. L. Garner, J. Kautman, A. W. Bethel, and R. E. Hock. Basic Circuitry of the MIDAC and MIDSAC. Ann Arbor: Univ. of Mich. Press, 1954. - 15. Buck, D. A., "The Cryotron—A Superconductive Computer Component," <u>Proc.</u> <u>IRE</u>, <u>44</u>: 482-493 (1956). - 16. Moore, Edward F., "Gedanken-Experiments on Sequential Machines," pp. 129-153 in Automata Studies, edited by C. E. Shannon and J. McCarthy, Princeton Univ. Press, 1956. - 17. Hilbert, D., and P. Bernays. <u>Grundlagen der Mathematik</u>. Vol. 1. Berlin: Springer, 1934. #### DISTRIBUTION LIST (One copy unless otherwise noted) ### Alabama The Air University Libraries Maxwell Air Force Base, Alabama #### California Applied Mathematics and Statistics Laboratory Stanford University Stanford, California Department of Mathematics University of California Berkeley, California #### Commander Air Force Flight Test Center Attn: Technical Library Edwards Air Force Base, California The Rand Corporation Technical Library 1700 Main Street Santa Monica, California Director, Office for Advanced Studies, Air Force Office of Scientific Research Post Office Box 2035 Pasadena 2, California #### Commander Western Development Division Attn: WDSIT Post Office Box 262 Inglewood, California #### Connecticut Department of Mathematics Yale University New Haven, Connecticut #### Florida Commander Air Force Armament Center Attn: Technical Library Eglin Air Force Base, Florida Commander Air Force Missile Test Center Attn: Technical Library Patrick Air Force Base, Florida #### Illinois Department of Mathematics Northwestern University Evanston, Illinois Institute for Air Weapons Research Museum of Science and Industry University of Chicago Chicago 37, Illinois Department of Mathematics University of Chicago Chicago 37, Illinois Department of Mathematics University of Illinois Urbana, Illinois ### Maryland Institute for Fluid Dynamics and Applied Mathematics University of Maryland College Park, Maryland Mathematics and Physics Library The Johns Hopkins University Baltimore, Maryland DISTRIBUTION LIST (Continued) ### Massachusetts Department of Mathematics Harvard University Cambridge 38, Massachusetts #### Commander Air Force Cambridge Research Center Attn: Geophysics Research Library L. G. Hanscom Field Bedford, Massachusetts #### Commander . Air Force Cambridge Research Center Attn: Electronic Research Library L. G. Hanscom Field Bedford, Massachusetts ### Michigan Department of Mathematics Wayne University Attn: Dr. Y. W. Chen Detroit 1, Michigan Willow Run Research Center University of Michigan Ypsilanti, Michigan #### Minnesota Department of Mathematics Folwell Hall University of Minnesota Minneapolis, Minnesota Department of Mathematics Institute of Technology Engineering Building University of Minnesota Minneapolis, Minnesota #### Missouri Department of Mathematics Washington University St. Louis 5, Missouri ### Missouri (Concluded) Department of Mathematics University of Missouri Columbia, Missouri Linda Hall Library Attn: Mr. Thomas Gillis Document Division 5109 Cherry Street Kansas City 10, Missouri #### Nebraska Commander Strategic Air Command Attn: Operations Analysis Offutt Air Force Base Omaha, Nebraska ### New Jersey The James Forrestal Research Center Princeton University Princeton, New Jersey Library Institute for Advanced Study Princeton, New Jersey Department of Mathematics Fine Hall Princeton University Princeton, New Jersey #### New Mexico Commander Holloman Air Development Center Attn: Technical Library Holloman Air Force Base, New Mexico Commander, Air Force Special Weapons Center, Attn: Technical Library Kirtland Air Force Base, Albuquerque, New Mexico ### DISTRIBUTION LIST (Continued) ### New York Professor J. Wolfowitz Mathematics Department White Hall Cornell University Ithaca, New York Department of Mathematics Syracuse University Syracuse, New York Mathematics Research Group New York University Attn: Professor M. Kline 45 Astor Place New York, New York Department of Mathematics Columbia University Attn: Professor B. O. Koopman New York 27, New York Department of Mathematical Statistics Fayerweather Hall Attn: Dr. Herbert Robbins Columbia University New York 27, New York Mr. I. J. Gabelman Rome Air Development Center Attn: RCOS Griffiss Air Force Base Rome, New York ${\tt Commander}$ Rome Air Development Center Attn: Technical Library Griffiss Air Force Base Rome, New York Institute for Aeronautical Sciences 2 East 64th Street New York 21, New York ### North Carolina Institute of Statistics North Carolina State College of A and E Raleigh, North Carolina Department of Mathematics University of North Carolina Chapel Hill, North Carolina Office of Ordnance Research (2) Box CM Duke Station Durham, North Carolina Department of Mathematics Duke University Duke Station Durham, North Carolina #### Ohio Commander Air Technical Intelligence Center Attn: ATIAE-4 Wright-Patterson Air Force Base, Ohio Commander Wright Air Development Center Attn: Technical Library Wright-Patterson Air Force Base, Ohio Commander (2) Wright Air Development Center Attn: ARL Technical Library, WCRR Wright-Patterson Air Force Base, Ohio (2) USAF Institute of Technology Attn: Technical Library, MCLI Wright-Patterson Air Force Base, Ohio Chief, Document Service Center (10) Armed Services Technical Information Agency, Knott Building - ^ Commandant Dayton 2, Ohio DISTRIBUTION LIST (Concluded) ### Pennsylvania Department of Mathematics Carnegie Institute of Technology Pittsburgh, Pennsylvania Department of Mathematics University of Pennsylvania Philadelphia, Pennsylvania #### Tennessee Commander Arnold Engineering Development Center Attn: Technical Library Tullahoma, Tennessee Dr. Alston S. Householder Oak Ridge National Laboratory Post Office Box P Oak Ridge, Tennessee #### Texas Defense Research Laboratory University of Texas Austin, Texas Department of Mathematics Rice Institute Houston, Texas Commander Air Force Personnel and Training Research Center Attn: Technical Library Lackland Air Force Base San Antonio, Texas ### Wisconsin Department of Mathematics University of Wisconsin Madison, Wisconsin Mathematics Research Center Attn: R. E. Langer University of Wisconsin Madison, Wisconsin #### Washington, D. C. Human Factors Operations Research Laboratories Air Research and Development Command Bolling Air Force Base Washington 25, D. C. Chief of Naval Research Department of the Navy Attn: Code 432 Washington 25, D. C. Department of Commerce Office of Technical Services Washington 25, D. C. Director of National Security Agency Attn: Dr. H. H. Campaigne Washington 25, D. C. Library National Bureau of Standards Washington 25, D. C. National Applied Mathematics Laboratories National Bureau of Standards Washington 25, D. C. Headquarters, USAF Director of Operations Attn: Operations Analysis Division, AFOOP Washington 25, D. C. Commander (2) Air Force Office of Scientific Research, Attn: SROAM Washington 25, D. C. Commander Air Force Office of Scientific Research, Attn: SRRI Washington 25, D. C. ### Belgium Commander (2) European Office, ARDC 60 Rue Ravenstein Brussels, Belgium 3 9015 02082 7799