UNIVERSAL EMBEDDING SPACES FOR AUTOMATA John H. Holland

THE UNIVERSITY OF MICHIGAN COLLEGE OF LITERATURE, SCIENCE, AND THE ARTS Communication Sciences Program Technical Report UNIVERSAL EMBEDDING SPACES FOR AUTOMATA John H. Holland ORA Project 06114 under contract with: AF AVIONICS LABORATORY SYSTEMS ENGINEERING GROUP AIR FORCE SYSTEMS COMMAND CONTRACT NO. AF33(615)-1162 WRIGHT-PATTERSON AIR FORCE BASE, OHIO administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR May 1964

1. 1. INTRODUCTION In his unfinished manuscript, The Theory of Automata, von Neumann introduces an unusual space obtained by iterating a 29-state automaton to form an infinite 2-dimensional array. The 29-state automaton is so chosen that the resulting space is computation-universal ("logical-universal"): given any computation, there is a pattern of initial states which will cause that computation to be carried out in the space. Although particular examples and classes of computation-universal spaces have been considered since [Burks, 1960; Church, 1957; Holland, 1960-a; Moore, 1962; Myhill, 1962] our knowledge of the role of these spaces in the study of automata and computation remains meager, In spite of this lack, there can be little doubt of the possibilities offered by universal spaces. Von Neumann's particular usage shows just how far-reaching and unusual these may be: By using his space as a framework within which to represent the construction of automata from simple elements, he was able to give the first careful treatment of self-reproducing automata. In a different vein, computation theory offers several appealing opportunities for the use of universal spaces. For instance, a universal space can serve as a uniform formal context for comparing computation procedures as to complexity and efficiency. The usual approach via Turing machines fixes "work square", "description", and similar conventions at tile outset, typically with metalevel statements; thus effects of changes in convention are placed beyond the theory's reacl. In contrast, thie conventions associated with different computation procedures can be dealt with explicitly in a universal space. This is an important gain since changes in convention drastically affect the procedure employed (the machine required) to carry out a computation.

2. Tile present paper seeks to bring the possibilities of universal spaces into sharper focus. It begins (Compositions of Finite Automata) by characterizing the automata which can be built up from an arbitrary finite or countably infinite set of automata. The general idea of representing (or embedding) the structure of one automaton in another is then examined (Embeddings and Computation Procedures). Finally a general class of universal embedding spaces is defined, some important subclasses are distinguished and several properties of the subclasses apropos to studies of computation and construction are investigated (Universal Compositions). In treating construction, one cannot escape a preliminary commitment to definition of automata in terms of "connected" sets of component automata. A straightforward way to accomplish this is to treat an automaton as a device for transforming sequences into sequences: the two finite functions used to define an automaton, the transition and output functions, can easily be extended to functionals mapping sequences into sequences. (This contrasts with the usual treatment based upon acceptance of finite strings.) "Connection" then amounts to using the output sequence of one automaton as the input sequence of another. A function, called a "composition function" in part 2, can be used to specify which input sequences are to be identified with which output sequences. By restricting the class of composition functions appropriately, it can be assured that: (i) the composition of a finite set of automata is an automaton; (ii) the composition of a countably infinite set of automata effectively defines a unique output sequence for any component automaton for each choice of free input sequences and assignment of initial states (free input sequence = an input sequence not constrained by the composition function).

3. Von Neumann's space is an example of a composition on a countably infinite set of copies of a single 29-state automaton. Part 2 gives these ideas a more rigorous formulation. Functionals, of a restricted kind mapping sequences into sequences, become the mathematical objects of study corresponding to the component automata. From the mathematical point of view, then, the composition function induces a set of simultaneous equations on the functionals. These equations correspond to the constraints imposed by the identification of sequences. (In effect, the automaton corresponding to a given composition function is defined by imposing constraining equations on a set of functionals corresponding to a "free" product-automaton.) A composition function is considered admissible only if it satisfies a requirement of "local effectiveness"; this requirement assures a unique solution to the (possibly infinite) set of simultaneous equations. The unique solution defines a new functional. If the set of components is finite, there will exist some finite automaton with transition and output functions extending to this functional. That is, the result of composition in this case is again a finite automaton. Von Neumann's interest in construction procedures set three important requirements on the space he defined: (1) it had to be possible to represent (or embed) an automaton in that space as a connected set of components (that is, the representation had to be "structural" as well as "behavioral" so that one could study processes whereby simpler automata are connected (or composed) to produce more complex automata); (2) the space had to be homogeneous (so that the same component could be represented in the same way at any position in the space - an important consideration if construction procedures were not to be dependent upon their location in the space);

4. (3) given any computation, it had to be possible to represent in the space some automaton capable of carrying out that computation (to avoid a priori restrictions on the procedures which could be represented and, hence, studied). By defining similar properties for compositions one can characterize a fairly general class of computation-universal spaces. The generalization of property 1 to compositions depends in an essential way upon the notion of embedding - mapping one composition (the object composition) into another (the image composition) with preservation of structure and behavior. Specifically, for a mapping from one composition to another to be an embedding it must satisfy two conditions: the functionals corresponding to a component of the object composition and its image under the mapping must be the same; the images of sequences identified by the composition function of the object composition must be identified by the composition function of the image composition. With this definition homogeneity, property 2, becomes a requirement that, if a composition can be embedded at all, the same embedding procedure must serve mutatis mutandis to embed that composition anywhere in the image composition. Finally, property 3 becomes a requirement that, given any computation, some composition capable of carrying out that computation must be embeddable in the image composition (in what follows this requirement will be restricted to computations which can be carried out by finite automata - a weakening not altering essential points of the discussion). Part 3 of the paper develops these concepts. The above properties can be combined, as indicated, to define the set, U, of homogeneous, computation-universal compositions. It can easily be shown (part 4) that each member of this set is a composition over a countably infinite set of copies of a single automaton (a single generator) and that the elements

5. of the composition can be located in a regular fashion in an n-dimensional cartesian coordinate system, A natural subclass, Ud, can be defined by adding as a requirement one further property of the von Neumann space -- a requirement that inputs of the generating automaton affect its outputs only after a non-zero "delay". In other words, the generating automaton is required to be of the Moore type. The subclass Ud includes the natural n-dimensional generalizations of von Neumann's two-dimensional space [von Neumann, unpublished] and counterparts of Church's potentially-infinite automata [Church, 1957]; it excludes "most" of the iterative circuit computers [Holland, 1960-a] although these fall within the class U. A second sub-class, Uc, of U can be defined by strengthening property 3 to a requirement that all finite compositions be embeddable; i.e., elements of Uc must be composition-universal and homogeneous. Certain of the iterative circuit computers belong to the class Uc. Part 4 investigates and compares U, Ud, and Uc. It is proved there that "most" finite compositions can not be embedded in a composition of class Ud, even if the notion of embedding is considerably weakened. This is so even though every computation can be realized in the space by some embedded composition. The theorem turns on the failure of Kleene's representation theorem [Kleene, 1956] for the embedded compositions. It is a consequence of the theorem for Ud that two compositions which are embeddable in a space of type Ud may be further composed (interconnected) in such a way that the resulting composition is no longer embeddable (assuming a common "input rate" for all images) Stated in another way: Various construction procedures will not be realizable in spaces of type Ud unless "input rates" are permitted to go to zero as the complexity of the embedded compositions increases. A further consequence: Several different computation procedures may employ a

6 common subroutine, realizable by a single composition, and yet in a space of type Ud the embedded compositions corresponding to the different computations will not have a corresponding common part. Since there do exist compositions which are not only computationuniversal but composition-universal as well, the classes Ud and Uc must be non-empty and disjoint. Hence for a composition to be composition-universal it is necessary that its generating element be of the Mealy type - it must have states for which the input-to-output delay (lag-time) is zero. As a consequence compositions of type Uc, though locally effective, may in some states exhibit arbitrarily long (finite) connected sequences of elements with zero lag-time. Part 5 provides a closing commentary on definitions and results. Two plans for studying computation and construction procedures are contrasted: One may identify the class of "realistic" universal spaces with the class Ud, restricting the admissible realizations of finite automata (by requiring a less than exponential "fan out" in compositions) so as to retain universality. Alternatively one may identify the class of universal spaces with the class Uc, permitting channels with zero delay. It can be shown that any space of type Ud can be embedded in a space of type Uc; thus either approach can in fact be implemented via the spaces Uc.

7. 2. COMPOSITIONS OF FINITE AUTOMATA This study of computation-universal spaces is based upon the possibility of defining an automaton in terms of a "connected" set of component automata. Hence, the first task is to characterize the automata which can be defined by connecting elements of an arbitrary, finite or countably infinite, set of finite automata. A composition function will specify the connections in each case; thus we must characterize the automaton corresponding to each composition function. Since composition is to be defined over arbitrary sets of automata, the sets may include automata of the Mealy type. That is, the output of some component automaton, in some states, may depend upon its current input. (The logical nets of Burks and Wright [Burks and Wright, 1953], which are defined over a set of primitives including "non-delay" switches, are examples of such compositions). In order that the behavior of the resulting compositions be effectively defined it is necessary to restrict the class of composition functions by a requirement of "local effectiveness". Roughly, a composition is locally effective if the state of each component automaton at any time is uniquely determined by the transition functions of automata in some finite "neighborhood" of the component. That is, the state of a component at time t+l can be computed from the internal states and states of free inputs (inputs not constrained by connection), at time t, of a finite, connected subset of elements of the composition - a subset which may in general change with time. In the definitions below most of the complications come from this requirement. However, unless the class of compositions is to be unduly narrowed (excluding thereby "most" logical nets), the requirement cannot be avoided. It will be shown that the behavior of each component of a locally effective composition is uniquely and effectively defined.

8. Compositions will be defined for any indexed set of elements {a c a E A} satisfying the following conditions: A is a finite or countably infinite ordered set a <I, S, 0, f u > a a^ a' a' a a In Ia = a I, a cartesian product on finite sets I i=l a,1 a,i S, a finite set n 0 = n 0, a cartesian product on finite sets 0 a j=l aj a, fa a Sa: Sx u: I x S + O a a a a X and Y will designate the (lexicographically ordered) sets of all pairs of indices used to specify sets I. and 0 j respectively: X = {(a,i) % a c A and i - ma} Y = {(a,j) 3 a E A and j = n } Each quintuple a with attendant conditions constitutes one of the standard definitions of a finite automaton: the elements of I i being interpreted as the states of the ith input of a, the elements of S as the internal states a a and the elements of Oj as the states of the jth output. The quintuple is used to define recursively two families of functionals Fa and U (parametrised by S ) from infinite sequences on I (input sequences) to infinite sequences on S and 0 respectively (internal state sequences and output sequences): F: IN x S SN a a a a U: INx S N = (0,, 1 2,...} a a a a where IN = {I ~ I: N + I } a - C a sN = {S; S: N + S } ONa = r N a 0^ = (^ ^ ^: N ^ 01

9. The functions I, S, 0 are interpreted as infinite sequences. The recursive extension of fa and ua to F. and U can be given as follows: F(I (0),s) = s Fa(I (t+l),s) f(I(t), Fa(I(t)s)) U (I (t),s) = u (I (t), Fa( t)S)) s c Sa t c N An unrestricted composition, A, on {aa} is a quintuple <IA, SA, OA, fA' UA> such that: SA = 11a^ S {s: A + US b a e A and s(a) e S } A a-A aa a m I n a l,i = 1X I = {i: X I+ =IX ) x e X and i(x) I } - eaA i=l (Y. xcX x 0 =1 n na 0 = l 0 = {o: Y > UO Y y E Y and o(y) c 0 } A acA j=l apj yCY y yY Y Under interpretation, i: X + UI is a particular input state assignment to x x the inputs indexed by X, and IA is the set of all possible input state assignments (functions); SA and 0A are interpreted similarly. The transition function fA IA x SA + SA is thus a functional, from assignment functions into assignA A A A ment functions, satisfying the requirement: fA(i,s)(a) = s'(a) = fa(i(a), s(a)) where i e IA, s,s' c SA i(a) = (i(a,l1), i(a,2),..., i(a,m )) a Similarly uA: IA x SA + 0A satisfies uA(i,s)(a) = o(a) = u (i(a), s(a)) Each composition over {a } will be defined in terms of a composition function y satisfying the following conditions: (i) domain, Y' C Y (ii) range, X' C X (iii) y: Y' - X', 1-1 onto

10. (iv) Oy C I (y) Under interpretation the function y determines which inputs of automata in set {a } are connected to outputs of automata in the set. Property (iii) is a requirement that every output be connected to at most one input - "junctions" or "fan-outs" must be represented by a component automaton with multiple outputs. (Note that {a } may include "fan-out" elements with no delay: u (i,s) = i for y = (,l),...,(a,n ), where u =df proj ua(i,s)). Property (iv) assures the compatability of the symbols of the output of index y with the symbols of the input of index x - all encoding and decoding procedures must be carried out explicitly by component automata. Each x c X' in the range of y selects an input of the unrestricted composition A which is to be constrained by the requirement (Vt)[L(t) = O.yl(x)(t)] More formally the composition function induces a set of equations of the form U (I,s) = Iy(y) (*) (For y = (a,j), U(I (t),s) =df proj y U (I (t),s), an infinite sequence over 0 j; IYr(y) is defined similarly). Not all functions y satisfying conditions (i)-(iv) will yield a consistent set of equations: it may be impossible for the sequences 0 (t), y C Y', to satisfy both the equations induced by y and the equations defining the 0y in terms of the functions u. That is, y may specify the equivalent of a cycle of switches without delay, e.g. a negation element with its output connected to its input. Or, if {aa} is countably infinite, y may specify a two-way infinite line of switches, none of which involves delay. To assure the consistency of the equations (*), y must satisfy a further requirement which imposes "local effectiveness" on the equations. This can be accomplished as follows:

11. ee A sequence y,... y such that y c Y', 1 = h a L, will be called connected if, for h < L, proj Y(Yh) = proj Yh+ (where proj y = a if y * (,j)). If proj y = a, proj y, = w and l < o, the connected sequence will be called a sequence from a to w of length I. A connected sequence may also be infinite, either because an infinite number of distinct elements of Y' are involved or because there exists, say, a sequence y,..., y. such that proj v~(Y) = P^oj Y (such as would be generated from elements connected in a cycle by y, cfo [Holland, 1960]). A connected sequence p = {y,..., y} will be called acyclic if h f h' implies proj Yh f proj y 1 - h h' s y will be called consistent with respect to s e SA if, for each a e A, there is an integer s such that every connected sequence to a of length S,a is contains an element Yh = (6,j) such that u,j depends only on the state of a, s(6), and composition inputs of a, (6,i) E X-X'. The requirement that 6 6 u depend only on the state and composition inputs of a can be stated more formally as: (Vi,i e IA)[i X-X' = i 2X-X' = u j(i (6),s(6)) = u (i (6),s(6))] 1 2 _ 1 2 6,j 1 6,J 2 By composition input states (and composition output states) we mean possible state assignments for inputs (and outputs) not constrained by y: I A { ilX-X' i c IA yA A OyA { olY-Y' o c 0A} Theorem. If y is consistent with respect to s e SA then *each i' e I A has a unique extension to an i e IA satisfying equations (*). Proof outline: Let R be the set of all finite acyclic sequences to a, S,a 1{Yl..., Yh'..., ye}, such that uy, y = (6,j), depends only on s(6) and i " Jc ~~yl 1

12. composition inputs to a6, (6,i) c X-X', while each u, 1 > h - Q, depends upon at least one input indexed by x c X'. (Under interpretation, the first element of each sequence involves a "delay" while the remaining elements are "non-delay" switches, when A is a state s.) If y is consistent with respect to s then no sequence in R is of length greater than 9s and every connected sequence to a contains some sequence in R as a final sub-sequence. Note that Ra is finite and can be effectively determined in this case. Let B. a {( B3 = proj y for y an element of some sequence of R }..-S p, (1 1 S P a Bs a indexes the;et of automata having outputs belonging to sequences in Rs as For any i' C I A and for each x such that proj x E B, the equations induced by y determine i(x) uniquely, using just i'BSa and s|Bs a* Moreover, if proj x e B A, i a, the value for i(x) determined from Bs i is the same as that determined from Bs. All of this follows readily upon modifying the algorithm of Theorem X, [Burks and Wright, 1953], so that it applies to the sets Rs SPa Since for every x e X' there exists at least one set B such that proj x c B i' can be extended to i c IA as follows: i'(x) if x c X-X' i(x) = the unique value, u (i(proj x), s(proj x)), Y-l(x) 1 1 determined by the Burks-Wright algorithm if x c X'

13. For y consistent with respect to s, let s(i') e IA denote the extension of i' guaranteed by the above theorem. y will be called a locally effective composition function (LECF) if the following conditions are satisfied: (i) there exists a computable function s c SA for which y is consistent, (ii) if y is consistent for s E SA then, for every i' e IyA Y is consistent with respect to fA (s (i'),s). For Y a LECF define: S = {s c SA 9 y is consistent w.r.t. s and s is computable} y,A fyA(i',s) = f ( (i'),s) u A(i',s) = uA(P(i'),s) for i' C I A s c SA For each LECF y and unrestricted composition A the corresponding composition A, on {a } is the quintuple <I A'S,A'0 A'f, u >. (Where no confusion can arise the subscript A will be dropped, i.e., I will denote Iy A, etc.). Thus the set {A 3 A is an unrestricted composition and y is Y,A - LECF over A} will be the set of all (finite and countably infinite) compositions (see A and By of Fig. 1 as examples). The definitions given earlier for finite automata will be used for all compositions; thus SN = {S 3 S: N - Sy} Y -- Y Y etc. Under interpretation, a composition is so defined that there is at least a unit delay in every cycle of elements and in every infinite non-repeating sequence of elements; note that the delays need not appear at the same position each time and that they may, for instance, become less and less "dense" in the space with time.

14. Theorem. If A is over a finite set {a } then, for any LECF y over A, A is a finite automaton. Theorem. If A is over a countably infinite set {a } then, given any t c N and any a e A, S (t)|a can be calculated from S (o)|B(a,t) and I (o)|B(a,t),..., I (t)|B(a,t) where B is a computable function and B(a,t) is a finite -y subset of A containing a. (Under interpretation, this theorem says that the state of any element a of the composition at any time t can be determined from the initial state and composition input sequence up to time t of a finite portion of the composition, indexed by B(a,t) - B(a,t) will in general be an unbounded function of time. Repeated use of the Burks-Wright algorithm enables determination of B(a,t) for any a and t.) Theorem, The property LECF is decidable for all y over finite A. (The proof of this theorem follows from a variant of the algorithm for determining admissible states of a finite automaton, cf. [Burks and Wright, 1953]). Theorem. The property LECF is not decidable for all y over all A. (The proof of the theorem depends upon the representability of all Turing machines in particular A constrained by y: There exist y and A such that, if a6 goes into state s*, y is not consistent for any s for which s (6) = s*. Moreover, y can be so selected that deciding whether a6 takes on state s* depends upon a solution of the halting problem. Thus, decidability of LECF for y would yield a decision procedure for the halting problem). For some extensive sub-classes over countably infinite sets {a }, such as the class of iterative circuit computers, it can be proved that all associated y are LECF.

15. The following definitions will be useful: {a,..., ak} will be called a set of generators for A if {a r a E A} 1 -- Y consists only of copies of elements of {a,..., a}. 1 k B over {b 3 B c B} will be called a subcomposition of A over {a 3 a c A} if: (i) {b} c {a }, (ii) y' = vYYB where Y = {y c Y' 3 proj y E B and proj y(y) E B} 1

16. 3. EMBEDDINGS AND COMPUTATION PROCEDURES The central objective of an embedding is to mark out, in some (image) composition, a sub-composition which will exhibit the same local behavior as a given (object) composition. For every element in the object composition there must be distinct part (a sub-sub-composition) of the image which defines the same functionals (over images of the free input sequences). Among other things, this amounts to complete preservation of local properties of the information flow. Thus one can study such properties in the embedding. If many compositions can be embedded in a single image composition, various comparisons and interactions can be implemented and studied. Note that only part of the inputs (or outputs) of the image sub-composition may be required for the embedding; also it may be that several inputs (outputs) of the sub-composition will be required to "encode" a single input (or output) of the object composition. This "encodin' causes problems when one tries to reflect the connection scheme of the object composition in the image sub-composition. The definitions are cumbrous largely on this account. We will consider first a strict notion of embedding which preserves all local behavior on the original time-scale. Afterward this notion will be weakened so that the object composition can be partitioned and only behavior of and connections between elements of the partition need be preserved and this only on a dilated time-scale. Let A be a composition with associated index sets A, X, Y. Let B -a Y i s -YY with associated index sets A, X, Y be a sub-composition of CY. Let * be a mapping from A, X, Y and Sy, I, 0 to subsets of A, X, Y, S, I, 0 Yepciyel y 1(a) l l CY1 Y1i Y1 respectively (i.e. ((a) C A., 4(x) C X., etc.). 4 will be called a strict

17. embedding of A in C (on B ) if the following conditions are satisfied: - -Y 2 -1 (i) Distinct {elements, inputs, outputs} map onto distinct sets of {elements, inputs, outputs}: a # a = O $(a) n O(a ) = A, x # xl = — ~(x) n (x ) A, and y ~ y ==> )(y)n ( (y ) = A, where A is the null set. (ii) Each {input, output, composition (A ) input, composition (A ) output) of an element of index a maps onto a set of {inputs, outputs, composition (B ) inputs, composition (B ) outputs) of the subset of elements in B indexed by ~(a): -Y! proj 4(x) C $(proj x), projl (y) C O(proj y) 1 1 1 1 x c X-X' => >(x) C X-X', ye Y-Y' > (y) C Y -Y' 1 1 1 1 (iii) If y is connected to x by y then all outputs in 4(y) must be connected by y to all inputs in (x); O(Y(Y)) I Yl ((y)). (iv) If two {internal, input, output) states of A assign the same state to {element a, input x, output y} then the images of these states must assign the same state to {((a), (x), (y)}: s(a) = s (a) -> (s)|I(a):= (s )I0(a) i(x) = i (x) =>,(i)|I(x) =,(i )|q(x) o(y) = o () y -> (o0) (y) ( = (o )| (y) (Note that not all states of B need be images of states of A ). (v) When B is in state ~(s) the state transition function, f -Y Y -l 1 depends only on ~(s) and states assigned to images of composition inputs of A: i'l,(X-X') = i';l(X-X') -> f (i', ~(s)) = f (il, 4(s)) w1hri; Y1 where i', i' e I 1 Y 1

18. (vi) The state transition function of the image of a acting upon images of input and internal states of a must yield the image of the successor state produced by the transition function of a, f similarly for ua: f (a) ((i)|(a), f(S)I|(a)) = fa(ila, sla) U ( ) (p(i) 1(a), 4(s) 1~(a)) = 4Uu(i|a, s|a) where f is the transition function of the sub-composition indexed by ~(a) and similarly for u Conditions (v) and (vi) assure that, for every pair of local transformations (F,Ua) in the object composition, there is a sub-composition of the image with transformations (F((a), UW(a)) which can be restricted so as to be the same as (Fa,U ). Conditions (i)-(iv) assure the proper composition of these transformations in the image. Figure 1 presents an example of a strict embedding. If ( satisfies the following weakened requirements it will be called a weak and b-slow embedding or, briefly, an embedding: (i)-(vi) All requirements for strict embedding stand except as altered by (vii) and (viii). [weak] (vii) Let A be partitioned into a set of disjoint sub-compositions. -Y Each such sub-composition may be treated as a single element and be embedded accordingly. Thus y need only be preserved over connections between sub-compositions. More formally: Let A be partitioned by a partitioning, I,of its index set, A. Let {Ayly } represent the set of induced sub-compositions. Then condition (iii) is to apply only to y c Y -Y', P e n. All other conditions are modified similarly. P P (Under interpretation, a weak embedding permits one to ignore all structure and information flow within the sub-compositions {A yy }; only preservaion of the assciaed is required). preservation of the associated functionals, F and U, is required). p p

18a. A B y-1 X' = {x, x X' = {x,,, x 12 21 1 12 21 31 32 yl = y, {y I I ly 1 1 1 Yls y y2} yl = {y, ly, =y, y y 11 22 1 11 12 22 32 Y J l y, )2 1 r (1. 1 1 l Yl1 22 Yn Y 12 Y 7.2. 42 y(y) x x (not ue_, = J x x 12 Figure + IS l. 21zL 12 Y1y) N3 X^t l f t~cl~ fc'Yr —r ^r ^~ |.___ i ((not a~~4 c(a ) = {~ } <(y ):= {y } 4(x ) = {x y 2 3 21 31 12 12 *(y ) = {1y } p(x ) = {1x,O x 22 32 21 31 32 ((S ) C S x S, etc. 1I 1 2 [Note: proj {((ly ) =y } = {,8 } C {0,B } = 4(a ) = 4(proj y 1 11 1 11 22 1 2 1 2 1 1 11 and *(y(y )) = (x ) = {x,x {y (y ), ( 1y )} = y (y(y )) 11 21 31 32 1 11 1 22 1 11 etc., as required.] Figure 1. EXAMPLE OF STRICT EMBEDDING

19. [slow] (viii) A b-slow embedding yields an image with a time scale t' = bt, where t is the time index of A. That is, if oy is the output state of A at time t, the output state of the embedded composition will be *(o ) at time bt. To accomplish this, input sequences are "slowed" by the "insertion" of b-1 "no-signal" states between signals, cf. [McNaughton, 1962]. Formally, the definition depends upon an extension of f and uY to strings on I in the usual way (cf. [Rabin and 1 1 ~ Scott, 1959]). Ib will denote strings of length b over I and, if Y Y ~~~~~~~~~~~~1 1 i* C I!b-l (i) * i* will denote a string of length b with first (earliest) element 4(i). The following formal condition for a b-slow embedding weakens requirement (vi): For all i e I there exists a string i* e Ibi such that for all s e S and y e Y-Y' (with f and u extended to Ib ) Y! 1 Y 11 f (+(i) * i*, ~(s))|+6(a) - =(fy(i,s)(a)) uY ( (i) ~ i*, 4(s)) (y) = *(u (i,s)(y)) 1 ~ Thus the strings *(i) * i* of length b become the images of the input states i. The conditions assure that, if I = *(I ), then, for all t, S (bt) = *(S (t)) and 0 (bt) = ((0 (t)). (Under interpretation -Y -Y -y -y T 1 the signal-processing rate of the image is "slowed" by a factor b). The reader can develop a considerable understanding of slow embeddings by a close reading of McNaughton's discussion of "slow automata" [McNaughton, 1962 and draft].

20. The following additional definitions will be useful: An isomorphic embedding ) of A on B is a strict embedding such that: a 4)(a) x - 4)(x) y = 4(y) a = (a) Ix (x) y 4 (y) Two embeddings, 41 and 4, of A in Cy will be called identically 1 2 -Y - -Y oriented if: (i) there is an isomorphic embedding, 0, of the sub-composition indexed by ) (A) on the sub-composition indexed by 4 (A) such that O6 = 4; 1 2 1 2 (ii) for all a,a' e A and for any connected sequence p {from 4 (a) to (a'), from ) (a) to 4 (a)} there exists a connected sequence p' {from ~1 1 2 (a) to ) (a'),, from ) (a') to 4 (a')} such that proj p = proj p' 2 2 1 2 2 2 (where proj p (i., i.,..., i ) just in case Yh 2 l hp h (Bh,ih), 1 6 h X A). The characterization of homogeneous compositions (in the next section) will be based upon this definition of identical orientation. Prior to defining computation-universality for compositions we also need a relevant definition of computation. The definition which follows is related to that of Burks [Burks, 1960]: Let Ej =: N + Nj 4 N = {0, 1,..., j }}. Let r:. + zj be a functional from sequences into sequences. We wish to define a finite computation procedure for r (of course, only some r will admit of such procedures). Augment a composition A by two mappings: -.y A: N. U {n} + I' 1-1 onto, I' C I 1 1 Y Y Y A': r N. U {i} 1-1 onto, 0' C 0 2 under interpretation determines the "non-signale mappings The symbol'2' under interpretation determines the "non-signal" states, The mappings

21, X and X are extended to infinite sequences a c Zi and 0 c 0O (treating 1 2 1 -Y Y sequences as strings and allowing "signals" to be separated by "no signal" sequences) as follows: Xl(n) = X (n), Xb(n) = -l(n) (n), n c Ni U {Q} 1 1 1 1 1 A (a,b,T,s) = ([X T() X b(o(l)) xb(a(2)) *...],s), s c S X (O ) = X (O (t )) * A (t )) *.. 2 -Y 2 -Y 1 2 Y 2 where A (0 (t)) J 2 just in case t = t. 2 JY <A X, A, s> will be called a b-uniform computation procedure for r and r -Y * 1 -2 = I =. -. will be called (b-) uniformly computable if (Vo i., T e N)( T' E N)[A U A (a,b,T,s) = r(a)] and [U y (o,b,T,s)(t) # 2 just in case t = T1 + jb] That is, at the rate of one input (output) "signal" every b time-steps, and independent of elapsed time T before the first signal, A must transform signals corresponding to successive elements of a into signals corresponding to successive elements of r(a). The set of finitely computable r could be broadened considerably (and naturally) by using X*: Nd - Ib and X*: 0b 1 N4 in place of X and X. If, 1 i Y 2 Y J 1 2 under quite general conventions, some finite A can compute r then one can show that A can be composed with an acyclic (encoding) composition E so that — 1 the resultant composition computes r via A* and X*. Thus the extended defini1 2 tion would serve well for the study of finitely computable r. However, this line will be avoided here because it is more complicated and because the theorems and corollaries of part 4 hold for any class of finitely computable r containing the uniformly computable r as a subclass. Theorem: If r is finitely computable in the sense of [Burks, 19.60] then r is uniformly computable.

22. Theorem. Corresponding to each regular event over a given alphabet there is a distinct uniformly computable r. Note that the elements of E. can be interpreted as representations of fractional real numbers. With appropriate restrictions various non-trivial uniformly computable r can be interpreted as continuous real functions. Lemma. Let r: E.i -. and r': z. + z be arbitrary b-uniformly computable 2.^ J J k functionals except that the range of r is the domain of r'. Let <Ay X, X, s> and <A', X', X', s'> be arbitrary b-uniform computations for r and r' respecY' 1 2 tively. Then there exists an acyclic composition E such that: -Y1 (i) the composition outputs of A can be identified in 1-1 fashion with the composition inputs of E and the composition outputs of E can be'YJ -Y 1 identified in 1-1 fashion with the composition inputs of A', to form -Y a new composition C involving only A, A', and E -Y - yY' -y 2 1 (ii) the output functional of C satisfies the equation 2 U =U, U U Y2 Y Y Y (iii) <C, X, X', s">, for an appropriate s" E S, is a b-uniform — Y 1 2 2 2 2 computation procedure for r'r since f(o c Zi, T c N)[' U U U x (a,bt,s") = r'r(C)] 2.1 ~ 2 Y Y1 Y 1 (Under interpretation the output of A can be connected to the input of A', -Y Ily via the combinatorial recoding EY1 so that the resulting composition computes the composition of the functionals r and r'.) Theorem. Let b(A, r) = minb {b ) A can b-uniformly compute r}, undefined if A cannot b-uniformly compute r. Given r, r', and C as in the previous lemma, C can at best b -uniformly compute r'F where b = max{b(A,r), b(A',,r')}. (The "speed" of the faster computation procedure is wasted). Theorem. If A can b-uniformly compute r and if {(A ) is a b'-slow embedding of A then {(A ) can b'b-uniformly compute r.

23. 4. UNIVERSAL COMPOSITIONS Using the definitions thus far set down we can at last attain a definition, fairly general and fairly concise, of the principal objects of study. A composition V of index V will be called universal and homogeneous -v for (uniformly)computable r or computation-universal if: (i) for each (uniformly) computable r there is an embedding c, into V, of a composition A capable of (uniformly) computing r (It is assumed that the class of computable r, however defined, includes the class of uniformly computable r.); (ii) if < embeds A of index A in V then, given arbitrary a c A, 1 -- l -V S e ((a) and i' e V, there exists 2 embedding Ay in V with identi2 y -Y V cal orientation so that O(S) = i' (where 6 is the isomorphic embedding specified by the definition of identical orientation). Condition (ii) assures that, if a composition can be embedded at all, it can be embedded anywhere in the space with identical orientation. Thus, in a sense made precise in the next theorem, all "translations" of the original image are also images. As a result, properties of the image such as its connection scheme can be made independent of the "location" of the image. If the universal space is to be used to study growing automata and construction procedures this requirement is critical. A composition W of index W will be called universal and homogeneous for finite compositions or, briefly, composition-universal if:

24. (i) given any finite composition A there exists an embedding * of ly A in W -Y -i (ii) as for computation-universal. The work of von Neumann [von Neumann, unpublished] establishes the existence of computation-universal compositions. It can also be shown that there exist composition-universal compositions (the class of iterative circuit computers [Holland, 1960] is an effectively defined class of compositions containing a subset of composition-universal compositions). Note that the class of composition-universal compositions, Uc, is a subclass of the class of computation-universal compositions, U. Theorem. The following conditions are necessary for a composition V to be — v computation-universal: (i) the composition must be generated by a single element, (ii) it must be over a countably infinite set of elements, (iii) strings of output indices are commutative, i.e. if o = proj p 2 where p is a connected sequence from a to R, then for each permutation a' of a there exists a connected sequence p' from a to B such that o' = proj p'. Proof outline: For (i), let a and aB be any two elements of Vv and consider the one-element composition A which is simply a copy of aa. Obviously there exists *1 embedding Ay at a in V. Hence, by the second condition in the definition of computation-universal, there must be an embedding * of Ay at B and an isomorphic embedding 0 of 1 (A ) = a at 8 such that 0 1'. It follows immediately that the quintuples <Is, Sg, OB, fs, ug> and Ia Sa, Oa, fa, ua> must be identical. Ia, aS a, 0a' ua

25. (ii) follows at once from (i) and the existence, for arbitrary k, of r computable only by compositions A wherein the cardinality of S exceeds k. (iii) follows from the common cardinality of all input (output) index sets (since V, is generated by a single element) and the following argument: Let p = (y,y ) be any two-element connected sequence in V with, say, y = 12 1 (a,i), y = ( ),j), and proj v(y ) = 6. Let 6 be an identically oriented 2 1 2 isomorphic embedding of the sub-composition on elements a,6 with 6(6) = 6. But then y constitutes a one-element connected sequence from B to 6 whence 2 by identical orientation there must be a connected sequence p from a to O(a) such that proj p proj y = j. Moreover, since y constitutes a connected 21 2 1 sequence from a to 3, there must be a connected sequence p from 0(a) to 0e() 2 such that proj P = proj y i. The sequence p' = (p,p ) is connected, 2 21 21 from a to 0e() = 6, and proj p' = (j,i). Since proj p = (i,j), with i,j 2 2 arbitrarily chosen, this proves commutativity of indices for arbitrary pairs. From this, commutativity for arbitrary strings of indices can be established in the usual way.. * It is a consequence of this theorem that we can "co-ordinatise" the computation-universal spaces in terms of a subset n,..., n }, nk n of the output indices of the generator: Define a non-oriented connected sequence from a to S to be the same as a connected sequence except that, for each yh' either projl(yh+l) =(hor proj (yh) = proj) v(Yh+) is permitted. Count an index occurence as -1 in the latter case. Choosing some a as origin, the position of any a' is given by a k-tuple having its jth component equal to the sum of the (signed) occurences of index nj in any nonoriented connected sequence from a to a'. The above theorem assures the

26. uniqueness of this number; the resulting co-ordinatization is the usual k-dimensional (integral) cartesian grid. The theorem also enables us to define simply one further class of compositions: U = {V V is computation-universal with a Moore automaton as generating element). Elements of Ud are the most natural generalizations of von Neumann's "logicaluniversal" space because each exhibits the "propagation-delay" or non-zero lag time which plays an important part in von Neumann's development. Referring to the definition of slow embedding, lag time can be more formally defined: a composition A has lag-time b if (extending u~ to strings of length b+l): (3i, i' c I )(Vs c Sy)(Vi*, i** C Ib) [u (i*,s) = u (i**,s) and u (i'i*,s) 9 u (i'.i*,s)] If V C Ud has a generator with lag-time b and if the shortest connected sequence from a to a in V has length a, it will take at least bi units of time for the output of element aR to be affected by an input to aa. The central result of this section establishes the set Ud as disjoint from the set Uc of composition-universal compositions; i.e. elements of Ud are not composition-universal. It follows at once that the generating element of any composition of type Uc must be a Mealy type automaton. The result actually shows that, given any finitely computable r, "almost none" of the compositions capable of computing r can be embedded in a composition of type Ud. This has several consequences, stated as corollaries and discussed thereafter, for the study of computation (and construction) procedures in computation-universal spaces. In the following discussion: V will be a composition of type Ud on a generator g with n outputs and lag-time Tg s 1; G will be a finite set of

27. automata, arbitrary except that it contains at least one element with two or more input-dependent outputs (every set of generators complete for all finitely computable r must contain at least one such element); TG will be the minimum lag-time for elements of G having two or more outputs; [A ] will be the (unique) sub-composition of A consisting of all elements a of A such that, for some connected sequence p from B of length l, a occurs in proj p (i.e. [AY] is the sub-composition consisting of all elements belonging to connected sequences of length a from a). Theorem. Given any LV, G, and b e N there exists a composition A generated by G which cannot be embedded in V. (In fact for each V, G, and b there exists a constant c and a function e(a) such that the theorem holds for any A containing a sub-composition [Ay]B, Q = c, with at least e(Q) elements; the proof shows that e(Z) and c are such that "almost all" A over G satisfy the condition). Proof outline: Given V and G there will be a positive integer k such that no composition B over G consisting of more than k elements can be weakly embedded in the "1 generator g of V.'1 Because of commutativity of strings of output indices in V, fewer than (2 )n distinct elements can belong to any sub-composition [V.] of V (where n is the number of outputs of g). But then any composition A over G containing more than k(22)n elements will have an image in V, under weak embedding with at least one acyclic connected sequence of length > L. There are compositions A over G consisting of 2Q+1-l elements and having no acyclic sequence of length > i.

28. Since there exists co such that k(2Q)n < 2+'1-1 for -> c0, there are compositions over G which can only be embedded in V if some acyclic sequence in the image has length Z + Z > Z. By simply increasing c0 we can show, in fact, that there exists c for which, when Z a c (k+Q) Tg max(ZTG, b); thus for any A over G consisting of more than k(2k)n elements, with Z > c, the lag-time T(A ) of tle image under an embedding ( must exceed both b and the lag-time Tr of A Y -) -Y )' maxvA, b) A Therefore, if A as specified is to be embedded, the embedding 9 must be slow since other types of embedding (weak but not slow) preserve lag-time. Let C contain A as a sub-composition in such a way that some -Y2 composition outputs of C depend upon composition outputs of A. Since the 2Y lag-time of ((Ay) will be T (A ) >,TA the overall lag-time of composition -.Y) TY outputs of C(C ) must exceed that of C. Otherwise the slow embedding re2 2 quirement that composition output values occur synchronously will not be satisfied. (Under interpretation: since signals through {(A ) are unduly delayed, the proper phasing of signals at the output can only be restored by delaying the other signals accordingly.) The result will be an overall increase in the lag-time T of O(C ) to at least T(A ) > b. -Y 2 Consider the image'(Cy ) of in VY under a b-slow embedding 4. -Y2 2 For the moment assume 2b 2 T. Because T > b, 4(Sy (t)) can at best be deter2 mined in 4(CY ) at time (b-l)t + T > bt, even if (Sy (t-l)) is available at tim2 (l Ift n d tt S t) l s2 time (b-l)t. In fact, in order that P(SY (t-1)) always be determined by time 2

29. bt, for arbitrary t, it is necessary that f(S (t+l)) be determined directly from O(S (t-l)) rather than from f(S (t)). This is only possible if the'2 2 transition function of the image, in effect, has images of S (t-l), I (t-l) ~12 2 and TL (t) as arguments at time bt -a sufficient (and for particular C, 2 2 necessary) condition for determination of O(S (t+l)). 4(Iy (t)) will be ~^2 2 available at time bt, but I (t-l) will be available only if the set of 2 states of the image is the image of the set I x S. (Under interpretation: 2 2 new storage elements would have to be added to the embedded network to "hold" the image of I (t-l)). This violates requirements(iv) and (vi) of the 2 definition of embedding. Yet all of this is necessary to preserve the function U under 0 (by meeting the requirement that the image of Oy (t) always occur 2 2 at time bt in V_ under a b-slow embedding). Hence the assumption that Cy, 2 as specified, can be b-slow embedded in V leads to a contradiction. -v A similar argument requiring S (t-kl), (t-k), I (t-k+),... liy2 L2 2 I (t) applies when (k +l)b - T > k b. ~2 1 1 Note that "almost all" compositions on G as specified will include a sub-composition [A_], > c, containing more than k(2Z)n elements. Corollary. Given any finite composition A and any b-slow embedding ~ of A in space V of type Ud, there exists a composition A' which contains A -Y - y' -y as a sub-composition and which cannot be b-slow embedded in V. -v In fact, given any b E N, "almost all" compositions containing A cannot be b-slow embedded in V. Thus most schemes for composing A_ with other compositions cannot be represented in V using b-slow embeddings, even when A itself is embeddable. — %

30. Corollary. Given V e Ud, any b C N, and any set of generators G as specified above, there exists an infinite sequence of functionals r, F..., r such that: (i) each r. is b-uniformly computable by a composition [Aj] over G, JJ JYi (ii) [Aj] can at best be bj-slow embedded in V where bj > bj 1 y. — Jv - j b. c N. J Note that lim. b. = so that the computation rates of the images of J J the [Aj] in V must approach zero. If A can b-uniformly compute F then Yj - a composition [Cj], containing A and [A.] as subcompositions, can b-uniformly compute r' = r.F. However there is no b' e N such that all [Cj] J J Ji Y can be b'-slow embedded in V. Hence, although A can serve as a common "sub-routine" for computing all r, there need be no corresponding common subroutine for computations of the r' in V (assuming a minimal signal rate 1/b' j -V specified). The restrictions just noted do not hold for the co mposition-universal spaces Uc. In fact there exist composition-universal spaces such that any finite composition can be strictly embedded in the space. The spaces Uc are compositions and hence satisfy local effectiveness conditions; however, each such space is generated by a Mealy-type automaton, and may exhibit arbitrarily long (finite) connected sequences of elements with zero lag-time (cf. the similar situation for logical nets). Theorem. Given any composition V~ c Ud there exists a composition C c U y c in which Vv can be strictly embedded. Proof outline: As indicated earlier all compositions in U can be given Cartesian coordinates; one can then determine a unique dimension for each. V has a vsingle finite automaton as generator and that generator can be strictly embedded single finite automaton as generator and that generator can be strictly embedded

31. in a finite n-cube of some space C E U of dimension n. If the dimension -Y c of V is less than or equal to na set of image n-cubes can be arrayed in C with the same geometry as the elements of V. rY

32. 5. COIMENTARY The central definitions of this paper are structural - involving no statements quantified over time. Thus, the compositions are defined via a composition function which specifies the graph of connections between component automata, the behavior of the resulting structure being defined by a set of simultaneous equations induced by the composition function. An embedding selects a sub-composition of an image composition and constrains certain of its state components. These constraints are invariant (hereditary) under a natural restriction of the transition function of the image composition: a restriction wherein input states tc the sub-composition are restricted to images of the input states of the object composition. Only when the inputs to the image depart from this condition will the embedding cease to reflect the behavior of the object composition. The full set of finite compositions may for some purposes be too broad, containing "unrealistic" elements. A natural way to achieve a "realistic" subset is to admit only compositions over a selected set of "realistic" generators. For example, a generator may be designated "realistic" only if it has non-zero lag-time (the Moore type automata). The results of part 4 are intrinsic in this respect: No matter how one limits the set of "realistic" elements by selection of generators, almost none of the compositions over the selected generators will b'-slow embeddable, for any b' e N, in any V c Ud. Thus, given any b' e N, almost none of the b-uniform computation procedures over the generators can be embedded in V as b'b-uniform computation procedures. Instead of a computation rate l/b, almost all images will have

33. a rate less than E, for any 6 > o - an undesirable consequence if the space is to be used to study computation procedures. Two alternatives for avoiding this consequence present themselves: (i) One can accept the spaces Ud as the only "realistic" universal spaces and avoid the above consequence by further narrowing the set of admissible representations of finite automata. This can not be achieved by selecting a set of "realistic" primitives; it can only be done by restricting the connection procedures. (The heart of the difficulty lies in the unrestricted cascading of fanout elements.) Such a restriction would considerably modify current notions of representation and several theorems, such as the Kleene representation theorem for regular events (Theorem 3 [Kleene, 1956]), would be lost. (ii) One can use the spaces Uc with their concomitant of connected sequences of elements exhibiting negligible delay. This introduces no mathematical difficulty because the spaces Uc are compositions and hence satisfy strong local effectiveness conditions; nevertheless one may question how "realistic" such spaces are in allowing propagation of signals with negligible delay over arbitrarily long channels. The second alternative seems preferable to me: Criteria which argue for alternative (i) can be applied, via the last theorem of part 4, to restrict the tranfunction sition of spaces of type Uc. Thus such criteria can both be implemented and A studied under the second alternative. At the same time, there are useful restrictions of Uc not readily translatable to Ud. In this context the study of construction procedures, or growing automata (leading to studies of self-reproduction, adaptation, etc.) becomes the study of sequences of compositions. Such sequences become state sequences in a universal composition - an embedding must transform structural features of

34. the object composition into properties of the state pattern of the universal composition. The fixed connection scheme of the universal composition can only supply an invariant framework. Note that the state sequence corresponding to a sequence of compositions must involve departures from the constraints imposed by embedding. That is, the image of one composition can be altered to that of another only by departing from the conditions set by the embedding. A restriction of the transition function of the universal space will still be involved; it will be similar to, but weaker than, that required for embeddings. The restriction must be suchl that any state trajectory under this restricted function corresponds to a sequence of compositions. Note that a construction procedure involving structures that cannot be embedded will not be representable. More than this, given a universal space, there will be construction procedures, involving only embeddable structures, which still fail of representation. That is, some recursive enumerations of embeddable structures will not be representable through a restriction of the transition function. This is not necessarily undesirable: If it is assumed that the "size" of given composition in a construction sequence is a function of the "size" of its predecessor, then most recursive enumerations of compositions will not be construction sequences. Questions related to characterization of the construction procedures representable in various universal spaces remain open.

35. ACKNOWLEDGEMENTS I would like to thank Robert McNaughton for sending me an early draft of his extensive work on badly timed elements (a portion of which appeared in 1962 [McNaughton, 1962]); the work he has done and the work reported here are complementary in several parts. I would like to thank James Thatcher for many helpful comments and suggestions based on a careful reading of an early draft of the manuscript; in several places the notation derives directly from his comments. The research reported here was in part supported by U.S. Air Force contract AF 33(615)-1162.

36. REFERENCES Burks, A. W., (1960): Computation, behavior and structure in fixed and growing automata. Self-Organizing Systems. Marshall Yovits and Scott Cameron, Editors. New York, Pergamon Press (pp. 282-311). Burks, A. W., and Wright, J. B., (1953); Theory of logical nets. Proceedings of the Institute for Radio Engineers, 41, 1357-1365. Church, A., (1957): Application of recursive arithmetic to the problem of circuit synthesis. Summaries, Summer Institute for Symbolic Logic. Cornell University, Institute for Defense Analysis (pp. 3-50). Hartmanis, J., (1961); On the state assignment problem for sequential machines. Institute for Radio Engineers, Transactions on Electronic Computers EC-10, 3, 157-165. Holland, J. H., (1960); Cycles in logical nets. Journal of the Franklin Institute, 270, 3, 202-226. Holland, J. H., (1960-a): Iterative circuit computers. Proceedings of the Western Joint Computer Conference, San Francisco (pp. 259-265). Kleene, S. C., (1956): Representation of events in nerve nets and finite automata. Automata Studies. C. E. Shannon and J. McCarthy, Editors. Princeton, New Jersey. Princeton University Press (pp. 3-41). McNaughton, R., (1962): On nets made up of badly timed elements, I. Summer Conference Notes on Parallel Computers, Automata and Adaptive Systems. Ann Arbor. The University of Michigan. McNaughton, R., (draft): On nets made up of badly timed elements. Continuation of On nets made up of badly timed elements, I. Moore, E. F., (1962): Machine models of self-reproduction. Mathematical problems in the biological sciences. Providence. American Mathematical Society (pp. 17-33).

37, Myhill, J., (1962): Self-reproducing automata. Summer Conference Notes on Parallel Computers, Automata and Adaptive Systems, Ann Arbor. The University of Michigan. Rabin, M. 0., and Scott, D., (1959); Finite automaLa and their decision problems. IBM Journal of Research and Development, 3, 114-125. von Neumann, J., (unpublished): The theory of automata, construction, reproduction homogeneity.