COLLEGE OF LCuRli.~ioe e Computer d Cociences D CELLULAR ALTOMATA S MODELS OF CELL TUIR S"/STEIAS AMdrew Gehret Barto August 1975 THE UNIVERSITY OF i'lHIGAN ENGlINEERiG LiB'ARIARY Report No. I83 Technical with assistance fromt Nton science Fodation GTant No. DCR -01997 Washington, D.C.

1-11. 1~51 c311~~ e ~~, 1 z

ABSTRACT CELLULAR AUTOMATA AS MODELS OF NATURAL SYSTEMS by Andrew Gehret Barto Chairman: Bernard P. Zeigler A view is taken of certain kinds of cellular automata that allows integration of some concepts and techniques from automata theory, linear systems theory, and the theory of partial differential equations. The emphasis is placed on the idea of modelling natural systems by by-passing continuous formulations. Elements of the abstract theory of harmonic analysis are used to relate solution techniques for linear partial differential equations with automata theoretic formulations of system simplification. It is shown that classes of automaton-homomorphisms exist which can be given natural interpretations in the context of modeling distributed systems. Some of these homomorphisms justify usual modeling practices; some indicate where care must be taken in the model building process; and others suggest novel simplification techniques.

ACKNOWLEDGEMENTS I wish to thank the many people who have helped me during my doctoral studies. I am particularly indebted to Bernard P. Zeigler, the chairman of my doctoral committee, whose research has provided the framework in which this thesis was undertaken, and whose instruction, advice, and encouragement were crucial to its completion. I am indebted also to Arthur Burks, John Meyer, and Alan Wineman for their service on my doctoral committee and for their valuable participation in discussions of this research. I also thank Norman Foo who expressed an early interest in the problems considered here, and who has contributed in many ways to my education. I am grateful to the Logic of Computers Group which provided excellent research facilities and generous financial support and to all its members for their help. I particularly want to thank John Holland and Richard Laing for many inspiring discussions and Phil Pilgrim whose graphics programs contributed a great deal to this thesis. I wish to express my deepest gratitude to my wife Monna, who not only typed the manuscript, but whose support and encouragement made this project possible. The research contributing to this thesis was supported in part by National Science Foundation Grant No. DCR71-01997. ii

TABLE OF CONTENTS ACKNOWLEDGEMENTS ii CHAPTERS 1. Introduction 1 2. Notational Preliminaries 2.1 Introduction 10 2.2 Basic Notation 10 2.3 Linear Algebra 11 2.4 Group Theory 13 2.5 Automata Theory 16 3. Cellular Automata 3.1 Introduction 22 3.2 Definition of a Cellular Automaton 23 3.3 Group Graphs 27 3.4 Connection with the Intuitive View 30 3.5 Automorphism Groups 32 3.6 Generalizations 34 4. Cellular Automata and Linearity 4.1 Introduction 37 4.2 Linear Cellular Automata 37 4.3 Linear Networks and LCAs 42 4.4 LCAs and I/O Analysis of Linear Systems 47 4.5 Examples 49 5. Linear Cellular Automata and Harmonic Analysis 5.1 Introduction 59 5.2 The Discrete Fourier Transform 61 5.3 The Convolution Theorem 66 5.4 Extension to Vector-valued Functions 67 5.5 Examples of the Discrete Fourier Transform and Convolution 70 5.6 Generalizations 78 5.7 Parallel Decomposition of Linear Cellular Automata 80 5.8 Examples of Parallel Decomposition 84 6. Homomorphisms of Linear Cellular Automata by Spatial Frequency Filtering 6.1 Introduction 98 6.2 Homomorphisms 99 6.3 Restrictions, Projections, and Embeddings 101 6.4 Distributed Realizations of Sets of LSMs 106 iii

6.5 Homomorphisms of LCAs via Spatial Frequency Filtering 6.6 Complexity of Hoomomorphic Images 7. Linear Cellular Automata and Sampling Theory 7.1 Introduction 7.2 Homomorphisms between LCAs and having Different Structure Groups 7.3 Generalized Sampling Theory 7.4 Homomorphisms of LCAs via Filtering and Sampling 7.5 Homomorphisms of LCAs via Spatial Aggregation 7.6 Laminated LCAs 8. Linear Cellular Automata as Models of Non-uniform Structures 110 117 119 120 123 141 154 164 8.1 Introduction 8.2 Non-uniform Linear Cellular Automata 8.3 Harmonic Analysis of Non-uniform LCAs 8.4 Uniform Homomorphic Images of NULCAs 8.5 Homomorphisms of NULCAs produced by Fi: Sampling and Aggregation 8.6 Discussion 169 170 174 180 186 196 ltering 9. Conclusion REFERENCES 197 199 iv

Chapter 1 Introduction Many systems that occur in nature are traditionally viewed as very large collections of components distributed in space with interaction occuring between components that are near each other. Each component is usually considered to be essentially the same as every other and is often well understood when behaving in isolation. The usual way of modeling these kinds of systems is through partial differential equations (PDEs). Intuitively, this kind of model might be viewed as a network of an uncountably infinite number of components each interacting with infinitesimally close neighbors. When the components and their interactions are simple enough, many of these equations can be solved analytically using powerful results from functional analysis. However, for most equations one settles for computer generated approximations to specific solutions. Models within this tradition are ubiquitous in the physical sciences and have been very successful in accounting for a wide variety of phenomena: the behavior of sound waves in air, probability waves in quantum mechanics, diffusion of heat or concentration of liquids and gases, diffusion of neutrons in atomic reactors, atmospheric or oceanographic flow and turbulence, and the entire range of electro-magnetic activity. In fact, a great proportion of scientific and technological progress can be directly linked to the success of modeling efforts based on the PDE modeling formalism. And there is little doubt that models formulated in this tradition will continue to play a significant role in contributing to our understanding of the natural world. 1

2 However, successful models tend to bias our perception of experience to the extent that the invention of other modes of explanation becomes less and less likely. Models and modeling formalisms that have been fruitful for long periods of time become regarded as necessities for the representation of experience and are sometimes confused with experience itself. Whether this is true in the case of PDEs is a question we cannot attempt to answer here, but it is against the background of these remarks that we discuss the cellular automaton formalism as an alternative means of formulating hypotheses about the operation of the real world. The concept of a cellular automaton (also called a tessellation automaton or a homogeneous structure) originated with John von Neumann's design for a self-replicating machine. Although he planned to consider a model based on non-linear PDEs, only the conceptually simpler cellular model was completed (von Neumann [1966]). This modelling framework begins with a view of the components as actually separated in space and operating in discrete time. In fact, each component can be viewed as a computer - either a simple one as in John Conway's Game of Life which produces a 1 or 0 depending on how many neighbors are emitting ls or Os (Gardner [1970]), or a very complex one as in the computing elements of the ILLIAC IV computer. More generally, a cellular automaton is an array of uniformly interconnected identical finite state automata. For an array of two dimensions, one can think of a finite state automaton situated at each point on a plane having integer valued coordinates. All of the automata are identical in their operating characteristics, and each receives input from automata having neighboring positions in the array.

3 Uniformity of these interconnections means that the network "looks the same" from any location: if an automaton is connected in a particular way to, say, its right-hand neighbor, then any other automaton is connected to its own right-hand neighbor in the same way. Each automaton operates in discrete time and in synchrony with its neighbors so that the state of each automaton in the array changes to a new state at the same time. An important point to understand is that the components of the network operate in parallel and without centralized control. At each time step, each automaton "reads" input from its neighbors and then computes a new state. The result is simultaneous computing activity over the entire network. It's customary to call each automaton in the array a cell and the state of the entire array a configuration of cell states. This kind of conceptual framework has been elaborated and generalized and has provided a framework for modeling many kinds of systems (see Burks [1970], Yamada and Anioroso [1969,1971], or the recent survey by Aladyev [1974]). There is increasing interest in cellular automata as models of biological systems that exhibit a degree of spatial homogeneity such as certain areas of brain tissue (Mortimer [1970]), sections of heart muscle tissue (Foy [1974]), or colonies of bacteria (Goodman, Weinberg, and Laing [1971]). These systems differ from those typically modeled by PDEs in several ways. First, they exhibit spatial homogeneity at a level higher than the "microscopic" homogeneity of a vibrating string or a heat conducting slab. Hence, the components are viewed as being actually discrete in space (sometimes with connections to fairly distant components). Secondly, the components are often assumed to

4 possess comparatively sophisticated computational power (e.g. the power of a muscle cell or neuron to process information). There are also important differences in the use made of the resulting models. While one may hope for a closed form solution with the PDE formulation, a cellular formulation is rarely expected to yield a "solution" in this classical sense. Computer simulation is usually the only means possible for gaining insight into a cellular automaton's behavior, and the simulation model is considered not as an approximation but as a realization of the cellular model. In fact, the formalism of a cellular automaton allows us to abandon the idea that modeling with discrete time and space necessarily represents an unfortunate and inaccurate discretization of "real" time and space. It is not necessary to feel that, all else being the same, a finer "mesh" would be better. Although many important things are lost when notions of continuous time and space are abandoned, other things are gained. We feel that in some modeling situations the use of PDEs presents unnecessary mathematical complications. This is especially true since often discrete approximations to the PDEs must be considered anyway. For some applications it may even be true that the great care taken in numerical approximation introduces an empty kind of precision into the problem since considerable abstraction may have been involved in the initial continuous formulation. Thus, there are at least two approaches one can take in modeling what we have called distributed systems: the PDE approach and the cellular automaton approach. In some applications it is quite clear which kind of model is more appropriate. When processes appear to be continuous in space and time, the PDE formulation might be more

5 valuable. However, in many cases the choice is not clear and is usually decided by the researcher's background or by the tradition prevalent within a discipline. In particular, we feel that PDEs are often chosen because the modeller is unaware of or unfamiliar with other modeling techniques. On the other hand, discrete formulations often appear to be disconnected from any rich body of lore such as that which exists for PDEs. They seem to be ad hoc constructions that neither benefit from well developed theory nor seem likely to contribute to it. This impression is only partly true. Cellular automata are related to a body of knowledge, but one that is motivated by an interest in these systems as models of computation. The more abstract of such studies deal with issues concerning the specification of cellular automata having universal computational power (e.g. Codd [1968], Thatcher [1964]), and the relationship between cellular automata and formal languages (e.g. Aladyev [1973]). Holland's iterative circuit computer (Holland [1959]) is the first of numerous uses of the cellular automaton framework to suggest forms of computer architecture capable of extensive parallel processing. Studies having more direct applications have resulted from the fact that logical circuits with uniform or iterative structure can be inexpensively constructed (see the survey by Minnick [1967]). However, investigations like these have historically had little direct relevance to issues which seem important in the context of modeling natural systems. PDEs, on the other hand, can provide useful formulations since enough structure is present to allow useful specification of behavioral questions.

6 Theoretical studies of cellular automata tend to be either about a single member of the class or about the class as a whole. The study of special single examples have established the existence of cellular automata with special capabilities (universal computational and constructional power) but provide nothing that can be generalized to larger classes of automata. On the other hand, studied in their full \ generality, cellular automata do not have enough formal structure to support results useful from a modeling point of view. Consequently, no results have been obtained concerning the behavioral possibilities of large but well structured classes of cellular automata. PDE's, however, can provide extremely useful formulations precisely because of the possibility of this kind of behavioral analysis. Although it has been recognized that cellular automata are related to PDEs and numerical methods used for approximating their solutions (Aladyev [1974]), the PDE and cellular automaton traditions remain essentially disjoint, and no significant attempt has been made to investigate concepts of one tradition as they apply to the other. Several reasons for this lack of integration seem apparent. First, PDEscan be discussed in terms of well developed functional analytic theory so that they tend to be seen purely as structures within that theory. Second, systems typically modeled by cellular automata are so complex that formulation in terms of PDEs is very unnatural, and useful formulation in those terms is very difficult (if not impossible). A third reason for the conceptual separation of cellular automata and PDEs is simply that historically the ideas have come from different roots and the educational process has reflected this fact. We do not mean to imply that cellular.automata and PDEs should be viewed as being

7 essentially the same. They clearly are not the same if only by virtue of the totally different kinds of questions that can be asked (and answered!) within each theory. They do, though, deal with the same sorts of phenomena, and some concepts and results from each approach can be interpreted in terms of the other. In this thesis we attempt to provide one kind of link between these approaches so that a rigorous cross-fertilization of these very different modeling traditions can occur. In particular, we discuss a class of systems that can be formally related to both the PDE and the cellular automaton formalisms. This class of systems consists of cellular automata that have linear sequential machines as cells. These systems are related to linear PDEs in the same way that linear sequential machines are related to continuous time linear systems. Although this restriction limits behavioral possibilities, one can begin to deal with behavioral questions and arrive at results comparable to those that exist for linear PDEs but without the complexities of continuous mathematics. On the other hand, it becomes possible to examine concepts developed in automata theory as they are exemplified in this special class of systems and thereby make the corresponding connections with the continuous theory. For purposes of modeling and simulation, perhaps the two most important links of this kind concern the relationship between known analytical means for solving certain PDEs and the conceptual framework, expressed in automata theoretic terms, which is developing for the process of model building and simulation (Zeigler [1972,1974]). In particular, we will discuss the separation of variables technique as it appears in a discrete situation and the automata theoretic technique

9 representing non-uniform structures as cellular automata. It is shown that in some instances spatial inhomogeneities can be ignored without producing misleading models of a system's operation. Since our purpose is primarily to integrate aspects of the PDE and cellular automaton modeling formalisms, we make no attempt to present results in their most general form. It is felt that such abstraction tends to obscure the basic algebraic simplicity of the theory. We do, though, indicate directions of generalization that can be undertaken in the future.

Chapter 2 Notational Preliminaries 2.1 Introduction In this chapter we explain some of the notation and terminology used throughout this thesis. We have tried to use standard symbolism whenever possible, but since aspects of several formal traditions have been brought together, we have often found it necessary to make compromises. When each of several traditions has its own standard way of expressing the same ideas, we have tried to choose the most economical notation and the most suggestive terminology. Sec. 2.2 covers our usage of some basic mathematical notation. The other sections explain concepts from linear algebra, group theory, and automata theory which we will need later on. Notation that is very standard will be used with no explanation, and more specialized notation will be introduced as it is needed. 2.2 Basic Notation Throughout the text we will use the symbols R, C, and Z to denote the real numbers, complex numbers, and the integers respectively. The symbol R will be reserved to identify an arbitrary ring. We will use Z to indicate the set of non-negative integers, i.e. {0,1,... }. For a set X with subset Y, X-Y = {xixcX and xtY} is the complement of Y in X. The cardinality of a set X will be denoted IXI. If n is a function from a set A into a set B, we will write n:A - B and indicate its action on elements of A by n:a - b when asA, beB, and n(a) = b. The image of A in B under n is the set 10

11 n(A) = {n(a) aeA}. The function n is onto B when n(A) = B and one-one (orinjective) when f(a) = f(a') implies a = a'. A one-one and onto function is called a bijection. We will write (n, (b)) to indicate the set {aE~An(a) = b} whether n is injective or not. Suppose A, B, and C are sets, f:A + B, and g:B + C. The functions f and g can be composed to produce a function gf:A + C defined by a t-. g(f(a)) for all aeA. Note the order used in writing this composition. We will occasionally find it convenient to use the following simple terminology. Let Y be a subset of a set X. The map i:Y + X given by i(y) = y for all yeY, is called the inclusion map of Y into X. For two elements x and y of a group (or a ring or field), we will say that x divides y and write xfy when there is a positive integer d such that y = x+x+...+x with d terms on the right. We will also write y = x+x+...+x(d times). The symbol ~ will be used in two ways: first, it indicates the isomorphism of various structures, and, secondly and less frequently, it indicates approximate equality. The context will make the meaning clear. Finally, we will abbreviate the phrase with respect to by writing simply w.r.t.. 2.3 Linear Algebra Let K be a field and let Kn denote the vector space of all n-tuples of elements of K. A typical vector will be written v = (v0,...,vn l) Let MatK(n,m) denote the set of all n row x m column matrices over K. For a matrix A, Av will denote the image of v where m-1 (Av)i = a..v. j=O 3 I

12..th th where a.. is the entry in the i row and j column of A. We will also write A= (aij). Recall that MatK(n,m) is also a vector space over K. For any set X, if f:X + K we define n scalar valued functions f.:X + K, i=O,...,n-l, such that for all xsX 1 f(x)= (f( (x),..,fn 1 (x)). We will often find it necessary to define matrix valued functions of a set X. Suppose H:X + MatK(n,m). Then for every xeX, H(x) is a matrix which we will denote by (h. (x)) where hi:X - K for i=0,...,n-l; j=O,..,m-l. We will use capital script letters to indicate matrix valued functions. Almost all of the theory to follow deals exclusively with finite dimensional vector spaces. For some parts of the theory, however, we have chosen to use notation and terminology that is unconventional in the finite dimensional case. The reason for this is that we would like the notation to suggest extensions to the infinite dimensional theory that is the subject of functional analysis. To prevent this notation from giving rise to misconceptions, we will briefly describe its relationship to the more familiar symbolism of finite dimensional linear algebra. Let V be a finite dimensional vector space over K with basis {eo,...,en _ } Then, as we have indicated, a vector vsV is usually written v = (v0,..,vn_) when n-l V= v.e. i=O with the v.cK. Another way to write this vector is as a function. Let 1

13 I = {0,...,n-l}. Then {e0,...,en } = {eilieI} and we can view a 0 n-1 vector veV as a function of I, i.e. v:I > K is given by v(i) = v.. This slight notational bias makes it easier to consider index sets that are more than just sets. In particular, the index set will often be a finite group (actually it will be the underlying set of a finite group, a distinction we will make clear below). Functional analysis deals with the structures arising when the index set I is the set t. Needness to say, the uncountability of IR adds numerous subleties to the theory. 2.4 Group Theory We will use some elementary concepts from the theory of finite abelian groups that can be found in any introductory text on group theory (e.g. Rotman [1973]). Let G be a non-empty set. If there is a binary operationon G that associates to each pair of elements x, y in G, the element denoted by x+y in G such that: i) for all x,y,zEG, (x+y)+z = x+(y+z); ii) there is an element eeG called the identity element such that e+x = x+e = x for all xsG; and iii) for every xEG, there is an element -x of G called the inverse of x such that x+(-x) = (-x)+x - e; then G is said to be a group. We will consistently use the additive notation. For conceptual clarity, it will sometimes be necessary to distinguish between a group G and the set of elements of G. We will refer to this set as the underlying set of G - it has the same elements as G but there is no rule of composition. Usually, however, the symbol G will denote both the group G and its underlying set.

14 A group is abelian or commutative if x+y = y+x for all x, yeG. The order of G, denoted by IGI, is the number of elements in G. If the order of G is finite, we call G a finite group. A non-empty subset D of (the underlying set of) a group G is a subgroup if the elements of D form a group under the operation defined for G. By Lagrange's theorem, if D is a subgroup of G and G is finite, then IDI divides IGI. The subgroup of G consisting of the identity element alone is called the trivial subgroup of G. If X is a subset of G (not necessarily a subgroup), for every seG we will let s+X denote the set {s+xlxeX}. Similarly, s-X = {s-x[xeX} and should not be confused with our notation for set complement (Sec. 5.2) in which both symbols refer to sets. If X is a subgroup of G, s+X is a (left) coset of X in G. Let X be a subset of G. The smallest subgroup of G containing X is the subgroup generated by X denoted by {X}. When X has only one element x, {X} is called the cyclic group generated by x. If this cyclic group has order N, then we call it the cyclic group of order N and denote it by ZN. This is justified by the fact that any two cyclic groups of the same finite order N are isomorphic to each other and hence isomorphic to the group formed by the set of integers {O,...,N-1} with modulo N addition as the group operation. If G and H are groups, then the set of all ordered pairs (g,h), geG and hcH, forms a group under the operation (g,h) + (g',h') = (g+g',h+h') This group is denoted G H and is called the (external) direct sum of G and H. For example, Z = Z... ( Z(d times) is the group of all d-tuples of integers with component-wise integer addition as the group

15 operation. A mapping n:G + H between the two groups G and H is a group-homomorphism if n(g+g') = n(g) + n(q') for all g,g'EG. The addition on the left is the operation of G and the one on the right is the operation of H. If n is also one-one and onto (a bijection) then it is a group-isomorphism and we'll write G H H. If n:G + H is a group-homomorphism, the set K = {gcGjn(g) = e}, where e is the identity element of H,is called the kernel of n and is a subgroup of G. The set n(G) is a subgroup of H. Our use of group theory does not go very far beyond the use of the elementary concepts described above. Our reason for using the idea of a group is that it precisely and compactly characterizes the intuitive idea of structural symmetry or uniformity. To see what group theory has to do with cellular automata, recall that we described a cellular automaton as a set of components, all identical, arranged so that each component is located at a point in space with integral coordinates. Moreover, the components are interconnected in a "uniform" way. This uniformity means that if the entire structure of the cellular automaton is shifted appropriately in space, then it looks exactly the same as it did before the shift. These shifts can be thought of as the elements of a group with the "sum" of two shifts given by the following: first do one shift, then, starting from the resulting position, do the other shift. It's easy to see that with this kind of "addition" the set of shifts does indeed form a group. In fact, for a cellular automaton having a component at each point in space with integer coordinates, the set of shifts that leave the structure unchanged is isomorphic to Z3

16 (for 3 dimensional space), i.e. the group consisting of the coordinates of the cells. This group is the group of symmetries of the cellular automaton's structure which we'll later call its structure roup. Now, as Holland [1960] first pointed out, other groups do as well as Z and describe other kinds of cellular automata. Some of these have non-abelian groups of symmetries consisting of various reflections and rotations in addition to simple shifts. Since our usage of these terms coincides with their usage in studies of crystals, it is quite literally true that cellular automata are crystals of automata. We will present these facts more precisely in Chapter 3. For more information on groups as models of real structures we refer the reader to Budden [1972] and Grossman and Magnus [1964]. In the chapters to follow, several sections are digressions into aspects of the theory of harmonic analysis on groups. In these sections, as in the current one, we will use G to denote a group. We will always use the symbol S, however, to denote a group that underlies the structure of a cellular automata. The intent is to suggest the words "space" or "structure". 2.5 Automata Theory The concept of a finite automaton or, what is the same thing, a finite state sequential machine has been developed to provide a model of discrete, deterministic, computational processes requiring only finite memory. The theory presented in the following chapters is formulated entirely within an automata theoretic framework in order to provide a connection between models of computational processes and more classical models based on differential equations. We have, however, dropped the

17 finite memory restriction in order to include models closely related to these more traditional modeling formalisms. Accordingly, the structures that are studied in this thesis are examples of automata or sequential machines having state sets that are possibly infinite. All of the major results, however, also apply to a non-trivial class of finite automata. Wb will use the terms automaton and sequential machine interchangeably to refer to a Moore type* sequential machine as defined, for example, by Hartmanis and Sterns [1966], but with the finiteness restrictions removed. An automaton, therefore, is a quintuple M = <X,Q,Y,,X> where i) X is a non-empty set of inputs; ii) Q is a non-empty set of states; iii) Y is a non-empty set of outputs; iv) 6:Q x X + Q is the transition function; and v) X:Q -+ Y is the output function. M is a finite automaton when X, Q, and Y are finite sets. This mathematical structure is intended to model a discrete, deterministic process in the following sense: at any time teZ+, the process ( machine) is entirely characterized by a state qeQ so that if it receives an input xCX at time t, then its state at time tl is given by 6(q,x). Its output at time t+l is X(6(q,x)). Given a starting state and a sequence of inputs, by successively applying the transition function, it is possible to generate the sequence of states through which the process passes. Similarly, it is possible to generate the output sequence. *As opposed to a Mealy type sequential machine in which X is a function of Q x X instead of just Q. See Hartmanis and Sterns [1966].

18 One of the crucial problems faced in applying this model to any actual process is in deciding what aspects of the process are to be taken as states. The choice is not arbitrary since, according to the definition of an automaton, a state must be rich enough so that it can be used, together with input, to predict the next state. Of course another problem arises since it's not always obvious what is appropriately considered input to the system. Keeping these problems in mind, the reader can think of the prototypical example of modeling aildigital computer as a (finite) automaton: a state at any time of its operation can be taken to be the contents of its entire memory including such things as data registers and instruction location counter. See Minsky [1967] for a careful development of the concept of state. In the next chapter we define a cellular automaton as a special kind of automomous automaton or state-transition system. An autonomous automaton is simply a pair M = <Q,6> where Q is a non-empty set of states and 6:Q + Q is the transition function. This structure represents a process in which there is no input and in which the output at time t can be considered to be the state at time t, i.e. A(q) = q. If at time 0, M is in state q, we will write qt 6 (q) = 6(6(..6(q)) (where 6 is applied t times) to indicate the state of M at time tcZ+. Although most of the theory we develop can be extended without difficulty to the case having input and output, we will stay almost entirely within the framework of autonomous sequential machines.* *The exception to this is that thei.cells of a cellular automaton are sequential machines with input and output (from and to their neighbors). Viwed as a whole, however, a cellular automaton (as we will define one) is autonomous.

19 While this restriction makes it impossible to talk about such important ideas as controllability, observability, and minimality (see, for example, Kalman, Falb, and Arbib [1969]), it removes a great deal of notational complexity. In addition,our desire is simply to provide points of contact between automata theory and other formal traditions, and we feel that for this purpose a study of the input-free case suffices. In the conclusion we indicate the appropriate directions of generalization and mention some results that can be obtained. The automata theoretic concepts to be described below are therefore given for the autonomous case only. The complete definitions can be found in Hartmanis and Sterns [1966]. When it is clear that the term automaton (or sequential machine)refers to a pair instead of a quintuple, we will drop the word autonomous. Let M. = <Qi'..> i=l,2 be two sequential machines. M2 is a sub1 <Qi' i 2 automaton or submachine of M1 if Q2 ~ Q1 and 62(q) = 6 (q) for all qEQ2. A map F:Q1 Q2 that is onto Q2 is an automaton-homomorphism if F(61(q)) = 62(F(q)) for all qeQ1. We will say that M2 is a homomorphicimage of M1 and write M1 — M2 or just M1 —M2. If F is also one-one, then it is an automaton-isomorphism and we'll write M1 M2, M - M2, or just M1 P M2. When there is no danger of confusion 1 F-1 1 2 between homomorphisms (isomorphisms) of groups and automata, we will omit the identifying prefix. A fact that will play an important role in the proofs of many theorems is that when automaton-homomorphisms are composed, the result is also an automaton-homomorphism, i.e. if M. = <Qi 'i>' i=1,2,3, are F1 F2 F1F2 automata, with M1 ---- M2, and M2- -- MY then M- M3 If both F1 and F2 are isomorphisms, then so is F2F1. It will often be convenient

20 to use the standard commutative diagram conventions to express facts of this kind. For example, the diagram below conveys all of these facts: i) F1 is a homomorphism from M1 to M2; ii) F2 is a homomorphism from M2 to M3; iii) F3 is a homomorphism from M1 to M3 and F3 = F2F1. F1 M1 —.... 2 3 \ F2 33 M Suppose M1 and M2 are two automata and M1 is a sub!pchine pf4 F -l M1. If M ' M2 we say that M1 realizes M2 The map F is called an assignment of M2 into M1 (c.f. Hartmanis and Sterns [1966]). Intuitively, M1 can do everything M2 can do and more. For any starting state q of M2 there is a starting state F (q) of M', such that the behaviors of M1 and M2 will be the same (with relabelling by F) for all time. When M1 is started in a state not in the state set of M', it may behave in a way entirely unrelated to any possible behavior of M2. This concept is intended to model the situation which occurs when it's necessary to construct ("realize") a device (e.g. logical circuit) that behaves like M2 but its simpler for some reason to build one that behaves like M1. If M1 realizes M2, building the simpler device suffices. Finally, we define a subclass of sequential machines which provides other points of contact between different modeling traditions. This class consists of sequential machines whose transition functions and output functions can be represented by linear operators. Although we will mostly need to consider only the autonomous case, we start with the definition from Harrison [1969] which includes input and output.

21 A (Moore type) linear sequential machine (or LSM) is a sequential machine M = <X,Q,Y,6,X> such that there is a field K and integers kn,2sZ+ such thatX=Kk,Q=Kf, Y = kneZ+ such that X = K, Q = Kn, Y = Kl, and such that there are matrices A, B, and C over K (of appropriate sizes) so that 6 and can be given as follows: 6(q,x) = Aq + Bx X(q) = Cq for all qeQ and xcX. Similarly, an autonomous LSM is a special kind of autonomous sequential machine and corresponds to an LSM with k = 2 = 0. We will denote an autonomous LSM by writing M = <Kn,A>. When K is a finite field, an LSM is a special kind of finite automaton. The concept of an LSM represents a formulation in the language of sequential machines of a class of models that goes by several different names. In the more mathematically oriented literature, e.g. the literature of numerical analysis, LSMs are called linear difference equations. From the systems theory point of view, they are discrete time, time invariant, finite dimensional linear systems (e.g. Kalman, Falb, and Arbib [1969]). In the more applications oriented areas involving signal processing, they are sometimes known as recursive digital filters (see Gold and Rader [1969]).

Chapter 3 Cellular Automata 3.1 Introduction The idea of a cellular automaton was first developed by John von Neumann, at the suggestion of S. Ulam, as the framework for a demonstration that no logical problems were involved in the idea that a "machine" could, at least in principle, reproduce itself. He began with the description of a structure (which most people could comfortably call a machine) that would be able to construct a copy of itself. This logical construction was left unfinished at his death but was completed by A. W. Burks and published as the Theory of Self-reproducing Automata [966]. The self-reproducing structure was imbedded in a particular two-, dimensional example of what von Neumann called a cellular automaton. He defined the general notion of a cellular automaton as a uniform interconnection of identical finite state automata and viewed the theory of these networks as part of a more general theory of automata. From the beginning, then, cellular automata have been thought of in terms of the discrete mathematics of automata theory. Although von Neumann hoped to develop a continuous model of self-reproduction formalized by non-linear PDEs of the sort used to model fluid dynamics, only the discrete cellular automaton version was completed (von Neumann [1966]). However, investigations inspired by von Neumann's construction continued in the tradition of automata theory where the technical context is largely that of digital computation: computational and constructional universality, formal grammar theory, and analysis and synthesis of cellular hardware. Aladyev [1974] surveys a broad range of these studies. Minnick '[1967] surveys the more applications oriented work on cellular logic. 22

23 The emphasis in this thesis, however, is on the view of cellular automata that is more closely allied with continuous mathematics. We give a formal definition of a cellular automaton that uses terminology and the conceptual view of functional analysis but is, nevertheless, essentially the same as the definition familiar to automata theorists. We then use this framework to indicate how the idea of a cellular automaton can be altered to make contact with various non-automata theoretic formalisms. 3.2 Definition of a Cellular Automaton Before a formal definition of a cellular automaton is given, some notational preliminaries are required since we take a functidnal analytic point of view that is unusual in discussions of cellular automata. Instead of talking about n-tuples and d-tuples of n-tuples etc., we'll talk about functions and operators on function spaces. The purpose of using this terminology and the accompanying notation is to indicate connections to traditional mathematical structures that are obscure within the usual automata theoretic framework. Let S be a countable set and let V be a set with a distinguished element denoted by 0. The support of a function f:S -+ V is the set X S such that f(s) # 0 if and only if seX. A function f:S + V has finite support if its support is a finite set. Let F(S,V) denote the set all functions from S to V that have finite support. For any subset X of S and any feF(S,V), let fix denote the restriction of f to X, i.e. f|x: X - V is given by (f|X)(s) = f(s) for seX. Now suppose S is a group under an operation denoted by the symbol '+'. For any XC S and any seS, let s+X = {s+xixeX} and s-X = {s-xlxeX}.

24 For every seS, it is convenient to define a shift operator L on the set F(S,V) such that (L f)(r) = f(s+r) for every feF(S,V) and reS. Under operator composition the set of shift operators forms a group isomorphic to S via the map that takes s into L. This is true since LL = Lt+ -S st t+s or L L L = L (as can be verified using the definition of -s -t -t-s -(s+t) function composition). In case S is abelian, S and the set of shifts are also isomorphic via the map s — Ls.* Ignoring for a moment our countability restriction of S, suppose S = V = R and '+' is the usual real addition. Fig. 3.1 indicates the action of a particular L. ____fI 0 I s 0 w I ] <~~~ \f I-(<4sFigure 3.1: The effect of a shift operator L, s>O. *If we defined a shift operator so that (L f)(r) = f(r+s), then the map given by st- L would be an isomorphism for both abelian and non-abelian groups. The definition we have chosen, however, offers certain notational conveniences later on.

25 We define a cellular automaton as a special kind of sequential machine (see Chapter 2). Although it is possible to consider cellular automata with input and output, for simplicity only the autonomous case will be considered here. In this case a cellular automaton is a special kind of autonomous sequential machine (state-transition system) M = <Q6> where 6:Q + Q. Definition 3.1 A cellular automaton is an autonomous sequential machine M = <Q,> with the following special properties: i) There is a countable group S (additive) calleddthe structure group and a set V called the cell state set with a distinguished element 0 called the quiescent state such that Q = F(S,V). Elements of Q are called configurations; and for every qeQ and every seS, q(s) is called the state of cell s in configuration q. ii) There is a finite subset X of S called the neighborhood template of M. The set s+X is the neighborhood of cell s. iii) There is a mapping a:Q - V such that if qeQ is identically 0, then oq = 0; and such that for ql,q2eQ with qllX = q21X, aq1 = aq2. The map ais called the local transition function of M. iv) 6:Q + Q is called the g&obal transition function of M and is defined as follows: 6(q) = q' where q' is given by q'(s) = aLsq for all ssS. For convenience we will denote the cellular autbmata M by M = <S,V,X,> but also by M = <Q,6> when it is to be considered as an unstructured sequential machine. In order for this definition to make sense it must be true that the

26 global transition function 6 does indeed map Q into Q, i.e. it must be true that if q has finite support, then so does 6(q). This is true as the following argument shows. Suppose q C Q has finite support Y. We'll show that the support of 6(q) is a subset of A = U (y-X) and therefore, yYY being contained in a finite union of finite sets, is also finite. To see this, suppose seS but s~A. This means that for all xeX and all yeY, s # y-x; or, for all xeX -, that s+xOY. Thus for any xcX, (Lq) (x) = q(s+x) = 0 since the support of q is Y. If fsQ is the con, figuration identically equal to zero, we have that (L q)JX = fiX so that by part (iii) of the definition aL q = af = 0. Thus (6(q))(s) = aLsq = 0, S s and we've shown that the support of 6(q) is contained in A and is therefore finite. Our definition of a cellular automaton, then, is internally consistent; but is it consistent with the various other forms of the definition that are found in the literature of cellular automata? Although the functional notation we use perhaps obscures the fact, our concept of a cellular automaton is essentially identical to that used elsewhere in the literature with the exceptions that we have allowed the structure group S to be any countable group instead of just the group Zd, and we have allowed the cell state set V to be infinite. Our more general definition is given for two reasons: first, allowing the structure group to be any countable group lets us explore the relation of these structures to classical symmetrical structures which we do in the next section, and second, we want to include cases where V is a finite-dimensional vector space - also to include more classical structures (Chapter 4). Except for these two differences, we have defined what has been called an algorithmic cellular automaton ( Burks [1970]). Even though the array of cells may be infinite, if all but a finite number of

27 cells are quiescent initially, then all but a finite number of cells will be quiescent at any time during the cellular automaton's operation. This property is the same as requiring the state set Q to consist only of configurations with finite support and showing, as we have done, that Q is closed under the action of the global transition function 6. The least standard part of our definition is (iii), the description of the local transition function a. The conditions on a given there imply that 1) if a cell's neighborhood contains only quiescent cells, then its next state will be quiescent, and 2) that a is indeed local since its action does not depend on a function's values outside of the finite set X. The specification of the global transition function 6 that we give. in part (iv) differs from that given in other definitions (e.g. Yamada and Amoroso [1969]) only because we use the notion of a shift operator. A cell's next state is obtained by viewing it as located at the origin d (the identity of S) and operating with a. Thus, at least for S = Z, our cellular automata have the usual kind of uniformity. In the next section we discuss the meaning of uniformity with respect to an arbitrary countable group. 3.3 Group Graphs As we have indicated, the concept of structural uniformity is usually defined with respect to Z as in Aladyev [ 1974]. However, the more basic notion is structural symmetry which group theory precisely characterizes as mentioned in Chapter 2. Def. 3.1 allows the structure group S to be finite or even non-ab6lian in order to place structural uniformity in this more basic context. It is necessary, however, to show that the idea of a cellular auto maton with a non abelian and/or finite structure group makes intuitive

28 sense. It was shown by Wagner [1966 that uniform interconnection of components can be characterized by group graphs based on any kind of group. The integral lattice and neighborhood template of a cellular automaton as d it is usually defined can be regarded as a group graph based on Z A group graph (slightly different from the graph of a group) is a graph whose vertices correspond to and are labeled with the elements of a countable group S, and for which there is a finite subset X of S such that there is an edge between vertices corresponding to s and r if and only if there is an xeX such that r = s+x. We will say the graph is a group graph of S w.r.t X. This graph is completely connected if and only if X generates S. (The set X will correspond to the neighborhood template called X in Def. 3.1,) Wagner has shown that this concept precisely characterizes that we mean when we say that a network "looks the same" from each component. Such a network corresponds to a group graph and every group graph gives rise to such a network. Example 3.1 Consider, for example, the group S- of all the permutations of 3 objects. This group is non-abelian and has six elements, say a,b,c,d,e, and f; and the operation given by this table: + a b c d e f a a b c d e f b b a f e d c c c e a f b d d d f e a c b e e c d b f a f f d b c a e Let X = {b,e}. The group graph of S3 w.r.t. X is shown in Fig. 3.2.

29 b d<> --- —— ~ ^d a f e Figure 3.2: The group graph of S. w.r.t. X = {b,e}. 3 The crystal-like structure in Fig. 3.2 could very well underlie a cellular automaton by our definition. The shift operators would perform shifts of configurations according to the operation of the group S3. Fig. 3.3 illustrates the action of Lb. 3 Lb X 1( ~24 Figure 3.3: The action of a shift operator on a function of S3. Intuitively, a translation with respect to a non-abelian group requires a rotation in addition to a shift. This is more easily seen by studying the non-abelian tessellation of the entire plane shown in Fig. 3.4. We have indicated two sets, s +X and s2+X, that would correspond to the neighborhoods of cells sl and s2 respectively if the graph

30 in Fig. 3.4 were * e f associated with l a cellular automaton. *: I * 0. Figure 3.4: A non-abelian tessellation of the plane. The reader is referred to Grossman and Magnus [1964] for an elementary presentation of these concepts. In what follows, however, we will deal almost exclusively with abelian groups. 3.4 Connection with the Intuitive View We can now clearly connect Def. 3.1 and the intuitive notion of a cellular automaton given in Chapter 1. The network of interconnected components associated with the cellular automaton M = <S,V,X,a> would be based on the group graph of S w.r.t. X. On each node of the graph would be located a copy of the sequential machine M' = <Q', I', S'> with state set Q' = V, input set I', and transition function 6' given according to which of the following two cases holds: case i) eCX, where e is the identity element of S.

31 Let X = {x,...,n 1}. Then the input set I' = V. For any veV and (v0,..,v )e1Vn, let q be any configuration in Q with q(xi) = vi, i=O,...,n-l. Then 6'(v,(v...,Vn_)) = aq. eeX. Say x= = e. Then I' = Vi, and for any qcQ as in case i, we have 6'(v.,(Vl,.. v n )) = aq. 0 1 ~nm-1 case ii) It follows from the definition of a that in either case 6' is well defined. The details of this specification are somewhat awkward given our definition but not complicated.' Case i corresponds to the situation in which cells are not in their own neighborhood and thus do not immediately depend on themselves. Case ii provides for this immediate dependence. These sequential machines are interconnected so that at any time t the component v. of the current input vector to the machine at node s is the current state of the machine at node s+x.. The interconnection 1 graph of the network is a graph like the group graph of S w.r.t. X but the edges are directed and labeled. For any two group elements s and r and any x. X there is an edge labelled x. directed from r to s if and 1 1 only if r = s+x.. Fig. 3.5 shows the labelled and directed group graph corresponding to the S and X that generated Fig. 3.2. This graph is the same as the diagram of immediate effects (cf. Ashby [963]) of a cellular automaton M = <S,V (b,e},>. 3$^

32 b b b b Figure 3.5: Labeled and directed group graph of S5 w.r.t. X = {b,e}. Thus, the cells of the cellular automata can be regarded as sequential machines. If a cellular automaton is in configuration q at time t, then q(s) gives the state at time t of the sequential machine located at node s of the group graph. 3.5 Automorphism Groups In this section we point out an alternative view of cellular automata that is more closely related to the non-automata theoretic approach than that provided by Def. 3.1. Again, only the autonomous case will be considered even though the results below can be extended to the more general situation. An automorphism of an automaton M with no input is simply an isomorphism from M to M, i.e. it is a bijection on Q that commutes with the transition function 6. The composition of two automorphisms is again an automorphism, and'the set of.all autombrphisms' of M forms a group (Fleck [1962]) with function composition as the group

33 operation. We'll call this group G(M). Theorem 3.1 If M = <S,V,X,a> is a cellular automaton, then S is isomorphic to a subgroup of G(M). Proof: For any seS the shift operator L is an automorphism of M: each L is s s a bijection since L L is the identity on Q, and L = 6L since S -S S S (L (q)) (r) = ((q))(s+r) = Lsrq= oLLq = oL (L q) S+r r s r s = (6(Lsq))(r). Since the L, ssS, form a group isomorphic to S (s H — L ), we are done. S ' s Q.E.D. version of a result proved by Jump [1969 ]. The following theorem is a Theorem 3.2 (Jump) If M = <Q,6> is an autonomous finite state automaton with automorphism group G(M), then it can be realized by a cellular automaton with structure group S where S is any subgroup of G(M). Proof: See Jump [1969]. An example given by Jump shows that the realization concept is necessary in this theorem. For an automaton M = <Q,6> with automorphism group G(M), it is not necessarily true that M is isomorphic to a cellular automaton with any subgroup of G(M) as structure group. M is isomorphic to a submachine of the cellular automaton. With this caveat, however, the structure group of a cellular automaton is the same as a group of automorphisms of an (autonomous) automaton.

34 Finally, we emphasize that Def..1 does not require the structure group of a cellular automaton to be m-ximal in any sense. Any sequential machine can be viewed as a cellular automaton with a trivial structure group. We have not, therefore, delineated a subclass of automata to examine but have added additional structure to the idea of an automaton. 3.6 Generalizations Yamada and Amoroso [1969] indicate that if the group S and template X are allowed to be uncountably infinite then a cellular automaton (or tessellation automaton) might be viewed as a model of a continuous physic cal system with the local transition function being an integration. The notation used in our definition of a cellular automaton was chosen so that this kind of generalization might be made more precise using standard concepts from functional analysis. However, we do not adopt the more general framework in this thesis for two reasons. First, requiring S to be countable avoids topological considerations which tend to obscure the basic simplicity of our results. Second, our emphasis is on the modeling and simulation aspects of the cellular automaton formalism. We will not need to "discretize" a cellular automaton (according to our definition) in order to simulate it on a computer. The generalization to continuous cellular space does, though, serve to indicate the place occupied by the theory of cellular automata with respect to more classical mathematical formalisms. The natural generalization (and the one that allows the harmonic analysis results described in Chapter 5 to go through) is to allow S to be any locally compact topological group and the template X to be a

35 compact subset of S. A cellular automaton could then have structure group IR and any closed and bounded region X as a neighborhood template. The natural generalization of Q, the set of configurations, would be to let it consist of:functions with compact support. This definition would specialize to ours since any countable group with discrete topology is locally compact and, for the discrete topology, compactness and finiteness mean the same thing. It is possible to go further however. The idea of a finite neighborhood or region of influence is important when cellular automata are considered from a computational point of view. Computers cannot handle an infinite amount of input at each time step. Viewed from the point of view of continuous mathematics, however, this problem takes a different form. The analogous concern is with whether the local operators are bounded or not. In the linear theory, compact neighborhoods insure boundedness but are not necessary for it. Systems described by hyperbolic PDEs have compact regions of influence that determine a finite upper limit on propagation speed (e.g. wave equation). Systems described by parabolic equations, on the other hand, do not. The heat equation, for example, has unbounded neighborhoods but they have bounded influence. Thus, it is natural to define a norm on the set of functions from S to V and require Q, the set of configurations, to consist only of functions with finite norm. The local transition function 6 would then be required to be bounded with respect to this norm. 2 For example, if V = R, Q might be L (R), the set of square-integrable (in the Lebesque sense) real valued functions of R. A local transition function might be given by

36 00 caq = oo w(s)q(s)ds 2 where weL OR). The support of w would be the neighborhood template, but the integral will converge (in the norm of L(R)) even if the support of W is all of R. This follows from the fact that L2(R) is a Hilbert sp'ce and hence reflexive. See, for example Schechter [1971]. In this framework, then, the idea of a neighborhood loses significance. Much more general frameworks are possible. For example, we will be considering operators on spaces of group functions that are invariant with respect to various group generators and causal with respect to at least one (the generator for time). Our desire, however, is to stay within the sequential machine orientation. where the concept of state plays a central role

Chapter 4 Cellular Automata and Linearity 4.1 Introduction In this chapter we begin the discussion of cellular automata that have the additional structure of linearity, or, what is the same thing, that are composed of cells which are, linear sequential machines (LSMs). The resulting mathematical structures are very well studied and have applications in many different areas. It is basic knowledge within many disciplines that structural symmetry together with linearity gives rise to the notion of convolution of functions and hence to analysis techniques based on the Fourier transform. It is not clear, however, that cellular automata researchers realize that some of the objects of their study have close relatives within classical mathematics. It is apparently also true that few people trained in the more mathematical tradition are aware of the non-standard modeling formalism provided by the cellular automaton idea. In the following sections we present, with as little mathematical abstraction as is practical, basic definitions and equations that will be used in later chapters. We also include a discussion of iterative linear networks and numerous examples. The discussion following Example 4.5 contains a proof of a result by Amoroso and Cooper [1971 ] and Ostrand [1971] that is quite natural within the framework developed. 4.2 Linear Cellular Automata In this section we formally define a linear cellular automaton (LCA) by adding appropriate structure to Def. 3.1. Our use of the 37

38 term "linear" is not the same as the use that occasionally appears in cellular automata literature where it describes arrays of cells that are linear strings, i.e. one-dimensional or associated with the group Z. We use the; term LCA to refer to the subclass of cellular automata that are also linear systems because they have a vector space structure and a global transition function that is a linear operator. Some of these are one-dimensional arrays, but not all of them are. Definition 4.1 A linear cellular automaton (LCA) is a cellular automaton M = <S,V,X,a> such that: i) The cell state set V = Kn for some non-negative integer n and field K. ii) There is a function W:S - MatK(n,n) with finite support X such that the local transition function a:Q - Kn is given by: aq = CW(r)q(r). (4.2-1) reS We will denote the LCA M by <S,Kn,W>. Its state set will be denoted by Q and its global transition function by 6. The neighborhood X is implicit in the specification of W. Note that a is indeed local since if ql,q2EQ with ql(r) = q2(r) for reX, then cq1 = W(r)q1(r) = HW(r)ql(r) = W(r)q2(r) = aq2 rcS rsX reX This is true because W(r) is the zero matrix for reS-X. We also point out that the zero vector of Kn is a quiescent state since a is a linear operator. Thus Def. 3.1 part (iii) is satisfied.

39 It follows from Def. 3.1 (iv) that the global transition function 6 of an LCA is given by 6(q) = q' where q' (s) = aLsq = W(r)q(s+r) for all seS. (4.2-2) reS If we define a function H:S + MatK(n,n) by letting H(s) = W(-s), for all sCS, we can write q' (s) = CH(-s)q(s+r) _ H(r)q(s-r) = H(r)q(s-r). reS -reS reS (4.2-3) The change of variable in the last step simply re-orders the summation due to the group structure of S. Equation (4.2-3) is true even when S is non-abelian. The function H is the reflection of W and has support X= {-xlxeX}. The set X indicates the set of all cells that are influenced by the cell at the origin (instead of those that influence the cell at the origin). Therefore, consistent with terminology for PDEs (see Duff nnd Naylor [1966]), the set s+X = s-X is called the region of influence of cell s. Letting H(s) = (hi.(s)), for all seS, (4.2-3) can be written in the expanded form q(s) h0(r n) 1 q (s-r) q!(s) sL 7 ( Lq(s-r) j (4.2-4) q (S) h () r) n-r C(s-r Ln-1l k onO n-l,n-1. for all seS. The component functions h.. will be called spatial impulse reponse 1)~ ~ ~ ~ ~~~~~~~~ functions since if q = (q0,..,qnl ) is such that

40 $1 for k = j and s = e (the identity of S) qk(s) = 0 otherwise then q! = h... Letting W(s) = (w.i(s)), for all sES, we call the w.. 1 13 13 13 the weighting functions of M. Observe that since the support of W is X we know that X.. C X, i,j=O,..., n-l, where X.. is the support of w.... 1] 13 13 Similarly for H. The operators given by equations (4.2-2) and (4.2-3) involve the natural generalization of cross-correlation and convolution respectively. For f,g:S - K, i.e. for scalar valued functions, we will occasionally write f *S g for the function given by (f *S g)(s) = f(r)g(s-r) for all seS. reS This scalar convolution is commutative if and only if S is abelian. Since the function H will often provide the most convenient form for specifying an LCA, we will sometimes denote an LCA M by <S,V,H> with the understanding that the symbol H is always interpreted as the reflection of the function W found in Def. 4.1. The function H corresponds to what is called the Green's function for linear PDEs with constant coefficients (Duff and Naylor [1966] or Roach [1970]). For PDEs, however, H is a function of time as well as space: H(s,t) gives the response at s (place) and t (time) to an impulse applied at the origin. In the case of discrete time, it is possible to specify this response for all time teZ+ by specifying it for t = 1. Then H(s,t) for t > 1 is determined by the composition property of a state determined system (Kalman, Falb, and Arbib [1969]). Thus, if we were to let H be the discrete analog of a Green's function for an LCA, we would have that H(s) = HG(s,l), for all seS. G It is convenient to make several other observations about Def. 4.1. The set Q of configurations of an LCA is a vector space of vector-valued

41 functions with the usual component-wise operations of function addition and scalar multiplication. Each shift operator L, seS, is a linear n operator on Q. The ldcal transition function a:Q > K is also a linear operator that is bounded since W has finite support. Finally, the global transition function 6 is a linear operator on Q and can be written in operator form as follows: 6 = SW(r)L. (4.2-5) reX This is verified by the following manipulations: for qcQ, 6(q) = (W(r)Lr )q = (W(r)Lrq) reX reX since the L are linear. Thus, for any seS, r (6(q))(s) = W(r) [(Lq)(s)] = (r)q(s+r)= W(r)q(s+r) reX reX reS which is formula (4.2-2). Within the framework provided by Def. 3.1, i.e. S countable and X finite, part (ii) of Def. 4.1 is the same as simply requiring the local transition function to be a linear operator. We have chosen to explicitly provide the representation W of the local transition function in order to simplify our presentation. In a more abstract theory formalized in terms of Hilbert space, the representation of a is given in terms of an inner product which, in some cases, is a convergent integration (not necessarily over the set X) as indicated in Sec. 3.6. For our purposes, however, Def. 4.1 suffices and we leave it for the more mathematically inclined reader to provide the appropriate generalizations.

42 4.3 Linear Networks and LCAs Intuitively, an LCA is a cellular automaton in which each cell is a Moore type linear sequential machine (see Chapter 2) with state space dimension n. Moreover, the interconnections between the cells are linear so that the LCA as a whole is a linear sequential machine (although not necessarily finite dimensional). We will make this view explicit and provide an intuitive basis for understanding the various parts of Def. 4.1 by showing how an LCA can be realized by a linear network. A sequential linear network is an interconnection of three kinds of components: unit delays, summers, and coefficient devices. Fig. 4.1 illustrates the function of these primitive components. x1 x y V2 x ax D I, Xn. y(t) = x(t-l) x1 + x2 +..+ x (a) (b) (c) Figure 4.1: Sequential linear network components: (a) Unit delay, (b) Summer, (c) Coefficient device The arithmetic operations shown are those of an underlying field K, i.e. the summation in Fig. 4.1(b) is the operation of K and a in (c) is any element of K. A network represents an ideal computational process where a unit delay can store any field element, a summer performs exact multiplication by an arbitrary field element even if the field is infinite. Of course any real implementation of a network for an infinite field

43 will involve round-off error, but for the moment we will ignore this issue. It is of course also impossible to construct or simulate under all circumstances a network with an infinite number of components. For LCA, however, we can illustrate such infinite networks by drawing just the structure in the neighborhood of one cell. This structure is, by definition, finite and can be mentally iterated in space to give the complete structure. Elipses in our diagrams will indicate this iteration. The following examples illustrate the network interpretation of Def. 4.1. Since it is very difficult to illustrate in detail the linear networks associated with LCAs having large S and n, we will take this opportunity to establish a diagram convention for use in the following chapters. Example 4.1 Let M = <Z,IR,W> where W has support {-1,0,1}, i.e. M is an LCA with structure group Z, cell state set R, neighborhood template {-1,0,1} and local transition function 1:Z + MatR(l,l) IR. (wo)) (xo~ (f,'^)... ---< M^^) ----IY~L)...... Figure 4.2: Linear network for a LCA with structure group Z and neighborhood template {-1,0,1 }.

44 Figure 4.2 shows a section of an infinite linear network which realizes M. Note that 1) we have labeled the delays with the elements of the structure group Z; 2) we have not drawn the coefficient devices for W(s), scZ-X, since, by definition, the coefficient values are all zero; and 3) it is clear that the neighborhood template X gives the labels of delays that have wires leading to DO and that a wire from D goes through a coefficient device with coefficient value W(s). The more schematic representation shown in Fig. 4.3 is produced by drawing the labeled and directed group graph of Z w.r.t. X = {-1,0,1} as described in Sec. 3.4. Here, however, we label an edge arising from seX by a circle containing W(s) in order to suggest a coefficient device with value W(s). The nodes are square to suggest unit delays and labeled with elements of Z. In the sequel diagram of this form will always be intended to represent an underlying linear network of unit delays, summers, and coefficient devices. Figure 4.3: A schematic representation of an interative linear network. A final simplification is shown in Fig. 4.4(a) where only the edges directed into one node are shown. By the definition 1often LCA, this is

45 the information necessary to uniquely specify the complete network. Fig. 4.4(b) similarly specifies a network labeled with the values of H. Since H(s) = W(-s), these networks are exactly the same. * * * a). 0.. a b) Figure 4.4: a) A minimal representation of an iterative network, b) the same network represented by values of H. Example 4.2 Fig. 4.5 shows a simplified diagram for an LCA M = <Z,R,W> where the support of W is {-1,0,1}.

46 I I I I I I rev - - - - - -- l 'aft l Il lIlC0 1 I I Figure 4.5: Diagram for M = <Z,IR,W> This network can be viewed as consisting of two interconnected layers each like the network in Fig. 4.4. The weighting functions w.. specify the coefficients in the following way: wij (s) gives the coefficient for the connection to cell 0 of layer i from cell s of layer j. Alternatively, the structure within the dotted lines can be viewed as a 2 4 cell following Sec. 3.4. It is an LSM with state set R input set IR and transition function given by

47 6(q) =. W(O)q + [W(-1),W(l)] x for qR2, xeR4 and [W(-l), W(l)]eMatR(2,4) obtained by writing the matrices W(-1) and W(1) next to each other. It's interesting to note for this example that if each matrix W(s), seS,happens to be symmetric, then the cells of M are internally uniform and connected in such a way that Fig. 4.5 has an axis of symmetry running horizontally between layers 0 and 1. In this case the 2 LCA M = <Z,1R,W> realized by that network is isomorphic to an LCA M' = <Z 0 Z2,R,W'> for W' apprppriatelyc defined. This is true since the standard term symmetric applied to a matrix means that the operator represented by that matrix has a group of automorphisms isomorphic to Z2. In general, if S' is a group of automorphisms of a single cell, then the LCA with structure group S composed of those cells is also an LCA with structure group S (S' and cells of lower dimension. 4.4 LCAs and I/O analysis of linear systems The reader familiar with linear systems theory as presented in Chen [1970], or Schwarz and Friedland [1965], for example, will have recognized that the formal view of LCA developed above is essentially the same as that for the I/O analysis of a relaxed time invariant linear system. We have given the underlying group the interpretation of space instead of time and have not restricted it to R or Z. The spatial uniformity required of a cellular automaton is formally identical to time invariance. Theorem 3.1 says that the global transition function of a cellular automata commutes with a group of shift operators. Similarly the definition of a time-invariant system requires the system operator to commute with a set shift operator that forms a group isomorphic to R or Z. Combining this with linearity as we do in

48 Def. 4.1 gives rise to convolution in exactly the same way that it usually arises. H is formally the same as the usual temporal inpulse response matrix. The spatial analog of causality would have the neighborhood s+X of cells extending only on "one side" of cell s, e.g., for S = z, if xeX then x<O. This formal identity of concepts, however, should not obscure the different interpretations given to them. Mathematically, of course, whether S is treated as space or time or anything makes no difference (except that usually S is required to be singly generated and the neighborhood one-sided fot the temporal interpretation). Any theorem is true for any interpretation. It is common practice, in fact, not to distinguish between space and time and to study "instantaneous" operators. Hennie [1961] does this for some cellular automata by looking at "spacetime diagrams". We could formulate an LCA, for example, as a single convolution operation based on the group SxZ. Our view of an LCA as an autonomous sequential machine, however, means that we view its operation as an iteration in time of a convolution operation. At each time step, formula (4.2-3) is computed. (It should be clear by now that iteration means invariance with respect to certain shift operators: either spatial as in "iterative network" or temporal as in "iterative process.") Thus we have already taken advantage of the time invariance by incorporating it into a state space realization. The advantage of this view is twofold: 1) the explicit appearance of tirm is eliminated and 2) concepts of automata theory are immediately applicable. In light of the observations above it is interesting to note that tha "neighborhood standardization" technique presented by Yamada and

49 and Amoroso [1971,] can be viewed as a kind of "spatial realization". They show that a cellular automaton with an arbitrary neighborhood template is isomorphic to a cellular automata in which each cell has only immediately adjacent neighbors. In the images of this analogy, the usual idea of realization involves reducing the "temporal neighborhood" to the immediately adjacent "earlier" cell so that the resulting state transition function is very local in time. 4.5 Examples This section consists of examples of LCPs that have been studied in other contexts or have been used as models of natural systems. The purpose is to demonstrate the diversity of these kinds of structures and to indicate some of the many connections this study has with other areas of interest. Example 4.3 Finite difference approximation to the heat equation. u = u(x,t) 2 The heat equation: k u(x,O) = f(x) for -oo < x < c (4.5-1) x2 0 ~ t This equation is the classical formalism for modeling diffusion on a one-dimensional strip. In this case the strip is infinitely long since we have not imposed any boundary conditions. The variables x and t are in R and haxe the interpretation of space and time respectively, and u(x,t) gives the model's value for the temperature at point x and time t. To approximate the solution using a finite difference scheme these variables are discretized and the solution computed at equally spaced points xi, ieZ, where xi+l-x. = Ax. (Of course in practice only a finite 1 1+1 I~~

50 number of points can be considered). Time is discretized also so the solution is considered only at points t.,ieZ where t i-t. = At. The second derivative can be approximated by 2 a u ~ u(x+Ax,t) - 2u(xt) + u(x-Ax,t) ae 2 ax2 Ax2 and the first derivative by _u, u(X,t+At)-u(x,t) Tt At Thus (4.5-1) is written u(x,t+At) (x,t) x+ t - 2u(x,t) + u(x-,t) ] At Ax2 or u(x,t+At) u(x+Ax,) + (- 2kAt u(xt) + k u(x-Axt ) Ax Ax Ax This is a two dimensional linear difference equation and can be though of as specifying the LCA M <Z,R,W> shown in Fig. 4.6. ABI gA^ A \ Figure 4.6: An LCA for a finite difference approximation to the heat equation. If M is started with initial configuration qO where qO(s) = f(sAx), then the state of any cell seS at any time tEZ will be an approximation for u(sAx, tAt) of the initial value problem (4.5-1). Under many

51 circumstances this will not be a good approximation since care must be taken in choosing Ax and At. We will return to this point in Chapter 5. Example 4.4 Finite difference approximation to a simplified wave equation (discussed by Gary [1972]) +u c. Ob u = u(x,t) + C-;Lu 0 -t 3 x u(x,O) = f(X) for -_ < x < o (4.5-2) O $ t O t The exact solution of this equation is simply u(x,t) = f(x-ct) which represents a wave propagating to the right (toward higher values of x) with velocity c. The shape of f is not changed as it moves. A difference scheme with no truncation error is given by u(x,t+At) = u(x-cAt,t) The LCA M = <Z,R,W> indicated by Fig. 4.7 corresponds to this difference scheme. Letting the initial configuration qo be given by q0(s) = f(sAx), we have that qt(s) = u(scAt,tAt). Or, letting cAt = 1, qt(s) = u(s,tAt). In other words, the structure group Z of M is scaled so that the correct velocity results. M can be thought of as a simple shift register. ** ~-1 ) ---( l)->-1i o 1 Figure 4.7: LCA for the hyperbolic PDE t + c = In this case, the discretization produces no error. We will return to this example in Sec. 7.6 when we discuss laminated cellular automata.

52 Example 4.5 Circulents Let S = Z, the cyclic group of a network like the one shown in Fig. to make the diagram Simpler). order 4. An LCA M = <Z4,K,W> has 4.8 (we have let W(1) = W(2) = 0 Figure 4.8:...... An LCA M = <Z4 KWE>. 4 If the global transition function 6 is written like this: in matrix form, it looks qq(O) W(O) q(1) = W (3) q(2) W(2) q (3) W(l) W(1) W(0) W(3) W(2) W(2) W(1) W(0) W(3) W(3) 'q(O) W(2) q(l) W() q (2) W(O)J q(3)

53 The first row of the matrix is the weighting function and the first column is the impulse response function (i.e. H(0) = W(0), i(1) = W(3), etc.) A matrix having this kind of cyclical structure is commonly called a circulent or a discrete convolution matrix in the engineering literature (e.g. Waltz [1966]). Mathematical approaches formalize this structure as the regular matrix representation of an element of the group algebra of Z4 (see Curtis and Reiner [1962] or Boerner [1970]). Example 4.5 Cellular automata that can reproduce arbitrary patterns Amoroso and Cooper [1971] show that cellular automata can be designed which reproduce patterns in a sense to be defined below. Their systems are examples of LCAswhich have an underlying finite field K. We'll consider only the one-dimensional case. Let M be an LCA with structure group Z and let q be any configuration of M with finite support Y c Z with IYI = n. A configuration q' contains k copies of the pattern found in j if there is k-tuple of integers (S l,. sk) such that f (s.+Y) = $ i=l 1 and (L q')iY = qlY for i=l,...,k, i.e., q(y) = q'(si+y) for all yeY. The k copies are contained in a quiescent environment if the support of q' is k U (i+Y) i=l 1 M reproduces the pattern found in q if, from initial configuration q, a configuration is reached that contains two or more copies of the pattern.* *These definitions are from Amoroso ~ Cooper [1971], p. 459.

Z4) One result of Amoroso and Cooper can be stated as follows: The LCA M = <Z,ZP> shown in Fig. 4.9, where, for p a prime, Z is the field of integers modulo p, will reproduce the pattern found in any configuration q. The resulting copies will be in quiescent environments. *- [ — lI -- o Figure 4.9: An LCA that reproduces arbitrary patterns We give a proof of this result that is essentially the same as theirs but is based on the following more general ideas. Suppose M is an LCA <S,Kf> where W:S + MatK(1,1) - K, i.e. the cells have one-dimensional state spaces. Let X be the neighborhood template of M, i.e. X is the support of W. Define, for each xcX, w:S + K by x ( W(x) for s = x w (s) = 0 otherwise In other words, each w has just one non-zero value which is at point x x and equal to W(x). For each xeX, let Mx be the LCA <S,K,w > with x x global transition function denoted by 6. Each M is an LCA in which x x each cell has only a single neighbor. Fig. 4.10 shows M0 and M_1 for the LCA in Fig. 4.9.

55 M-_ ' ' * ~1 — 1 i) - 1 A ' * ' M0 M Figure 4.10: The LCAs M0 and M.lassociated with the LCA shown in Fig. 4.9. It is clearly not true in general that t(q) = 6t(q) xsX XeX x for any configuration q and time t. The linearity of M implies that this is true for t = 1, but it is not necessarily true, even with linearity, for t > 1. It's not generally possible simply to run the M separately x and then superimpose the results fo find what M would have computed. However, under certain restrictions it is possible. Theorem 4.1 Let M be an LCA <S,K,W> where S is abelian and K has characteristic p > 0. With LCA M, xcX, defined as above it is true that m m 6p (q) = SP (q) for any qcQ and mZ.Z+ xeX Proof: It's well known (e.g. Rotman, p. 154) that the binomial theorem leads to the fact that for a,beK, where K has characteristic p > 0,

56 m m m (a+b)p = aP +bP for m > 0. This is clearly also true for any finite sum. Recall that the global transition function 6 of M can be written in operator form as W6= W(x)L xX X Since S is abelian L L =L L and we can apply the fact stated above rs s r so that m, ( \-! mm mrn \xeX m xeX xeX dP=( WX )LX) = XEX(W(X xcX Thus 6p (q) = p (q) = 6 (q).xtX x xeX,, m since the 6P are linear. x Q.E.D. The Amoroso-Cooper result can be seen to follow from Theorem 4.1 as follows: since Z has characteristic p > 0, the LCAs M and M 1 p 0 shown in Fig. 4.10 can be run separately each starting with configuration q. For any m > 0, their configurations after p time steps can be superimposed to produce what M produces in p time steps. M0 does nothing to q since 6 is the identity operator. M1 computes L (pm), i.e. it shifts q to the right pm cells. Superimposing these results for m large enough to eliminate any pattern overlap produces a configuration containing two copies, each in a quiescent environment, of the pattern in q. 2 Amoroso and Cooper show a similar result for S = Z and conjecture that it's possible for any finite dimension, i.e. for S = Z. This that it's possible for any finite dimension, i.e. for S = Z. This

57 conjecture was proven by Ostrand [1971]. By Theorem 4.1 and an argument similar to the one above these results are clearly true for an arbitrary countable abelian group S. Theorem 4.1 expresses the essential fact underlying all of these results. Finally we note that pattern reproduction for arbitrary finite dimen= sion can be accomplished using a simpler scheme than that presented by Ostrand. Simply construct a laminated LCA (see Sec. 7.6 below) using the LA shown in Fig. 4.9. Fig. 4.11 shows the cheme for S = Z2 the LCA shown in Fig. 4.9. Fig. 4.11 shows the scheme for S = Z (1,1) (O Ai Figure 4.11: A laminated LCA that reproduces arbitrary patterns for S arbitrary patterns for S = Z. Since there is no interference between horizontal slices in this space, any two-dimensional pattern will produce a copy of itself that

58 will appear to the right of the original position. The scheme presented by Ostrand produces two copies - one to the right and one above. A similar lamination of the LCA of Fig. 4.11 will produce a scheme for s 3 S =Z

Chapter 5 Linear Cellular Automata and Harmonic Analysis 5.1 Introduction In the case of linear cellular automata the simplicity with which the structure is specified, i.e. the fact that the corresponding networks are iterative, can be translated into simplicity of behavior. This is accomplished by using Fourier analysis which is applicable in this case for the same reasons it is in other contexts: homogeneity of structure (e.g. time invariance and/or spatial invariance) together with linearity gives rise to the operation of convolution which can be uncoupled by the Fourier transform. This situation is found in the I/O analysis of time invariant linear systems (e.g. signal processing), in the study of linear PDEs with constant coefficients, and in models of lenses and imaging systems in optics (see Andrews [1970]). Often, though, harmonic analysis techniques are learned in a specific context and an understanding of the principles involved becomes tied to a specific application. Among the misconceptions arising in this way is the belief that functions need to be real or complex valued functions of one or several real variables in order for harmonic analysis techniques to apply. Less standard applications tend to be viewed as approximations to the continuous case or as necessarily derived from the continuous case. As a result, it is often felt that no real working knowledge and understanding can be achieved unless one feels comfortable with integrals, continuity, convergence properties, distributions, and other concepts necessary for dealing with functions of a real variable. That these beliefs are unjustified is becoming clear as elements of abstract harmonic analysis become known in applications oriented areas. 59

60 The chief reason for interest in this approach is the greatly increased interest in harmonic analysis due to the introduction of the Fast Fourier Transform (FFT) algorithm in 1965 for computing the Discrete Fourier Transform (DFT). The DFT can be understood completely within a purely algebraic framework that is a simple special case of a theory in which functions of locally compact topological groups are considered. The Fourier transform, Fourier series, Z-transform, the DFT, and even the Walsh transform are special cases in a unifying theory in which the structure of abelian groups plays a central role. In this chapter we will precisely state the conditions under which harmonic analysis is applicable to LCAsand interpret the results in automata theoretic terms. Our theorems will be illustrated by several examples which serve to unify previously known techniques. The literature pertaining to harmonic analysis and its applications is very extensive since uniform, linear structures are ubiquitous as models of natural systems. Bracewell [1965] provides an intuitively satisfying view of Fourier series and transforms. The DFT and FFT are well covered in engineering literature such as the IEEE special issues on the FFT [1967] and [1969], Bergland [1969] or Cooley, Lewis, and Welch [1967]. For the more computer oriented literature on digital signal processing see Gold and Rader [1969] or Andrews [1970]. The Walsh Transform is covered by Harmuth [1972]. Nicholson [1971] provides a concise algebraic development of the more abstract approach and gives an algebraic view of the FFT as does Cairns [1971]. The mathematical literature is represented by Rudin [1962] and Hewitt and Ross [1963].

61 5.2 The Discrete Fourier Transform In this section we will define in a general form what has come to be known as the discrete or finite Fourier transform (DFT). Although the DFT is usually defined over the complex numbers and for simple cyclic groups, we give a definition that allows certain finite fields and arbitrary finite abelian groups. Sec. 5.3 specializes the definition to familiar cases. We follow the presentation given by Nicholson [1971] and Aho, Hopcroft, and Ullman [1974]. Let G be a finite abelian group, with [GI = N. We will denote the group operation by '+'. The well known fundamental structure theorem for abelian groups (Rotman [1973]) says that G is isomorphic to a idirect sum of cyclic groups: d G Z... ( Zn where n n. = N and n.|n. (5.2-1) It "d j=l 3 Here Z denotes the cyclic group of order n. which can be conveniently n.i 1 thought of as represented by the integers with modulo n. addition. We will call a decomposition as in (5.2-1) a canonical decomposition of G. Hence G can be identified with the set of k-tuples of integers {(s1,.. sd)lO s < n. } with addition modulo (nl,...,n ). It will occasionally be convenient to denote an element seG as (sl,...,sd). The integer m which is the least integer such that mas = s+...+s (m times) = 0 for every seG is the exponent of G. For G decomposed according to th (5.2-1), m = nd. A primitive or principle p root of unity is an element weR which generates a multiplicative cyclic group of order p. In other words w is such that 1) wp = 1 and 2) w3 $ 1 for O<j<p.

62 th 2dni/p The most familiar example of a principle p root of unity is e where i = v/ in the ring of complex numbers. ixt The complex exponential functions:(t) = e into which a large class of complex valued functions of R can be resolved by the usual Fourier transform are special examples of functions called group characters. For the discrete case considered here, the group characters can be defined using only the decomposition (5.2-1) of G and roots of unity in the integral domain. R. Suppose that R contains a principle mth root af Unity where m is the iexponent of.G (here m = nd). Let a be this root and recall that F(G,R) = {f|f:G + R}. A set of group characters is defined as follows: Definition 5.1 The set of group characters of G in R is the set G(G,R) = {s |SEG} F(G,R) where d ( cr~. S ) ((rh ' = rd) d j= r s.r nd/n. (5.2-2) d' ' dj=l for all (Sl...S d)' (rl,...,rd)G. If we further assume that N eR where we take N = 1+1+...+1(N times), then the following properties, which are well known for the case R = C, can be proven (see Cairns [1971], Nicholson [1971], and Aho, Hopcroft, and Ullman [1974]): 1) Each s S(G,R) is a homomorphism from G into the multiplicative monoid of R, i.e. X(a+0) = S (ca) S (B) for all s,qeG.

63 In fact, P(G,R) is the set of all such homomorphisms excluding the one identically equal to 0. See Hewitt and Ross [1963] p. 345. Nicholson [1971] proves this for integral domains. 2) <(G,R) is a group with group operation '.' given by pointwise multiplication of functions, i.e. (Cs fr)( = s s()or (a) for s,r, aeG. 3) 4(G,R) with operation given in 2) is isomorphic to G under the isomorphism s - s. Thus s(oa)r(a) = s+r(a) for all s,r, aeG. s s r s+r 4) ( (r) = r (s) for all r, seG. s r 5) Each character is an eigenfunction of every shift operator L, seG (see Sec. 3.2). In particular Ls r = r(s)<r for all s, reG. s s r r r 6) The characters are orthogonal in the sense that 1 ~ (r1 if r = s GN E -s r0 otherwise 7) Any feF(G,R) can be written as a unique linear combination of the characters as follows: f = i Y f(a)a a~G where f(a) = jf(M)> (8) for all aeG. 8sG a OeG The equations in 7) can be taken to define an invertible operator on F(G,R) which we will call the discrete Fourier transform with respect Fore.tasEm _ihre c to G. We make this fact precise by stating it as a theorem. For a

64 proof of the existence and invertibility of the DFT in this general setting see Nicholson [1971]. Theorem 5.1 Let G be a finite abelian group with IGI = N and exponent m. Let -l th R be an integral domain containing N and a principle m root of unity. There is an invertible operator FG on F(G,R) called the discrete Fourier transform w.r.t. G given by FGf = g where g(a) = f(B) a() for all aeG (5.2-3) G~G GeG where the % are the characters of G in R. It's inverse, denoted by -1 FG is given by Fg = f where f() = Eg(a)c (). for gEG. (5.2-4) FG gaG aeG This theorem gives one of several forms that the DFT can take. Cooley, Lewis, and Welch [1969], for example, normalize the group characters so that the factor 1/N appears in the forward transform rather than the inverse. Note also that from fact 3) above we can conclude that for every s, aeG, 4s (a) has a multiplicative inverse in R. This is because %S(C) (s(a) = fe(a) where e is the identity of S. From Def. 5.1 it can be verified that 4e(a) = 1 for all asS. Thus (( (a))1 4 s(a) and, using fact 4), (a (s)) = s(a). Expression (5.2-4) is therefore equivalent to Fg = f where f(6) = Na a e a G a ee a ea Finally note that F(G,R) is a module with the usual operations of function

65 addition and multiplication by constants, and that FG is a module homomorphism or, in case R is a field, a linear operator. The discrete Fourier transform given in Theorem 5.1 has the same intuitive meaning as the more familiar Fourier transform and Fourier series. The group characters correspond to the complex exponential functions. Every function in F(G,R) is a superposition of the group characters where each character in the sum is multiplied by an element of R. A Fourier coefficient f(a) gives the "amplitude" of the character +-e in the sum (in other formulations of the DFT, f(a) gives the amplitude of f). We will often say that f is the "frequency domain" representation of f. While the term "frequency" may not retain its literal significance in the more abstract cases, it can be given the following interpretation. Each group character js is a homomorphism from G into the multiplicative structure of R. Therefore the kernel of s is the subset of G on which (s is 1, and, since is is a group homomorphism, this subset is a subgroup 5 5 of G. Subgroups of an abelian group G consist of "regularly spaced" elements of G. One can therefore think of the frequency of a character in terms of the "spacing" of its kernel - low frequencies have widely spaced kernels and vice versa. In what follows we will always use one of two notational convertions to indicate the DFT w.r.t. G: 1) as above, FG and FG will denote the G G operator and its inverse; 2) for any function f, f will denote FGf in cases where it is clear what group is involved. For longer expressions the hat will be placed following the expression, e.g. (f *S g) means FG(f *S g). The integral domain R underlying these operations will be clear from the context.

66 5.3 The Convolution Theorem The significant fact about the Fourier transform, whether the continuous or discrete one, is that it simplifies convolution. Letting G and R satisfy the hypothesis of theorem 5.1, the convolution with respect to G of two functions f,geF(G,R) is the function f *G gEF(G,R) given by (f *G g)(s) = f(r)g(s-r), for all seG. reG Convolution, then, is a way to "multiply" functions in F(G,R). Another way to multiply functions is to pointwise multiply them. We'll use the symbol '.' to indicate this operation on F(G,R), i.e. for f, geF(G,R), fog is the function given by (f'g)(s) = f(s)g(s), for all seG. Each of these operations makes F(G,R) into a (different) commutative algebra: the operation given by '*G or '' in addition to the usual module structure of pointwise addition and scalar multiplication. The algebra with the convolution operation is called the group algebra or group ring. The convolution theorem gays that FG is an isomorphism from the group algebra to the point-wise multiplication algebra. Since we know that FG is an isomorphism of modules, we need only that it preserve the "multiplication" operation of the algebras. That this is true is well-known for R = C and is proven in the general form by Nicholson [1971]. Theorem 5.2 (Convolution theorem) Let R and G satisfy- th. hypothesis of Theorem 5.1. Then for any f,g~F(G,R) we have that (f *G g)^ = '.g '

67 In other words, for convolution and the DFT both based on the same group G and integral domain R, the DFT of the convolution of two functions is the pointwise product of the DFTs of the functions. 5.4 Extension to Vector-valued Functions It is convenient at this point to describe the well-known extensions of the DFT and the convolution theorem to vector-valued functions. Consider the space of functions F(G,Rn) where Rn is a module with the usual component-wise operations. Assume that G and R satisfy the hypotheses of Theorem 5.1. The DFT can be extended to F(G,Rn) by applying the operator in (5.2-3) to component functions. For any feF(G,Rn) recall the convention of writing f(s) = (f(s), ' ' (s)) where f.eF(G,R). Define n nn Fn:F(G,R) + F(G,Rn) by FG (f) = (FGfO *FGfn) = (o. = f. (5.4-1) G G G n-l 0 g'" ~~F~f,_l O'"'' n-l Since FG is a bijection on F(G,R), FGn is a bijection on F(G,Rn), and its inverse is defined in the obvious way, i.e. (Fn)-l1 -1i (FG) (f) = (FG fo 'FG fn-1 (5.4-2) Let MatR(n,n) denote the set of nxn matrices with entries in R and let A:G > MatR(n,n). For any feF(G,Rn) let A *G f:G - Rn be given by R G (A *G f)(s) = ZA(r)f(s-r), for all seG. (5.4-3) reG This is what we'll mean by convolution for F(G,R ). Note that it is

68 not a way to "multiply" functions in F(G,R ) as it is for n = 1 since A is a matrix valued function. The extended DFT, however, simplifies this generalization of convolution if we define the DFT of a matrix and the analog of pointwise multiplication of functions in the following ways. For any A:G > MatR(n,n), we will write A(s) = (aij(s)), seG. R 1J Letting a.. = F aj, 0 i,j < n, define A:G + MatR(n,n) where A = (a.). j G ij aij For fsF(G,Rn), let A-f denote the function in F(G,Rn) given by (A.f)(s) = A(s)f(s), for all seG. (5.4-4) The generalization of pointwise multiplication, then, is pointwise action of matrices on vectors. In this framework, the convolution theorem appears in the following form. Since a concise proof of this generalization seems to be lacking in the elementray literature, we provide one here even though the result is well-known (at least for R = C). Theorem 5.3 Let R and G satisfy the hypotheses of Theorem 5.1. Then for any feF(G,Rn) and A:G + MatR(n,n), we have that (A,, f) = A.f Proof: According to (5.4-1) we need only show that FG[( *G f)] = (A.f)i for i=O,...,n-l From (5.4-3) we have for seG and i=O,...,n-l that

(A *G f)i(s) = ~ [A(r)f(s-r) ] rG -n-1 reG j=O reG n-l = - i(ai. j=O 13 ai. (r)fj(s-r) 13 3 J aij (r)fj (sr) n-l1 G fj ) (s) = j_ 0 aij G f (s) G n-1 Hence (A *G f)i = Z j=O Now, since F is a module hon G aij *G fj i=O,...,n-. omorphism, one ca write Romorphism, one can write TI FG[(A *G f)i] FG [ aij *G f] n-1 = Z FG(aij *G fj) j=O n-1 =O a..j. j= 0L~ by Theorem 5.2. Making the argument s explicit, we have that n-1 j=o (aij f )(s) 13 3 n-1 -= a S (s)f (s) j=O 1 3i = (A(s)f(s))i 1 = (X. i(s) for all seG by I Therefore, FG[A A A *G f)i] = (Af)i, i=O,...,n-l. 1 ' Q.E.D.

70 The reader may note that for n=l Theorems 5.2 and 5.3 are identical. In fact, in a more abstract presentation the more general situation would be discussed directly in terms of the theory of multilinear operators and Theorem 5.2 would be presented as a special case of Theorem 5.3. We will therefore use the notation '* ' and '.' for both the scalar-valG n - ued and vector-valued cases. We will also write FG for F and F n -1 for (FGn) when it is clear from the context that an operator on F(G,Rn) is required. 5.5 Examples of the Discrete Fourier Transform and Convolution In order to emphasize the generality of Theorems 5.1 and 5.2 we present several examples. The first is the most commonly described form of the DFT and convolution: R = C and G is a cyclic group. The last example is very unfamiliar to most people as has intriguing possibilities for modeling certain natural systems. Example 5.1 The most common form of the DFT is for the case when G = ZN, the finite group if integers with modulo N addition, and R = C, the complex numbers. The exponent of ZN is N and a principle N root of unity in C is e2i/N. Following Def. 5.1, we have that k(j) = e2 ikN for all j,kcZN. Thus we have that N- 1 f(k) = f(j)e2ikj/N k=0,,Nj=0 and N-1 1 A -2vikj/N f(j) f(k)e- j=0,...N-l. k=O

71 In this case, it can be seen that _s (a) = s (a), for all s,aZN, where the bar indicates complex conjugation. The orthogonality relation given by fact 6) of Sec. 5.2 can therefore be stated in the more familiar form ^1 -cs~,r(a) (=.1 if r=s N v)vW )= ( aeG r Io otherwise. This is true whenever the underlying integral domain is C. Convolution with respect to ZN takes the form N-1 (f *Z g)(k) = f(j)g(k-j), k=0,...,N-l. N j=0 The minus sign indicates subtraction modulo N, i.e. refers to the operation of the underlying group. This is often called circular or wrapped convolution since Z is a simple cyclic group (c.f. Fig. 4.8). A matrix interpretation of this DFT and convolution may help to clarify the concepts involved. While it is possible to represent the operator FG as a matrix for any finite abelian group G, we will only show what happens for this special case. A function fZN C is (naturally isomorphic to) a vector (f(0),f(l),...,f(N-l)) in CN. F can therefore N be expressed as an NxN matrix which we'll call U having entry e ik/ th th in the k row and j column. The inverse transform is represented by N Ut where Ut is the conjugate transpose of U. For example we have for N=2' 1 1 1 1 N = 2: U = [and U = [ ]

72 and for N = 4: 1 1 1 1 1 1 1 i-1 - and UI: ~U~~- = T1 i _-1iad4 U - 1 i 1 -1 1 -1 1 -1 1 -1.1 -i -1 ii L1 i -1 -i. If the rows and columns of U are labeled so that by the sth row of U we mean the same thing as the Sth coordinate of a vector in CN, then th U is an NxN matrix whose s row is the character of (s viewed as a vector in CN. The matrix N U-1 represents a change of basis for CN in 1 - Tus1, th which the new basis vectors are the columns of T. Thus, the s 1 basis element is the normalized group character. We now give a concrete matrix interpretation of the convolution theorem which makes that result particularly easy to understand. Suppose N = 4, f = (f(0),...,f(3)), g = (g(O),...,g(3)), and h = f *Z g. 4 4 Convolving g with any fixed f is a linear operation on C and thus has a matrix representation which we'll call Af. It's easy to check that in this case Af is a 4x4 circulent (see Ex. 4.5): f(O) f(3) f(2) f(l) f(l) f(O) f(3) f(2) Af = f(2) f(l) f(O) f(3) f(3) f(2) f(l) f(O) We then have that h = Afg. Theorem 5.2 says that U that pointwise multiplication by any fixed Uf is al on C and has a diagonal matrix representation Df with the function (vector) Uf along the diagonal, i.e.

73 (Uf)(O) 0 0 0 (Uf) (1) o o Df= 0 0 (Uf) (2) 0 0 0 0 (Uf)(3) Using this observatiorA, Theorem 5.2 says that UAfg = DfUgcor; recalling that U is invertible, Afg = U DfUg = UtDfUg. 4 Thus, the convolution theorem in this case says that for any feC, Af is similar to a diagonal matrix Df under the similarity transformation A~f ~~f. represented by U: Af=U DfU. The significant point, however, is that the same change of basis diagonalizes the matrix Af for all vectors f. In other words, all matrices that have a structure like Af can be diagonalized by the same change of basis. The particular entries in the matrix are not important, only their arrangement or pattern. Example 5.2 Let R = C and let G be finite abelian group which decomposes so that G Zn... Z d! ^nd Writing elements of G as d-tuples of integers, we have that nl-l nd-l f(kl,...,k ) =... 1 d.j=0O =, d for all (k1,...,kd)eG and nl-l nd-l 1 Z, -27riCk j /n f(j,...,' d) = d 4 *k* k f(kl,.,k )e 1. P/k - kfor kall (d) l rk I..drlf kalO 1 ^kd:O for all (jl,.,jd)eG.

74 This is the general form of what is usually called the d-dimensional DFT. Convolution in this case takes the following form: [f *G g](kl,.,kd) nl-1 nd-l = 0 f(ji...,jd)g(kl-J,...',kd-Jd) j:0 o: for all f,feF(F,C) and (kl,...,kd)eG where k i-ji is subtraction modulo n.. Example 5.3 The previous three examples involved the infinite field C. However, the requirement that R contain an mh principle root of unity does not imply that R must be infinite. Let R be the finite field of characteristic p where p is prime. Then R ~ Z and contains a principle P (p-l)th root of unity which we'll call w (see Nicholson [1971]). th It's easy to see that if m divides p-l, Z contains a principle m P root of unity, namely o(P l)/m Thus, FG and FG 1 exist when R Z and G has exponent m that divides p-l. In particular, for R = Z and P G = Z Fourier analysis is applicable when n divides p-l. Nicholson n [1971] points out the computational significance of these results. In some cases, it is possible to compute the DFT and convolution exactly using integer arithmetic. This fact together with the fast algorithm for computing the DFT underlies the Schonhage-Strassen fast integer multiplication algorithm (see Knuth [1969] or Aho, Hopcroft, and Ullman [1974]). For example, let R = Z5 and G = Z4. It can be verified that 2 is 54 th a principle 4 root of unity in Z so that the matrix representation 0

75 of the DFT (see Ex. 5.1) is as follows: 1 1 1 1 1 2 4 3 1 4 1 4 1 3 4 2 Its inverse U is given by 1 1 1 1 u-1 = 4 1 3 4 2 1 4 1 4 1 2 4 3 since 4 = 4 in Z5. Example 5.4 Suppose R = C and G = (Z )n In this case the DFT is called the finite Walsh transform and is receiving more and more attention due to its very interesting properties (see Harmuth [1972]). The most suggestive fact about it is that each group character Js, or Walsh wave, has a range consisting only of +1 and -1. This is true because the exponent of G is 2 and +1 and -1 are two distinct roots of unity. Since these functions can be generated easily by digital means (e.g. switching), standard signal processing techniques applied in such areas as radio, TV, radar, and spectroscopy are being recast in terms of this new underlying group with the hope that digital technology may be more easily applied. The usual way of presenting the finite Walsh transform and the corresponding convolution theorem is through a binary representation of G. Letting Z2 = {0,1} with module 2 addition (or, what is the same

76 thing, the logical operation of exclusive-or), the group G can be represented as the set of n-tuples of Os and Is with bit-wise exclusive-or as the group operation. This operation is often called dyadic addition and the group G is often called the dyadic group (Harmuth [1971]). For s,reG, let s ( r denote the dyadic sum of s and r and note that dyadic subtraction is the same as dyadic addition. Then dyadic convolution is the same as dyadic correlation and can be written as follows: for f,gsF(G,C) (f *G g)(s) = f(r)g(s r), for all sEG. rcG Ordering the elements of G as if they were binary representations of integers, the matrix representation of the Walsh transform for n = 2 is 1 1 1 1 1 1 -1 -1 U 1 -1 -1 1 1 -1 1 -1. For any n, the entries in the matrix representation of the finite Walsh transform are all Is'or -ls and hence the matrix is real. Since the matrix for any DFT is symmetric, we have that 1 = 1 Ut= 1. |G| 2I A forward Walsh transform is the same (except for scale) as a backward one. In most applications, the elements of G are viewed as encodings of the integers so that elements of F(G,C) are thought of as functions of the set {O,l,...,n-l} with its usual interptetation as a set of time or space samples. The clearest example of this "straight-line" represen

77 tation occurs in the continuous case. Although they are not instances of the results we have stated above, the following facts follow from the general theory of harmonic analysis on locally compact topological groups. 2 Let L [0,1] denote the space of all square integrable functions (in the Lebesque sense) on the real interval [0,1]. Each such function can be expanded as a (perhaps infinite) sum of Walsh functions. Con2 vergence is according to the usual L norm. The first 4 Walsh functions are graphed in Fig. 5.1. +1 +1 -1 +1 -1 +1 -1 _I - 1 I ~ ~ ~~~~~~~~~I -I-~~~~~~~~~ 4-H -C I ~ ~~~I I I — -il I I 0 1 Figure 5.1: First 4 Walsh functions (of [0,1] R).

78 Compare these graphs with the columns (and rows) of U. Although we graph these functions in the familiar way, they are functions of the group (Z2. This group can be identified naturally with the interval [0,1] but has a different topology. Since the usual group structure of the integers (or reals) often successfully models our notions of time or space, this "straight-line" view can lead to confusion. In particular, it might be felt that there is no convolution theorem for the Walsh transform. There is, of course, but the operation of dyadic addition does not model our usual view of translation in time or space. This does not mean, however, that the dyadic group, and hence dyadic convolution, may not provide a natural and useful model for real world phenomena. Two other facts about the finite Walsh transform should be mentioned. First, there is a fast algorithm for computing it which is a special case of the usual FFT. Since the characters only have values +1 and -1 the algorithm need not involve complex arithmetic and is faster than the usual FFT. Second, although the term Walsh transform usually refers to cases in which the underlying integral domain is C, the DFT on (Z)n works for other integral domains. In particular, it works for R and, in accordance with the remarks in Ex. 5.3, for the field Z3 consisting of only 3 elments. 5.6 Generalizations It is possible to relax both the finiteness and commutativity conditions on the group G and still obtain results similar to those stated above. In fact, the theory of the DFT for R = C is the simplest instance of a theory of harmonic analysis. on locally compact topological

79 groups (see Hewitt and Ross [1963]). While we will make no attempt to precisely outline these generalizations, it is possible to mention two major differences that can occur in more complex situations. First, removing the finiteness condition on G permits G and the character group to be non-isomorphic. In general, the character group is isomorphic to a group G* known as the dual group of G. In case G is finite and abelian, G = G*. This has important consequences since the Fourier transform (defined appropriately) of a function of G is a function of G* (and, it turns out, vice versa, i.e. (G*)* = G). An example is the case of the usual Fourier series for periodic functions of R. Suppose, for simplicity, the period is 1. Then each function can be viewed as a function of R/Z = {x+ZjxeR} which is sometimes called the circle group (see Lang [1967]). The group G is then IR/Z but G* Z Z. This corresponds to the fact that the spectrum of a periodic function must consist of discrete multiples of the fundamental frequency. In other words, the Fourier transform o.f a periodic function of IR is a function of Z and hence the term "series" applies. Although lt/Z is uncountable, its dual group is countable. (dually, the Fourier transforms of suitable functions of Z are functions of l/Z). The second kind of complication encountered in the more general theory is that for non-abelian groups the Fourier transform of a scalar valued function is in general operator valued. In the matrix framework described in Ex. 5.1, this means that the similarity transformation given by U and U1 does not necessarily diagonalize all the convolution matrices Af. Instead of producing a matrix with single elements along the diagonal, it produces one with "blocks" along the diagonal. Therefore, the Fourier transform for non-abelian groups does not necessarily convert convolution into pointwise multiplication.

80 5.7 Parallel Decomposition of Linear Cellular Automata In this section we interpret the application of Theorem 5.3 to linear cellular automata in the language of automata theory. This requires us to restrict attention to LCAs with finite abelian structure groups. These LCAs have finite dimensional state spaces and are therefore linear sequential machines (LSMs) in the usual sense (Harrision [1969]). Now, since most studies of LSMs do not contain discussions of module theoretic generalizations (i.e. where the field requirement is relaxed and and rings considered instead), we also restrict attention to the application of the DFT based on an underlying field. Thus we will be discussing a subclass of LCAsas defined by Def. 4.1. The reader should note that the existence of the DFT and convolution theorem for certain rings as discussed in Sec. 5.2 and Sec. 5.3 leads to natural generalif zation of the ideas presented here. However, since our purpose is primarily integrative, we will not pursue the theory in its most general form. A finite set of autonomous sequential machines, M. = Q i=l,...,N, can be regarded as a single sequential machine N N M = < n Q., n 6> i=l 1=1 N where 6i(q,...,qN) = (6(qO)'..' N(q N i=l N for all (ql, ''qN) Qi ' ~1 N i=l We say that M is a parallel composition of the Mi. The machines M. are just viewed as syncronously operating in parallel, i.e. they all change state at the same time and there are no connections between

81 them. We will write N M= II M.. i= 1 These notions are usually defined for nonautonomous machines and our results are generalizable to such cases but will only consider the autonomous case here (see Hartmanis and Sterns [1966] or Harrison [1969]). Since each state transition of an LCA is a convolution as indicated by (4.3-3), the convolution theorem can be applied (in the form given by Theorem 5.3). The resulting pointwise multiplication of functions is the same as the transition function of a parallel composition of simple automata. "Pointwise" and "parallel" mean the same thing: there are no interconnections between the components. This is a simple observation, but we state it as a theorem to precisely connect the terminology of the functional view with that of automata theory. Theorem 5.4 Let M = <S,Kn,H> be an LCA where S is a finite abelian group with exponent m and K is a field containing a principle Mh root of unity. Then M is isomorphic to a parallel composition II M where for each seS sES, M = <Q,6 > with Q = Kn and 6 = H(s). That is, M is an autos s S s nomous LSM with an nxn transition matrix given by (hi (s)) where the h.. are the impulse reponse functions df M (see 4.2-4). 13 Proof: If the elements of S are put into a standard order, say S = {sl,...,sN} where ISI = N, then the state set of I M can be naturally identified seS swith s where with F(S,K ) by the correspondence (ql.' '.,qN ) - qeF(S,Kn) where q(si) = qiEK for i=l,..,N. Under this identification the transition 1

82 function 6 = n 6 of n M is given by (9q)(s) = 6 (q(s)) = H(s)q(s), seS s S s for all ssS, or simply 6(q) = H.q. If 6 is the global transition function of M, we know from (4.2-3) that 6(q) = H *S q for all qF(S,Kn). Since S and K are such that the DFT F exists, (for vector-valued fctions) says that (for vector-valued functions) says that the convolution theorem (H *S q) = Wq for all qEF(S,Kn). In other words Fs6(q) = 6(Fsq) for all qeF(S,K ) so that FS is an automaton-homomorphism. Since we also know that the DFT is invertible, FS (for vector valued functions) is the required isomorphism. Q.E.D. This result is formally identical to the corresponding theorem in which the group S is taken to represent time instead of space. This theorem forms the basis of frequency domain analysis for time invariant linear systems with n input and n output wires (see for example, Chen [1970] or Schwartz and Friedland [1965]). It should be noted, however, that the usual application of these results in linear systems theory produces an isomorphism of systems in the I/O sense. In our application, the Fourier transform is a state isomorphism. This difference allows concepts from systems theory and facts from harmonic analysis to make contact in novel ways. For example, in the next chapter we will discuss how filtering techniques can be used to produce homomorphic models of LCAs.

83 For the sake of further integrating automata theoretic terminology, we note that the parallel composition I M given in Theorem 5.4 sS s realizes the LCA M through the assignment FS. Therefore it is consistent with the usage of the term decomposition in automata theory (Hartmanis and Sterns [1966]) to say that 1[ M is a parallel decomposition of seS the LCA M. The following two definitions provide some notational conveniences we will find useful in the succeeding chapters. Definition 5.2 An LCA M = <S,K,H> is decomposable if it satisfies the hypotheses of Theorem 5.4. Definition 5.3 If M= <S,K,H> is a decomposable LCA, then the-pparallel decomposition given by Theorem 5.4 will be dendted-by M = [S,Kn,H]. In what following chapters, we will assume the natural correspondence described in the proof of Theorem 5.4 and take the state set of M to be F(S,Kn). Also, for W any subset of S, we will take the state set of n M to be F(W,K ) so that its transition function can be given seW by (HIW).q for qeF(W,Kn).

84 5.8 Examples of Parallel Decomposition Three examples of the application of Theorem 5.4 follow. The first two are based on the LCAs of Ex. 4.3 and Ex. 4.4, i.e. on the finite difference approximations to the heat equation and simplified wave equation. In these cases, the parallel decomposition of the LCAs corresponds to a well-known technique for stability analysis of finite difference schemes. The third example is a simple system that has been used to model an aspect of morphogenesis. We present a well-known observation about this system as a special case of our theory of LCAs. The general context thus provided allows interesting connections to be made. Example 5.5 Heat equation The LCA M = <Z,IR,W> given in Ex. 4.3 as a representation of a finite difference scheme for the heat equation is not decomposable since Z is infinite and JR contains only two principle roots of unity: +1 and -1. Therefore, consider the LCA M'= <ZN,C,W'> where W' (s mod N) = W(s) for all seZ, i.e. M' has a cyclic structure group, a complex cell state set, and the same weighting functions as M but viewed as functions of ZN. Except for their complex state sets, the cells of M' are the same as the cells of M, but M' consists of N of them arranged in a loop. Since C contains a principle Nh root of unity for any N we know that M' is decomposable. If we relax, for the moment, the requirement that the state set of M consist only of configurations with finite support, then we can formalize the relationship between M' and M as follows: a submachine of M' is isomorphic to a submachine of M. Since'W has real values

85 kAt 2kAt kAt -, 1-.-2k,k and, Ax Ax Ax a real valued initial configuration of M' will generate only cell states with zero imaginary part. Thus there is a submachine of M' with global state set F(ZN,R). Call this submachine ReM'. The submachine of M to which ReM' is isomorphic has state set consisting of configurations (i.e. functions of Z) that are periodic with period N. To see this consider an operator from F(ZN,IR) into F(Z,R) that periodically extends a function of ZN into a function of Z. Think of "unrolling" the cyclic group so that a function's values on ZN are repeated in a periodic way for all of Z. Clearly, then, the range of this operator consists of all functions of Z with period N. Since M and M' are spatially uniform and have the same kind of cells, its not hard to convince yourself that this periodic extension operator is an isomorphism from ReM' to the submachine of M whose state set consists of configurations with period N. Call this submachine PNM. We have that ReM' —= PNM. We've relaxed the finite support assumption since the only function of Z with finite support and period N is identically equal to 0. With this relationship in mind, the behavior of M can be studied for certain initial configurations by examining the behavior of M'. Since M' is decomposable, information about its behavior can be obtained by studying the behavior of its parallel decomposition M' = [ZN,C,H']. The importance of these relationships is that M' is very simple to analyze since there are no interconnections between its components. Fig. 5.2 shows the LCA M' and its parallel decomposition M' for the case N = 4.

86 Figure 5.2: The LCA M' and its parallel decomposition M' for N = 4.

87 Notice that even though we've arranged its components in a circle, M' is not an LCA since the values of H' will in general all be different, i.e. it is not uniform. The circular arrangement is used only to A A suggest the fact that the function H' and the states of M' are functions of Z. 4' From the formula for the DFT given in Ex. 5.1, the function H' can be computed. Using some elementary facts about complex exponentials, the following expression is obtained: 2kAt 2kAt H' (s)= 1 - 2k- + 2kt cos (2rs/N) for all seZN. (5.8-1) Ax Ax The cosine is the usual one evaluated at the integers. Although the function H' (which is the same as W' since it is an even function) and H' technically have complex values, their imaginary parts are zero. kAt 1 Fig. 5.3 shows the real parts of these functions for N = 16 and -2 =. Ax We've graphed these functions using a convention similar to that used for functions of IR, but it should be remembered that they are functions of a discrete variable and need not be thought of as related in any way to continuous functions. In particular, they are not discontinuous functions of R.

T I, c.. ~-. "iX1 01 2 * * a) Real part of H'. i 2 1 2 b) Real part of H'. <v"~ ~ kAt 1 Figure 5.3: The functions H' and H' for N = 16 and kA = 2 2 Ax In numerical analysis the decomposition M' is often used to determine the stability of the finite difference scheme to which the LCA M' corresponds (see Duff and Naylor [1966] or Gary [1972]). To see how this is done, note that the components of M' behave very simply: if 6 is the transition function of component s, we have for all qcF(ZNC) that 6t(q(s)) = [H' (s)t(q(s)) S for t = 0,1,2,... (5.8-2) Thus, for M' to be stable it is necessary and sufficient that IH'(s)I < 1 for all scZN. (5.8-3) If this is not true for an acZN, then the state of component a in the

89 i T i77 '17 ~ * * W-2 M-S 0 1 2. - a) Real part of H'. (I ~ -I-r - 41 I i -I i_____ ___L I b) Real part of H'. Figure 5.4: A kAt _ The functions H' and H' for N = 16 and k2 = 5 Ax

90 A<N~~~~~~~~~~~~~~~ A parallel decomposition M' will grow without bound, and we say that M' is unstable. When H' (a) < -1, not only does the magnitude of a state grow, but its sign changes at every time step. Using expression (5.8-1), condition (5.8-3) can be written I 1-2kAt + 2kAt 1| 1 2kt- 2kA cos (2rrs/N) | 1 for all seZN. Ax Ax k > 0 this reduces to the condition kAt < 1 (5.8-4) Ax since Icos(2irs/N) < 1 for all seZ. N The functions graphed in Fig. 5.3 show the situation for kAt 1 k 2= 1 and one can see that the values of H' are all between +1 and -1. 2 2 Ax Fig. 5.4, on the other hand, shows what can happen when (5.8-4) is not A satisfied. There are five values of H' that are less than -1. The corresponding five components of M' are therefore unstable, and hence we say that the entire machine M' is unstable. Since M' and M' are isomorphic, this implies the instability of M'. It is not the case, however, that all submachines of M' are unstable. Let Y c S be such that IH'(s)I > 1 if and only if seY. If M' is started in a state q such that q(s) = 0 for seY, then for all teZ, 6t(q) has the same property, i.e. [6t(q)] (s) = 0 for scY. The set of states with this property is thus the state set of a submachine of M', and thi.s submachine is clearly stable. However, the submachine ReM' of M' with state set consisting of real valued configurations is not stable when (5.8-4) fails. Since ReM'= —PNM' we can conclude that condition (5.8-4) also guarantees the stability of PNM. It also guarantees the stability of the entire system M, but we have not developed the

91 framework necessary to show this. Example 5.6 Simplified wave equation The LCA M = <Z,IR,H> discussed in Ex. 4.4 as a finite difference for a simplified wave equation is not decomposable for the same reasons the LCA for the heat equation isn't: it has an infinite structure group and IR doesn't contain enough roots of unity. However, the remarks made above for the apppoximation to the heat equation hold here as well, and we'll consider an LCA M' = <ZNCH'> that has the same kind of cells as M but a cyclic structure group and complex cell state set. M' is decomposable and Fig. 5.5 shows M' and its parallel ^ A decomposition M' = [ZN,C,H'] for N = 4. The impulse response function H is simply (0,1,0,0), and if you refer to the matrix U of the DFT for N = 4 given in Ex. 5.1, you will see that H = (l,i,-l,-i), i.e. it is the second row of U. M' 1 0 1 1 1 2 1W 3 Fz FZl-1 4 4 Figure 5.5; The LCA M' and its parallel decomposition M' for N=4.

92 An understanding of the relationship between this M' and M' will go far toward clarifying the nature of the spatial and the spatial frequency representations of a decomposable LCA. In fact, an intuitive understanding of these systems constitutes an essential step in understanding much of what mathematics has to say about PDEs and wave phenomena. Therefore, despite their simplicity, we will discuss the systems of Fig. 5.5 in some detail. First, it is obvious what kind of behavior M' exhibits: it's a circular shift register. Any initial configuration will simply circulate around the loop without changing size or shape. The behavior of M' is simple also since its components are independent. Each component is an oscillator with a different frequency. Since when complex numbers are multiplied their arguments add and their moduli multiply (De Moivre's Theorem), the behavior of component a of M' can be seen directly from the argument and modulus of f(a): the argument gives the frequency, and the modulus determines whether the oscillation grows or decays in amplitude. In this case all the moduli are 1 so that the components produce undamped oscillations. A graph of the arguments of the coefficients looks like this: Argumen /2j Argument of H 2 3 0 1 2

93 It turns out that this graph is linear for any number of cells (even an uncountably infinite number). Thus, for N = 4, the frequencies are such that component 0 simply stores its initial state without change for all tcZ+. The state of component 2 changes sign at each time step, and components 1 and 3 return to their initial state after every 4t time step. How are the very different behaviors of M' and M' related? Suppose M' is started in initial configuration a, acZ4, i.e. its initial configuration is a normalized group character. According to Theorem 5.4 and the definition of FZ the corresponding initial 1 state of M' is F ( ) = (0,...,1,...,0) where the 1 is in the Z IN -a thA A a position. Then the next state of M' is (0,...,H' (a)1,...,0) = (O,...,fH(a),...,O). Since M' and M' are isomorphic, this means that the next configuration of M' is Fz l(,..,H(a),...,0) = ^1 1 4 H'(a) a= ' (a(). a. In other words, the group characters are eigenvectors of the linear global transition function of M'. The 1 ^ corresponding eigenvalues are the values of H'. Since any configuration can be represented as a linear combination of the group characters, and since the transition function of M' is linear, it is possible to find the coefficients in the linear combination (i.e. compute FS) and multiply each by the appropriate value of.H The resulting values are used as the coefficients in the group character expansion of the next state of M' (i.e. FS is computed). The transition function of M' is just the diagonalized representation of the global transition function of M' (see Ex. 5.1). But how are the group characters shifted (i.e. the action of M')

94 by means of multiplying them by constants-? Suppose M' is in configuration '2 = (1,-1,1,-1). Then the next configuration is this function shifted (circularly) to the right, i.e. (-1,1,-1,1). But this is -2'. Similarly, i(l,-i,-l,i) = (i,l,-i,-l), etc. Fact (5) of Sec. 5.2 expresses this more simply: the characters are eigenfunctions of the shift operators L, aeS. Here, the global transition function of the LCA M' is L 1:F(Z4,C) + F(Z4,). Group characters in C (at least for simply generated groups) can be viewed as helices (in the space S x C) whose "pitch" is given by their frequency. Multiplying by a complex constant can be throught of as a "twist" of the helix and, therefore, results in a translation. The linear graph of the argument of H' reflects the fact that higher pitched helices need to be twisted more to produce the same lateral translation as their lower pitched counterparts. If a value of H' has modulus not equal to 1, the corresponding helix will shrink (<1) or grow (>1) in diameter as it is twisted (cf. the previous example where, in addition,the argument of each value of H' is zero so that no twists are involved). Although such imagery is not directly applicable in the more abstract cases (e.g. Walsh functions), it captures all the essential intuition needed to understand the relationship between a decomposable LCA and its parallel composition.

95 Example 5.7 Rashevsky-Turing morphogenic model Rosen [1970] discusses a simple example of a class of models proposed by Rashevsky in 1940 and by Turing in 1952 as possible mechanisms underlying morphogenic processes. The example provides one reasonable view of how an initially homogeneous collection of cells can give rise to cellular differentiation. We will present this model in an extremely simple version that retains only the barest essentials necessary to produce the required behavior. The system consists of two cells which are thought of as actual biological cells. The state of each cell is characterized by the concentration of a substance, called a morphogen, which is taken to determine the characteristics of a cell (the model Rosen discusses has two morphogens per cell). Suppose each cell produces the morphogen at a particular rate and, depending on the parameter values chosen, the morphogen either diffuses or is actively transported to the other cell. Let us assume that a cell can manufacture and store an arbitrary amount of morphogen so that we can use the following simple model: let M = <Z2,R,H> be the LCA with H(0) = a and H(1) = b. This LCA is decomposable and Fig. 5.6 shows M and its parallel decomposition M. In this case the DFT has this matrix representation: 1 -1

96 M F A ) (a-b Z 2 Figure 5.6: A two-celled LCA and its parallel decomposition Thus, component 0 of M keeps track of the total concentration of the morphogen in M, and component 1 keeps track of the difference between the concentrations in cell 0 and 1. Suppose a+b = 1 so that the total concentration remains unchanged, but let a-b > 1. Then, if the initial state of component 1 of M is non-zero (say it is postive), the amplitude of the spatial frequency component <1 = (1,-1) will increase without bound as M operates. Using a term from the theory of signal processing, we can say that <1 is a frequency that resonates in M. What does this spatial resonance mean in terms of the model's morphogenetic interpretation? First, note that any initial configur

97 ation of concentrations that is a constant function (i.e. the same value for each ceil) \is &n equilibruim configuration of M since the Fourier coefficient corresponding to (1 is 0. Thus, in the isomorphic system M, the initial state of component 1 is 0 so that neither component changes state. However, if there is the slightest difference between the concentrations in cell 0 and 1 of M, this difference will resonate. In other words, a constant configuration is an unstable equilibrium of M. Thus, the model shows that it is not logically impossible for an initially homogeneous structure to differentiate provided there is a source of outside disturbances however small. Note, though, that M remains uniform in the sense of our definition of a cellular automaton since the function H does not change. The differentiation occurs between the states of the cells. This has no bearing on the argument, however, since presumably biological cells in different states can exhibit different physical characteristics. Finally, we would like to make the simple observation that in this modeling situation the instability of the system M is a desirable aspect of its behavior and has not been introduced by careless approximation techniques.

Chapter 6 Homomorphisms of Linear Cellular Automata by Spatial Frequency Filtering 6.1 Introduction The concept of an automaton homomorphism provides one suggestive idealization of what it means for one system to be a model of another. In this chapter we present a number of ways that homomorphisms of linear cellular automata can arise and interpret these homomorphisms in the context of modeling and simulation. The point of view taken is that when a model is formulated as an LCA,two related sets of issues arise. First, the original LCA proposed as a model might be so complex that it does little to increase our understanding of the real system. It might even be too complex to simulate on a computer due to storage and/or time limitations. Under these circumstances it is natural to ask how the original LCA can be simplified and what relationship can be expected to hold between the LCA and its simplified form. The second set of issues involves the justification for using the LCA formalism in specifying a model. While Chapter 8 deals more directly with the latter problems, the results developed here and in Chapter 7 provide insight into various simplification procedures and the kind of difficulties that can arise when they are not used carefully. Following an intuitive discussion of automaton-homomorphisms, we define several operators that will be used to express relationships between LCAs and their simplifications. Sec. 6.4 is somewhat of a digression. It contains a result we will use later on in discussing homomorphisms, but the presentation was chosen to emphasize some ideas 98

99 that are less central to our line of development. Finally, we discuss the basic facts about LCAs and spatial frequency filtering. Although this chapter and the next contain numerous examples, the possibilities for system simplification are not adequately illustrated since it is impossible to draw diagrams for all but the least complex systems. 6.2 Homomorphisms Let M= <Q,6> and M' = <Q',6'> be sequential machines. In Chapter 2 we said that if there is an onto map g:Q + Q' (not necesarily one-one) such that i6(q) = 6'(E(q)) for all qcQ, then M' is a homomorphic image of M. The map ' is called an automaton-homomorphism and induces an equivalence relation on Q such that ql q2 if and only if ((q) = E(q2). Since C is a homomorphism it is true that if ql E q2 then 6(ql) - 6(q2), i.e. equivalence classes are mapped to equivalence classes by the transition function 6. Recall our convention of writing M - M' to show that i is a homomorphism from M to M'. The notion of an automaton homomorphism, which we have presented in its simplest form, has been very suggestive both formally and intuitively in studies of the modeling and simulation process. The reader is referred to Ashby [1963], Burks [1973] and Zeigler [1970,1974] for more complete discussions. Here we would like only to bring out several intuitive aspects of the idea that are particularly well illustrated by the homomorphisms of LCAs that we discuss below. Thinking of the sequential machine M as representing a "real" system, the homomoiphic image M' can be viewed as a model of M that captures certain aspects of Mts behavior. Operating with map X causes certain distinctions which are made in the state set Q of M to be lost so that

100 $(Q) is a view of Q that is "less distinct " in some sense. Each equivalence class produced by the equivalence relation discribed above consists of states of M that look the same when viewed through i, i.e. that are blurred to the same image. That 5 is a homomorphism means that the level of resolution of the image of any state of M is sufficiently detailed so that the next state of M can be predicted from it to the same level of resolution. In other words, provides a less distinct view of M,but one that is not misleading as far as the behavior of M is concerned. It turns out that for LCAs this metaphorical language can be taken quite literally. For a decomposable LCA M, since all the components in its decomposition M operate independently of each other, the view of M resulting from "filtering out" any subset of these components is a homomorphic image of M. This homomorphic image is literally a low-resolution image of M in the spatial, visual sense. It's as if the LCA's behavior were being viewed through a fog. That the resulting image is a homomorphic image means that each blurry view of a configuration contains enough information so that the blurry view of the next configuration can be predicted exactly. An observer, compelled to watch only through the fog, could, therefore, formulate a valid deterministic model for what he was able to see. Although these observations correspond to fairly elementary mathematical concepts, we feel that the connection provided to the concept of automaton-homomorphism, and hence to theoretical investigations of the modeling process, warrants the careful development that we provide in this chapter.

101 6.3 Restrictions, Projections, and Embeddings In this chapter and the one that follows, we will be discussing various homomorphisms and isomorphisms between LCAs, between LCAs and parallel compositions, and between different parallel compositions. According to Def. 4.1 and the remarks at the end of Sec. 5.7, the state sets of these kinds of systems can be viewed as sets of functions. Thus, a map from one state set to another is an operator on a set of functions, i.e. it takes one function and produces another. An example that we have already seen is the Fourier transform as it is used in Theorem 5.4. We now define three other operators on functions, each much simpler than the Fourier transform, that will be used to relate the state sets of various machines.Let S and I be sets such that there is an injection p of I into S, i.e. p:I + S is one-one. Let V be a set with a distinguished element denoted by 0. Definition 6.1 The restriction w.r.t. p(I) is the operator Res ():F(SV) + F(I,V) given by Res If = fp for fcF(S,V). PMI) Definition 6.2 The projection w.r.t. p(I) is the operator Proj (I):F(S,V) + F(S,V) given by (f(s) for sep(I) (Proj I1f) (s) = for all ssS. P(M t0 otherwise Definition 6.3 The embedding w.r.t. p(I) is the operator Emb (:F(I,V) + F(S,V)......P (I)

102 given by f(p (s)) (Embp (If) (S) = for sp (I) otherwise for all seS. Suppose S = {0,1,...,7} and I = {x,y,z}. Fig. 6.1 illustrates the action of the above three operators for the map p:I + S shown at the top of the figure. 0 S o P / Res p(I)f P(I) 1 2 3 4 5 6 7 * 0 0 0 0 * 9 * b y z Res,. 0 1 C. cL 2 - -I f e Emb (I) * l h Pro ) P MI x y z b 0 i Proj () Figure 6.1: The action of the operatrs Resp (1 ProjMpPand Emb(I)p Projp^i and Emb. p(I)' Pp(I)'

103 It should not be difficult to see that for any injection p the following properties hold: 1) ProjpI)Prop-() Projp(I) 2) Res PProg (I) = Res and p (I) p(I) P(I)' 3) Emb Res ' = Proj p(I) p(I ) p(I) These operators are perhaps more familiar when p is an inclusion map (see Sec. 2.2). Suppose W is any subset of S and p:W > S is the inclusion map. Then we have that Res f = fjW in the notation of Sec. p (W) 3.2; the projection operator is given by (f(s) for seW (Proj W)f)(s) = otrs p VVJ 0 otherwise; and the embedding operator is given for feF(W,V) by (fs) for seW (Emb (W) f)(s) = otherwise p1"- ~0 otherwise When p:I - S is not an inclusion map, the situation is exactly the same except the subset of S is p(I), and we can label its elements with elements of I. We will let the context indicate what set V denotes in particular instances: it will be either a field K, the vector space K, the vector space MatK(n,n), an integral domain R, or the module R. In each case the distinguished element 0 will be the usual zero element of the space, i.e. the zero of K, the zero vector in Kn, the zero matrix in MatK(n,n), etc. The reader can assume that unless otherwise indicated the same set S, set I, and injection p will underlie all usages of the symbols Res Proj and Emb We will write Res Proj and

104 EmbW in case W S and p is the inclusion map. In many cases, such as in the proofs of theorems, it will be possible to completely drop the subscripts without causing confusion. When it is clear what I and p are, we will simply write Res, Proj, and Emb. We now establish some basic facts about the operator Proj and decomposable LCAs. Let M = <S,Kn,H> be a decomposable LCA with the A A A decomposition M = [S,Kn,H]. Let 6:F(S,Kn) + F(S,Kn) be the transition function of M, i.e. (6q)(s) = H(s) q(s). Let p be an injection of a A set I into S. Then from the fact that M is a parallel composition where each component M is linear we have the following: Lemma 6.1 S(Proj (I) (F(SK))) S Proj (F(S,Kn)) Proof: Let q e Proj ( (F(S,Kn)) and seS - p(I). Then (6q)(s) = H(s).q(s) = H(s) (0) = 0. Thus 6q c Proj (F(SKn)) p(I) Q.E.D. In other words, if the parallel composition M is in a state at time t such that the components in the set S - p(I) are in state zero, then those components will be in state zero at time t+l. Thus, if M is started in such a state, those components in S - p(I) will stay in state zero as long as the machine operates. It therefore A makes sense to define a submachine of M: Definition 6.4 M(I) is the submachine of M with state set Proj (I(F(SKn)) p(I) p(i1)'~

105 From Theorem 5.4 we know that M o that there is a sub machine of M isomophic to defined.s follows: P() In fact, this submachine can be Definition 6.5 (I) is the submachine of M with state sero w eseFpro' s S 3 ^ (Ff(s, Kn)). We have, then,that M F I ze ~ the set S P(I) s one whose Fourier tansfo i zero on the set S -~(I). If p(I) is a set of "low frequencies",, (c.f. s t a t e s e t o f M c o n soia b l e to o n y s o St (I) consists of only Smooth functions S. Finally, since the operator F r F will appear so freque ly we provide an abbreviation for iti ear so ent Definition 6.6 (p(I) is the operator given by Flp o F(S, Following our sual practice, the set V wil so that different usages of (I may referes t he conto unssion lnluo(I) to differentoperators. For I and p the inclusion map, we will write Intuitively, this operation can be interpreted as a "filter it filters out those frequen in the set S (I). We return to this idea in Sec. 6.5.

106 6.4 Distributed Realizations of Sets of LSMs In Sec. 5.7 we pointed out that the decomposition of an LCA M could be considered a realization M through the assignment FS. Since FS is a bijection on the set F(S,K ), it is equally possible to consider the LCA a realization of its decomposition through the assignment FS Moreover, under appropriate conditions, any parallel composition n M. of LSMs is a part of the decomposition of some LCA. We can thereiCI fore consider that LCA to be a realization of TI M.. The next theorem iCI 1 precisely states this observation using the Emb operator. Let I be any finite set, let A:I + MatK(n,n), and assume that S| = N. Theorem 6.1 Let {M. = <K,A(i)>jiEI} be a finite set of autonomous LSMs and 1 let p:I + S be any injection of I into a finite abelian group S. If th K contains a principle m root of unity where m is the exponent of S, then the LCA M = <S,K,H> where H = FS 1Emb A is a realization of S- P(I) the parallel composition n M. through the assignment F Emb i F_ I S P (I)' idl 1 Proof: We need to show that T M. is isomorphic to a submachine of M. This iCI - submachine is M (Def. 6.5) and the isomorphism is Fs Emb.. p(I) S p(I) To see this, note that M is decomposable with decomposition M = [S,KnEmb ()A] having transition function 6 given for qcF(S,K ) by (6q)(s) = (Emb A)(s)-q(s), seS. The transition function 6 of n M., on the other hand, is given as idI 1 follows (if we identify the state set of n M. with F(I,K )): for idl 1 fsF(I,K)n

107 (6f)(i) = A(i).q(i), isI. Noticing that Emb:F(I,Kn) + Proj(F(S,Kn)) is a bijection, we have that Emb A n M. = M il (I) as the following argument shows: i) for sep(I) (Emb6f)(s) = 6fp (s) = A(pl(s)) f(p (s)) = (EmbA) (s)* (Embf) (s) = (6Embf)(s), and ii) for seS-p(I) (Emb6f)(s) = 0 = (CEmbf)(s). Therefore we have Emb ^ FS nMi s M (I) (I) Q.E.D. n. Note that the elements of any set of LSMs {M. = <K iAi> Ai MatK(n,ni), ieI} can be given the same state space with dimension n = max n. by augmenting the deficient A.'s with zeros, i.e. by introiEI ducing dummy components. Since each augmented Mi realizes the original M., we can apply Theorem 6.1 and conclude that a parallel composition of any set of LSMs over K can be realized non-trivially by an LCA provided K contains the necessary principle root of unity. (By non-trivial, we mean realizations other than an LCA with one large cell and the trivial

108 structure group). th Since the field C contains a principle m root of unity for any positive integers m, we have the following: Corollary 6.1 The parallel composition of any finite set of finite dimensional LSMs over C can be non-trivially realized by an L.CA. If the LSMs all have transition functions given by matrices containing only real numbers, it can be shown that the realizing LCA has weighting functions that are real and even. This follows from the fact that real valued functions are Fourier transforms of even functions (see, for example, Cooley, Lewis, and Welch [1969]). We have explicitly stated these facts using the notion of realization because certain properties of the assignment Emb F(I) are suggestive P (I)S of interesting applications. In particular, for a function f, it is true that every value of its transform f depends on every value of f; and conversely, every value of the inverse transform of f depends on every value of f. This follows from the fact that the support of each group character %s is all of S. It can therefore be said that the process represented by a single LSM in H M. is distributed by the -1 i 1 assignment FS Emb throughout the entire structure of the cellular Sn P Up(I) automaton M. Of course, other linear operators have similar properties. All that is required is that the rows of the matrix representations contain few zeros. With the Fourier transform, however, a heterogeneous collection of LSMs can be realized by a uniform structure. We will call the realization given by Theorem 6.1 a distributed realization of T M.. iI 1

109 It would not have been misleading to call these realizations holographic realizations. In fact, the connection to holography can be made fairly precise. Suppose a holographic image is to be made of an illuminated object. One can think of each point on the object's surface as an electro-magnetic oscillator having an amplitude depending on its brightness and a frequency depending on its color. The relative phases of the oscillations depend on the relative "depths" of the corresponding points on the object's surface. Each oscillation can be modeled by a simple LSM over C with a one-dimensional state space. Then the object can be thought of as the parallel composition of LSMs. The holographic image corresponds to the distributed realization of this collection of oscillators in the following way. A holographic image is recorded on a photographic plate over a short interval of time. Think of taking a time-exposure of the distributed realization as it is operating. The result will be a superposition of all the configurations assumed by the cellular automaton in that time interval. In general, this superposition will not contain the desired information about the object. However, if all of the independent oscillators making up the object have approximately the same frequency, the distributed realization has a small enough neighborhood so that the time-exposure captures essential information pertaining to the object's threedimensional appearance. This condition corresponds to the requirement of a coherent light source (e.g. a laser) for making holographic images. We return now to the major theme of this chapter and use Theorem 6.1 to prove the existence of certain kinds of homomorphisms of LCAs. For more information on the holographic process, we refer the reader

110 to Stroke [1969]. 6.5 Homomorphisms of LCAs via Spatial Frequency Filtering In Sec. 6.2 we said that since all the components in the decomposition of an LCA M operate independently, when any subset of them are "filtered out", the remaining components form a homomorphic image of the complete set. This smaller set of components can then be realized by another LCA M' such that a submachine of M' is a homomorphic image of M. The simplest example of this procedure is shown in Fig. 6.2. The LCA M = <Z2,R,H> is decomposable into the parallel composition M by the DFT w.r.t. Z2 whose matrix appears in Ex. 5.1. The component M1 is eliminated from M to produce what we've called I M (W ~ S is l~~~~~~~~ ~~~~seW SEWN the set of components retained). The map ResW shown in the figure is an automaton-homomorphism. The LCA M' is a distributed realization of -1 I M. Since we know from Theorem 6.1 that the assignment FZ Embw seW s 2 is an isomorphism from 1 M to the submachine M' of M', the composis EXW -1sw s tion of F- Emb, Res, and F (which turns out to be the map S gi2 W 2 1}, given by Def. 6.6), is an automaton-homomorphism. Since S-W = {1}, we say that the map W filters the group character -1= <1= (1,-1) from the LCA M (see Sec. 5.2).

111 M FZ2 \ FZ-Emb 2 a+b q (0) q (1) Mo a-b q(O)-q(l) M1 a+1:- ] q(O)4(l) M 0 Resw M 1 M seW S Figure 6.2: M' is a homomorphic image produced by filtering the character (1 = (1,-1) from the LCA M = <Z2,R,,H>. Formally, we have the following theorem: Theorem 6.2 Let M = <S,K,H> be a decomposable iLCA. and W any 9ubset'of Then the submachine M- of the LCA M' = <S,Kn H'> with H' = EWH homomorphic image of M. S. is a

112 Proof: The set {M = <Kn, H(s)>lseW} satisfies the hypotheses of Theorem 6.1 S for p the inclusion map of W into S. Then the LCA M' is the distributed realization of T M given by Theorem 6.1 since SEW seW H = WH = I[FS ProjWFsl]H * AA =S F ProjWtl = FS Embw(ResWH). In other words, the function called A in Theorem 6.1 is RestH. It's easy to check that the map Resw a homomorphism from M = n M to ses n M. Then composing the maps shown in the diagram, we have that SEW seW F1 -- Fs EmbwResFs = FS ProjWFS = W is the required homomorphism. M S |S EmbW S S M = 1 M In M seS Resw seW Q.E.D. We have maintained the distinction between the LCA M' and the submachine MI whose state set consists of all "smooth" configurations of M'. However, the difference between M' and MI is very slight. It is easiest to see this by looking at M' and MI, the parallel decomA A positions of M' and M1 respectively. M' and M1 consist of exactly the same components M = <K,H'(s)>, sS, whereH' = FSWl = ProjWH, but

113 any state f of the submachine M must be such that f(s) = 0 for seS-W. Now suppose M' is started in an arbitrary initial state qeF(S,Kn). Then the next state, say q', is given by i,'I q'(s) = (PrpjH) (s)'q(s), seS. Thus, for seS-W, q' (s) = O-q(s) = 0 so that q' is a state of the submachine M'. The only difference, therefore, between M' and MU is that machinee is th the initial state of M' is sometimes not a state of M. Since these machines are isomorphic to M' and Mi respectively, we have that the submachine M; is always entered in the first transition of the LCA M'. Technically, however, we need to distinguish between M' and MN since our definition of an automaton-homomorphism (Sec. 2.5) requires the map W to be onto. Theorem 6.2 is an automata theoretic restatement of the classical principle which, if the structure group S were allowed to be R, could be succinctly stated as the Principle of Rayleigh. We quote from Duff and Naylor ([1966] p. 87): "Principle of Rayleigh: A continuous system can be approximated by a finite system, so that the eigenvalues and eigenvectors of the finite system wilZ approximate to a finite number of the eigenvalues and eigenfunctions of the continuous system." Here we see that projection onto a subspace spanned by eigenvectors is an example of the concept of automaton homomorphism. The symbol W was chosen to suggest the term "window" used in the signal processing literature (e.g. Gold and Rader [1969]). The operator E = FS ProjWFS represents an ideal band-pass filter of signals which here we interpret as spatial signals, i.e. they are cellular auto

114 maton configurations. The frequency band is the set W, and we say that windowing in the frequency domain is filtering in the spatial domain. The filtering operation can be represented more concretely through the use of characteristic functions. Let V be a vector space with zero vector denoted by 0 and let XW e F(S,V) be given by ^A 1 for seW XW(s) = S: for all seS. 0 otherwise The function XW is the characteristic function of W. Again, in using this symbol, we leave it to the context to indicate what space V is. -1^ It is clear that Proj f = Xw'f for all fcF(S,V). Letting XW = F XW we have for all feF(S,V) that (Fs ProjFs) (f) = (F Projw)(f) = FS l(Wf) XW *S f by the convolution theorem. Thus, letting V = MatK(n,n), the function H' of M' is given by H' = *S H (6.4-1) and (with V = Kn) the homomorphism EW is given by EWq = XW *S q for all qsF(S,K ). (6.4-2) For the temporal interpretation of the group S, ideal filters of the kind we have described are not physically realizable since XW given in (6.4-2) is a non-causal operator (Gold and Rader [1969]) for W S S. This is true since the support of XW is not "one-sided" for any W $ S. For our spatial interpretation, however, this fact creates no difficulty. Fig. 6.3 a) shows the functions XW and XW associated with the homomorphism W of Fig. 6.2. The "window" W is the set {0}. Fig. 6.3 b)

115 shows the action of W on a particular function of Z2. Each value of SWq is the average of the values of q. This is always the case for W = {6} where e is the identity element of the underlying group. Xw -T — - 1/2 q -- --- - 1/2 - 1.._.- — 3/4 XW I. * _Jwq'9 = XW* q o. 0 1 a) b) Figure 6.3: Filtering on Z2 with window W = {0}: a) the characteristic function XW and its inverse transform xW, b) the action of EW on a qeF(Z2,R). Section 6.2 intuitively described an automaton homomorphism as a map that produces a low resolution version of a given automaton. When the set W underlying the homomorphism SW = FS ProjF consists of the lower multiples of the generators of S and their inverses, the map SW might be called a "low-pass" filter and can be thought of as a smoothing operator. Figure 6.4 illustrates the action of such a map on a particular function qeF(Z8,C) for W = {6,7,0,1,2}.

116 Iq^ T W = {6,7,0,1,2} Figure 6.4: Low-pass filtering of qF(Z8,C). We refer the reader to picture processing literature (e.g. Andrews [1970]) for evidence that the homomorphic images of LCAs produced by low-pass spatial-frequency filtering are indeed low-resolution or blurry images. Finally, we would like to point out an interesting fact about the situation illustrated in Fig. 6.2. If we let 6 denote the transition of M and 60 denote the transition function of MO, then the crucial fact underlying the existence of the homomorphism CW is represented by the following commutative diagram:

117 6 (q(0),q(1)) --- > ([aq(O)+bq(1)],[bq(0)+aq(l)]) ReSWFz2 ReswF q Res F (aq(0)+bq(l)+bq(0) +aq(l)) W Z Y 0+~)6 Ii al (q(O)+q(l)) ~ > (a+b)(q(O)+q(l)) The equality on the right follows from the distributive property required of rings. In fact, the commutativity of this diagram is equivalent to the distributive property of the underlying commutative ring. This is always the case, in fact, for W = {e}, where e is the identity of S, since the character fe for any group S is identically equal to 1. We see, though, that any W c S will produce a commutative diagram. This suggests a generalization of distributivity that perhaps has interesting applications. We have not yet explored the mathematical literature to see if this observation has been incorporated into some algebraic framework. 6.6 Complexity of Homomorphic images The homomorphism TW, W c S, defined in the previous section is a map between LCAs that have the same structure group so that M and M' have the same number of cells. Thus, despite the lower resolution of the LCA M', it is not clear how M' is less complex than M, something we would desire of a homomorphic image for simulation purposes. In

118 fact, M' might be considered more complex than M since in general the neighborhoods of cells in fM' are larger than those of cells in M. This is true since filtering the function H always tends to increase its support. Intuitively, M' is less complex than M in the sense that the cells of M' process less information than cells of M: they are sensitive to a narrower spatial frequency band. We will not pursue the precise information theoretic characterization of this observation but will ask, instead,how this kind of complexity reduction can be exploited for simulation purposes. Stated another way, how can this "information theoretic" complexity measure be interpreted in terms of more standard complexity measures associated with the number of components and density of interconnections? We will examine one answer to this question in the next chapter by interpreting aspects of sampling theory in the language of LCAs.

Chapter 7 Linear Cellular automata and Sampling Theory 7.1 Introduction A natural technique for modeling a "real" homomogeneous medium by a cellular automaton is to view each cell of the automaton as a representation of a small area of the system to be modeled. For example, a cellular automaton specified to model a section of fairly homogeneous neural tissue might be formulated so that each "cell" of the automaton is intended to model a small block of neural tissue containing thousands of neurons. In other words, the cellular automaton is an aggregated model of neural tissue. Each state variable of the model (i.e. each cell of the cellular automaton) represents a set of components of the real system (i.e. a set of neurons). Examples of this kind of aggregation are Mortimer's model of the mammalian cerebellar cortex [1970] and Foy's model of cardiac muscle tissue [1974]. Intuitively, it is felt that the behavior of the aggregated model ought to be essentially like that of the more complex natural system except for a lack of fine spatial detail. While these kinds of models tend to be much more complicated than the linear cellular automata considered here, it is nevertheless important to understand some of the difficulties that can arise when LCAs are aggregated. The reason for this is that even in the comparatively simple linear case, aggregated models tend to produce misleading behavior for reasons that seem very counter-intuitive. It is therefore not sufficient to appeal to intuitive explanations for a justification of the aggregation process. 119

120 To examine these problems using the framework developed so far, we will take the point of view that the underlying real system is some LCA with structure group S. We will then be interested in describing homomorphic images of this LCA that are also LCAs but having structure groups smaller than S. Using Zeigler's terminology [1974], we will postulate an LCA M as a base model and examine lumped models which are LCAs with fewer cells than M and which homomorphically preserve aspects of M's behavior. Theorem 6.2, proved in the previous chapter, establishes a homomorphic relation between an LCA M and an LCA M', but M and M' have the same structure group S and hence have the same number of cells. It's just as easy, however, to obtain a similar result in which M' has fewer cells than M. In the next section we develop this idea more carefully and then motivate the application of sampling theory to questions arising from using LCAs to model natural systems. 7.2 Homomorphisms between LCAs and having Different Structure Groups Refering back to Fig. 6.2 in the previous chapter, the LCA M' in the upper right of the figure is a particular distributed realization of n Ms, namely the one with structure group S. A distributed realisew zation, however, will give rise to a homomorphism analogous to SW of Fig. 6.2. In particular, M' can have structure group S' where IwI < S' < S since W can then be injected into S' instead of S. Using this observation we can prove the following theorem:

121 Theorem 7.1 n Let M = <S,K,H> be a decomposable LCA, W any subset of S, and let M' = <S',Kn,ft'> be a decomposable LCA such that IWI I |S'1. Then for any injection p of W into the underlying set of S', if ' = FS, Emb ()ResWFSH, then the submachine M'( of M' is a homomorphic b p (W) W b P (W) image of M. Proof: This is proved in basically the same way that we proved Theorem 6.1. First, M' is the submachine of M' with state set Fs, Proj (W(F(S',K)), i.e. the group S' is used in Def. 6.5 instead p (w). of S. Also, Res:F(S,Kn) + F(W,Kn) since, by our notation, Res = Resi( where i is the inclusion map; but W i(W) Emb:F(WKn) + F(S',Kn). P (w) Then consider the following diagram: M ---— >p (W) FS Fs, Emb (W) M = M n M sS S Resw seW All we need to show is that M' with H' = Fs Emb p ReswFsH is a 5t P(w) WS distributed realization of 1 M. Applying Theorem 6.1 with SEW seW A = Res F H and S' taking the place of S (and M' taking the place of W S M!), it can be seen that M' is indeed a distributed realization of 1 Ms. Thus, composing the maps in the diagram, the required homoseW morphism is

122 F 1Emb Res F (7.2-1) S' p (W) W S Q.E.D. For S = S' and p the inclusion map of W into S, thus homomorphism -1 reduces to SW = FS ProjwFS given by Theorem 6.2.* There are several difficulties with the homomorphism -. F Emb (w)ReswF First, it is obviously complicated. Since it is 5' p(W) W S a linear operator, it can be represented by a matrix, but the matrix does not in general help clarify matters. In Sec. 6.5 we showed that the map SW has a concrete representation in terms of the convolution operation (see (6.4-1) and (6.4-2)). Here, however, the matrix representation is not so nicely structured. For example, let S = Z3, S' = Z2, W = {0,1}, and let p:W + Z2 be the identity map. Then if C is the underlying field (and n = 1), 3 2 the operator (7.2-1) maps C into C and has the matrix representation: 1+) 1 + 12 2 (7.2-2) 1-co 1-c3 2 2 rd 20r i/3 where w is a principle 3rd root of unity in C, i.e. u = e. In It's interesting to note here that if S = S', W = S, and p is a permutation of the elements of S, then the map (S) = FS Proj(s)Fs is an isomorphism of the LCAs M and M' each having structure group S. In fact, if n=l, all such isomorphisms are associated with permutations so that there are exactly N! of them where N = IS|. See Nicholson [1971]. We will not pursue the modeling theoretic interpretation of these maps.

123 the general case, the most intuitively satisfying way to think about these kinds of operators may be to go back to expression (7.2-1) i.e. to view them as appropriate compositions of restrictions, embeddings, and DFTs. Another difficulty with the homomorphism given by Theorem 7.1 is that in order for it to exist,both of the LCAs M and M' must be decomposable over the same field K. This means that the group S' must be chosen so that the DFT exists over K for S and S'. When K = C there is no problem, but for finite fields care must be taken in choosing the group S'. Neither of these difficulties exist for the homomorphism SW given by Theorem 6.2. It can be represented by a convolution operation, and since S = S', M' is decomposable whenever M is. As indicated in Sec. 7.1, however, we would like to consider cases in which M' has fewer ce]ls than M. It turns out that when S' is a certain kind of subgroup of S, the situation is different, and all of these objections are met.* In order to establish these results, it is necessary to develop aspects of sampling theory which we do in the next section. In Sec. 7.4 we will return to the discussion of automaton-homomorphisms. 7.3 Generalized Sampling Theory As we have indicated, several results about LCAs are essentially cellular automatic theoretic interpretations of well known facts about filtering and sampling signals which, in our case, are spatial *It seems likely that if S' is any subgroup of S, the same results can be proved. However, within the framework developed so far, it's easiest to provide S' with an explicit representation. Whether or not all subgroups can be so represented is a purely group theoretic question that we do not pursue here.

124 signals (i.e. configurations of a cellular automaton). However, the group theoretic framework of our discussion requires us to develop the sampling results in similar group theoretic terms. This sort of presentation is not new but is 1) more abstract than the engineering literature (e.g. Cooley, Lewis, and Welch [1969]), and 2) less abstract than the mathematical literature (e.g. Hewitt and Ross [1963]). Our development owes the most to algebraic developments of the Fast Fourier Transform algorithm, in particular to Cairns [1971] and Cooley, Lewis, and Welch [1969], but provides the generalization from complex numbers to integral domains containing suitable roots of unity.

125 Let G be an arbitrary finite abelian group with IGI = N and canonical decomposition as in (5.2-1), i.e. d G " Z... ZZ where I n. = N and n In. (7.3-1) n1 "nd j=l J 1 Let D be the finite abelian group, IDI = M, with canonical decomposition d D ~ Z ~... Z where I m. = M, mi and m In.. (7.3-2) l d m^ mi+ i i Suppose n. = m.k. for i=l,...,d. Define two maps p:D + G and n:G - D as follows: p((a,...,a)) = (klal,...,kdad) for all (al,...,ad)eD (7.3-3) where k.a. = a.+...+a.(ki times) and 1 1 1 1 n((s,...,sd)) = (slmod ml,...,sd mod md)ED for all (sl,...,sr)eG (7.3-4) The following facts can be proven about p and n: 1) p is an injection of D into G which is a group-homomorphism. Hence p(D) is a subgroup of G and MIN. 2) n is a group-homomorphism of G onto D. 3) The kernel K of n is (isomorphic to) Zk... Zk. K1 d Let k be such that N = kM, and let L be a module. In addition to the operators Resp(D):F(GL) + F(D,L), Proj (D):F(GL) + F(G,L), o(D) ' ~~~~~~~

126 and Emb (:F(DL) + F(G,L) given by Defs. 6.1, 6.2, and 6.3 p(D) respectively, we define two more operators: Definition 7.1 The periodic extension w.r.t. n is the operator PE:F(D,L) + F(G,L) given by PE f = fn for all feF(D,L). Definition 7.2 The aggregation w.r.t. n is the operator Agg:F(G,L) + F(D,L) _n given by (Agg f)(a) = f(r) for all feF(G,L) and aED. n kn(r)=a All five of these operators are illustrated in Fig. 7.1 for G = Z, L = IR, and n and p as shown. Note that the operator Aggn averages over cosets of K and that Agg PE is the identity on F(D,L). We will let the context indicate what module L denotes in particular n instances: it will be an integral domain R, the module R, or the module of matrices MatR(n,n). Note that we need the structure of a R module so that Def. 7.2 makes sense. When no confusion is likely to occur, we will drop the subscripts from the names of these operators. The operator Emb is the natural generalization of the operator p (D) given by Cooley, Lewis, and Welch [1969] for cyclic groups which they called "Stretchk". Similarly, their operator "Samplek" corresponds to our Res (D However, they identify the functions Res (Df and PE (D)Res p)f by saying that Samplekf is regarded as extended to all the integers in a periodic way (hence our name "periodic extension"). Intuitively, Res takes equally spaced samples of a function where (D) k, the size of the kernel of n, is the sampling rate. For the example shown in Fig. 7.1, k = 2 and Res p f consists of every second value of f. p(D)

127 D G D f I 2 3. 0 0 * 0 i. 2 3 a+e a I 0 1-' ~~ dJ+h T b Agg1 Re p (D) A ' I h N * a t Prop (D) c L Emb, p (D),I — aw I C t _... I o 1., ' PE n 0 i. a 0 It, r t I 0 1. ~ * I I 3 Figure 7.1: Action of the operators Resp(D), Projp(D) Emb (D) PE, and Aggr on an f~F(Z8,R). p D) ' I rl

128 The results of this section express relationships between the above five operators and the Fourier transforms FG and FD. First it is necessary to show that!the conditions given in Theorem 5.1 which insure the existence of FG also insure the existence of FD. According to the G D canonical decompositions of G and D given by (7.3-1) and (7.3-2), the exponents of G and D are nd and md respectively, and we know that n d =km nd = kdmd. Consequently, if R is an integral domain containing a th th principle nd root of unity w, then it contains a principle md root nd/md k of unity, namely w' = w = w. Similarly, since N = kM, we know 1- -l -1 that if R contains N- then it contains M- = N +... +N (k times). Therefore, let R be an integral domain containing both a principle n root of unity w and N. Following Def. 5.1 we can define the set of characters of D in R: <(D,R) = {ol acD} Q F(D,R) where a d a.b.nd/m. ((blpeeopb )) = na (7. 3-5) (al,.,a d). d j=l for all (al,...,ad), (bl,...,bd)sD. The facts 1-7 of Sec. 5.2 are true of 4(D,R) so that we have the following lemma: Lemma 7.1 For G and D with canonical decompositions given by (7.3-1) and (7.3-2) respectively, if R is an integral domain containing N th and a principle nd root of unity, then both FG and FD exist for functions with values in R.

129 We will always denote the characters of G by %s, seG, and those of D by 0a, asD. The next result gives a pertinent relation between 4(G,R) and O(D,R). Lemma 7.2 For all aeD, PE = pa n a (a)' Proof: We need to show that 0 a = f for aeD. Directly from (5.2-2) a p(a) we have for (al,...,ad)eD and (rl,...,rd)eG that (rlta a * 0* ^ 0f a r a ~(7.3-6) p (al,...,ad) (rl,...,d) = (klal..,kdad)(r,...,rd3-6) d = kj.a.rnd/nj j=l d d k.ar.nd/m.k 11 J j d/ j (since m.k. = n) j=l d = a.jr.jnd/m j=l For each j, we have that r. = r. mod m. + L.m. where 0 < 2. < k.. Thus (7.3-6) equals the following: d a(rj mod mj + j m.)nd/m. n w J J n d/1 3 j=l d [ a.(r. mod m.)n/m a.J.mn/m W nd/mj I I I d/m 3I j= th However, since w is a principle ndth root of unity, we have that W a j.m n /m. aa.n = 1. co j J 3 d/ = ) J j d Thus (7.3-6) equals

130 d d aj(r. mod m )nd/mj j=l (al...,ad)(r mod m,...,rd mod md) by (7.3-5) = [(al,...,ad)q](rl...,d). Q.E.D. Thus, we know that for each s in the subgroup p(D), the character s of G is the periodic extension of a character of D. This fact s can be applied directly to the formulas for FG and FD (see (5.2-3)) to obtain a relationship between the DFT w.r.t. D of a function feF(DR) and the DFT w.r.t G of the function Emb fD)cF(GR). A similar relationship holds for the inverse DFTs. Theorem 7.2 1) PEFD = F Emb n D G p(D) 2) PE F =kF Emb l D =G p(D) Proof: To prove part 1 let feF(D,R) and aeG. Applying Def. 7.1 to the formula for the DFT w.r.t. D (see 5.2-3) we have (PE FDf) () = f(b) c) (b) (7.3-7) bed From fact 4) stated in Sec. 5.2 we know that O (b) = (() = (bn)(a) for any aeG and beD. Thus, from De. 7.1 and Lemma 7.2, we have Thus, from Def. 7.1 and Lemma 7.2, we have 0 ((b) =(PE Ob) () (b)

131 so that (7.3-7) becomes (PEFDf)) (a) f(b)p (b)(a) bed Since p is an injection, we can view this as a sum over p(D) by making the change of variable b8 =- p(b). Thus b = p (6) and we have (PE FDf)(a) = p(fp 1 )()8( ) Bep (D) and by fact 4) Sec. 5.2, = E (fp'1)(8)1(B ) 8ep (D) By the definition of the Emb operator (Def. 6.3) this is the same as (Emb f)(B)C (B). BeG p (D) But this is the DFT of Emb f evaluated at a so that p (D) (PE FDf) () = (FGEmb(D)f)(a) This proves part 1. Part 2 is proved analogously but with the observations that p(-b) = 8 implies p(b) = -8 since p is a group homomorphism Q.E.D. Corollary 7.1 kEmb F = FGPE p(D) D G n Proof: From Theorem 7.2 part 2 we have

133 out, even in the abstract case we are considering here, that this is the kind of smoothness that allows a rigorous connection to be made between a function and its sampledversion. Within our framework these results take the form of a relationship between the DFT w.r.t. G of a function and the DFT w.r.t. D of its sampled version. The operator Res can be thought of as p(D) performing the sampling operation where the samples are taken at equally spaced elements of G. Thus, we will establish a relationship between the operators FG and FDResp(D). First, we need a relationship between the DFT of a function and the DFT of its projection, i.e. between the operators FG and FGProj (D). The appropriate relationship is a consequence of the following lemma: Lemma 7.3 Aggs = 0 for seG-p(D). n S Proof: This is difficult to prove directly so we will take the following approach: first we will shqw that for any seG, Agg s is a homomorphism from D into the multiplicative monoid of R. Since O(D,R), the set of characters of D in R, contains all such homomorphisms that are not identically zero (fact 1 Sec. 5.2), we will then have that Agg is either a character of D or identically zero. But all the characters of D are accounted for by Lemma 7.2. We will then conclude that Agg s = 0 for seG-p(D). To show for seG that Agg >s is a homomorphism from D into R we need to show that

134 (Agg qs)(a+b) (Agg)s) (a)- (Agg s) (b) for all a,beD. (7.3-8) Using Def. 7.2 of the Agg operator, the right hand side of (7.3-8) is ()] * 1 Z k2 2 n (x)=a q (x) A s < s(x)4s (y) n (y)=b s (x+y). n (y)=b (7.3-9) If we make the change of variable x+y + r, then y + r-x and the inside summation is taken over all r such that n(r-x)=b or, since n is a group-homomorphism, all r such that n(r) = n(x)+b. Thus (7.3-9) becomes 1 ) k2 n (x)= a s (r)=a+b n (r)=a+b but since n(x) = a inside the brackets, we have 1 k =a n (x)=a xa [(r)a n (r)=a+b

135 Now {xln(x) = a} is the coset x+K of G (where x is such that n(x) = a) and therefore has cardinality k = |K|. Therefore (7.3-9) becomes 2 k E (r) = E q s(r) = (Agg^s) (a+b) k n(r) a+b n(r)=a+b which is the desired result. Therefore we know that for any seG, Aggn s is a homomorphism from D into R and hence is either a character of D or identically 0. According to Lemma 7.2, however, for all aeD PE 0 = p PEa p(a) Aggregating both sides of this equality we have AggnPE 0 = Agg np(a) in a np(a) Since Agg PE is the identity on F(D,R) this means that for all aeD a = ggnp(a) These are all the characters of D so for seG-p(D) it must be true that Agge 0. ggn p(s) Q.E.D. Applying this result directly to the formula for FG, we prove that for any feF(G,R), the functions FGf and FGProjp f are in the G G p(D) same partition of the equivalence relation induced by Agg, i.e. when you aggregate them they look the same. In other words AggnFG = AggnFGProj p(D)'

136 Lemma 7.4 1) AggFG= AggnFGProj ) 2) AggnFG = AgnF ProjpD) Proof: Clearly the operator Agg is a module homomorphism (or a linear operator for R a field). It suffices, therefore, to prove that Agg(FG-FGProj) maps every function in F(G,R) to the zero function of F(D,R). Let fsF(G,R) and let aeG. We have from (5.2-3) and Def. 6.2 that [(FG-FGProj)f](a) = f(W) () - 2 f(f),( )) BeG BEep(D) _ ~f()cf (e) eG-p (D) - f( e) (ca) since a(B) = f(a). $eGr(D) a S Therefore, since Agg is a module homomorphism, we have that Agg[(FG-FGProj)f] = Agg( Z f(G), ) = 12 f(s)AggCB. SeG-p (D) Now from Lemma 7.3 we know that Agg8 =E 0 for BeG-p(D). Therefore [Agg(FG-FGProj)]f E 0 and part 1 is proved. Part 2 is proved analogously with the observation that if ~G-p(D) then -BEG-p(D):, since p(D),>os a subgroup of G. Q.E.D.

137 We are now in a position to relate the operators FG and FDResp Theorem 7.3 1) FResp D)= AggF 2) FDe D = k Aggn 1 D p(D) G Proof: To prove part 1 we start with Theorem 7.2 part 1, i.e. we have that PE FD = F Emb. D G Aggregating both sides we arrive at AggPE FD = AggFGEmb. Thus, since AggPE is the identity on F(D,R) we get FD = AggFGEmb. Hence, the equality holds also for restricted functions or FDRes = AggFGEmbRes. Since EmbRes = Proj we get FDRes = AggFGProj Finally, applying Lemma 7.4 part l,we get FDRes = AggFG and part 1 is proved. Part 2 follows similarly from Theorem 7.2 pa and Lemma 7.4 part 2. Q.E.D. rt 2 Corollary 7.2 k FDAggu = Res pD)F D r= p(D)G

138 Proof: Immediate from Theorem 7.3 part 2 by applying-FD and F as in the proof of Corollary 7.1. Q.E.D. Recalling that operating with Res can be thought of as sampling p,(D) at rate k, we can think of Theorem 7.3 as saying that: 1) Sampling in space (time) corresponds to aggregating in the spatial (temporal) frequency domain. 2) Aggregating in space (time) corresponds to sampling in the spatial (temporal) frequency domain (Corollary 7.2). In most applications emphasis is placed on statement 1. (This is especially true when the group G is interpreted as time since sampling is a causal operation but aggregation is not.) The usual statement of this result takes the form of the well-known "sampling theorem" and can be roughly stated as follows (e.g. Schwarz and Friedland [1965]) for functions of IR: If a signal has a Fourier transform which vanishes for frequencies above a certain value, then it is completely determined by its values at a denumerable set of equally spaced sampling points. The higher the cut-off frequency, the smaller the sampling intervals. The emphasis is thus placed on the ability to exactly reconstruct a function from its values at sample points, or, in our terms, to exactly

139 recover a function fEF(G,R) from Res.(f)F(D,R). Since both F and FD are invertible, it suffices to recover G f = FGf from FDRes(f), and we can see from Theorem 7.3 part 1 that this involves "disaggregating" AggFGf. Of course this can't be accomplished in general since the map Agg loses information, i.e. is not one-one. If, however, we know a priori that the function feF(G,R) has a non-zero value for exactly one element of every coset of K and we know which element of the coset it is, then no information is lost by the map Agg. It is customary to say.that no aliasing occurs. In particular, for G with the decomposition in (7.2-1), this is true if the support of f is the set W = {(si...,sd )0<si<mi}. Noting that W is the underlying set of D, it can be seen that f(s1,...,sd) = k(Aggf)(sl,...,sd) for (sl,...,sd)eW. Then given Agg(f), the function f can be obtained by filling in the missing values with zero and dividing by k. The function f can be obtained by taking the inverse DFT w.r.t. G. A function like f is called band-limited since its transform has an appropriately limited support. The recovery of a function with no higher frequency components (a smooth function) is very naturally interpreted as an interpolation procedure. However, these smooth functions are not the only kind of band limited functions that can be recovered: they are a special case of functions whose transforms have one non-zero value on each coset of K. The reconstruction process in these other cases, though, does not resemble interpolation. Finally, we should point out that there is nothing profoundly unique about band-limited functions that makes their recovery possible. Any function feF(G,R) can be recovered from Res(f) provided that we have,

140 in addition to Res(f), enough other information about f, namely that we know 1) that f belongs to a subset of F(G,R) on which the operator Res is invertible and 2) we know how to compute the inverse operator for that subset. The set of band-limited functions is one such subset and is important in applications because membership in it is often a natural consequence of underlying assumptions. For material related to these issues we refer this reader to deBoor's discussion of the method of projections in numerical analysis [1966] and to Zeigler's discussion of justifying conditions for the postulational approach to modeling [1974]. Our application of Theorems 7.2 and 7.3 to linear cellular automata will, however, involve band limited functions since the process of band-limiting (filtering) produces homomorphic images (Theorem 6.2). Thus, it will not be necessary to require a priori knowledge of a function's spectrum since the object is not to reconstruct a function exactly but to recover only enough information for predictive purposes. Whether this situation underlies the importance of band-limited functions in other applications is an interesting question the discussion of which may require careful thought about the process of building models. Before returning to the theory of LCAs, it's necessary to point out that all the results in this section have been proved only for operators on the spaces F(G,R) and F(D,R) where R is an integral th domain containing a principle nd root of unity and the multiplicative inverse of GI. However, using the extensions of F and FD to vector and matrix valued functions that we gave in Sec. 5.4, it is not difficult to see that it is possible to extend all of our results to operators

141 on the spaces F(G,Rn), F(G,MatR(n,n)), F(D,Rn), and F(D,MatR(n,n)). The theorems proved above are simply applied to all of the component functions. 7.4 Homomorphisms of LCAs via Filtering and Sampling Let S and D be finite abelian groups with canonical decompositions as in (7.3-1) and (7.3-2) respectively. Let the maps p:D - S and n:S - D be defined according to formulas (7.3-3) and (7.3-4) and let K be the kernel of n. Assume also that ISI = N, IDI = M and IKI = N/M = k. For every LCA M with structure group S we define an LCA with structure group D whose impulse response functions (and hence weighting functions) are simply related to those of M. Definition 7.3 For any LCA M = <S,K,H>, MR is the LCA given by <D,K,kRes H>. ' ' p(D) The impulse response functions of MR are k times the restrictions of those of M. An immediate consequence of Lemma 7.1 is the following:

142 Lemma 7.5 If M = <S,K,H> is a decomposable LCA, then MR is decomposable also. Proof: The LCA MR is decomposable if it satisfies the hypotheses of Theorem 5.4. We have that M satisfies them so that MR does also by Lemma 7.1. Q.E.D. Even though the structures of M and MR are simply related, it is unfortunately not true that a simple relationship between their behaviors can always be found. For certain LCAs M, however, a simple relationship does exist. We need the following definition: Definition 7.4 A subset W of S is compatible with n whenever [Wn (n (a))| = 1 for all aeD. In other words, W is compatible with n whenever it contains exactly one element of each coset of K. It's clear that if this is true, then IW| = |DI and we can write W = {w acl)} where n(wa) = a. a a Suppose M = <S,K,H> is a decomposable LCA for which H(s) = 0 for scS-W. One can think of M as having "smooth" impulse response functions like those of the LCA M' given in Theorem 6.2. Its cells are insensitive to a portion of the spatial frequency spectrum. Now consider the LCA M = <D,K,kRes (DH>. From Theorem 7.3 part 1 we R p(D) A know that the decomposition of M is given by M = [D,K,kAgg FGH]. R R TI G

143 If W is compatible with n, then it is not hard to see that each non-zero component of the parallel composition M = [S,Kn,H] is isomorphic to a component of MR and vice-versa. In fact, component Wa of M is the same as component aeD of MR. However, since the *~~~^~~n first state of M need not be in Proj (F(S,K )), there is only a homomorphism from M to MR. This homomorphism becomes Res ) when R*K~~~~ ~p (D) transformed back into the spatial domain. We then have this commutative diagram: FFRes M; pp(D) M ' MR FS d; D- 1 M= [S,K,f] --- MR = [D,K,kAgg H]. More formally, we can prove the following theorem. Although the proof essentially follows the verbal argument above, we give it in a direct form in order to explicitly show that the homomorphism is the map Res. o(D)' Theorem 7.4 Let W be any subset of S that is compatible with n. Suppose M is a decomposable LCA given by <S,K,H> where H(s) 0 for scS-W. Then Res is a homomorphism from M to MR p(D) R Proof: Let 6 and 6R be the transition functions of M and MR respectively and let qeF(S,Kn). We need to show that Res (6q) = R(Res q). Or, expanding, and dropping unnecessary subscripts,

144 Res(H *S q) = (kResH)* (Res(q)). By Lemma 7.5, MR is decomposable so that it suffices to show (noting that FD is linear) that: FDRes(H *S q) = (kFDResH). (FDRes (q)). By Theorem 7.3 part 1 (for vector valued functions) this becomes: AggFS(H *S q) = (kAggH). (Agg(q)). Using the convolution theorem we can write: Agg(H.q) = (kAggH) (Agg(q)). Both sides of this equation are functions of D so for any asD we need (Agg(Hq) (a) = (kAggH) (a). (Agg(q)) (a). Using Definition 7.2, this is the same as: ( q)(r) 1 (r) q(r) Kn(r)=a n(r)=a n(r)=a But since W is compatible with n, H(r) # 0 only in case r = wa. We are thus left with the equality: 1 (H q)(wa) = H(wa) * q(wa) = 1 ( q)(w) Since Res(D is clearly onto F(DKn) we are done. p (D) Q.E.D. Since the only thing preventing Res(D) from being an isomorphism is that M can start in a larger number of states than MR can, we can ______ ~ ~ ~ ~ ~ ~ ~ ~ cn Rec

145 prove the following: Corollary 7.3 Let M be as in Theorem 7.4. The map Res is an.(D) from the submachine MW of M (see Def. 6.5) to MR. isomorphism Proof: It's necessary only to show that Res is a bijection when restricted p (D) to F lProjw(F(S,K )), the state set of MW. First, to see that Res ) (so restricted) is onto F(D,Kn) let f be any function in F ). en the function F q where f be any function in F(D,Kn). Then the function FS lq where k(FDf) (a) s = w q(s) = a (0 otherwise is in F iProjw(F(S,Kn)) and Agg(q) =F From Theorem 7.3 part 1 it -1 can be seen that FDResFS = Agg. Thus FDResF q = F f and since FD is one-one, ResFS q = f. Therefore Res) is onto even when D pp(D) restricted to functions in FP Prow(F(S,Kn) ). Now, to see that Res is one-one, suppose q, p (D) q' eFS Projw(F(SKn)) and that Res )q Resq Res )q' p (D) p(D) ' Then FDRes(q) = FDRes(q') or, by Theorem 7.3 part 1: AggFsq = AggFSq'. Then for all aeD we have that

146 (AggFsq)(a) = (AggFsq')(a) or 1 2 q(r) = 1 q'(r) kk~~~~i k K (r)=a Kf(r)=a But, since W is compatible with n, we have 1 A 1 A A,q(w ) = q'(w ) or q(wa) = q'(wa) for all w e W. a a 'q?(w (aa a -l dA A And since q, q'eFS Projw(F(S,K )) we have that q(s) = q(s) = 0 for S A seS-W. Therefore q = q' and hence q = q'. Q.E.D. Again, as for the submachine of Theorem 6.2, the submachine MW is always entered by the initial transition of M so that we really need not be concerned with distinction between M and MW. Intuitively (bearing in mind this caveat) these results say that if it is known a priori that an LCA M has "smooth" weighting functions, then it is the same as an LCA in which the weighting functions are sampled versions of those of M. This machine will occupy a smaller "space" and will keep exact track of the sampled versions of the states of M. No information is lost since the cells of M are insensitive to a part of the spatial frequency spectrum. In the LCA MR this insensitivity is manifested as a smaller number of cells. As indicated by our remarks at the conclusion of Sec. 7.3, however, it is possible to eliminate the requirement of a priori knowledge about the cells of M. The price paid at this lower level of knowledge is that MR will be a homomorphism of M and will process less information than M: it will keep exact track of smoothed and sampled states of M.

147 To see how this is done, notice that for an LCA to satisfy the hypothesis of Theorem 7.4, it must have impulse response functions each with spectrum having support (contained in) the set W. The LCA M' of Theorem 6.2 satisfies this requirement since H' = H = [Fs ProWFs] Taking FS of both sides, we get H = ProjWH so that H' (s) = 0 for seS-W. Moreover, M' is a homomorphic image of the LCA M of Theorem 6.2. Using observation we can prove the following result: Theorem 7.5 Let M = <S,K,H> be a decomposable LCA and let W be any subset of S compatible with n. Then Res (D)W is a homomorphism from M to the kn LCA M" = <DK. kRes pD H> p(D) W Proof: Let M' be the LCA <SKn, WH>. Theorem 6.2 says that the submachine M of M' is a homomorphic image of M, and from its proof we have that the homomorphism is Ew F(S,Kn) + F(S,Kn). Since W = FS ProjwFS, its clear that (Fs wH)(s) = 0 for seS-W. Thus, Corollary 7.3 applies to M', and since M" = M-,we have the following commutative diagram. M" Q.E.D.

148 We have succeeded in relating LCAs which have different structure groups. The map Resp(D) W is used two ways in Theorem 7.5: first, interpreted as operating on nxn matrix valued functions (and multiplied by k) it relates the structures of M and M", and, second, operating on vector valued functions, it relates the states of M and 'M as indicated in the proof. Intuitively, operating with this map first filters a function and then samples it. It is therefore not necessary to know a priori that a state of M is appropriately band-limited. The homomorphic image M" will keep track of the states of M exactly, whether they are band-limited or not, but only to a certain level of resolution. Ex. 7.2 illustrates Theorems 7.4 and 7.5 and their proofs. Example 7.2 Consider the LCA M = <Z4,C,H> where H is shown in Fig. 7.2. The group D is Z and W = {0,1}. For the sake of clarity, interconnections in M and M' are shown only for cell 0. However, it should be kept in mind that these networks resemble the one shown in Fig. 5.2. The maps n and p are as follows: 0 1 2 z1 2 3 4 0 2 3 P Z * * 2 0 1

149 M FS I I 11 p (D) -1 M S 4w 4x 4 3 o 1 2 ProjW 4/ A M" -1 D FD I Agg, Figure 7.2: Illustration of Theorens 7.4 and 7.5 for S = Z4, D = Z2' and W = {0,1}. The maps shown are between state sets.......

150 Since n maps each element in W = {0,1} to a different element of Z2, W is compatible with n. The maps shown in Fig. 7.2 are those on A state sets. Note that the structures of M' and M" are related by 2Agg and that the structures of M' and M' are related by 2Res pD). In Sec. 6.5 we showed that the map SW can be represented more _ A A concretely as convolution by XW = FS Xw where Xw is the characteristic function of the window W. Thus the map Resp(D)W can be given as follows: (Resp (D) )(a) = [Res (D)(XW *S q)](a) = xw (r)q(p(a)-r) reS = E XW(P (a)-r)q(r) reS For the special case illustrated in Fig. 7.2, Res (D)W can be taken 4 2 to map C4 into C2 and has matrix representation given by: 1 1li 1-i 2 4 4 (7.4-1) 1-i 1 l+i 0 -4 7 4 The second row is the same as the first but moved (cyclically) two spaces to the right. Compare this matrix with matrix (7.2-2). This kind of operator can be interpreted as follows: to compute Res (D)q at a point aeD, center the function XW over p(a)eS, pointwise multiply, and then sum over reS. One can think of this operator as an instrument with which one views states of M. A probe (e.g. a kind of microelectrode) is inserted into a configuration at a point in p(D),

151 i.e. at one of an equally spaced number of sites. Instead of producing a reading that is the state of the cell at that site, though, it produces a value that is a weighted sum over an area around the site. In other words, the probe has a non-local receptive field. The weighting function is determined by the window W. Fig. 7.3 a) shows the weighting function of the "probe" associated with the map Resp(D)S used in Ex. 7.2. It has a complex-valued weighting function which with appropriate shifts, constitutes the rows of matrix (7.4-1). Fig. 7.3 b) shows a more familiar weighting function associated with low-pass filtering. It has real values since the associated characteristic function of the window W is an even function. For large groups it is possible for Xw to be even and W to be compatible with n. j i _b).......... 4_ _ a) b) Figure 7.3: Weighting functions associated with filtering and sampling "probes".

152 We havelshown, therefore, that the two objections raised in connection with expression (7.2-1) can be met by a map that relates LCAs with different structure groups. The first objection, i.e. the intuitive complexity of map (7.2-1), is met since the map Res( W has an interpretation as "smoothing and sampling" that- might provide a plausible model for many kinds of real data gathering techniques. The second objection was that care must be taken in the case of finite fields so that the smaller LCA would be decomposable. This difficulty is avoided for LCAs related by Resp(D) W since when FS exists, so does FD. There are, however, several facts about the map Res p(D p(D) W that create other problems. First, the function XW will generally have very large support and will always have non-zero values arbitrarily far away. This is because they are inverse transforms of functions with small support. This means that the receptive fields of the corresponding "probes" are arbitrarily wide. For most reasonable cases (e.g. low-pass filtering) this is mitigated somewhat by the fact that these functions have values very close to zero at outlying points (c.f. Fig. 7.3 b)) and might be approximated by more locally supported functions. But the second'and more serious problem is that for the sampling rate required to produce a homomorphism, the receptive fields associated with neighboring sites generally overlap very non-triviaZZy (an exception is treated in the next section). It is therefore misleading to think of a cell of the hmomorphic image M" as a model of a section or block of cells in a partition of the structure group of the structure group of the original cellular automaton, e.g., in Yamada

153 and Amoroso's terms [1971], a kernel block (which is not the same as a coset).* In fact, as we remarked in Sec. 7.1, care must be taken when cellular automata are simplified by regarding a block of neighboring cells as a single, more complex cell. Such aggregations can indeed give rise to homomorphic images (see Yamada and Amoroso [1971]), but care must be taken to prevent the lumped model from producing artifactual behavior due to aliasing of spatial frequencies. One kind of aggregation that seems natural is to take the state of a block at any time t to represent the sum of the states at time t of all the cells that make up that block. In our framework, this kind of aggregation can be viewed as convolving in space with the characteristic function, say X, of a block, and then sampling at a slow enough rate to remove overlap of blocks in space. This kind of "filtering" and sampling does not cleanly remove high spatial frequencies since the DFT of X is not a window. In fact, it has very broad support so that the sampling rate required to eliminate overlap of the blocks in space is not fast enough to eliminate aliasing of spatial frequencies (at least for nontrivial blocks, i.e. blocks which are less than the whole space but bigger than single cells. An exception, however, is discussed in the next section). * Filtering theory as developed for signal processing is concerned primarily with approximating ideal filters with ones that involve weighting functions that are more local, in particular, with ones that can be causally realized (in time). A variety of window shapes (i.e. not characteristic functions) have been investigated which have helpful properties. However, while approximate homomorphisms can be produced using these windows which require less global aggregation, disjoint aggregation always involves undersampling. See Gold and Rader [1969].

154 These observations are well known among people who are concerned with any form of signal processing. They are related to what is known as the "uncertainty principle" (see Bracewell [1965]) for Fourier transforms. You can either remove overlap in the spatial domain (i.e. produce a blocking of cells) or in the spatial frequency domain (i.e. eliminate aliasing), but not in both domains at the same time. To actually filter out the fine spatial detail requires non-local and overlapping "aggregation". Therefore, in order to justify simplifying a system by means of local disjoint aggregates, one generally has to appeal to assumptions other than a combination of linearity and uniformity. An exception to these remarks is discussed next. We describe how a true homomorphic image of an LCA can be produced by certain kinds of structural blockings and aggregations. 7.5 Homomorphisms of LCAs via Spatial Aggregation The operator given by Def. 7.2 was called an aggregation operator because it produces a set of sums over disjoint subsets of G, namely, the cosets of K. Applying Corollary 7.2 to an LCA shows that homomorphisms can be produced by aggregation in space. Let S,D,n and p be as defined in the previous section. The kernel of n is K and IKI = k. For every LCA M with structure group S we define an LCA with structure group D that is an aggregated version of M: Definition 7.5 For any LCA M = <S,K,H>, MA is the LCA given by <D,Kn,kAgg H>. A Tn

155 An immediate consequence of Lemma 7.1 is the following: Lemma 7.6 If M = <S,Kn,H> is a decomposable LCA, then MA is also decomposable. We saw in the previous section that under certain conditions on M the restricted (i.e. sampled) version of M is a homomorphic image. The situation is more pleasant for the aggregated version of M: if M is decomposable then MA is a homomorphic image of M. Intuitively this theorem is true for the following reasons: first, Corollary 7.2 says that aggregating in space corresponds to sampling in the spatial frequency domain; but since the frequency domain representation of an LCA is a parallel composition, selecting any subset of components produces a homomorphic image (cf. Theorem 6.2). Moving back to the spatial domain, this selection appears as aggregation. These ideas are perhaps less well known than those underlying Theorems 7.4 and 7.5 since most interpretations treat S as a singly generated representation of time. This sort of aggregation in time would wlways be non-causal (or trivial). In fact, each aggregate would contain information from times arbitrarily far in the future. Theorem 7.6 If M is a decomposable LCA, then Agg is a homomorphism from M onto MA. Proof: This proof is analogous to that of Theorem 7.4. Let 6 and 6A be the transition functions of M and MA respectively, and let qEF(S,Kn).

156 We need to show that Agg(6q) = A(Agg(q)). Or, expanding, that Agg(H *S q) = (kAggH)*D(Agg(q)). By Lemma 7.6, MA is decomposable so it suffices to show (noting that FD is linear) that: FDAgg(H *S q) = (kFDAggH) (FDAgg(q)). By Corollary 7.2 (for vector valued functions) this becomes: 1 ^ 1 ^ k ResFS(H *S q) =(ResH) (- Res(q)). Using the convolution theorem we can write: 1 A A^ ^ 1 kRes (H.q) = (ResH) * ( Res (q)). From the definition of Res, this is the same as 1 ^ ^ ^ 1 ^ K (H-q) = (Hp) ( qp). Both sides of this equation are functions of S so for any seS we need 1 (H-q)(p() = H s k (Hq)k(p(s)) = H(p(s)) k q(p(s)). But this is just what is meant by pointwise multiplication of functions. Since Agg is clearly onto F(D,Kn) we are done. Q.E.D. We give two examples of this result. The first shows what happens for a simple cyclic group, and the second shows an application which allows the dimension of the underlying space to be reduced.

157 Example 7.3 Let M = <Z C,H>,D 8' 8 ~0 1 2 0 ^~~~~ = Z4, and the group-homomorphism n be given by: 4$ 3 4. 5. 6 7 I 1 3 3 * * 1 2 Fig. 7.4 shows M, MA, and their decompositions. The functions H and 2Agg H are shown for cell 0 only for the sake of clarity. 2Agn

M / / \ / / / / / / / / / / / \ \ / I / 2Aggn f ol / MA M \0\ MA ( 0 \ \ \ \ h 4 I 1 1 'I 2 5 6 I / Res I p(D) Figure 7.4: A homomorphism from spatial aggregation: MA is a homomorphic image of M. A\

159 For the simple cyclic case shown in Fig. 7.4, it can be seen that each cell of MA keeps track of the sum of the states of evenly spaced cells in M, i.e. those cells in a coset of K. Thus, while the blocks in the aggregation are disjoint, they do not consist of contiguous cells. However, this kind of aggregation has a very useful interpretation for simulation purposes. Suppose one is interested in experimenting with an LCA that has a very large cyclic structure group; one that is so large, in fact, that it can be considered infinite, i.e. non-cyclic. Suppose also that storage and/or time constraints prevent the simulation of any LCA with structure group bigger than ZN. If the cells of the original LCA have neighborhoods- small enough (e.g. in Fig. 7.4 d = g = f = 0), then the LCA having the same kind of cells as the larger LSM but with structure group ZN is a natural candidate for simulation. According to Theorem 7.6, this smaller LCA MA is a homomorphic image of the large one, M, via the map Agg. What inferences can be made about the behavior of M by experimenting with MA? First, it is A intuitively clear that for initial configurations with small enough support, the behavioral trajectories of MA and M will be essentially identical until the configurations begin to wrap around ZN. So much is also true for non-linear cellular automata. In the case of LCAs, when the configurations wrap around they superimpose (overlap and add). When this happens, the homomorphism Aggn begins to act non-trivially by actually losing information, i.e. by producing aliasing in the spatial domain. In the spatial frequency domain, however, the situation appears differently. The decomposition of MA is a sampled version of the A

160 decomposition of M as in Fig. 7.4. The larger N is, the closer these samples are taken. Thus, if a very large LCA is to be specified and/or observed in the spatial frequency domain, it is not misleading to specify and/or observe a frequency sampled LCA. The significance of these facts is that fo:r simulation purposes our theoretical restriction to LCA with finite structure groups is no real limitation. Keeping in mind the action of the homomorphism Aggn in both the spatial and spatial frequency domains, one finds that the infinite case, while perhaps more aesthetically satisfying, does not promise an increase in the understanding of linear cellular autoamta. The next example shows how Theorem 7.6 can be used to reduce the dimension of an LCA. Example 7.4 Let M = <Z2 x Z4,C,H>, D=Z, and n given by (0,0) * (0,1) (0,2) * (0,3) Z2 x Z4 Z * * 0 4 0 1 2 3 The map n projects the elements of Z2xZ4 onto the second coordinate and is clearly a group-homomorphism. Fig. 7.5 shows M, MA, and their decompositions.

161 (a M a A x E, 1,0 X02 1.2 1, 0,3 1,3 MA A _ E1.. 02 03 3 Figure 7.5: Reducing the dimension of an LCA by spatial aggregation.

162 Compare this figure with Fig. 7.4: M of Fig. 7.4 and M of Fig. 7.5 have the same number of cells but have different structures since Z and Z2 x Z4 are not isomorphic and groups (no single generator 8 2 4 exists for Z2 x Z4). Although the diagrams for these two LCAs appear to be the same except for a rearrangement of the cells, when the connections for all of the cells are filled in, the difference in structure becomes clear. For example, in Fig. 7.4 cell 1 influences cell 0 through the coefficient h, but in Fig. 7.5 cell (0,1) influences cell (0,0) through the coefficient d. Both of these LCAs, however, have the same aggregated version MA. For M of Fig. 715, though, the cells in a block are adjoining in the structure specified by Z2 x Z4,

163 In addition to being an automaton-homomorphism from M to MA, the map Agg:F(S,Kn) + F(D,K ) given in Theorem 7.6 is an example of what Zeigler calls a strong structure preserving morphism. See Zeigler [1971,1974]. This concept consists of three parts. First, there is a coordinate mapping that maps the coordinate set of M onto the coordinate set of MA. In our case, these coordinate sets are the structure groups S and D respectively, and n is the coordinate mapping. It induces a partition of S consisting of the cosets of K (the kernel of n) Secondly, there is a family of coordinate state set mappings. In our case, each of these maps simply averages the states of cells in a coset of K to produce the state of a cell in MA. These maps are required to produce an automaton-homomorphism when acting together. Theorem 7.6 says that this is true since Agg is the result of averaging over each coset of K. Thus, in Zeigler's terminology, Agg is a weak structure preserving morhism. If, in addition to these properties, a third requirement is met, then Agg is a strong structure preserving morphism. This requirement is that the coordinate mapping (here n) is a digraph morphism. This means that there is a connection between two cells of MA only if there are cells of M in the corresponding blocks A that are connected. In other words, the aggregation process does not introduce misleading dependencies. This is true in our case since the connection coefficients in MA are sums of the appropriate connection coefficients in M. Dependences can disappear (sum to 0) but cannot arise (0+0 = 0). In the next section we complete our application of sampling theory to LCAs by pointing out a connection between a well known fact about

164 DFTs and an idea that developed in an automata theoretic context. 7.6 Laminated LCAs Yamada and Amoroso [1974] introduced the terms laminations and laminal subautomata for describing the structure of certain cellular automata. For the case of LCAs some potentially interesting connections can be made between these concepts and some well known facts from harmonic analysis. We will describe these concepts as they specialize to the case of finite abelian groups noting that nothing fundamental is lost by abandoning their module theoretic framework. We pointed out in Chapter 3 that the group graph underlying a cellular automaton is completely connected if and only if the neighborhood template X contains a set of generators for the structure group S. If this is not the case, then the cellular automaton is said to be laminated. In general, X will generate a subgroup of S. Suppose this subgroup is p(D) for some group D according to the definitions of Sec. 7.3. Consider an LCA M = <S,KnH> where the support of H is X where X generates p(D). This means that X is the neighborhood template of M. Then it should be clear that Projp(DH = H, and that cells in different cosets of p(D) in S cannot communicate with each other in any number of time steps. There are ISI/IDI = k distinct cosets of p(D), and therefore k different laminal subautomata of M can be defined. We summarize these ideas using our notation by the following definition. It is essentially the same as that of Yamada and Amoroso except that it allows the possibility of a laminal subautomaton being itself laminated. Let S, D $ S, p, n and k be as defined in Sec. 7.3 and denote

165 cosets of p(D) by AO,...,Ak l Definition 7.6 An LCA M = <S,Kn,H> is said to be laminated w.r.t. D if Proj H = H. -- p(D) For i = 0,...,k-l, M(Ai) is the subautomaton of M with state set ProjA (F(S,Kn)). These subautomata are called laminal subautomata of M. i Since cells associated with different cosets cannot influence each other, M is isomorphic to the parallel composition of the M(A.). The isomorphism is given by q -> (ProjA q,Proj q....,Projn q) for qeF(S,Kn). 0 1 k-l Yamada and Amoroso go further and point out that each laminal subautomaton is isomorphic to a nonlaminal automaton. Using our notation this result means that for each i = 0,...,k-l, M(Ai) is isomorphic to the n LCA ML <D,KRes H> If ML happens to be laminated also, further restrictions will finally result in a nonlaminal automaton, i.e. one whose neighborhood contains a generating set for its structure group. It's interesting to note that the isomorphism from M(Ai) to ML can be 1 L interpreted simply as a change of spatial scale. It can also be seen that Resp(D) is an automaton-homomorphism from M to ML (this follows from the fact that Res(D)roj(D) = Res(D)) All of the facts described above can be formulated and proved for arbitrary cellular automata. What additionally happens in the linear case? It turns out that a laminated LCA has a specific kind of decomposition.

166 Theorem 7. 7 Suppose M = <S,K,H> is a decomposable LCA that is laminated w.r.t. D. Then M = [S.K,H] where H = PE FDRes (D H. Proof: From Theorem 7.2 part 1 we have that F Emb(D) PE F G p (D) T1 D Then F Emb Res PE F Res FGE p(D) p(D) = D p(D) or, since Emb Res = Proj I() GProj ( =!) 1 I: Res.( p(D) p (D) j (D)) G I) p ()) But since M is laminated w.r.t. D, we have F Proj = FGH = PE FRes G p(D) G n D p(D) Q.E.D. This result says that the decomposition of M consists of k identical sets of components where each set is the parallel decomposition of the LCA that we've called ML, i.e. the re-scaled version of a laminal subautomaton. Using the terminology of Cooley, Lewis, and Welch [1969], ML can be "stretched" to become a laminal subautomaton of M. Ex. 7.5 illustrates the situation for an LCA like those discussed in Ex. 4.4 and Ex. 5.6. Examp le 7.5 t'ot M., /Z 4 I:tl'- 1(: 111.h 1.:A Itowl t t'. ~,r wliii, it ld,(lt I i i.i M in Fig. 7.6 a). Since eac(h c(e ll only depends ot. the cell. two:;.(t:se; to its left, it is laminated w.r.t. D = Z4. The re-scaled LCA ML and its decomposition ML are shown in Fig. 7.6 b). Note that M simply its decomposition ML are shown in Fig. 7.6 b). Note that M simply consists of two copies of ML operating side-by-side.

167 FFz M WJ i a) M is laminated w.r.t. Z4. 4' L LZIO*WW Fz4 I 3 ML b) The LCA ML and its 1 1 1 2 1 decomposition. Figure 7.6: Illustration of Theorem 7.7

168 Recall that in Ex. 4.4 we discussed an LCA like that shown in Fig. 7.6 b) (but with infinite structure group) as representing a finite difference scheme for the simplified wave equation (4.5-2). Call this LCA M. We remarked that the approximation produced by this scheme has no truncation error. Of course, this is a trivial observation since the solution of initial value problem (4.5-2) can be explicitly written down, but it provides a point of contact between the terminology of numerical analysis and that of automata theory. If the PDE in (4.5-2) were thought of as specifying a cellular automaton with structure group R, then the statement that the LCA M represents an approximation with no truncation error means that M is a homomorphic image of the PDE-specified cellular automaton. In fact, for the simplified wave equation, M is isomorphic to a laminal subautomaton of the continuous cellular automaton. If p:Z + IR is the inclusion map, then this homomorphism is Res In other words, M is related to p(D)' the PDE-specified cellular automaton in the same way that ML is related to M in Fig. 7.6. It seems possible that a theory of cellular automata generalized along the lines discussed in Sec. 3.6 would support a rigorous development of this observation.

Chapter 8 Linear Cellular Automata as Models of Non-uniform Structures 8.1 Introduction In constructing models of systems whose components are distributed in space, whether the models are expressed as cellular automata or as partial differential equations, it is often assumed that small spatial non-uniformities exhibited by the "real" system can be ignored without causing the model to produce misleading behavior. This is presupposed by the choice of a cellular automaton model (according to our definition); and, in the case of linear PDEs, leads to an equation with constant coefficients. For the automata theoretic formulation, this simplifying premise is adopted because it simplifies the programming of a simulation model and permits a sometimes necessary reduction in time and/or storage requirements. In the case of PDEs, the assumption of constant coefficients leads to the possibility of applying powerful analysis techniques. In fact, many of these techniques are based on the ability to "separate the variables" of linear PDEs with constant coefficients (and for certain boundary conditions), which is the continuous analog of our parallel decomposition result for LCAs (Theorem 5.4). Our intuition tells us that if we're interested in large scale spatial behavior, such as the speed and position of wave fronts, then any fine spatial inhomogeneities can be ignored. Formally, this intuitive feeling might be expressed in terms of cellular automata and automaton-homomorphisms: a map which preserves only large-scale aspects of configurations is a homomorphism from certain kinds of spatially inhomogeneous structures to certain cellular automata. In 169

170 this chapter we provide a way of looking at spatially inhomogeneous linear structures (i.e. general linear networks) which allows us to specify conditions under which they can be homomorphically modeled by LCAs. As in Chapter 7, the overall point of view follows Zeigler's concepts of base and lumped models (Zeigler [1974]). We will postulate that the "real" system has a certain formal structure (base model) and ask when certain classes of models (lumped models) can be expected to preserve aspects of its behavior. Our results show that for certain kinds of spatial non-uniformities, the intuition often employed in formulating uniform models can be rigorously defended. 8.2 Non-uniform Linear Cellular Automata The title of this section is an inconsistent usage of our terminology, but we use it to emphasize the peculiar point of view to be adopted. By the definition given in Chapter 4, an LCA is a linear sequential machine that has a uniform structure. In this chapter we will study arbitrary linear sequential machines but we will view them within the framework used to study uniform structures. Instead of viewing them as abstract LSMs we will view them as LCAs having certain kinds of non-uniformities. The situation is analogous to standard treatments of time varying linear systems (Chen [1970], Schwarz and Friedland [1965]) in which behavioral trajectories are still regarded as functions of an underlying group (time) even though the system is not uniform with respect to the group (i.e. is not time invariant). In fact, the results we will derive here can automatically be regarded as results for time-varying linear systems with the appropriate considerations of causality and the usual non-cyclic models of time.

171 Let S be a finite abelian group and K a field. Definition 8.1 A non-uniform linear cellular automaton (NULCA) is an autonomous linear sequential machine M = <Q,6> which is associated with a finite abelian group S in the following way: i) Q = F(S,Kn) for some non-negative integer n. ii) Let B:S + Mat (n,n) for each aeS. a K The transition function 6:Q + Q of M is given by: 6(q) = where q'(s) = (r)q(s-r) (8.1-1) s-r reS for all scS. n We will denote the NULCA M by <S,K,{B}>. What are these functions 8,aeS, and how do they relate to the function H of an LCA? To answer these questions suppose qcF(S,K ) is given by { 1 for s = a q(s) = 0 otherwise Then according to (8.1-1) 6(q) = q' where q'(s) = Bsr (r)q(sr) = (s-) = (L a)(s) res Thus, the response to an impulse at cell a is L B so that the component functions of Ba are not the impulse response functions for cell a. Since L L B = B they are instead the impulse response functions as - a a len shifted to the left, i.e., shifted to the origin. Equivalently,

172 B specifies the impulse response of cell a, but the coordinate system is translated so that cell a is regarded as located at the origin. It's important to understand this point so the reader is urged to study the example shown in Fig. 8.1. Figure 8.1: The functions B and the impulse response functions IR, aeZ3, for a NULCA a 3 - It B =e same as can be seen, then, that a NULCA is an LCA M = <:S,n,('> when for all aeS, since when this is true equation (8.1-1) is the (4.2-3).* Note that the reflection B of a function B ~ c This is not a necessary condition for a NULCA M' to be an LCA since M' might be uniform with respect to a group other than S and have cell state set Km where m n n. In fact, according to our definition, any I.SM I 1 'uJ) i f1,r' ) over the iriia i god Jillpv

173 ( B (s) = 8 (-s) ) does not correspond to what we've called the weighting function of cell a. This can be seen in Fig. 8.1: 8B = (a,O,b), but its weighting function is (a,0,f). It is also instructive to compare our definition with the usual representation of linear operators on function spaces. Suppose the underlying group is still S. Then an arbitrary linear operator A is represented by a function of S x S called the kernel of the operator (which is not related to the kernel of a group-homomorphism). Let A:S x S - MatK(n,n) denote this kernel. For qeF(S,K), Aq = q' where q' (s) = CA(s,r)q(r) for all seS. In this case the reS th kernel is the same as a matrix where A(s,r) is the entry in the s row th and r column. Usually the term kernel is reserved for situations when the group is R and the summation is integration - but it plays the same role as a matrix. Using this notation, a linear operator on F(S,K ) with kernel A can be viewed as the transition function of a NULCA by defining each Ba, aeS, as follows: B (s-a) = A(s,a) (8.1-2) or B (s) = A(s+a,a) for all seS. In fact, it should be clear that any linear operator on a finite dimensional vector space can be viewed as a NULCA. If N is the dimension of the vector space, take S to be an abelian group with ISI = N (e.g. ZN) and derive the functions B, aeS, from the matrix of the operator.

174 8.3 Harmonic Analysis of Non-uniform LCAs Harmonic analysis techniques are generally reserved for the study of structures with the kind of uniformity we've been discussing here - invariance with respect to a group of shift operators. It is the very frequent assumption of this kind of uniformity, in fact, that makes harmonic analysis techniques so widely used. Other kinds of structural constraints on linear systems make different techniques applicable. For example, systems with spherical symmetry can be analyzed using Bessel and Legendre functions (Duff and Naylor [1966]). Beckmann's book on orthogonal polynomials [1973] describes many other assumptions that give rise to analogous techniques. It therefore seems to be mathematically unusual to apply the Fourier transform to systems that are not uniform in the appropriate way. However, the Fourier transform still can be used to study NULCAs; and, although parallel decompositions are not produced as they are in the uniform case, it is still possible to derive results that have interesting consequences for model building. In what follows we will restrict our attention to NULCAs which have only one-dimensional "cells", i.e. which have state set F(S,K). It seems likely that our results can be extended to the case of n-dimensional cells, but we would like to avoid the notational complexity of such an extension. Suppose S is a finite abelian group with IS = N and exponent m. th Let K be any field that contains a principle m root of unity. Then the group characters of S in K exist and are given by expression (5.2-2) where w is the root of unity. Then we know (Theorem 5.1) that FS, the discrete Fourier transform w.r.t. S, exists and is invertible. If M = <S,K,{B }> is a NULCA where S and K are as we have

175 just described, then FS is an automaton-isomorphism from M to another linear sequential machine which we'll call M. Unlike the case of LCAs, however, M is not necessarily a parallel decomposition of M. If A is the linear operator with kernel (i.e. matrix) given-by (8.1-2), then the transition function of M is the linear operator FSAFS:F(S,K) + F(S,K). This operator may not necessarily have a diagonal matrix representation (as it does in the uniform case). We will therefore call the NULCA M a transformable NULCA, and the linear sequential machine M, the transform of M since the terms "decomposable" and "parallel decomposition" would be misleading. Instead of trying to relate the entire structure of a transformable NULCA M and its transform M, we will only show that certain reasonable constraints on the structure of M can insure that M is of a special form with useful properties. In the next section, we use this fact to show how non-trivially uniform homomorphic images of NULCAs can be constructed. For a transformable NULCA with one-dimensional "cells" we now show that if all the functions B, aeS, have the same Bth Fourier coefficient, then the th coefficient of any configuration at time t+l depends only on the th coefficient of the configuration at time t. In order to prove this result, it is convenient to introduce a standard notation for an inner product on the linear space F(S,K). For f,geF(S,K), let <fg> = Zf(s)g(s). seS Using this notation, the orthogonality of the group characters of S in K stated in Sec. 5.2 becomes

176 (N if r = s ~>-S ^> 0 otherwise. (8.3-1) The discrete Fourier transform w.r.t. S (which exists and is invertible) can be written Fsf = g where g(s) = <f,s > for all seS. (8.3-2) Theorem 8.1 Let M = <S,K,{B }> be a transformable NULCA with transition func-: i ^ tion 6. If for any BeS there is a c eK such that B (B) = cW for all aeS (i.e. the th Fourier coefficients of all the B 's are the same), then (6(q)) (8) = cgq(b) for any qcF(S,K). Proof: By the definitions of FS and FS and since 6 is linear, we have for any qeF(S,K) (6 (q)) () = < (q), B> aeS = NS [<6 (+_ c),o >q(a)]. (8.3-3) aES Now consider only the inner product <6(% _),B '>. N r p Let 6T be such that <6!(~_d),> = V < 6_,) q ])> (8.3-4)

177 From standard finite dimensional linear algebra we know that 6T exists T and 6 (by) is given by (T()) (s)( = s (r-s) (r) for seS. reS This is true because from (8.1-1) we have (6(8))(s) () = B_(r)( (s-r) = Br (s-r) f (r) reS T by a change of variable. Then 6, the transpose of 6, is obtained by interchanging r and s in B (s-r). r By another change of variable, we get (dT($8))(s) = S (r)) (r+s) reS = ~B (r) ( S(r). (cs) since b is a homomorphism from S into the multiplicative structure of K. Thus, by (8.3-2), we have (dT ()) (s) = <(s)<Bs, >> = (s)Bs (). Returning to (8.3-4) we therefore have that S(-a > -a'61 -= 2f 4 (s)(, sBs (8)

178 = C a(s)s(s)cC seS = cB< B> ~ Going back to (8.3-3) we have that (S(q))^() = Zt[<6G(I_),) >q(a)] aeS 1 A = [c <~ > >q(a)] = cN e<'- q() Hence, since the characters are orthogonal in the sense given by expression (8.3-1), this becomes 1 A (6(q)) (B) = cB.Nq(B) which is the desired result. Q.E.D. We illustrate this theorem by the following example: Example 8.1 Consider the NULCA M = <Z4,C,{B }> shown in Fig. 8.2 a). Since the underlying field is C, M is transformable. We've contrived the functions Ba,a=0,...,3, so that they have the same h Fourier coefficients for 0=0,1. Therefore, according to Theorem 8.1, the transform of M is such that components 0 and 1 do not receive input from any components but themselves. This is shown in Fig. 8.2 b). Note that

179 a) M B0 IB 83 0 1 2 3 6 1+5i 0 1-5i 8 6i 0 -6i 4 2+4i 0 2-4i 12 12 -2+8i 0 -2-8i I. 8 -4 4 16 8 -4 8 20 A 83 8 -4 0 12 8 -4 16 28 b) M Figure 8.2: A NULCA M and its transform M.

180 Theorem 8.1 says that components 0 and 1 are not influenced by other components. It does not imply that they may not influence other components. Note also that component 0 depends on itself via the Fourier coefficient B(0) that is common to all of the B 's and that A component 1 depends on itself via B(1). 8.4 Uniform Homomorphic Images of NULCAs The automaton-homomorphism SW of Theorem 6.2, which maps states of an LCA M onto the "smoother" states of an LCA M', exists because the parallel composition M' consists of a subset of the independent components of M. More specifically, however, the map EW is an automaton-homomorphism because those components of M retained for the composition M' do not depend on components not retained. For a parallel composition, any subset of components forms a homomorphic automaton since its components depend only on themselves. It is clear, then, that a more general situation can be discussed. In particular, if M = <S,K,{B }> is a transformable NULCA such that the B 's share some subset of Fourier coefficients, Theorem 8.1 a says that each component of the transform M which corresponds to one of these mutual coefficients does not receive input from any other component. It is therefore possible to form a homomorphic image of M by retaining only those components of M corresponding to the mutual Fourier coefficients. Moreover, since these components do not depend on one another, they form a parallel composition that has a distributed realization with structure group S. The NULCA M, therefore, has a structurally uniform homomorphic image that can be produced by filtering out the aspects of its behavior which result from its

181 structural non-uniformity. Since any NULCA is automatical:y uniform with respect to the trivial structure group, we will be concerned with homomorphic images that possess, in some sense, maximal uniformity. Let us, therefore, explicitly define the maximal subset of S on which the spectra of all of the B 's take on the same values. Definition 8.2 For a transformable NULCA M = <S,K,{B }>, let U be the subset of S such that seU if and only if B (s) = 8 (s) for all a,BES. We have then for all a,8eS that BJU= |ju, and hence the following definition makes sense: Definition 8.3 For M and U as in Def. 8.2, let HUeF(S,K) be such that H = Pro Ba where aES. From Theorem 8.1 we can prove the following result: Theorem 8.2 Let M = <S,K,{8 }> be a transformable NULCA. Then F is a homomorphism from M onto a submachine of the LCA M = <S,[ Proof: It will suffice to show that Proju is a homomorphism from M, the transform of M, to the submachine M of the parallel decomposition

182 MU with state set Proj (F(S,K)) (see Def. 6.4). Theorem 8.2 will follow from composing the maps in this diagram: U U M -— UM S FS A Proju JA A ^ u Let 6 and 6 be the transition functions of M and M respectively Then we need to show that Proj 6 =- UProju. More explicitly, we need for all qeF(S,K) and seS that A A (Proju6q)(s) = (6UProJq) (s) First, assume seS-U. We know from Theorem 5.4 that (6UProjuq)(s) = Hu(s).(Projuq)(s) = 0.0 = 0. By the definition of Proju we have also that (Proju6q) (s) = 0 for seS-U. Therefore suppose seU. This means that B (s) = Hlu(s) for all Fc ^ aeS. If 6 denotes the transition function of M, since M^ -- M we know that FS6 = FS or -1 - FS6FS 6. Thus we can apply Theorem 8.1 and get the following: for ssU

183 (ProJu6q) (s) - (6q) (s) (Fs6FS q)(s) A A - (6(F5 q)) (s) = Hu(s).q(s) But 6U is such that for seU we have (6uProjiq)(s) = H(s).(Prouq) (s) H= (s)-q(s) Since Proju is clearly onto the set Proju(F(S,K)), we have that Proju is an automaton-homomorphism from M to the submachine of M given in Def. 6.4. We conclude that.U = FS ProjuFS is a homomorphism from M to the submachine of M of Mu having state set FS1Proju(F(S,K)) Q.E.D. Q.E.D. This theorem is illustrated in Fig. 8.3 where we have represented the LCA Mu and its decomposition Mu that corresponds to the NULCA of Ex. 8.1 and Fig. 8.2. Since MU is uniformly interconnected, we show HU for cell 0 only. Compare the parallel composition Mu to the U transform M of M shown in Fig. 8.2 b). One can see that at any time tZ+ the state of component 0 in Mu will be exactly the same as the A state of component 0 in M. Similarly, components 1 will agree for all time tcZ+. The map SU = FS ProjuFs of Theorem 8.2 is exactly the same as the map SW for W = U given in Theorem 6.2. Unlike the uniform case, however, not every subset of S will produce a homomorphism. It should be clear, though, that for any subset W of S that is also a subset of U, the map W is a homomorphism frpm the NULCA M to an LCA M'.

184 MU ~ ' o 2-' 2+i 3 3 MU 2 Figure 8.3: The LCA MU is a structurally uniform homomorphic image of the NULCA M shown in Fig. 8.2 a).

185 We make this explicit in the following corollary: Corollary 8.1 Let M = <S,K,{B }> be a transformable NULCA. Then for any WC U (where U is given by Def. 8.2), W is a homomorphism from M to the submachine M; of the LCA M' = <S,K,H'> where H' = Et. Proof: The LCA M given by Theorem 8.2 satisfies the hypotheses of Theorem 6.2. We therefore have M E MWU W M. Thus, SWSU is the required homomorphism W U = FS ProjwFsFS ProjuFS = F ProWProjuFs = FS ProjFS W since W c U. Q.E.D. This corollary is the generalization to NULCA's of Theorem 6.2. Notice that if the NULCA M happens also to be an LCA, then U = S since all the B 's are the same functions, namely they are all equal to the impulse response function H of the LCA. Therefore, the requirement that W 5 U reduces to the requirement that W S, and Corollary 8.1 reduces to Theorem 6.2. In a similar manner, all the results proved in Chapter 7 for LCAs can be generalized to the case

186 of NULCAs. We make these generalizations explicit in the next section. 8.5 Homomorphisms of NULCAs produced by Filtering-Sampling and Aggregation Let M = <S,K,{B }> be a transformable NULCA, and let Mu = <SK,%> be the LCA of Theorem 8.2. That is, for U given by Def. 8.2, H = ProjuBa for any aeS. Let S and D be finite abelian groups with canonical decompositions as in (7.3-1) and (7.3-2) respectively. Let the maps p:D + S and n:S + D be as defined by (7.3-3) and (7.3-4), and let K be the kernel of r with IKI = k. Def. 7.3 can be applied to the LCA M to define the sampled version Mu of M. More specifically: R Definition 8.4 For MU = <S,K,H >, the LCA MUR is given by <D,K,kRes p(D >. U KR p(D) U Recalling that a subset W of S is compatible with n whenever jWn(nl (a)) | = 1 (Def. 7.4), we can prove a generalization of Theorem 7.5: Theorem 8.3 Let W be any subset of U that is compatible with n. Then Res is a homomorphism from the NULCA M onto the LCA M. p(D)W i a hoomorphism from the NULCA M onto the LC M Proof: Theorem 8.2 says that M -U M

187 and Corollary 7.3 applies to Mu so that we also have Res MU p (D) uM MM Thus, we have the following commutative diagram: M> MU ~\ 11 ~PD) X Resp(D) MR Q.E.D. The above result is a generalization of Theorem 7.5 since the smoothed and sampled LCA M' = <S,K,kRes (D)WH> of that theorem is, for the case considered here, the same as M = <D,K,kRes H p>. The R ' p(D)U reason for this is that SWHU = HU due to the fact that HU is already a A A smooth function by virtue of its definition: Ht = ProjUB, Inother words, the "filtering" part of Theorem 7.5 has already been accomplished in passing from the NULCA M to the LCA M. The next example illustrates Theorem 8.3 and its proof: Example 8.2 We will again consider the NULCA M = <Z4,C,{B }> shown in Fig. 8.2 a). For this system, the set U is {0,1}. Suppose D = Z2, and the maps n and p are as follows: 0 1 Z4 0 Y Z4 i! Aim O 1 /2 3 P / Z2 1 1 0

188 FS M -1 FS I MR ResD) p(D) Mroj, 1 U f F -1 Figure 8.4: Illustration of Theorem 8.3 for S=Z4 D=Z2 and W = U = {0,1} The maps shown are between state sets. Compare this figure with Fig. 7.2.

189 Therefore U is compatible with n so that Theorem 8.3 holds for W = U = {0,1}. Fig. 8.4 shows the systems M, MU, M and their decompositions. The reader should compare Fig. 8.4 with Fig. 7.2 which illustrates the analogous facts for LCAs. The only difference between these figures is that in Fig. 8.4 M is not uniformly interconnected and its transform M is not a parallel composition. The crucial fact provided by Theorem 8.1 is that ProjU in Fig. 8.4 is still an automatonhomomorphism. We should point out that the homomorphism Res (DW of Theorem 8.3 p (D)^ w is exactly the same map that we discussed in Sec. 7.4.as a homomorphism between two LCAs. Thus, even for the non-uniform case, it can be interpreted as filtering (SW) followed by sampling (Res (D)) and can intuitively be thought of as a means of anobserving M using "probes" with overlapping receptive fields. For the case of NULCAs, the band of frequencies filtered out by W is wide enough so that any effect on behavior produced by structural naon-uniformities are eZiminated. The resulting uniform system will give exact information about the remaining frequencies. We now turn to the results summarized in Theorem 7.6 which permit homomorphic images of LCAs to be formed by means of spatial aggregation. We can also extend this simplification technique to the case of NULCAs. Consistent with Def. 7.5 we define an aggregated version of M: Definition 8 5 For M = <S,K,tHU>, the LCA MA is given by <D,K,kAgg HU>. 'U A T

190 Theorem 7.6 can be applied directly to the LCA Mu so that we can prove the following: Theorem 8.4: Let M = <S,K,{B }> be a NULCA and let W be any subset of U. Then U Agg nW is a homomorphism from M onto a submachine the LCA MA. Proof: Theorem 8.2 says that CW U M. --- —- > MU, and Theorem 7.6 spplies to M giving Agg MU n MU - ~A Then the submachine of MU with state space Agg W(F(S,K)) is a homomorphic image of MU. Call this submachine SubMA. Then we have the following commutative diagram: M \AggM Aggn SubMA Q.E.D. We illustrate this theorem with a non-uniform analog of the situation presented by Ex. 7.4. There we showed that spatial aggregation could be used to reduce the dimension of an LCA's structure group. Here we show that the same technique can be used under less restrictive conditions.

191 Example 8.3 Consider the NULCA M = <Z2 x Z2,R,{B }> shown in Fig. 8.5. Since R contains the principle 2n root of unity -1, M is transforms able. In fact, the appropriate transform is an example of the finite Walsh transform whose matrix is shown in Ex. 5.4. Suppose n is as fo lows: (0,0) r (1,0) Z2 xZ Z2 ~z~~~ * 2 0 1 Then the kernel K of n is the subgroup {(0,0),(0,1)} of Z2 x Z2. We've specified the functions B, acZ2 x Z2, such that the following holds: for any coset A of K, each cell in A is such that the sum of its outputs which go to cells also in A is a constant xcR. The connections between cells in different cosets are uniform. Thus, the set U corresponding to this NULCA is {(0,0), (1,0)} which is compatible with the group-homomorphism n. We can therefore apply Theorem 8.4 for W = U. Fig. 8.5 shows M, its uniformly interconnected homomorphic image M, and the aggregated version MA. We'd like to make explicit several observations about the simplification process illustrated in Fig. 8.5. First, notice that in going from Mu to MA we've aggregated over the cosets of n. However, this kind of aggregation can also be viewed as smoothing followed by

n z~r-n n U~V I Z61

193 sampling. If X is the characteristic function of U, then its inverse transform XU is equal to 0 on all of Z2 x Z2 except for values of 1/2 at (0,0) and (0,1). Convolving (w.r.t. Z2 x Z2) with XU and then taking one sample from each block induced by n produces the same result as the map Agg. A second observation is that although the simplification process of Fig. 8.5 can easily be justified without any mention of the DFT, the case shown involving eqial sums of influences within the blocks is a special case of our more general method. That the response functions sum to the same value within th a block is the same as saying that their e Fourier coefficients are the same where e is the identity of the underlying structure group. As we mentioned earlier, this is the case for any group since the character 4e is always a constant function. Here, however, we've shown the e notion of sum in this context can be generalized to a subset of spatial frequency components. Finally, we would like to point out that the LCA MA of Fig. 8.5 is of the same form as the LCA discussed in Ex. 5.7 as a model of an aspect of morphogenesis. Following the analysis given for that example, we can conclude that when 2x-(y+z)>l, any initial inhomogeneity in the state of M will resonate, and thus cause the difference between A the states of cell 0 and cell 1 to increase without bound. Therefore, we can conclude that the non-uniform system M behaves in the same way for 2x-(y+z)>l: the difference between the sums of cell states over the two blocks will increase without bound.

194 8.6 Discussion The results described in this chapter go part way toward justifying the intuition 'that small structural non-uniformities do not affect large-scale spatial features of a system's behavior. If we postulate that such a system can be validly modeled by a NULCA M = <S,K,{B }> (the base model), then the presence of small non-uniformities might reasonably be represented by requiring the B 's to differ only in their higher spatial frequency components. That is, the set U would consist of only the lower frequencies. Theorem 8.2 would then say that if one is interested in aspects of the system's behavior having relatively large spatial scale (i.e. involving low spatial frequencies), then it is not misleading to assume that the components of the system are identical and uniformly interconnected (the LCA M ). The difficulty with this analysis, of course, is in postulating the NULCA M as a base model, and it turns out that this may involve some non-trivial assumptions about the structure of the "real" system To make this point clear we will briefly discuss the dual situation in which NULCA's are characterized by weighting functions instead of impulse response functions. In Sec. 8.2 we pointed out that the reflection B of Ba, a~S, cannot be interpreted as the weighting function of component a. Suppose, however, that we give the name C to the weighting function of component a. It seems reasonable that small spatial inhomogeneities can just as well be modeled by assuming that the C 's are the same except for their higher spatial frequency components. In this case, however, the analog of Theorem 8.2 says that a homomorphic image can be formed by retaining the higher frequen

195 cies and that this homomorphic image is not necessarily uniformly structured.* An interesting question, therefore, is to ask under what conditions the smoothness of weighting functions (the Ba's) implies the smoothness of impulse response functions (C as).** It seems plausible that very reasonable conditions can be given, but we have not pursued this question. Another aspect of the results of this chapter arises if we ask how an LCA with time-varying impulse response functions might be modeled. If the temporal changes of the impulse response functions are confined to certain spatial frequencies, then it is possible to filter out these frequencies and obtain a uniformly interconnected and time-invariant homomorphic model. To see why this is true, note that at any time t, the time-varying LCA can be regarded as a NULCA M for which the set U is the set of spatial frequencies which do not change in time. At time t+l, the system is a different NULCA, but the set U is the same. It is not hard to convince oneself that the single LCA Mu is a homomorphic image of each of these different NULCAs. Therefore, Mu is a homomorphic image of the original time-varying system. This is true even when the spatial impulse response function of a cell varies with time independently of changes in the impulse response functions *More specifically, if in Fig. 8.2 the 8a's^are replaced by Ca's, one obtains the dual of the digraph shown for M in Fig. 8.2 b). That is, the arrows are reversed. Thus, components 0 and 1 do not influence other components but may be influenced by them. This is not automatically true. If it were, components 0 and 1 of M shown in Fig. 8.2 b) would only influence themselves.

196 of other cells. It may thus be possible to view certain kinds of nonlinearities of a cell as a time variance of its spatial impulse response function. In such cases, a cellular automaton composed of such cells can be modeled by an LCA. Further investigations are needed to determine whether non-trivial nonlinearities can be modeled by non-trivial LCAs.

Chapter 9 Conclusion Throughout the preceding chapters we have tried to bring together parts of traditionally rather separate areas: the theory of cellular automata, techniques from digital signal processing, and rudiments of the theory of partial differential equations and their numerical approximations. At the same time, we have tried to maintain a self-contained and rigorous level of description without obscuring what we feel are the basic facts about the class of systems involved. We have therefore not attempted to ground the ideas in the most general framework and have found it necessary to touch only very lightly on substantial bodies of existing and, in some cases, well known theory. We would like to close with a discussion of some of the issues that are likely to arise in more general settings and some of the results that can be obtained. Since the entire diccussion has been restricted to the case of autonomous cellular automata, we first briefly describe what can be expected when provisions are made for input and output. If an LCA has input and output variables such that, as a whole, it is a non-autonomous LSM (Sec. 2.5), then one can extend the notion of structural uniformity so that it is still possible to form parallel decompositions. If the input and output connections are such that from an I/O point of view, the LCA can be characterized as a linear operator with automorphism group S x T (where S is its structure group and T is time), then the LCAis isomorphic to a set of completely independent LSMs each having input and output. Since the input, output, and state spaces 197

198 of these LSMs are the same as the corresponding spaces of a cell of the LCA, the results regarding minimality, controllability, and observability of the entire cellular automaton reduce to the corresponding questions for LSMs over these smaller spaces. For example, if M = <S,K,H> is a decomposable LCA which also has input space F(S,K) and output space F(S,K), then given the appropriate uniformity of input/output connections, M is controllable if and only if it is 1-controllable (see Harrison [1969]). This means that if such a non-automomous LCA is strongly connected, then by choosing the correct input configuration, any state can be taken to any other state in one time step. If, on the other hand, M has cell state set 2 2 K2 and input and output spaces F(S,K ), then it takes at most two time steps to go from any state to any other. We indicated in Sec. 3.6 and Sec. 5.6 that the framework used in this presentation can be generalized to include structures with uncountable structure groups. This direction of generalization will permit cellular automata and partial differential equations to be considered particular members of the same class of models. Using concepts from a systems theoretic view of the modeling and simulation process, it will be possible to explore relationships between models having a countable number of dimensions and those having an uncountable number. It is felt that this approach can lead to questions and results which have a character different than those motivated by the desire to solve uncountably dimensional problems by means of finite dimensional systems. In particular, it is hoped that such studies will increase our repertoire of formalisms for modeling both natural systems and computational processes.

REFERENCES 1. Aho, A. V.; Hopcroft, J. E.; Ullman, J. D., The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, Mass., 1974. 2. Aladyev, V., "T -grammars and their Languages", Izu. AN ESSR, Biol. 22, No. 4, 1973. 3., "Survey of Research in the Theory of Homogeneous Structures and their Applications", Mathematical Biosciences, 22, 1974. 4. Amoroso, S.; Cooper, G., "Tessellation Structures for Reproduction of Arbitrary Patterns", Journal of Computer and System Sciences, 5, 5, 1971. 5. Andrews, H. C., Computer Techniques in Image Processing, Academic Press, New York, 1970. 6. Ashby, W. R., An Introduction to Cybernetics, John Wiley & Sons, New York, 1963. 7. Beckmann, P., Orthogonal Polnomials for Engineers and Physicists, The Golem Press, Boulder, Colorado, 1973. 8. Bergland, G.D., "A Guided Tour of the Fast Fourier Transform", IEEE Spectrum, July 1969. 9. Boerner, H., Representation of Groups, North-Holland, Amsterdam, 1972. 10. Bracewell, R., The Fourier Transform and its Applications, McGraw-Hill, 1965. 11. Budden, F.J., The Fascination of Groups, Cambridge University Press, 1972. 12. Burks, A. W. (Ed.), Essays on Cellular Automata, University of Illinois Press, 1970. 13., "Models of Deterministic Systems", Math. Sys. Theor., 8, 4, 1973. 14. Cairns, T. W., "On the Fast Fourier Transform on Finite Abelian Groups", IEEE Transactions on Computers, May 1971. 15. Chen, C. T., Introduction to Linear Systems Theory, Holt, Rinehart, and Winston, 1970. 16. Codd, E. F., Cellular Automata, Academic Press, New York, 1968. 199

200 17. Cooley, J. W.; Lewis, P. A. W.; Welch, P. D., "Application of the Fast Fourier Transform to Computation of Fourier Integrals, Fourier series, and Convolution Integrals", IEEE Trans. Audio Electroacoustics, Au-15, 2, 1967. 18., "The Finite Fourier Transform," IEEE Trans. on Audio and Electroacoustics, Au-17, 2, 1969. 19. Curtis, C. W.; Reiner, I., Representation Theory of Finite Groups and Associative Algebras, Interscience Publishers, 1962. 20. de Boor, C., "The Method of Projections as Applied to the Numerical Solution of Two Point Boundary Value Problems Using Cubic Splines," Ph.D. Thesis, University of Michigan, 1966. 21. Duff, G.F.D.; Naylor, D., Differential Equations of Applied Mathematics, Johh Wiley & Sons, New York, 1966. 22. Fleck, A.C,, "Isomorphism Groups of Automata," J. Assc. Comut Mach., 9, 1962 23. Foy, J., "A Computer Simulation of Impulse Conduction in Cardiac Muscle," Ph.D. Thesis, Computer & Communication Sciences, The University of Michigan, 1974. 24. Gardner, M., "The Fantastic Combinations of John Conway's New Solitaire Game 'Life'," Scientific American, Oct. 1970. 25. Gary, J., "The Numerical Solution of Partial Differential Equations," Manuscript, 1972, 26. Gold, B.; Rader, C. M., Digital Processing of Signals, McGraw-Hill, New York, 1969. 27. Goodman, E.; Weinberg, R.; Laing, R., "A Cell Space Embedding of Simulated Living Cells," Bio-Medical Computing, 2, 1971. 28. Grossman, I.; Magnus, W., Groups and their Graphs, Random House, 1964. 29. Harmuth, H. F., Transmission of Information by Orthogonal Functions, Springer-Verlag, New York, 1972. 30. Harrison, M. A., Lectures on Linear Sequential Machines, Academic Press, 1969. 31. Hartmanis, J.; Steams, R. E., Algebraic Structure Theory of Sequential Machines, Prentice-Hall, Englewood Cliffs, N.J., 1966. 32. Hewitt, E.; Ross, K. A., Abstract Harmonic Analysis, Springer-Verlag, Berlin, 1963.

201 33. Holland, J., "A Universal Computer Capable of Executing an Arbitrary Number of Sub-Programs Simultaneously," Proceedings of the Eastern Joint Computer Conference, 1959 34., "Iterative Circuit Computers," Proc. of the 1960 Western Joint Computer Conf., 1960. 35. IEEE Trans. Audio Electroacoustics, Special issue on Fast Fourier Transform and its application to digital filtering and spectral analysis, Au-15, 2, 1967. 36., Special issue on Fast Fourier Transform, Au-17, 2, 1969. 37. Jump, J. R., "A Note on the Iterative Decomposition of Finite Automata," Information and Control, 15, 1969, 38. Kalman, R. E.; Falb, P. L.; Arbib, M. A., Topics in Mathematical Systems Theory, McGraw-Hill, 1969. 39. Knuth, D. E., Seminumerical Algorithms, Addison-Wesley, Reading, Mass., 1969. 40. Lang, S., Algebraic Structures, Addison-Wesley, Reading, Mass. 1967. 41. Minnick, R, C., "A Survey of Microcellular Research," J. of the Assoc. for Comput. Mach., 14, 2, 1967. 42. Mortimer, J. A,, "A Cellular Model for Mammalian Cerebellar Cortex," University of Michigan Technical Report, LCG-107, 1970. 43. Nicholson, P. J., "Algebraic Theory of Finite Fourier Transforms," J. of Computer and System Sciences,, 5, 1971. 44. Ostrand, T. J., "Pattern Reproduction in Tessellation Automata of Arbitrary Dimension," J. of Computer and System Science, 5, 6, 1971. 45. Roach, G. F., Green's Functions: Introductory Theory with Applications, Van Nostrand Reinhold Company, London, 1970. 46. Rosen, R., Dyamical System Theory in Biology, VI, Wiley-Interscience, 1970. 47., "Biological Systems as Organizational Paradigms," Int. J. of General Systems, 1, 3, 1974. 48. Rotman, J. J., The Theory of Groups, Allyn and Bacon, Boston, 1965. 49. Rudin, W., Fourier Analysis on Groups, Wiley-Interscience, New York, 1962.

202 50. Schechter, M., Principles of Functional Analysis, Academic Press, New York, 1971. 51. Schwarz, R. J.; Friedland, B., Linear Systems, McGraw-Hill, New York, 1965. 52. Stroke, G. W., An Introduction to Coherent Optics and Holography, Academic Press, New York, 1969. 53. Thatcher, J., "Universality in the von Neumann Cellular Model," University of Michigan Technical Report, LCG-42, Sept. 1964. 54. von Neumann, J.; Burks, A. W. (Ed.), Theory of Self-Reproducing Automata, University of Illinois Press, Urbana, 1966. 55. Wagner, E.G., "On Connecting Modules Together Uniformly to Form a Modular Computer," IEEE Trans. on Electronic Computers, EC-15, 6, 1966. 56. Waltz, F., "Some Properties of Circulents," Univeristy of Michigan, Dept. of Electrical Engineering, SEL Technical Report No. 4, 1966. 57. Yamada, H.; Amoroso, S., "Tessellation Automata," Information and Control, 14, 3, 1969. 58., "Structural and Behavioral Equivalences of Tessellation Automata," Information and Control, 18, 1, 1971. 59. Zeigler, B. P., "Toward a Formal Theory of Modelling and Simulation, I," or "On the Formulation of Problems in Modeling and Simulation within Mathematical Systems Theory," Proceedings of the Sixth International Congress on Cybernetics, Namur, Belgium, 1970. 60., "Toward a Formal Theory of Modelling and Simulation, II," University of Michigan Technical Report, LCG-120, July 1971. 61. ___, "The Base Model Concept," Theory and Application of Variable Structure Systems, Mohler, R. R., Ruberti, A. (Eds.), Academic Press, New York, 1972. 62. _, "A Conceptual Basis for Modelling and Simulation," Int. J. of General Systems, 1, 4, 1974. NI3 9015 02523 0189HI 3 9015 02523 0189