THE UNIVERSITY OF MICHIGAN SYSTEMS ENGINEERING LABORATORY Department of Electrical Engineering College of Engineering Technical Report SEL-66-3 INITIATION TIMING IN A MODEL FOR PARALLEL COMPUTATION by Raymond Reiter March, 1966 This research was supported by United States Air Force contract AF 30(602)-3546.

Abstract This paper deals with a particular case of a model for parallel computation as formulated by Karp and Miller. A parallel computation is viewed as a directed graph in which a node represents a sequence of operations to be performed upon data on the node input branches with the results of an operation being placed upon the output branches. An operation associated with a node n may initiate only if there is at least one data item on each input branch to n. Upon initiation, n removes one data item from each input branch and upon termination, places one data item on each output branch. For such a computation graph G necessary and sufficient conditions that a set of real numbers (tri r=O,l,..,, and n. is a node of G) represent a sequence of ini1 tiation times for the nodes n. of G are given. A periodic set (t = t. + ry7 is given so that G computes periodically, and the minimum period it is determined in terms of the graph parameters. A maximal computation rate periodic schedule is also given for the case that G is required to compute synchronously, i.e. at integer times. Finally, in the case of a synchronous computation graph G, an analysis is given of the so-called free running execution of G and this is found to yield the maximum computation rate of Go iii

TABLE OF CONTENTS Section Page 1. Introduction. o...... o 1 2Ao A Periodic Schedule......... o. o..... 4 B. Extension to Weakly Connected Directed Graphs. 0..0 17 Co A Dual Assignment........ 19 Do Computational Algorithm... o. o. o o 21 3. Synchronous Computation Graphs.............. 24 4. A Maximal Rate Periodic Schedule for Synchronous Computation Graphs..............., o 47 5. Conclusion... o.. o.......... o. 52 References.. o. o o. o..... 53 v

1o Introduction The concept of a computation effected in a parallel fashion entails the notion of a number of processing units with interconnected data channels, with each unit capable of performing some function, provided only that the necessary input data is available to ito A model characterizing such a system has been developed by Karp and Miller of IBMo Briefly this model is as followso The computation is represented as a directed graph in which a node n. denotes an operation to be performed upon data which lie upon the input branches to nI. The results of the operation by node n, are placed _ 1 upon the output branches of n,. A branch, therefore, represents a data queue. With each branch boo directed from node n. to node no is associ~ J 1 j ated a quadruple (A ij Uij' Wi,' T. ), the elements of which are interpreted as Aoo - the initial number of data items on branch b.. 3J 1i U.i - the number of data items placed on b.i upon the termination of the operation associated with n. o Wij - the number of data items removed from bij under a first-in firstout queue disciplinee upon the initiation of the operation associated with node n. To. - a threshold. In order to initiate the operation associated with 1j node nj, the queue length on b i must be greater than, or equal to Tij for all input branches b.o to node no. 1j 1j J Let G be a computation graph with node set (n, ], For every branch b.i let To =U,. = Wo = 1 and let every directed loop of G contain at 1

2 least one initial data item. It is known (See [3.] or [5].) that such a graph represents a nonterminating computation. We seek an execution of G which is periodic in the sense that if a node nj first initiates at time tj. it will initiate thereafter at times t. + y, t. + 2y,.., where 7, 0 J the period, is the same for all nodes of G. Clearly, if the computation is to be controlled by a clock signal, such an execution is desirable. We shall show in Section 2 that such an execution exists for all such computation graphs. Moreover, it is possible to define a parameter rC for G such that the above periodic schedule is possible with y = TC and under which G computes at the maximum possible rate. In Sections 3 and 4 we consider various problems which arise when the frequency of the clock signal controlling the initiation of the nodes of G is a priori specified. It then turns out that the maximum computation rate periodic schedule of Section 2 is not applicable when i is not an integer. This leads us in Section 3 to define the notion of a synchronous computation graph computing under the so-called free running execution, Under this execution, a node initiates at integer times if and only if each input branch contains at least one data word. The principal result of Section 3 is that a synchronous computation graph under the free running execution computes at the maximum possible rate and that this rate is 1/r. In Section 4 we provide a periodic execution for synchronous computation graphs Go This execution has the following form: Let t = \/a where x and a are integers, and let n. be a node of G. Then there are integers t < t < < t + such that n Then there are integers t < t < < t < t + % such that n 1 1 1 1 1 initiates only at times

t.+, t.+,..., t +, 0 1 1 ti+.~ ti+k... t i.' +k ti+2%, t +2X,..., t ~+2k, Clearly, under this execution G computes at the maximum rate 1/t.

2,A. A Periodic Schedule Throughout this paper, unless stated otherwise, we shall be concerned with computation graphs G such that for each branch b.i= (ni,nj) we have T.. W.. U =. 1 and such that every directed loop of G contains at ij 1J 1i least one initial data item. Thus each node of G is eligible for initiation if and only if there is at least one data word on each of its input branches. With each node n. of G, associate a positive real number TV, 3J the execution time of n.. T. is to be interpreted as follows: if n. ini-------— 3 J 3J tiates at time t, then at time t, one data word is removed from each of the input branches to n., and at time t + T., one data word is placed upon each of the output branches of n.. We say that a directed graph is in-point connected if there exists a node n such that if n. n, then there is a directed path from n. to n. The node n is called an in-point. Let G be an in-point connected computation graph with in-point n1. If there is a branch from n. to nj, write n. -> n., and let A. be the 1J_' 1 J initial number of data words on this branch. By r(n.) for ni n1, we shall mean a path (which exists by definition) n. -n. n. - n.. - n. ik ik-1 11 where n. = n, and n. n. 11 1 1 Write kA. i. as A ir+l 1 r r=l Tt(ni) and ik r=i2 (ni) 4

5 We shall assume that G contains at least one directed loop. Let L be a loop of G, n n. -> n - o n -> n. il 12 1k'k+l where n. = n. k+1 11 ik Zor r=il Define = L k Air9 r=l' ir 9 r+l and let t = max tL } loops L Let y > An and consider the following assignment of real numbers t. to the nodes n. of G. t1 - 0 t =0 (1) t min m A- r] for n. # n, 1i it(ni) (ni) We shall prove (Corollary 3) that an execution exists for G such that for all nodes nj of G. n. initiates only at times tj, t. + 7y t. + 2y7 In [1], Cuninghame-Green considers a similar problem in which Ai. = 1 for all branches (ninj) but for which Ti is actually a branch (rather than node) parameter a... He defines a quantity X = max (Z a/} 1JL L L where 2 is the length of the loop L. Under the identification

6 X - a. = jAj - T., our problem becomes formally identical with his. Ciuinghame.-Green poses the problem of determining a set of initial starting times for the nodes of G such that a periodic execution with period rt is possible and he solves this for the special case that G is a loop. In this section we show that the assignment (1) with 7 = t is precisely such a set of initial starting times (Corollary 3). Lemma 1 If n. ->nj, then ti + T< tj + 7 Ai. l i - J ij PROOF: (1) Suppose n. = n.. Then n. -> n. is a loop of G whence 1 3 1 1 1 A< i < <n i,i i.e. t + T < t + y A. 1 i - i 1,1 (2) Suppose n. = n. Then n. -* n. is a path tc(n.) and we have immedij 11 3 ately t < 7 A. - T. = t. + 7 Ai. - - J 1,j i (3) Suppose ni = no. Let rt(nj) be a path for which t. = 7 Z A- E T t(nj) i(nj)

7 Then t = YA + y A Ai - Aj + T1 r(nj) C(nj) =y A - T - Y Ai,j +I 1 where we write L as the loop nl ->t(nj), and ~ A as the sum around L 3 J L of the initial branch weights of L, T as the sum of the execution L times of the nodes of L. But Lh < t < Y Z A L i.eo y 7 A - Z T > 0. L L Hence t. > - - A t + - 7.. j - 1,ij 1 jY A (4) Suppose ni, nj, ni n nl n. ~ nl. Let r(nj) be a path for which tj = y A-., c(nj) J(nj) If t(nj) does not contain ni, then it(n.):ni -~(nj), is a path from n. to nl whence

8 rc(ni) Tc(ni) 7 j + A'r 1,J 1 (n j -A(ni) y A, + t.. Otherwise t(n.) contains n.. Then j(nj) has the form n. n. - -. n. -... - n. 1s Isl s- 11 where n n, n = i, and n = nl Moreover n, -n. Then 3 1 1 1 r i S r r t. = 7 A - a a c(nj) T(nj) A.. Y A- T - T+T i Y A. t(nj) T(nj) 7ZA-2AT+7Z A - T + -7A. L L L L 1(T ) tn(n)n where we write L as the loop n. -4 n. - n. < n. i i i 1 r s s-1 r and t(n.) is the path from n. (=n.) to n 1 1 r n. -,n. - oe. nlo 1 1 r r-1

9 But 7 c A- Z T>ti i (n.) t(n) and -Lh < t < F A - L i.eo y A - T > O L L Hence t. >t. + T. - A. j - i i 1 Define x(n,t) = 0 if and only if node n initiates for the first time at some time t' > t. For k=l,2,..., define x(nt) = k if and only if node n initiates for the k-th time at some time t' < t but has not initiated for the (k+l)-th time at some time t" < t. Thus x(nt) is the number of initiations of node n up to, but not including, time t. If n. -- nj define bi.(t) = A. + x[ni (t-Ti)] - x(njt) i, i]j 1 ) where x[ni,(t-Ti) ] = x(n.it-Ti) if n:. does not initiate at time t-T. = x(n.it-i) + 1 if ni does initiate at time t-T.o 1 11 1 bij (t) is interpreted as the number of data words on branch (ninj) at time t,

10 Theorem 2 Let G be a computation graph. With each node n. of G associate a set of real numbers itr r t < r=Ol, a.,) There exists an execution of G such that n. initiates 0 1 r only at times t. t., 0o, t.,..., if and only if for all branches (ni,n.) of G we have r1 J r+A.. tr rj t. +. <t. 1 i — j PROOF r+-A,. Let n. - n. and suppose there exists r such that t. + T. > t. 1 j 1 1 ci Then r+A. r ij t. + T. = t. +. 1 1 j ij where e. > 0. Therefore ij r+A. or+A.. r+A. bi (t. A) = A + x[ni (t T) - (nt )'i c 1 -' i i]-x(njt Aij + x[ni(tr - ej] - (r+Ai) < A. + r - (r+A..) - ij'ci = 0 r+A.. which contradicts the condition that n. initiates at time t. 1 J J We shall prove, by induction on n, that for all nodes n. of G, n. nyl~~~~~~~J 0 initiates at time t.. Let the distinct numbers of the set (tI) be J i T1 < T2 <.o0 < T. For the case n=O, we apply induction on the subscript r of T r

11 0 0 0 Suppose r=l and let t = T1. Then if ni - nj, we have t? + T. < t. l j 1 l- 0j If Ai = 0, then t. + T t T1 and since T. > 0, t. < T1 a contradiction. Hence A.. > 1 for all branches (n.,n.) so that n. can initiate ij 1 <3 at time to, Assume, for all nodes n. for which ti < Tr, r < m, that n. initiates 1 1- 1 O 0 at time t. Suppose t = Tr+ll If Ai > 0 for all branches (n.,n.) of 1 j r-tl ij 13 G then no can initiate at time t.. Otherwise A.. = 0 for some branch (ni.,n) so that o ij t. + T. < t. t= T 1 1 - 3 r+l i.e. ti < T so that t = T for some q < r. By the induction hypothesis on r n. initiates at time t But then n. terminates at time 1 1 1 t. + T. < t so that at time t. the branch (ni,nj) contains a data word. 1 j j c a This is true for all input branches to n. so that n. can initiate at ~0~~~~~J time t.o This completes the induction on r and hence establishes the result for n=O0 Now assume, for all nodes n. of G, that n. initiates at times 1 n nl t 0.t o.t. We prove that n. initiates at time tn+ J Jf~~d, J J J Let T1 < T <.< < T be the distinct members of the set (t. } Again we apply induction on the subscript r of Tr* n+l Suppose r=l and let t = T1. We prove that Aij > 0 for all branches (ni,nj). For if not, there exists a branch (ni9nj) such that A.. = 0 Then n+l+A. tn+l +i <t. n+l T i 1- J j 1t ice. t < T1 a contradiction. Now for any branch (ni.,nj) 13

12 b(n+l). - bij(t = Ai. + X[n ] -x(n., t+ ) Suppose n+l < A... Then (n+l n+l bij(t. ) > Aij - x(nj, t.l ) ij 3 -- 3 3 Aij - (n+l) by the induction hypothesis on n > 0. n+l-A.. Otherwise n+l > A.. and t. ij is defined, so that - ij 1 n+l-A.. nl t. + T. < tn+ Therefore n+l-A.. niJ~t'nf 1 ) + tn+l b (tn+ > A.. + x[n,(t ) - x(n, t ) ii i - i I i But A.. > 1 so that n+l - A.. < n. Then by the induction hypothesis on n, -j 1jn+l-A.. x[n (t ) +] = n+l - A. + 1. n+l Also, by the induction hypothesis on n, x(nj. t ) = n+l. Hence n+l tn+l bi (t. ) 1 and n. can initiate at time t. n+l Assume, then, for all nodes n. for which t. < T, r < m, that 1 1 - r 0 1 n+l n. initiates at times t., t.,..., t. Let n. be a node for which 1 1 t. = Tr+. If Aj > 0 for all input branches (ninj) of nj, then by ~j r~r-J+l 1 3 the same argument as above for r-l, n+l b (t n+l) >1. 13 3 and n. can initiate at time t. l Otherwise A.. = 0 for some branch J J 1J (n o n)) Then n+l n t +l t.3 + T. <t = T i 1 - 3 r+l

13 n+!I i.e. t.1 < T so that t. = T for some q < r. By the induction hypoi Tr+l 1 q - 0 1 n+l thesis on r, n. initiates at times ti. ti...., t. Then (tn+l i+l + n+l bij(t x[ni,(t - i)]- x(n, tn ) n+l n+l But t. T > t. Therefore 3 l- 1 n+l n+l) + n+l bij(t ) > x[ni (ti ) ] - x(n t ) = (n2) - (n+l) by the induction hypothesis on n. Hence all input branches to n. contain n+l n+l at least one data word at time t so that n. can initiate at times t This completes the induction on r and hence on n. Corollary 3 Let G be in-point connected and y > Tot Then there exists an execution of G under which, for all nodes n, of G, n. initiates 3 3 only at times tj. t. + y, t. + 2y,.o.,. PROOF: For each node nk of G put t t + ry r=0,1,.. where t. is r i+ obtained from the assignment (1). Then by Lemma 1, t. + T. < tj j 1 1- J for every branch (ninj). The result now follows from Theorem 2. For an arbitrary nonterminating computation graph G, let us define the computation rate of a node n. of G to be x(njt) p = lim t J t-oo t if this limit exists. In the case of a computation graph G under a periodic execution with period p, we have p. = 1/p independent of n.. 3ss e h s d 3ecmfJ It thus makes sense in this case to define the computation rate of G to be p = l/p.

14 Theorem 4 Let G be in-point connected. With 7 = rc, the assignment (1) yields the maximum computation rate of G under any periodic execution. PROOF: Consider a periodic execution under which, for each node n. of G, no initiates at times tj, tj+p, tj+2p,.o.. Let L be a loop of G for which tL = it, and let A be the sum of the number of data words initially on the branches of L. Then for any node n. of L, x(nj, t. + Ap) - x(nj,t ) = A Thus, in time Ap, a data word on L passes once around the loop. But the minimum time in which this can be done is Z T.; i.e. Ap > Z -t. L c L J i.e. p > r = i t - L Note that the proof of Theorem 4 reveals that, with y = Tc, the assignment (1) yields a periodic execution of G such that if a node n. lies on a loop L with cL = C, then p. is the maximum computation rate Lj i of the node n. under any (periodic or not) proper execution of G. i In the actual calculation of times t. for a given graph G, in the case y = tr, the following observation will be found to be useful: If ni - n., and n.,n. are on a loop L for which tL = ic, then 1 J 1 J L t = t. + T. - A A.. j 1i1 ij PROOF: Let L be n. -n. - nkr-...-n r-n.e 1i j r i Then by Lemma 1

15 t. >t + T. - A. J - 1 1 i,J tk >t + T. - T A. k- j j,k t. >t + T - T A i- r r ri Therefore O>ET-TE A= O L L i.e. equality must hold for each of the above inequalities. Observe that all of the results of this section have been proved under the assumption that a node n. can initiate at any time that J there is sufficient data on all of its input branches, i.oe regardless of whether or not n. has terminated its previous initiation, If we J wish a model in which no node can initiate unless it has terminated its previous execution, we can obtain it by simply adding a branch (nj.,n) with A,. = 1 for all nodes no We call such a branch a selfJ' J JJ - loop. An Example ~~~~~~~~1 T 1/2n' = 3/2 ~1. 3~ ~~~~ "6 ) n4 7~6 n3~ 2 0 n3 1/2=n = 2 2

16 The initial numbers of data words is shown on the branches. t = 15/8 corresponding to the loop nl - n2 -> n3 - n4 -n5 nl. Then we have immediately, t1 = 0 t =0 t2 = t + 1- A2 = - 7/4 t 3t t3 2 2- t A23 = - 5/4 t4 = t3 + 3 - t A34 = - 5/8 5 t4 + T4 - t A45 = 3/8 Since (n6,n3) is the only path from n6 to n3, we have t6' A63 - C6 + 3 = 29/8. Similarly, = A76 - 7 + t6 = 13/8 t8 =t A87 -T8 + t 3. We note that it < T1, T3 T7 so that nl,n3 and n7 initiate at times when they have not yet terminated their previous initiation. If this is undesirable, we can place self-loops on nl, n3 and n7. It then follows that for this new graph, t = max (T1, T3, T7) = 5/2. The corresponding t. are: t = 0 t = - 1/2 t = t4 = 0 t = 1 t6 = 27/4 t7 = 19/4 t8 =27/4. 5

17 B. Extension to Weakly Connected Directed Graphs A directed graph G is said to be weakly connected if the undirected graph obtained from G by removing the arrows on the branches of G is connected. We shall show how the above periodic schedule for in-pointconnected computation graphs can be extended to weakly connected directed graphs. A subgraph G' of G is a maximal in-point connected subgraph if there is no subgraph properly containing G' which is in-point connected, If H is a subgraph of G, by G-H we mean that subgraph of G generated by those nodes of G which are not nodes of H. If H' is another subgraph of G. H+H' is that subgraph of G generated by the nodes of H and H'. Let G be a weakly connected computation graph and G' a maximal inpoint connected subgraph of G. If G' = G, the results of Section 2A are sufficient. Otherwise let G" be a maximal in-point connected subgraph of G-G'o It follows that in G, there is no path from a node of G" to a node of G'. Define a set of ordered node pairs N' = ( l{(ann) | n and n" are nodes of G' and G" respectively, and nk -nr'}T Assign real numbers t. to the nodes n; of G' by (1) of Section 2A, treating G' as a computation graph by itself. Similarly, assign real numbers t" to the nodes n" of G". Then, by Lemma 1, for all nodes j 0 n' n! of G' such that n' -on 1 J 1 J to + T. < t. + yAo. 1 1 -,0'Tj Similarly, for all nodes nn" of G" such that n -> ni tT + -k < to + Y~q Define a real number fa" as follows:

18 If N' =, at = 0. Otherwise a" = max ft + Tk t -7A Irk 2k' t - 7 kni P9 For all n" of GT' define t = tVI + cat Ther clearly, for all nodes ni,nj of G' + G" ti + i < t. + yAi. In general, suppose G = G' + G" +.. + G(r) where G(i) is a maximal in-point connected subgraph of G - (G' + G" + o.. + G (i-l) i=l12,o.org and that for all nodes ninj ofl such that n, - n. there exist real numbers t.itj for which t. + T. < t. + yA..o 1 1- J 13 If G = G, then we are through. Otherwise let G(r+l) be a maximal inpoint connected subgraph of G -. Then in G there is no path from a node of G(r+l) to a node of. Define N(r) = ((nn( ) nk and n r+l) are nodes of and G(rl) (r+l1) respectively, and nk -n n } Assign real numbers t r) to the nodes n ( ) of G ) by (1) of Section 2A. Define a real number a(r+l) as follows If N(r) = (rl) 0 Otherwise (r+l) max (t, + Tk - tr+l) - A) /r~k snr \ / f k k 1 naly for all of G defi Finally, for all n (r+1 of G(r+l) define 2

19 t - = t(l + (r+1) Then by the same reasoning as above, for all nodes n.inj of f + G(r+l) such that n. ->n, 1 3J t. + T. <t. + yA.. 1 i - J ij Since G has finitely many nodes, this process must terminate with G = G' +... + G(k) for some k. We then have t. + T. < t. + yA.. m i- J 13 for all n.,n. of G such that n. -> n. By Theorem 2, G then has a sequence of initiation times t. + ry, r=0,1,.,., for each node n. of G. C. A Dual Assignment Let G be a computation graph with the property that there exists a node nL such that for every node n. n1, there is a path 3t(ni) from n to no. G is said to be out-point connected, and n1 is called an out-point. Let t(ni) be n. - n. n. -n. here ni. 11 2 ik- 1 1k 1 and n. = n. k - k-l Write E A.. as E A Ir r+l r=l tr(n,) k-l and T as r r=i1 a ((n) For y > rJ, consider the following assignment of real numbers To to the nodes n, of G: 1

20 T = 0 T = max [ Tr- 7 A] for n, / nl ~(ni) c(n.) n(n.) Then it can be proved, in a manner similar to that of Section 2A, that an execution exists for G such that for all nodes n. of G, n. initiates J. J only at times T. + ry, r=01,...,. An extension of this result to weakly connected directed graphs is possible in essentially the same fashion as in B above except that N (r) is taken to be N() = ((n(,nlk) I an n are nodes of and G(rl) respectively, and n ( - nk Suppose G is strongly connected. Then we may choose a node n1 to be the in-point of the assignment (1) and the out-point of the above dual assignment. It then follows that for all nodes n. of G we have To < tio Moreover, if i.} is a set of numbers which satisfy! + T <. + yA.i 0 i 1 - j ij 1 for all nodes nin. of G such that n. -X n., i.e. such that nk initiates i3 1 3 only at times 0k + ry7 r=0,1,.o., then Tk < k0 < t for allk nodes - ko- kG for all nodes rL of G.

21 Do Computational Algorithms For computation graphs G with a large number of nodes the calculation of ir by inspection of all the loops of G, and of t. by inspection of all of the paths from n. to nl is a prohibitive task. Accordingly, the following more efficient algorithms are given. (i) Calculating r This algorithm is due to Lawler [4]. Define a recursion variable un) (m) as follows, where the indices i,j correspond to nodes ni,n. of G, and m,n are recursion indices. For m=0,l1. o u O)(m) = T if n. - n. and m = A. ij ~ ~ 0 = - otherwise. For m=Olo,.o, n=l,2,.., U() (m) = T. +A maxtjx(xu) (m-Ak) i Aik > 0}) ij k ] ik kik ik > 1K Here uj(m) = uk)(m) where N is such that uk (m) = uO (m). Let P be an upper bound on the number of consecutive branches b.. of G 1j for which A.. = 0. Then N < P. For all ij, compute the quantities u. (0), u. (l),..., u. (M) 013 u ( 13 13 () ij l0 j where M is an upper bound on the sum of the A's around any loop. Then t = max (uii(1), u.i(2)/2, u.i(3)/3,..., u.i(M)/M)} The comi 2 putation in this case grows as MPN where N is the number of nodes of Go

22 (ii) Calculating {ti and (T.i If n. -> n., define a.. = A. - T Then 1 3j Tij ij 1 t. = imin { Z a) t (n ) t (n) where Z a = y E A - A (ni) E(ni) (ni) Moreover, for any loop L, Z a > 0 since L - y7 A - Z T > n:Z A - Z T > 0 L L L L This condition guarantees the convergence of the following algorithm which is a slight modification of one given in [2]. 1) Start by assigning all nodes n. labels of the form [-,O(ni)] 1 where O(nl) = 0 (nl is the in-point of G), (ni) = oo n. / nl. 2) Search for a branch (ninj) such that O(nj) + a, < 0(ni)o Here oo + a = co). If such a branch is found, change the label on node n. to [nj,, (n,) + aij] and repeat. (That is, the new O(ni) is G(nj) + aij.) If no such branch is found, terminate. The value of 0(nk) upon termination is tk and the shortest path from nk to nl is obtained by tracing the sequence of first co-ordinate nodes from nk to n10 Suppose that G is not in-point connected and let nl be a node of Go Then this algorithm has the virtue of yielding minimal paths from all nodes n. of G to nl whenever they exist. Thus the set of all nodes nk for which upon termination (no) < oo represents a maximal in-point connected subgraph of G. We can combine this observation with the results

23 of Section 2B to yield the following procedure for computing (ti) for an arbitrary graph Go 1) Choose a node n1 of G and apply the above algorithm to yield G1, a maximal in-point connected subgraph of G, and a set (ti I n is a 1 node of G'}) 2) Determine G - G' and repeat (1) applied to G - G', yielding a maximal in-point connected subgraph G" of G - G', and a set I(t? I n' is a node of G"to 1 1 3) Determine the set (ti In'.' is a node of G") by the method of Section 2B. 4) Continue in the obvious fashion until all of the nodes of G are exhausted. The dual computation for the set (T. is as follows, If n. -> n. define a. = 7yA. -.. Then for any loop L, Z a > 0. 1J ij L - 1) Start by assigning all nodes n. labels of the form [-, G(ni)] where O(nl) = 0 (nl is the out-point of G), O(ni) = o, n.i n. 2) Search for a branch (ni,nj) such that 0(n.) + a. < 9(nj). (Here o + a = oo) If such a branch is found, change the label on node n. to [ni, O(ni) + ai], and repeat. If no such branch is found, terminate Then if n has value O(nk) upon termination, Tk = - 0(n') Again, if G is not out-point connected, this algorithm yields maximal paths from nl to all nodes n. of G whenever they exist. {T.} for an arbitrary graph G can be determined in a fashion parallel to that given above,

3. Synchronous Computation Graphs Suppose that the node initiation times of a computation graph G are to be controlled by a clock signal. Then node initiation times are constrained to be integer multiples of the clock period. But if it is not an integer the maximal rate periodic schedule of Section 2 is not suitable. Indeed, no maximal rate periodic execution of G, with integer initiation times, is possible in the case that T is not an integer. This remark follows from the proof of Theorem 4 where it is shown that if p is the period of a proper execution of G, then p > At, so that if G must have integer initiation times, we must have p > Frct (the greatest integer containing A). Then, clearly, if tA is not an integer, 1/p < 1/t, the maximal computation rate. In view of this contingency, we consider computation graphs for which the node execution times (T.i are all positive integers, and in which a node n. is permitted to initiate only at integer times. We call J such a computation graph a synchronous computation graph. The principal result of this section is the determination of an. execution (the free running execution), for a class of synchronous computation graphs, under which G computes at the maximal rate 1/to Theorem 5 considers the most general form of a computation graph, as defined in [3], 24

25 Theorem 5 Let G be a synchronous computation graph. Let ( be an execution under which, for all jj, node n. initiates at time t if and only if for all branches b.. into n., the ij number of data words on b.. is greater than or equal to'j T.jo Let' be any other execution. Then x(nj,t) > x'(njt). PROOF: The proof is by induction on t. For t=O the result is immediate, Assume, for t < n, that for all nodes n. of G, x(n.,t) > x'(njt). Let t = n+l and consider node n.o J (1) Suppose n. initiates at t=n under ~. Then x(n.,n+l) = 1 + x(n. n) > 1 + x' (nj.n) > x' (n.n+l) and the result holds, (2) Suppose n. does not initiate at t=n under both e and g' Then x(n.,n+l) = x(nj,n) > x' (n.n) = x' (njn+l) and again the theorem holds. (3) Suppose nj does not initiate at t=n under S but nj does initiate at t=n under K'. Then there are two possibilities, (a) x(nj,n) > x'(n.,n) in which case x(nj.n+l) = x(n.jn) > x (nJ^n) + 1= x'(nj n+l) o (b) x(nj,n) = x'(nj.n). Since n. does not initiate at t=n under g but does initiate at t=n under'I, there is a branch bij = (ni.nj) such that b. J(n+l) < T i and b! (n+l) > T.. where the prime indicates queue length under W'. Then, in particular, bi,(n+l) < b;,(n+l)9

26 i.e. Ai. + U. j x(nin+l) - Wj x(njn) < A + U. x' (ni n+l) - Wij x'((njn) where we write x(ni.n+l) as the number of terminations of node n. in time n+l under A, and primes correspond 1 to'o Hence x(ni.n+l) - x'(n.in+l) which contradicts I 1 the induction hypothesis x(nit) > x'(n.,t) for t=Ol,o,n. We call the execution & in the statement of Theorem 5 the free running execution. Theorem 5 then establishes that a synchronous computation graph executes at the fastest possible rate under the free running execution. Note that Theorem 5 has been proved under the assumption that a node ns can initiate at any time that there is sufficient data on all of its input branches; i.e. regardless of whether or not n. has terminated its previous initiation. If we wish a model in which no node can initiate unless it has terminated its previous execution, we can obtain it by simply adding a self-loop to n. for all n. of Go a J 3.1 Synchronous Graphs with T=1, U=W=T=1 We consider synchronous computation graphs with T. = 1 for all nodes n, and for all branches bi. = (n.,nj) we have U. = Wi. = T.. = 1. ~1 ij N 11 ij ij 1 Later we shall see how the results obtained with T,=l may be applied to a graph in which n. has associated with it an arbitrary positive integer execution time.

27 Let n be a node of G and for k > 1, Sk(n) a sequence of (not necessarily distinct) nodes, (nOnl,...,n p+l, such that (1) n+ -4n -...- - n1 * n with n = n. p+l (2) Aj k. j=1 P (3) - < k. j=l We call such a sequence Sk(n) a k sequence. Put A = A1 A1 = A21 A =A A = A p-l pp-1 p-1 A = k - A. i=O Define Ao = Ao and for 1 j _< p let A' = A. + A' + K. J J J-1 J where K. = - 1 if A! > 1 0 if A' 0. J-1 Define dk(Sk(n)) = p + A' - 1. Intuitively, the significance of dk(Sk(n)) is as follows: Let wk be the k-th data word "backed u~' from node n along the node sequence

28 Sk(n). Then dk(Sk(n)) is a measure of'"how far away" wk is from the node n. Example ok 3 0 1 n? 3 n1 n2 n3 S4(no) 0 1 2 3 4(n) d4(S4(nO)) = 3 0 2 0 0 3 nO n1 n2 n3 n4 n5 S4(n0) d4(S4(n0)) = 5 Define Tk(n) = max (dk(Sk(n))). Sk(n) We shall be interested in tracing the progress of the k-th data word through time along a k sequence Sk(n). To that end we make the following definitions: Let Ao,A1,...,A be a sequence of nonnegative integers. p Let e0 e1,..., e and 50 51..., 5 be sequences such that (1) E = 0 or 1 i=0,l1,.9.p-l, e = 0 (2) 8. = 0 or 1 i=Ol...p (3) =1 < > i+l = -i=0,l,..1 p-l. A sequence Ao(l), Al(l),..., A (1) is said to be 1-admissible if Ai(l) = A. + c. + 6i i=O,l,..,p. i 1 1 In general, if Ao(t-l), Al(t-l),..., A (t-l) is (t-l)-admissible, then the sequence Ao(t), Al(t),..., Ap(t) is said to be t-admissible if Ai(t) = Ai(t-l) + c. + 6. where e. and 6. satisfy the conditions (1) - (3) above.

29 Let Sk(n) be the k sequence np+l - n ->...- no, n0=n, and let A. = A i=0,l,...,p-l i Ai+lgi' l p-1 A = k - A. i=O Then the k-th data word wk "backed up" from node n lies on branch bp+ K P+l.P Let AO(t), Al(t),..., A (t) be a t-admissible sequence derived from AOg A,.o. A, and let r be the maximum subscript such that A (t) ~ 0. 0 p r Then at time t, wk lies on branch br+lr and the sequence Ao(t), Al(t),..., Ar(t) yields the branch distribution of data words "between" node nO and wk Define Ax(t) = Ao(t) and for 1 < j < p, let A!(t) = A (t) + A! _(t) + K.(t) where K.(t) = - 1 if A (t) > 1 = 0 if A _(t) = 0. Let r be the maximum subscript such that A (t) 0. Define d(Sk(n),t) = r + A'(t) - 1 k r d(Sk(n),t) is the "distance" of wk from node n at time t. Define Tk(nt) = max (d(Sk(n),t)). Sk (n) Lemma 6 A'(l) = A' + 9 v e {-1,0,1), r=0,l,...p r r A'(1) = A' + 1 =1 r r r A'(1) = A'- 1 E =0 r r r

30 PROOF: The proof is by induction on r. Take r=O., Then A( Ao(l ) = A) + A o + o + Eo Ao + ~o and E0 + 50 e {-1,0,1). Clearly, A6(1) - A + 1 = 1 A6(l) = A6 - 1 -= _ =o 0. Assume the result to hold for r=m-l and consider A'(l) A (1) + A^ (1) + K (1) m m m-1 m = A + E + 5 + A' (1) + K (1) m m m m-1 m A'l + c + + Al (1) At + A (1) - + m m m m-l m-1 Km() - Km (i) Suppose A (1) = A' m-1 m-1 Then K (1) = K and m m A'(1) = A' + E + m m m m i.e. A'(1) = A' + v. m m l-lrl' Al(l) =A' + 1= A'l(1) = A' - 1 = 0. m m m Therefore A'(1) = A + - 1 + 1 -1 - K A' + " 1- K = A' + v Finally - 1 1 > e = 1 c - -K =-1 -- e = 0. m m m (iii) Suppose A' (1) = A Then 1 T he induction hypothesis M~l m - 1 hmp =0 so 5 0. Also Ae >i 1 so that K = 1. Therefore A'(1) = A' + c - 1 + K (1) + 1 m m m m = A' + +K(1 m m m

31 and m + Km(l) e -1,0,1}. Clearly, e + K (1) = 1 > e = 1 m m m E + K (1) = -1 0. m m m Let AO(1), A1(l),..., Ap(l) be a 1-admissible sequence derived from AA1,..o.,A. Let r be the maximum subscript such that A' - 0. p r-l If no such r exists, take r=O. We say that AO(l), A1(l),. 9 A (1) is freely derived from A0g A^,..., A if p (i) = - 1 (ii) A' = 1 r < s < p-l S - Lemma 7 Let Ao(l), Al(1),..., A (1) be freely derived from Aog A,... Ap. Then Ap(l) A' - l p p PROOF: Suppose rfO. By Lemma 6 we have A' (l) e (0,1}, i.e. A' (l) + Kr(l) = 0 Then A'(1) = A (1) + A' (1) + K (1) p p p-l p = A(1i)+' [A (l A 1))+ + K( )] K (1) A ee e!* ~* W p-2 + o e ~ Ap(l) +. e A'(l) + Ar(l) t'K~( l) + K a (l) p r r-1 r Kr+l + o... + K(l) = A (1) +... + At(l) + Kr (l) +. + K (l) P r rsl P = c + A +... + A + 5 + K () +.o + K (1) p p r r r+l p =A + + A K + A + K+ + (K (l) -K p r p Kr+) + p +oo. (+ (I)(1) -K 4)+ c ~ 6 o r~~l r~l p r

32 But since A' = 0 we have r-1 Ap = A +... + A + +. + K P P r p Hence A'(l) = A' + (K(l) - K) +... + (1) - K + + p p p p r+l r+l P r If r=O we obtain the same expression for A'(1): p A'(l) = A (l) +... + A(l) + Kp() +... + Kl(l) A +...+ AO + 0 K(l) + K(l) A +.. + Ao + K +... + K6 + (K () K) p 0 p p 0 p p +... + (Kl(l) - K1) = A + (Kp(1) - K +..+ (K1()1 - K1)+E + 0 P P Now if A' > 1, p-l > s > r then by Lemma 6 A'(1) > 1 so that S - - K +(1) = K = - 1. Suppose A' = 1. Then = 1 and by Lemma 6 s+l s+l s s A'(1) > A' = 1, i.e. K +(1) = K1 = -1. S - s s+l s+l Since r is the maximum subscript such that A' = 0, these are the r-1 only possibilities for A'. Hence s A'(1) = A'+e + +. p p p r But e = 0, 6 - 1, i.e. A(l) = A' - 1. p r p p Lemma 8 Let G be a synchronous computation graph under the free running execution. Let Sk(n), np+1 -> n ->... ->n n0 =n, be a k-sequence for which d(Sk(n)) = Tk(n). Then for 0 < t < Tk(n) (i) d(Sk(n),t) = Tk(n) - t (ii) d(Sk(n),t) = Tk(n,t). PROOF: We prove the result for t=l. Then by (ii), the conditions for the lemma to hold are valid at t=l and we have a basis for iterating the

33 argument which established the result for',t1. If Tk(n) = 0, there is nothing to prove. Hence assume Tk(n) > 1. (i) Consider the sequence AoA1.o,.,A where A. = A.i+l i=lo P-l p 1 and p-1 A k - A. i=O Define, for i=Olooop,. = 1 if and only if node n. initiates at 1 1 t=O. Otherwise 5. = 0, For i=Olo,,,.,p-l, put c. = 1 if and only if 1 1 6i = - 1. Otherwise e. = O0 Finally, let e = O0 We prove that the i+l 1 p sequence AO(l), A (l),.., A (1) defined by Ai(l) = A. + e. + 6i p m i i=,0100oo0p is a 1-admissible sequence freely derived from Ao)Alo.oA o Let r be the maximum subscript such that A' = 0 If no such r r-l exists take r=O. We show that n initiates at t=O. Clearly, A > 0. r r Suppose there exists a branch b into n such that A = 0O r'+lr r r r'~r Let S'(n) be a k-sequence n n, >o - -n - n - n where k. qj+ q' r'+1 r' r'=r. Then since A' = 0 or in the case r=O, we must have r-ld(S{(n)) = q' + A' -1 q' +A,+, +A + K + + K - 1 q''~ r' qg r'+l But since A' = A + A' + K r' r' r- r' = 0, then Kr = 0. Hence dk(Sl(n)) = q' + Aq, + + A, + Kq + o + K Kk qr' j r'+~' r'+2 = q' + A + + A + K +.. + K - 1 p r q' r'+2 > q' + A +,,. + A - [q' + 1 - (r'+2)] - 1 p r = A + A.o + A + r P r - (Sk(n)) + 1

34 a contradiction. Hence A > 0 for all branches b into n and rr'+lr r n initiates at t=O. Now let A' = 1 for p-1 > s > r. Then by a similar r s - argument to that just given we must have that n+l initiates at t=O. It follows that AO(l), A1(l),.., A (1) is a 1-admissible sequence freely derived from AOA1..,A. By Lemma 7, A'(l) = A' - 1. Then if Ap > 1, p d(Sk(n),l) = d(Sk(n)) - 1 = Tk(n) - 1. Suppose A' =1. Then since Tk(n) > 1 we must have p > 1. Then p - A'1 E {0,1). If A' = 0 then since A ), A((l),..., A (1), is freely p4 p1 p derived from Ao,A1,...,A, we must have e = 1. Since A' = 0 implies p p-1 p-1 A -= 0 then 5 p =0 and A (1) = 1. By Lemma 6, A' 1(l) =1 and p-! p-1 p-1 p d(Sk(n),l) = p-l + Ap - 1 = (p + A' - 1) - 1 = (p+A -1)-l p (Sk(n)) - 1 Tk(n) - 1. Finally, we consider the case A' = 1. Let r be the maximum subscript p-l such that A' = 0. If no such r exists, take r=O. Then by the same argur ment as in the proof of Lemma 7, we obtain A' (1) = A' + + p-l p-l p-l r Since AO(l), A1(l),..., A (1) is freely derived from Ao,A,...,A p we have p-l=, = - 1 and p-1 9 r A (1) = A' = A' p- p-(n)l) = T so again d(Sk(n),l) = Tk(n) - 1.

35 (ii) Let S1(n) be a k-sequence and suppose d(Sk(n),l) < d(S (n),l). (1) Then d(Sk(n)) - 1 < d(SB(n),l) < dk(Si(n)), oeo, ~ dk(Sk(n)) < dk('S(n)). But dk(Sk(n)) = Tk(n) so that dk(S (n)) = Tk(n) By (i), d(Sl(n),l) = d(S(n)) - 1. By (1), d(Sk(n)) < dk(S(n)) a contradiction. Theorem 9 Let G be a synchronous computation graph under the free running execution. Then node n initiates for the k-th time at t = Tk(n) PROOF: Let Sk(n) be a k-sequence for which dk(Sk(n)) = Tk(n)~ Then by Lemma 8, d(Sk(n),Tk(n)) = 0 = Tk(n,Tk(n))o Thus at t = Tk(n) every input branch to node n contains a data word so that n initiates at time Tk(n), Finally, if wk is the k-th data word "backed up" from node n along the sequence Sk(n) at t=O, then at t=Tk(n), wk is the first data word "backed up" from node n along Sk(n), ie. x(nTk(n)) = k-1 and n initiates for the k-th time at t = Tk(n). Let ~(t) be the vector with components bi (t) for all branches (n9n ) of G.

36 Lemma 10 Let G be a strongly connected synchronous computation graph under the free running execution. Then there exists a nonnegative integer t' and a positive integer X such that for all t > t' b(t+\) ='(t). PROOF: Since G has T. = 1 for all nodes n. of G, V(t+l) is uniquely 1 1 determined by b(t), i.e. bij (t+l) = bij (t) if there is (not) a branch bki with bki(t) = 0 and there is (not) a branch brj with brj (t) = 0 b. (t+l) = b i(t) + 1 if for all branches b ki bki(t) > 1 and there is a branch br such that b (t) = O. bij(t+l) = bi (t) - 1 if for all branches b., b (t) > 1 and there ij ij *rj rj is a branch bki such that bki(t) = 0O Consider the sequence B = b(0), b(1),...,. Since G is strongly connected, the queue lengths of G are uniformly bounded above (See [3-] for a proof of this statement.) Hence there exist finitely many vectors b(t) so that some b(t) of B must repeat in the sequence. Therefore there exists a positive integer X and a nonnegative integer t' such that b(t'+X) = T(t')o But, by the above remarks, if V(t'+%) = V(t'), then t(t'+x+l) = V(t'+l), etc. Hence the Lemma. Lemma 11 Let G be strongly connected. Then for any pair of nodes n. and n. of G, x(n, t) lim = lo x(n.,t) t -> oo J

37 PROOF: Let ni -4 n. Then bik(t) = Aik + x[ni,(t-Ti)+] - (n, t) Since U. = W i= T.i= 1 for all branches bij= (ni.nk) of G, and since G is strongly connected, then bik(t) is uniformaly bounded above and G is nonterminating. (For the proofs of these statements, see [31.) Hence x[n9, (t-:i)+] x(n.,t) lim lim x( = 1 t-)oc x(nk't) t - o00 k^ Since G is strongly connected, there is a path from n. to n, so we: 0 may iterate the above argument yielding the statement of the lemma. Lemma 12 Let G be a strongly connected synchronous computation graph^ under the free running execution. Then for all t > t', b(t+~) = b(t) if and only if for all t > t', x(njt+X) - x(njt) = c a constant independent of n. and t. PROOF:~ Let nj initiate at time t. Since bij (t) = bi (t+~) for all branches b.j, n. must initiate at time t+\. Suppose n. initiates at times t t2.oo.t where t' < tl < t2 <.o1 < t < tl+\. Then by the above remark, n, initiates at times t1 + + t + A, t1 + 2.9 o.,o i.e. x(n.,t+X) - x(njt) = a., independent of t. Now let nk be another node of G. Then we obtains similarly, x(nkt+X) - x(nkt) = ai independent of t. By by Lemma 11, x(nkt+rk) lim = 1 lim x(n,t+rA) = r - oo j

38 Therefore C/a = lo =C- ~ ~%= 1. bi (t+x) = A + x(ni, (t+x-1)+) - x(nj,t+%) = A.i + x(ni,t+%) -: x(nj,t+%). But x(ni,t+X) - x(ni, t) = x(nj,t+%) - x(nj,t). He1 3 Hence bij(t+\) = A.i + x(ni.t) - x(nj.t) b.j. ( A. = b (t). It follows from Lemma 12 that V(t) has period X if and only if for all nodes n. of G. n. initiates at times 1 1 t1 tl-i ti9 tiV ~'P9 i t~ t. 1 o,0 t. t+ + k t, +,, t + 1 1.-1 t. + 2., t. + 2X,..., t. + 2., 1 1 1 where t' < ti < t1 <. < ta- < t + k. -1 1 1 1 Lemma 13 Let G be a strongly connected synchronous computation graph under the free running executionO Then for any node n lim Tk(n) _ k - oo k a PROOF: By Theorem 9, for any node n of G x(n,Tk(n)) = k-1. Therefore

39 Tk(n) Tk(n) x(n,Tk(n)) k-1l and Tk(n) Tk(n) lim k = lim - (1),r -~ cO x(n, Tk(n)), k v By Lemma 12, for all t > t' x(n,t+rk) - x(nt) = ra i.e. lnim x(n,t+r%) lim lim = lim, t+r% t+r% r -4 oo r -oo = a/\ Hence- by (1) Tk(n) lim k = k - oo Theorem 14 Let G be a strongly connected synchronous computation graph. Then /oa = max(lc ) PROOF: Let L be a loop, length 2, of G for which TIL = t and let A be the total number of data words on the branches of L. Then, since a data word requires at least I units of time to pass once around L, we must have x(nt+l) - x(n,t) < min(~,A)for any node n of L. By Lemma 12,

40 x(n,t+2\) - x(n,t) = la for all t > t'. By the above remark la < r min (1,A) (1) i.e. \/a > max (1,c) (i) Suppose Tr > 1. Let n be a node of G and let Sk(n), n -4n -. -. n -+ n.. - n, no=n, be a k-sequence such p+l p r+l' r nr that dk(Sk(n)) = Tk(n). Let r be the maximum node subscript such that if nr+l - nr -... - n is an m-sequence S (n), then d (S (n)) = r M m r exists by the definition of dl(Sl(n)). We shall prove that k-m is uniformly bounded above. Consider the (k-m+l)-sequence, Sk m+l(nr), n -p n - -o. n -, n. By the manner in which r was chosen, we p+l p r+l r must have -m+l(Sk-m+l(nr)) k -m (2) and also Tk(n)= dk(Sk(n)) = d(Sm(n)) + \_m+l(Sk m+ (3) We prove that dk-m+l (Sk-m+l (nr) = Tkm+ (nr) (4) For let S m+(n), n - n -.. -> n be such that k-m+l r) q q-1 r -m+l(S-m+l (r)) > dk-m+l(Sk-m+l(nr) ) Then Sk, -n -.n n ->... 1 - n is such that n9 q q-l r r-l d( n)) (S (n)) (S(n)) + dk-m+l(Sm+(r)) > m(Sm(n)) + dk -m+(Skm+l(nr)) = (Sk(n) ) = Tk (n) a contradiction.

41 Now suppose that k-m is not uniformly bounded above. Put s = k-m. Then there exists some subsequence (si] of (s) such that s. - oo. But then T ) (nd S (n)) 1.+l(nr s.+1 s.+1 r l im li 1 s+l by (4) si -0 o i si -400 i S. — o Is. -- o i 1 1 1 - urnlir s by (2) S. - 00 1 1 But Ts(nr) lim = A\/ > C > 1 by Lemma 13 and (1) S -4 o a contradiction. Hence there exists M such that k-m < M for all k. Now let P be a path from n to n and let I be the total length of p+l the circuit defined by the sequence np+l - n ->.. -> n and then back to n+ via P. Let L1, L2,..., Lt be all of the loops making up the above circuit, and for i=1,2,...,t let A. be the number of data words on L. and 2i the length of Li, Then by the choice of r, dm(Sm(n)) r < 2 = 1+'o + t = A /A1 +.. + At t/At t i=l i=l

42 iedSm(n)) i.e. <it -) A. 1=1 1 i=l i.e. by (3) Tk (n) - dkm +l(S- (nr) ) k k t < Ai i=l Now since k - m <M, lim -m+l (Sklm+l (nr)) lim 0 k - oo Also, since k-m < M and since P is a path we have lim k = 1. k - oo Ai i=l Hence Tk(n) lim -k < k. k - oo This coupled with (1) and Lemma 13 yields the desired result. (ii) Suppose T_ < 1. If k-m is not uniformly bounded above, we have, as before

43 Ts (nr) S. r lim = 1 s. So - 00 1 which is what we wanted to prove. Otherwise, we obtain, as before Tk(n) lim Tk() <. (5) k - oo If cr < 1, this contradicts (1) so that k-m is not uniformly bounded above. If i = 1 then (1) and (5) together with Lemma 13 yield the desired result. Corollary 15 If tr < 1, then x = a = 1. PROOF: By Lemma 12, for all t > t' and for any node n of G x(n,t+~) - x(n,t) = a = x by Theorem 14, But x(n,t+%) - x(n,t) = X if and only if node n initiates at times t, t+l,.. t+%-l, i.e. node n initiates at times t' t'+l, t'+2,...,. 3,2 Computation Graphs with T. > 2 These may be reduced to the case T.=1 for all n. as follows. Suppose n. has 1 2 pose n. has T. > 2. 1 1 —

44 Then replace ni by the following sequence of T. nodes each with the execution time 1. It should be clear that all of the above results concerning free running executions of synchronous computation graphs can now be translated into similar results for the case T. > 1, except that n. can initiate at unit time intervals, so that if T. > 2, ni can initiate before its previous execution has terminated, If this situation is undesirable, we may place a self-loop on n.o 1

45 An Example n2, 2=3 L ~t, =2 0,,,'_\, - J..2 1 ~ i/\,T n4T4=2 1 +T +n3,"3=1 i + -2 3+ 3 E= -_ --- = 2. We can transform G into an equivalent graph G' with unit node execution times. G': O"n i /E 7 2 0 O'' n~f k -> n no Mu\ 2 6

46 By simulating G' we obtain the following sequence of branch vectors: t b b b b b b b b b b12b23 b34 b45 b56 b61 b27 b78 b81 0 0 2 0 0 0 1 2 0 1 1 1 1 1 0 0 0 1 1 0 2 0 1 1 1 0 0 1 1 1 3 0 0 1 1 1 0 0 1 2 4 0 0 0 1 1 1 0 0 3 5 1 0 0 0 1 1 0 0 2 6 1 1 0 0 0 1 1 0 1 7 1 1 1 0 0 0 1 1 0 Then b(7) = (l1). Hence k = 6, a = 3. From the above table, we can determine the initiation times of nl, n2,:.n3, n4 of G, which are the initiation times of ni, n, n', n7, respectively. We obtain n1: 0, 4, 5, 6, 10, 11, 12, 16, 17, 18,.., n2: O, 1, 2, 6, 7, 8, 12, 13, 14, 18,..., n3: 3, 4, 5, 9, 10, 11, 15, 16, 17, 21,.o., n4: 0, 1, 2, 6, 7, 8, 12, 13, 14, 18,...,.

4, A Maximal Rate Periodic Schedule for Synchronous Computation Graphs In Section 3 we noted that if tC is not an integer, then no maximal rate periodic execution is possible for a synchronous computation graph. The remark following Lemma 12 suggests the following definition: Let X and a be positive integers. A synchronous computation graph G is said to have a (k,a) periodic execution if for each node n. of G, 0 1.-i 0 there exists integers t < t. < < t. < t. + x such that n. inii i1 1 1 1 tiates only at times t~ t1 tii^ i.9, o i' o 0 0 t. + 2, t + X,.., o ti + K 1 + 2 1 + 2 t + 2 Note that if nt is an integer, then the results of Section 2 yield that a (T 1l) periodic schedule exists for synchronous computation graphs, Theorem 14 suggests the following question: Let it = k/Qa where X and a are integers. Does a synchronous computation graph G have a ( pa) periodic execution? Clearly, if the answer is in the affirmative, then under this execution G computes at the maximal rate 1/It. The principal result of this section is to provide just such an execution for a wide class of graphs. We shall treat the cases it < 19, t > 1 separately. A. i <1 Then we cannot hope to achieve a computation rate of 1/ir since 1/it > 1 whereas a node can initiate at most once per time unit. However, this maximal rate of one initiation per unit time for a synchronous computation 47

48 graph can be achieved with y = 1 under the assignment of Section 2B Bo t >1 For any real number x we write fxl to be the smallest integer containing x, and [xJ to be the largest integer contained in x. Let G be a synchronous computation graph. Consider the following r assignment of integers t. to the nodes n. of G: i 1 t1 = rr + tir where t. is obtained as in Section 2B. Lemma 16 i) r+l r i) ti > t 1 1 ii) tk+ = kX + ti k=0,1,o i 1 p=0, l.eo9 PROOF: (i) Write / = x\/ = = LJ + R/a 0 < R < a-l Then Frr + t'i = rFLiJ r + Rr/a + t] L= J r+ [Rr/c + t'; Therefore tr+l - r = LtJ + r+) til - rrR/a + t > Ll > 1 since T > 1. (ii) ti+' = F (ka+p)\/a + t I r kx + P \/o + ti =kX +: \/a + til = k + t3 1

49 Theorem 17 Let G be synchronous computation graph with t = %\/C \ and a positive integers. Then G has a (%,a) periodic execution. PROOF: Suppose n. - n Then for r=0,1,.oo. we have 1 J tl +,=o = F r + tol +i - frtr + t. + Til since T, is an integer < rcr+tj + ti A.i 1 by Lemma 1 = Ft(r + A.j) + t r+Aij t, 1J J Then by Theorem 2 there exists an execution of G such that node 0 1 r Ln initiates only at times tk, tk0.0o tk1. oo The result now follows by Lemma 16, It is clear that for a synchronous computation graph PG = a/ = 1/n so that G computes at the maximal rate. The following computation aid may be found useful: If n. — > ni, and 1 3 ni,,n both are on the same loop L for which TL = i then tr = tr+A-ij r=O 9, t,=ti L, o 01o PROOF: Let L be n. -~ no - o n - n -- n T b Then by the proof 1of Thrp m 1 of Theorem 17,

50 r+A.. r 1J tr < t. 1 - j r+A. r+A..+A t. ij < t j jk A.. A r+A. +..+A +A.+...+A + A t j pm < t i pm mi m i m r r+LA Therefore t. < t - Z T by adding the above inequalities 1- L = F r(r+ A) + ti 1 -' L L = fr+ t. + T -Z T 1 L L rT= r + ti. since Z T is an integer L t. i.e. equality must hold for each of the above inequalities and in particular, for the first. The obvious dual results hold for synchronous computation graphs under the substitution of T. for t., i.e. Tr = rtr + Til r=Ol oo defines a (Na) periodic execution.

51 An Example n7 0 2 f5 T 1 50 E _ 8/6 = 4/3 corresponding to the loop n4 -> n5 -> n6 < n -> n~. Take X = 4, of = 3. By the methods of Section 2 we obtain: = 0 t2 = - 4/3 t3 = - 3 t4 = - 11/3 t5 = - 17/3 t6 = 22/3 t7 = - 13/3. Hence ~ 0 1 2 t1 = 0 tl = 2 t= 3 11 n2 0~1 2 07 —3 2 22=2 0 1 2 "3=3 n5 = 8/6 = 4/3 corresponding to the loop n- 5 > n - n n Take k = 4P cI = 3. By the methods of Section 2 we obtain: t = 0 t2 4/3 t3 = -3 t = -11/3 t1 = -3 t 6= -22/3 t7-7 = -13/3 Hene 0 1 2 t= 0 = 2 = 1 3 0 1 2 t 2 =- t2 2 0 1 2 t3 -3 t3= -1 t3 0 +0 5 t5 -4 t2 3 =5 t 5 = 0 1 2 t -41 2

5. Conclusion A maximal computation rate periodic schedule of node initiation times has been presented for computation graphs G with U = W = T = 1 for all branches of G. This schedule is of two forms, according as G computes asynchronously or synchronously. Furthermore, an analysis has been given of the so-called free running execution of G and this is found to yield the maximum computation rate of G. An open problem is a similar analysis for non-terminating computation graphs G in which U, W, and T are not restricted to be 1. Another area for future research is as follows: Suppose all nodes of G are capable of performing the same functions e.g. each node is a computer. Then under the various periodic schedules of this paper, what is the minimum number v of "computers" necessary to perform the computation without decreasing the computation rate of G? A more general problem is the following: Suppose the node set of G to be partitioned into sets N1,N2,...,N where all of the nodes n in N. are capable of performing the same functions. Again, what is the minimum number of "computers" necessary to perform the computation under the various periodic schedules without decreasing the computation rate of G? A related problem is the following: Suppose k < v "computers" are available. How can we choose the minimum value of y such that the computation may be effected with period y and such that the k "computers" are utilized. 52

References [1] Cuninghame-Green, Ro A., Describing industrial processes with interference and approximating their steady-state behavior. Operational Research Quarterly, vol. 13, No. 1. [2] Ford, L. R., Jr. -and Fulkerson, D. P,'Flows -'in'-Netitorks (Princeton: Princeton: University Press., 1962) pp. 131-133. [3] Karp, R, M. and Miller, R, E., Properties of a model for parallel computations: Determinacy, termination, queueing. IBM Research Paper RC-1285. [4] Lawler, E. L., Extremal cycles in weighted linear graphs. Unpublished paper, Systems Engineering Laboratory of The University of Michigan, Ann Arbor. [5] Reiter, R., A study of a model for parallel computation. The University of Michigan's Information Systems Laboratory Technical Report ISL-65-4. 53

UNIVERSITY OF MICHIGAN IIIII IlI UII111 11111II 3 9015 03695 4512