THE UNIVERSITY OF MICHIGAN COLLEGE OF ENGINEERING Department of Aerospace Engineering ON THE MODELLING OF SYSTEMS FOR IDENTIFICATION Part II. Time-Varying Systems William L. Rooat ORA Project 011306 supported by: UNITED STATES AIR FORCE AIR FORCE OFFICE OF SCIENTIFIC RESEARCH AIR FORCE SYSTEMS COMMAND GRANT NO. 72-2328 WASHINGTON, D. C. administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR May 1973

ABSTRACT Certain Banach spaces, denoted ~Cof equivalence classes of functions of a real variable are introduced and investigated. These are to be used as system input and output spaces for systems operating for all time. Some basic properties of causal systems with finite memory are established. The concept of forming time-interval truncations of a time-varying system is formalized and investigated. Trajectories of such truncations are studied. It is proved that under certain reasonable conditions, the trajectories of a class of systems are generated by a strongly continuous semigroup of linear operators. It is also shown that ~-representations of the truncations can be generated by an induced semigroup. Thus, the evolution in time of classes of, in general, nonlinear, time-varying systems, is described in the framework of a linear dynamical theory.

INTRODUCTION TO PART 2 A "system" as defined in Part 1 is simply an input space, an output space, and a mapping carrying inputs into outputs. In Part 1 some abstract structure and representation theory is established for classes of systems for which the system mappings are bounded and continuous, and for which certain conditions are satisfied by the input and output spaces and by the class of mappings itself. In Part 2 the interest is in systems and classes of systems where the inputs and outputs are functions of time. Again there is no restriction to linear or to time-invariant systems. The chief emphasis is on causal systems, and indeed on causal systems with bounded memory. The emphasis on causal systems needs no justification, but there might be a question raised as to why one should consider systems with bounded memory. The primary answer is: the bounded memory condition turns out to fit very conveniently in the mathematical structure used, and since almost any system of interest has a decaying memory it can be approximated as well as desired by a system with finite memory. The goal in this work is to set up approximate models or representations of classes of systems to be used in identification, so approximation is permissible. It is certainly realistic to stipulate that the observation periods for both inputs and outputs be of finite duration, and this requirequirement influences the mathematical structure chosen. 1

The first Section after this introduction is devoted to setting up and investigating certain function spaces which are appropriate for modelling input and output spaces for systems; the second to establishing basic facts about causal and bounded memory systems. In the third Section the concept of trajectories of time-limited truncations of systems is developed. The time-limited truncations are, roughly speaking, observable portions of a system which is operating for all time. In the fourth Section families of trajectories associated with a class of systems are considered. Under certain circumstances, these trajectories can be generated by semigroups of linear operators. The results of Part 1 are not used explicitly in Part 2 till near the end of the fourth Section. However the work of Part I influences what is done in Part 2 throughout. Some of the material of Sections I and III appeared, with only partial proofs, in the conference paper [1] 2

I. SOME FUNCTION SPACES FOR INPUTS AND OUTPUTS We want to be able to treat systems for which the inputs and outputs are functions of time, real or vector-valued with finitely many components, and extending for infinite time. We do not want it to be required that the inputs and outputs must always die out in some sense in the infinite future or infinite past. Hence, function spaces, such as the Lp spaces with 1 < p < oo, which have the property that their constituent functions all get arbitrarily small (in some sense) as t -m oo will usually not be suitable for modelling admissible collections of inputs or outputs. The spaces of bounded or essentially bounded functions are satisfactory on this score, and sometimes we shall use the space of bounded continuous functions on R1 with the sup norm. However, there are certain non-standard function spaces that are suitable and especially convenient, and which will be used customarily. These are spaces of functions that are uniformly local-time L2 provided with one of a family of norms to be given in the definition below. These spaces are only Banach spaces, but the local L2 character is advantageous. Since it is quite as easy to define such spaces more generally using a local-time Lp property, 1 < p < o, we do so, even though the local L2 spaces are the ones chiefly desired. Let y be either a real-valued function of a real variable, or a vector-valued function that has finitely many real-valued components. In the second case, |y(t)| will denote the Euclidean norm of the vector 3

y(t). Define the operator Pt by [Pty] (s) y(s), s < t(1) =0, s >t As usual, let L (A) denote the Lebesgue space of p-integrable N-vectorP valued functions on the measurable set ACR1. It will always be assumed that 1 < p < oo The L norm is written | * ||. be the space of all functions y that satisfy the following condition: y E6(P) iff for any T, 0 0 < T < oo, there is a positive number K = K (T,y) such that || (Pt+T - P)y|| K for all t E R1. Obviously& (p) is a linear space over the real numbers under 0 the usual addition and scalar multiplication of functions. It is made into a normed linear space by the assignment of a norm: y(P) sup (II(P - P )YI )( YT St It+T t p where T is an arbitrary fixed number, 0 < T < oo. We call the resulting normed linear space-(p T. Proposition 2.1 ^ P) is a Banach space if its elements are interpreted to be the equivalence classes of functions in (p) that are equal a. e. Lebesgue. Proof: It is immediately verifiable that Z is indeed a normed linear space, so it remains only to show it is complete. Consider a particular set of half-open, half-closed intervals (kT, (k+ 1)T] where k is any integer, -oo < k < oo. Since for any (t, t,t+T] C (kT, (k + 1)T] U ( (k + 1)T, (k+2)T] for some k, we have 4

sup | (Pt+T - l5 sup 1 (P(k+z)T -kT t K < 2 sup i(P(k+l)T -PkT) p k or, IykI(P) < 2 sup ((Pk+ Pl )) ykT p < 21yIIT ( Now let {y }be a sequence such that for any E >0, IIY Y || P) E whenever m,n ' n (E). Then, for any integer k l (P(k+l)T -PkT) (Yn YMp for mn' no(E). (k) Since L (kT, (k+l)T] is complete, there is a y E L (kT, (k+l)T] which P P is the limit of the sequence {(P(k+1)T - PkT) (Yn )} regarded as a sequence of functions on (kT, (k+ 1)T]. Let y be the equivalence class of functions on R1 that are equal a. e. on each interval (kT, (k + 1)T] to any function representing y(k). Then yE E(a, and ly- Yj1 (P) < 2z sup Il(P T- P (Y-n p k Since, -(k+T P -kT Y- 2E for n > n () and all k, IIY- || (P) < 4E for n > n (E). Thus y is the limit of y n Henceforth, unless there is some particular reason to be precise, we shall refer to the "functions in (P)" or the "functions in L (A)" T p instead of to the elements, which are properly equivalence classes of (p) functions. Some elementary properties of the spacest are noted in the next proposition and succeeding remarks. Proposition 2.2 Let N, the number of (real) components of the vectorvalued functions under consideration, be fixed. Let T1, T2 be any 5

positive numbers. Then, (a) OZ p) and ~(p are comprised of the same elements, and Ti Tz the norms on these two spaces are equivalent. (b) If p >q, then any function belonging to J(P) belongs also to T d(/q)' Also, convergence in;Tp) implies convergence in Z (q) T T c i T Proof: The elements of t (p) and (p) are the elements ofo'(P) (for T1 T z o the fixed N in question). Suppose T1 < Tz and m is an integer such that mn T1 >Tz. Then for yEo~ -, I|yl T1 - $y11 T< m |Ily. o ' 1 TI T T Part (b) follows from Holder's inequality. In fact, iy(q) = sup { |(P - P) y } IIYIIT S~t~t t+T - YlIq -t < sup {| (P t+T - P)yp T } t p TPq IIyII (P) Let 7/(A) denote the set of functions in ot(P) which vanish a. e. o outside the interval A = (a,b), where -oo < a < b < +oo. Then clearly N'(A) is a closed linear subspace of oC P, for any 0 < T < oo. If a and b are finite, /j(A) may be identified with either L (A) or with a closed p linear subspace of L (R1), and the 1:1 correspondence in either case is p a linear homeomorphism. If T = b - a, the correspondence is isometric. We denote the operations of translation by c to the left or right, respectively, by Lc and R; i.e., (L u) (t) = u(t+c). L and R are linear operations on <(P) which pressure norm in any 6T(P), and L = R = R1. The following identities hold for any function u defined c -c c' 6

on R1 and any real numbers a, b, c: L (P - P )u= (P - P ) L u (3) c b a b-c a-c c R (P - P ) u = (P- P )R u. c b-c a-c b a c It will sometimes be convenient in order to avoid an awkward locution to apply L or Rc to elements of an L (A), for some finite interval A. When r- ' c c p this is done it will always be intended that Lp(A) be identified with 74(A), as above, so that the operation is defined. Care must be taken of course to ensure that this operation is meaningful. Compactness of input spaces is required for much of the structure described in Part I. Because that structure is to be applied to what follows, compactness in some form will again often appear as a requirement, but usually not as the condition that an input space be a compact subset of an z(P) space. A weaker condition is appropriate, one which says that ordinary compactness only hold locally in time (not local compactness). This notion is formalized, and it is proved below that there is an abundance of subsets of (cTp) which have this property along with certain other desirable properties. A subset aL of oe (p) is T-compact if (P - P ) regarded as a o t+T t subset of L (t, t+T] is compact for every t. Relative T-compactness is p defined correspondingly. Proposition 2. 3 If ~6 is a compact subset of '"p it is T-compact, but the converse is not necessarily true. Also, if 6is T-compact for a positive number T, it is T1-compact for any other positive number T1. 7

Proof: Obvious. t Another property of input sets that will be essential is the following: a subset Uof (P)will be said to have the projection property, denoted (P), if u E ~Zimplies that Ptu, (I - Pt)u and (Pt -P )u belong to Q for any real numbers s and t. Of course, if s = t, (P - P )u = 0, so in particular t s the zero function must belong to U. Proposition 2.4 Let 't be any compact subset of L (0,T]. Then there 0o exists a set ~ C c ( ~) with the following properties: 1) (P - P )( Da ' (using the identification explained previously between L (0, T] and 2 (0, T]) P 2) 'L has property (P) 3) U is T-compact 4) (. is invariant under time shift; thus if u E (P+T - P )'L then Ltu(P T - Po ) and vice versa. Proof: We first enlarge 1 so as to have a set that is closed under the projections (Pt P ), 0 < s, t < T. Let U be the subset of L (0,T] t s o p consisting of all functions uI satisfying u = (P - P )u, uE, 0 < s, t < T t s 0 a. s. 'o is compact in L (0,T]. In fact, let {u' } be an infinite sequence 0 P n of elements of u' =(P - P )un. Form a subsequence of the positive o n t n n integers, {ni}i, such that tni t n S ~ JUni - u | -asi->oo. Put u' = (Pt - P ) u. Then 8

llu - I 1< II (P< - P ) (Pt P ) p 1 op 1)uIS- ni +projections (P - P ) since (P - P u can always be written ni Sni o un. - ui (t+ T), -T < t < 0 1* * n. o 1 += (kT), kT < t < (k+l which is arbitrarily small for sufficiently large i. Put is closed under the 0 projections (P notP )aff E can always b e writtent the property. t is invariant under time shift. By the way it is 0onstructed, (Pis invar Pia )u, shiftsE which for some t and shift can Now construct a subset V of p) as follows: let the elements u E Z be defined by u(t) - un(t) 0 < t T u_! (t+T), -T < t 0 = u (t- T), T <t_< ZT = uk(t -kT), kT < t (k+l)T where the uk are any sequence of elements from. Put 1- U L '. 0 0_<i1_< T '1 Obviously U 0 and UC'4P() has property (P) since 1 does and translation does not affect the property. 4is invariant under time shift. By the way it is constructed,'1

be decomposed into a shift by NT for some integer N, and a shift L 0 <T < T,,is invariant for any shift. V, is T-compact. It is sufficient to prove that (PT - P )U is compact. Let {zn }be an infinite sequence contained in (PT- P )". By the construction given, each z must be of the form: z =(I- P ) L U + PT RTu-n n n o n T T-ri n where u, uE ', and 0 < n < T. Let n. be a subsequence such that n n o 1 IIU. - I, -,p II and. -uni T where u and u belong to i ' and 0 < Ti < T. There is such a subsequence because of the compact0 ness of 'U (and of the interval [0,T]). Then o lim (I P L.l 0z.= (I-P)L u + PTR u i- oo 1 oT no T-Ti o L (I - P )u + R P u ri T o T - r o In fact, since I| L u- ul| - as a - 0, it follows that L ui -' L u and a p Tini T1 0 RT - ni! -RT u by the triangle inequality. The limit element belongs - ni1 T- o to (PT - P ) by the definition of U. t T-compactness cannot be replaced by compactness here. In fact it is trivially verifiable that if o has even two distinct elements, then any "2 satisfying 1) and 4) in the theorem is not compact, whether it satisfies 2) or not. Indeed, Ut need only be invariant under shifts by integer multiples of T to make compactness impossible: with no loss of generality let Zto consist of the functions f (t) = 0, O < t < T, and fl (t) = 1, 0 < t <T. Then it is sufficient to observe that the sequence 10

{u }, u defined by u(t) = 1, nT < t < (n+l)T, u (t) = 0 for all other n ' n ' n tE R1, has no limit point. The construction given in the proof of Proposition 2.4 depends on T and happens to give a class U that includes all the periodic functions of period T generated by L. However, we remark that if C is a subset of o (iP) which is Tl-compact, shift-invariant and has property (P), then it is Tz-compact and, of course, still shift-invariant with property (P). Thus, in modelling a system, an input space Ut can be chosen with the desirable properties listed without any consideration being given to the value of T to be used. Clearly the bounded continuous functions on R' (denoted* ) are contained in all (P), and convergence in the uniform norm of B implies 0 c convergence in oTP) for any p and any T. It is easy to see also that the functions of Bc are not dense in any T. We give a characterization of the closed subspace of T() which is generated by. T c (k) Proposition 2.5 For any y (), let y EL [0, T] be defined by: O p (k) y (t) = y(t+kT), 0 < t < T, for all integers k. Then, a necessary and sufficient condition that y can be approximated in </T norm by functions from 6c is that the functions [y(k)(t)] P be uniformly integrable. Proof: (Sufficiency). Let Bk(b) = {t < [ 0,T): y( )(t) > b}. The condition that the [y(k)] are uniformly integrable is that, given any ri > 0, there exists b > 0 such that f y(k) (t) P dt < rl for all integers k. Bk(b) To be consistent, what is here denoted 1Oc should be ~(R1, Rn), but it seems less confusing to introduce a new symbol for this special case. 11

Let E > 0 be given, and put 1 = ). Let Yb (t) = y (t) whenever y(k) (t)| < b P and equal to zero otherwise. For each k, there is a function f) defined on [0,T] which is continuous on [0,T], and satisfies If(k)(t) bl/P and Ib(k) f(k)llp Then (k) (k); _ i y(k) (k)1 + Ilybk) f(k)11 ly p - p s [ ( y(k) (t) P dtl/p + lP+ i k when E < 2. Now if f E is the function formed by piecing together the f(k) - f = sup (Pt - P) ( - f)lJp T t t+T t p < sup II(P -P )(y-f)Jj kP (k+2)T kT p < 2. (Necessity). Suppose that Ily - f l ( -P 0 as n - oo, where the f E. It follows that Iy(k) f (k) - 0 uniformly in k. Suppose further that the n p [y(k)]P are not uniformly integrable; we shall obtain a contradiction. Then, for some E > 0, E < -, and for every real number b > 0, no matter how large, there is an integer k' = k (b,E) such that S Iy(kl) (t)I dt>3E Bk,(b) For the same E, let n be a fixed integer so large that y(k) - f (k)|| < n p for all k. We have, jy(k) (k) (k) (t) f (k) () Pdt] Bk (b) n 12

for any b, for all k. Put Kn sup If (t)|, and b = 2K Then, for t n k = k'(b, ), ly (t) - f (t)l > y (t) l - If (t)l n n = Y( )(t) - f (t) > ly )(t) - Kn for t E Bk,(b) Hence, (k') (k') (kt) ]/p IIYn )p[II (y (t)I - Kn) dt Bk(b) (k') t dli/p- [fK ^d] Bk,(b) Bk'(b) >(3E)l/P K [ (Bk,(b))]l / n k where p.(B) is the Lebesgue measure of B. But since (k') (k') (2K]lp ily >- f It n p B (b) n n Bk,(b) Kn [(B (b),)]1/p and since 3E > (3E)P, we have 1l (k') f (k') > 36 - E = 2E n p which yields a contradiction. Clearly, if y ~ Z(P) satisfies the condition of Proposition 2. 5 for T > 0 it also satisfies the condition for any other T' > 0. Since the value of T is immaterial, we can denote the class of all y Ec (P) which satisfy the condition by (). The functions belonging to (P), or more properly ou Ou the usual equivalence classes of such functions, belong to bTP) for any 13

T >0, and as a subset of ) this class is denoted T(P) An T Tu immediate corollary of Proposition 2. 5 is: Proposition 2.6 ce (p) is a closed linear subspace of o(p) and is the Tu T smallest closed linear subspace containing. Proof: Obvious. With reference to Proposition 2. 4 it may be noted that since the set.t is a compact subset of L [0,T] the functions belonging to o o p 0 are uniformly integrable. Then the construction given for U/ guarantees that the functions belonging to 2 satisfy the hypothesis of Proposition 2. 5. Hence U C (p) u14 14

II. PRELIMINARIES ON CAUSAL AND BOUNDED MEMORY TRANSFORMATIONS Let U be a metric space whose elements are either functions of a real variable t (time) or are equivalence classes of such functions that are equal a.e. Lebesgue. Correspondingly, let i be a Banach space of functions or equivalence classes of functions of t. If both '2 and i have property (P), the properties of causality and bounded memory for a mapping F from 6 into i can be defined, and in the usual way: F is causal if P F(u) = P F Pt(u) for all t and all u E; F has bounded memory (d) if (I - P) F(u) = (I - P ) F (I - P )(u) for all t t t-d t and all u E 2I. Note that the same symbol, Pt, is being used to denote the linear projection on the past in both U and, but this should cause no confusion. Proposition 2.7 If F is a mapping from Z into ' that is causal and has bounded memory (d), then for every T >0, (Pt+T - Pt) F(u)= (Pt+T -P) F (Pt+T - d)() (4) for all t and all u E U. Conversely, if equation (4) is satisfied for some T >0 and all t and all u E, then F is causal and has bounded memory (d). Proof: The assertions appear to be obvious. However a proof is given in Appendix A, where the algebraic properties of the Pt are isolated and are used precisely. t 15

In the class of bounded continuous mappings = J (, ), let 3P denote the subclass of causal mappings, and let d denote the subclass of causal mappings with bounded memory (d). Henceforth we only consider metric function spaces U that have property (P), and can always be chosen to be either one of the 2 (P) or /, both of which T have property (P). Hence 50 and d are defined. In some instances, however, where /Z could be used for i it may be convenient to take to be a subspace of 6 that does not possess property (P), e. g., the subspace of bounded continuous functions /l. This is alright, because in this situation where the elements of i are functions (not equivelance classes of functions) the definition of causality may be replaced by: F is causal if, for all t and all u, [Fu] (s) = [FP u] (s) for all s < t. An equivalent condition is the apparently weaker statement: [Fu] (s) = [F P u] (s) for all s and all u. s In fact, suppose the second condition holds. Since IC has property (P), Pt E U for all u E. Take t >s, then [F (Ptu)] (s) = [FP (P u)] (s) = [FP u] (s) t s t s and [Fu] (s) = [F P u] (s). s Hence, [Fu] (s) = [FP u] (s) for all s < t Analogous statements hold for the case of bounded memory. Proposition 2. 8 g and A7d are closed linear subspaces of J;. Proof: q o is obviously linear; we need to prove it is closed. First, let 16

us note the following. If yE T ), then by definition ||y|| - sup || (P - P )yll y lll t t+T tYl p y On the other hand if y =-, then IIyI = sup Iy(t)l = up II (Pt+T - P)yll where II(Pt+T - P I = sup y(s)l t+T t 6 t' s t + T is the norm in & of the truncation of y to [t,t+T]. Thus in either case, Ilyll = sup | (Pt+T - Pt)yll where the norm on the right side is appropriately t interpreted. Now suppose that F EC ~ and lim F F where F ~0. Put n d n n d A P - P andt = P - P Then t t+T t t t+T t-d ' IIF - FI sup sup I At (F u - Fu)l u t t n t t u t > sup sup IA F A' u - A F A ull u t t nt nt t - At FA' u - AtFull For some t and u, Ato Fu - At F At uOll a, a > 0, whereas for ~ o 0o o sufficiently large n, i At Fn (Atou) - At F (Ao u) | 0 n 0 0 < |Fn (/t u) - F (At u)ll < o/2, n > n1 17

Hence I F - F || > a/2, n > n, which is a contradiction. The proof for o~ is similar. If F is any mapping from ZC into A, then one can reasonably define the causal part of F, denoted F~, and the causal and bounded-memory (d) part of F, denoted Fd, by d [F~u] (t) = [FPtu] (t), for all t, all u ~ [Fu](t) = [F(P - P )u](t), for allt, all u E. d t t-d For the rest of this Section, we assume =. Proposition 2. 9 Let tL have the property that for any u, u' E /U and any s,t, d[(P - P )u, (P - P )u'] < d[u, u'] Let FE. Then a sufficient t s t s condition that F E O7 and Fd E d is that F is uniformly continuous on -. Proof: That F~ is causal and has bounded memory (d) is shown by a simple d verification. F~ is a mapping into /2 and is bounded; in fact, for any u and t, |[F u](t)| = [F(Pt - P d)u (t)! < IIF[(P - Ptd)] II -< l F To show that Fd is continuous, choose E > 0 arbitrarily. Let 6 > 0 be d small enough that if u', u satisfy d[u', u] < 6, then |IFu1 - Full < E/2. Take any such pair u', u, then there is a t such that I||F u - Fd u| < [Fd u] (t) - [F u] (t) + /2 = [F(Pt, - Pt d)u] (to)- F(P - Pt-d)u] (to) I + /2 IIF[(Pt - Ptd)u'] - F[(Pto P- ) u] II + E/2 < E/2 + E/2 = E 18

by the uniform continuity. Hence Fd is continuous, and indeed uniformly continuous. The same sort of argument shows that F E.t It is to be noted that F E 9 does not by itself necessarily imply that F~ 0 < Oj or Fd~ E d; i. e., a continuous mapping from into where a and ' satisfy the conditions of Proposition 2. 9, does not necessarily have a continuous causal part, nor a continuous causal and bounded memory part. The condition of uniform continuity is perhaps the most obvious condition that guarantees the continuity of F and F The d following is an example where Fd is not continuous, even though F e. Consider the set E of real-valued functions on R1 described as follows. Each uE 6 is of the form, for some T1, T2, -00o T1 < T2 < 0o, u(t) = 0, t ' T1 u (T2), T1 < t T2 0, T2 < t where -1 < u (T2) < 1, and where the convention is made that if T1 = -00, u(t) = u(Tz), t < TZ; and correspondingly, if T2 = +00, u(t) has a constant value for all t >Tl. Thus the constant functions, and the functions that are constant except for a single step up from zero or down to zero are included. Obviously g C g, has property (P) and is T-compact. Let F, a mapping from C. into, be defined as follows: [Fu] (t) = 0, t T = 0 (u (T), 2), T1 < t <2 = 0, T<t, 19

where 0 (a,T) = a |, if IT | < 1 IT. |al + 1- T1, if 1 < T1 <1 7 = 0, if IT > I ~l It will be noted that F carries all the constant functions in ~, and in fact all the functions in ' with T2 = +oo, into zero. It is readily verified that F is a bounded continuous mapping from ~ (regarded as a metric subspace of ) into >. Actually, the range of F is contained in. Now, let u(t) = 1 and u (t) 1 - -, n = 1,2,.. The functions n n u and un E, and u - u in 6. Consider F u, [F~u] (t) = [FPtu] (t) = u(t) = 1, it < 1 It u(t) + l-tl=, Itl < = oo i.e., [F u] (t): 1. On the other hand, [Fou ] (t) = [FPu ] (t) n tn = 1, It < 1 = It (1 - )+ -l it - 1 l< |t <n n n 0, Itl >n Thus F~u does not converge to F u in, although of course rn 0 [F u ] (t) -[F~u] (t) for each t. Hence F0 is not a continuous mapping. n For these particular u andu, F~u = F~u, and F1u = F~u, so it n n n follows that F1 is not continuous either. 20

The following very simple result gives some justification for introducing the concepts of causal part and of bounded-memory causal parts of mappings, at least when the intended use of these mappings is for approximation. Proposition 2. 10 Let F and F E. If for some a > 0 there is a GE e d such that | F - G| < a, then II F - F~ I 2 a. The corresponding statement is true for F0 and G E a. Proof: For any E > 0 there is a u E and a t such that |G - FI| < I [Gu] (to) - [F(Pt- Ptod)] (t)I + E/ = [G(Pto - Pto-d)u] (to) - [F(Pt - Ptod)U] (to)I + E/ < I|G-F|| + E/2 Hence G - Fd| < J|G - F I, and | F - |2 a.t nd 21 21

III. FINITE-TIME-INTERVAL PROJECTIONS OF SYSTEMS AND THEIR TRAJECTORIES The general situation to be discussed next is the following. The kind of system in question consists of an input space tC of functions of time, an output space i of functions of time, and a continuous bounded mapping F from 'v into Y. The mapping F may or may not be causal and of finite memory, but there is some emphasis on the case where it is. Such a system operates for infinite time. We want to look at pieces of the system corresponding to finite observation intervals for both input and output, and at the relations among such pieces and between them and the entire system. Each real number t can be taken to be the epoch of an observation interval. If the observation intervals are of fixed duration, then as t changes, a trajectory of comparable finite-time systems is generated by the original system. The elementary properties of these trajectories are investigated in this Section. It is assumed for the remainder of the paper that 'C is a subset of oP for some fixed p, 1 < p < oo, and that it is T-compact, shift0 invariant and has property (P). 2. is to be regarded as a metric subspace of afrP for some T and d >0 as given. Since all aZP spaces T+d T with the same p are topologically equivalent, changing T and d changes only the metric on '&; it does not affect the T-compactness. We have then always |u| = sup t+T I t+T-dul p u~E 1 22

/ is always either an cX P space or /'j, the Banach space of bounded functions on Rl with the uniform norm, or a closed linear subspace of one of these. In the propositions of this Section, whenever Y is to be one of some particular class of spaces that fact is stated; otherwise it may be any of the spaces just indicated. One slight technical annoyance is that sometimes it is desirable to take to be /, the bounded continuous functions regarded as a subspace of 6, but this space quite obviously does not have property (P), which is usually needed. It is not always satisfactory just to replace d with /. in every statement, but it will c be clear that when necessary, /.; can be imbedded in B in order to make C the calculations meaningful. (3 = rj (, ) is the family of bounded continuous mappings from ' into H made into a Banach space with the sup norm, as before. Thus, IIFII = sup sup |(P +T - Pt) F(u) I, FE uE 'L t in all cases, where the norm on the right is the L norm or the uniform p norm as appropriate. We now introduce notations for the finite-time pieces of a system. Let T > 0 and d >0 be given. Put,d ( p-P )- (5) t, T t+T t-d u =(P+T P) Fu' u (6) t, t+T t t,T(6) Equation (6) does define a mapping on t T since u' belongs to the domain t, b of F by property (P). Further, because of shift invariance we can write 23

Lt T U T for all t. Define FT by T t tT tT tT d F z=LF Rz (7) Ft T - Lt Ft T Rt (7) = L (P - Pt)F R z t t+T t t (PT - Po) LtFRt z, zE UT If has property (P), then FT is a mapping from MT into, and clearly it is bounded and continuous. But F can also always be regarded t,T as a mapping into a smaller space, denoted T If = o P then FT is a bounded continuous mapping into = L (0,T], and with the ~t)~~ T <<T p same norm as if its range space is taken to be {. Similarly, if = /3 F is a bounded continuous mapping into = / (0 T] with the same t, T T norm; even if = c, Ft T is a bounded continuous mapping into T = ec (0 T], although it is not a mapping into. If T is fixed throughout a calculation we write simply Ft for FtT It often avoids confusion to write Ft (PT - ) Lt F Rt (T - d) even though the projection on the right is redundant. When we are dealing with mappings F with finite memory, d is usually chosen to be the duration of the memory; however the above definitions are to be applied in the general case, whether F is causal with finite memory or not. Causality and finite memory are not to be assumed in what follows unless explicitly stipulated. 24

Let t: ( r i ) ' A( 'T' /T) be the mapping that carries F into Ft according to equation (7). Proposition 2.11 The mapping Trt is linear and continuous; in fact IIt F II< I F II. Proof: The linearity is obvious. Also 1rt Fit = IFt = sup II(PT - Po) Lt F Rt ull E UrT < sup IIF R u|| < sup tIFz I| IIF|| uE T t where the norm on the left is for the space U ( T yT) For each t, Ft = rrt F is an element of j ( UT Tr), so as t runs through R1 a trajectory is generated in;J( T, T/T) corresponding to F. If F is a time-invariant mapping this "trajectory" reduces to a single point, of course, but we are interested in time-invariant systems only as a special case. Since in general for time-varying systems these trajectories describe the evolution of the systems, we wish to investigate their properties. Note that the trajectories depend on T; however, for now, we keep T fixed arbitrarily. Proposition 2.12 Let Z C oTt 1 q <o, and y be ' 1 p < oo, or c. Then, for any F E ( th, e ) the trajectories Ft = t F C <7 t t with values in G ( L(T' /T) are continuous in t. Furthermore, if;4 is a compact subset of ~ ( U, i ), then the trajectories Ft = rr F, F E 9, are equicontinuous functions of t. 25

Proof: Suppose = o, then I[Ft -Ft+h sup II (Ft - Ft+h) ullp uE ~T sup T F u (PT - P) F Rt u- (P - Lt+h F Rth lip < sup I (P - P ) L F Rt u - (P - Po) Lt F R ull T o t t T o t+h t p 'T + SUP (tP h FR u- o Lt+h Rt h p + sup II(PT - P) Lt+h F Rt u (PT P) Lth F Rt+h llp %u-p Denote the first term on the right-hand side of the inequality by I, the second by II. Then, I < sup Lt (P - Pt) F Rt u - Lt+h (P - Pt) F Rt ull + sup TRt U Lt+h + - (PT+t+h Pt+h) F Rt u1p Denote the two terms on the right-hand side of this inequality by I, Ib, respectively. Then Lb -sup ILt+h [P P + P - P FRtull b, t+h [T+t T+t+h t+h t t p + sup I (PT+t Pt+h) F R u lp T + sup th- P t) F Rt Pull Now, F(Rt f. T) is a compact subset of / since T is compact in L (by the T-compactness of L. ) and FR is continuous. Let yi, i = 1, *, N, 26

be a set of points in U such that the balls of radius ~ about the Yi cover F (Rt 'LL). Then the first term in the expression dominating Ib is in turn dominated by sup min (PT+t - P +t+h) (F Rt - yi) | '[2T i=l,..,N T+t t+h t +II (T+t PT+t+h) Yi lp Let h be sufficiently small that II(PT+t- PT+t+h) Yi lp for all i = 1, - *, N. Then for any such h the above expression has a value 2 g. The second term in the expression dominating Lb can be treated in the same way, so we have that for some h1 >0, Lb 5 4 whenever Ih| < h The term I can be written as a sup (L - Lt+h) (Pt+T - Pt) F Rt ulp Since (Pt+T - Pt)FRt UT is a compact subset of L one can choose a 1 p about the z. cover (P - P ) F Rt 'A Then, since for h sufficiently 1 T+t t t T small II (Lt L t+h) zi lp < ~ for all i = 1,, M, we have, very much as above, that for some h2 > 0, I < 2 9 whenever h | < h2. a To bound II, we have II = sup t+h (PT+t+h - Pt+h) [F Rtu - FRt+h ] I s Sup II(PT t PT) [F Rt - F Rt Rh u] lp 7-T 27

when |h[ < T. Now -= Ih U_ Rh IT is a compact subset of Lq (as in Proposition 2.4) and (PT+t - Pt T) FR is a uniformly continuous mapping from into L. Hence, there is an 3q >0 so that p 1[ (Pt+ Pt T)FRtU - (Pt+ZT - P-T FRt u l whenever ju'1 - u"l <1|. Let {w.}, i = 1, *', K, be the centers of balls q 1 of radius r that cover 2'. Then II < sup ( 1(Pt+T - PtT) F Rt u - (Pt+T t-T) Ft i lp 'UT + 1 (Pt+2T - PT) F Rt Wi - (Pt+T - Pt T)FRtRhullp} Let h3 > 0 be small enough that J|Rh w. - w. [ < r/Z for all i = 1, K whenever [h| < h3, and temporarily fix such an h. With this fixed value of h, there is u so that the supremum in the inequality above is realized to within E by u = uo. This gives I_ IItPt+ZT t-T F t uo - t+2T t-T t i llp + T (Pt- F R t+zT -tt-T) t ho p + S, for all i = 1, -, K There is at least one w. so that I|u - w. I| < n/2; choose such a w.. Then the first term on the right is < C. With this particular w. lRhu - WIq 1l Rh u - Rh Willq + IRhwi - WI = l - o i ihq - i 1illq -< i/2 + r/2 = 28

so the second term is also < $. Thus for |h[ < h3, II $ 3C. Combining these estimates gives the result that if |h| < max (h, hz, h3 ) then |IFt- Ft+h| ' I+ Lb + II 2 + 4f + 3g = 9. This proves the assertion for a single trajectory with = g. An inspection of the proof will show that if F E ), 9 a compact subset of 9 ( U, ), then the compact sets chosen above can each be replaced by compact sets chosen independently of F in /f. For example, the compact set F(Rt UT) is replaced by 5 (Rt UT) which is a compact subset of / since;/ restricted to Rt UT is a compact set of bounded continuous mappings, and Rt UT is a compact subset of L. Also the mappings (PZT+t - Pt-T) F Rt restricted to A, F E A4, are equicontinuous by Ascoli's theorem. These facts yield the assertion that the Ft are equicontinuous. The proof of the assertions for the case = - c is similar, although obviously some modifications are required. The details are not given. Two consistency relations are introduced for the trajectories Ft The second of these will also be used as an interpolation formula. Conditions under which they hold are given in the Proposition to follow. (PT P ) Lal FtR (P ) T-r- o 'ri t 'r T-r1 -d - - o) Ft (PT- P d), O ' T (8) 29

F = (PT - P)L FR (P - Ft+ll (PT-a1 o 'n T — T-d + (PT - PT- ) L i-T Ft+T R -T (PT - T-n-d O < r < T (9) Proposition 2.13 i) If F = ir F, FE ~ ( /, 2 ), then F satisfies t tt equation (8) for all t. ii) If FE d (, ), then Ft satisfies equation (9) for all t. iii) If Ht, Ht+T, Ht+ are any mappings from U T into U T that satisfy equation (9), then they satisfy equation (8). Proof: The proof of i) is given by the calculation: (P P ) L [(P -P ) L F R (P P R (P -P p) T-' o PT o t t T -d 1 T-1 - d (P - P ) (P -P )L F R+ (PT P ) (P - P T-'q o T-'r -r1 il+t q+t T-'r -d-r T-,] -d (P -P)L FR (P P T-_r o r]+t F +t T-a1 - Pd (PT- P) (P P)L FR (P -P )](P -P) (PT- o T L t+F Rt+r1 T -d T-r -d (P -P )F (P P_ = (PT-r o Ft+,] T- -d ' To prove ii) we use i) for the first term on the right side of equation (b) and make an analogous calculation for the second term. Then the righthand side of equation (9) becomes (P -P)F (P(P -P )F + (P P ) (10) (FT —l oo t+rl T-r -d T T-r 't+'l PT T-rl-d If F is causal with bounded memory (d), then so are all the Ft, and this expression reduces to 30

(P - P) Ft + (PT PT- ) F = F T-] o t+ T T-r t+r] t+r] which proves ii). iii( can be verified immediately by substituting Ft from equation (9) into the right-hand side of equation (8).t The consistency condition (9), if required to hold for all t and all rl, 0 < l <' T, is not quite enough to guarantee that F is causal with bounded memory (d). It does guarantee something a little weaker, and to state this we use the definition: F is weakly causal and of bounded memory (d) if for every A > 0, and for all t, ) (+ - P) F (P - P-d - (P - F (P - (t+A t) (t+A t-d t+A t) ( a t-d whenever t + A < a, and (2) (P -P)F(P -P (P P)F(P P 2) (t+A t) F t+A t-d t+T- t)F t+A b) whenever b t - d This definition rules out non-causality and non-bounded-memory (d) that depend on interactions between past and future. Proposition 2. 14 If F E E ( ' - )' then equation (9) is satisfied for all T >0, all t, and all 0, 0 < s1 < T, iff F is weakly causal and of bounded memory (d). Proof: The right-hand side of equation (9) is given in different form in (10); consider the first term of (10). Since (9) is satisfied, we must have that (PT^ -P) F (P - Pd ) (PT-rI o t+~r T-r1 -d (PT- o) Ft+ (PT -d 31

This may be rewritten, Lt+ (PT+t - t+) F (PT+t - Pt+-d Rt+ Lt+l (PT+t Pt+ ) F (T+t+ Pt+q-d) Rt+ which is equivalent to (PT+t- t+) F (T+t t+-r-d ( T+t- t+) F (T+t+n -t+ -d) Put a = T + t +, s = T + and A = T -. Then this is in the form of condition (1) for weak causality and bounded memory (d), and a, s and A can be given arbitrarily by choosing rl >0, t and T >r1. An analogous argument applied to the second term of (10) yields condition (2). The converse follows immediately from equation (10) and the definition of H From a family of mappings carrying 'kT into T it is possible under certain circumstances to synthesize a mapping from into. We want to be able to do this, because we want to be able to go from trajectories {Ft} back to an overall system mapping F. The transformations p to be defined below accomplish this. If rT denotes the transformation carrying F into a trajectory {F }, then the p are roughly inverse to Tr. t s However the situation is a little complicated in general, and p and rr are s inverse to each other only when F is causal with bounded memory of sufficiently short duration. These comments are made precise in what follows. We use the notations: n (P -P n,t t- (n-1)T t-nT 32

A' (P -P ) n,t t-(n-1)T t-nT-d where T >0, d >0 are fixed. Let ~ be a bounded subset of U ( UT, 'T) with the additional property that the GE ~ are equicontinuous. Let {G } n be a sequence from '. For any real number t, define Gt= t A R G L At (11) 0 n,t t-nT n t-nT n,t ( -00 -oo It is clear that G is a mapping from 'L into if = P or; however, see Appendix A for a formal justification of the infinite sum. Proposition 2. 15 Gt as defined by equation (11) is an element of u (, ), where ~ is either or T T Proof: Take P = o~ P Given Z > 0, let 6 >0 be such that I z1 - zzI < 6, zi, Z2 E T' implies IIG (zl) - G (zz) I| <? for all n = 1, 2,..., as is possible from the hypothesis on;. For ul, uz E, |ul - uzll < 6, one has IIGt (ul) - Gt (uz) I = sup (P+ - P ) [Gt(ui) Gt (u] s p + k+l,t t-(k+l)T k+ Lt-(k+l)T k+l,t ( A R GL &! k#t t-kT k t-kT kt () -Ak, Rt-kT G L (uI) Ak+l,t t-(k+l)T k+l t-(k+l)T k+lt (U)lp < 2 Ilan,t Rt-nT n Lt-nT (ul) 33

- A R G L A' (u2) + n,t t-nT n t-nT n,t p 2 II(P - P ) [G L A (u) T o n t-nT n,t -G L A' (u2)]Ii + -n t-nT n,t p for some n. But 11 t-nT nt, ul - Lt-nT n, It I flu - uI < 6 t-nT nt t Gt (U) - Gt (U2) < 2 + = 3 The boundedness of G follows similarly. The same proof holds for 1/ = 0, if the L norms are changed to uniform norms.t p The transformation that carries the sequence {G } of equicontinuous n mappings into Gt is denoted Pt. Note that the equicontinuity condition is natural, since, when we to the other way we have that the rt F, FE j(, ' ), are equicontinuous with respect to t. Proposition 2. 16 The transformation p is continuous in the following t sense: if there are two sequences of equicontinuous mappings from 6T to 'T' {G} and {G}, and fIG - G I| < 6 for all integers n for some n n n n 6 = 6(f), then I[Pt ({G }) - Pt ({n})1 < Proof: Again take = TP Then, fPt ({Gn}) - Pt (p{ })f < 2 sup ||A R [G L A (u) c 2u n,t t-nT n t-nT n,t - L A' (u)] | + n t-nT n,t p 34

for some n by a calculation very similar to that in the previous proof. But the right side of this inequality can be rewritten as su (n n t-nT n,t p 2 sup ||(Gn- Gn) (Lt-nT nt U)Ip n )Ln n p - 2 sup 0) (Gn - ) z llp+ + ET since, by the shift invariance of 'AZ, any z E UT can be obtained by truncating some u by (Pt — - PtT ) for arbitrary t,n. Hence, if J G - G II is sufficiently small for all n, n n Pt ({Gn}) - p ({G}) 2 + = 3 Again, the same proof holds for L = / if the L norms are p changed to sup norms. Proposition 2.17 If FE ~; (U, ), then for any t, -nT F} is a family of equicontinuous causal mappings from RT to /T with bounded memory (d), and F = Pt ({t-nT F}) Conversely, if {G } is a sequence of equicontinuous causal mappings from T to /T with bounded memory (d), then Pt ({Gn}) E 3~d (, ) and Gk = ot-kT o Pt ({G}) Proof: The assertion that the rrT F are causal with bounded memory (d) -' —'~- (~t-nT is obvious; indeed all the Tr F are causal with bounded memory (d). Further, s Pt ({t-nT F}) oo = A R [(PTP ) L F R (P-p )] L A' E an,t t-nT [(T o) t-nT t-nT T -d Lt-nT n,t -00 00 Z A, F A' = F n,t n,t -0o 35

It is also obvious that Pt ({G }) is causal with bounded memory (d). The second inversion identity is given by the calculation: rt-kT 0 Pt((Gn) T o t-kT V n, t tnT n t- n,t R t-kT (PT -d -oo Lt-kT k,t Rt-kT Gk Lt-kT k,t t-kT (T -d (PT - P )G (PT - P = Gk o k -dT d When F is not causal with bounded memory, the operations Tr and p obviously cannot be inverse to each other because some information about t F is lost in the truncations given by the Irt which cannot be restored. The sense in which they are approximately inverse to each other is given in the next Proposition. Proposition 2.18 Let {F}, - oo < t < oo be a family of mappings in ^ ( T' / T) which is bounded and in which the Fk are equicontinuous. For any fixed s consider {F }, n = **, -2, -1, 0, 1, 2, **. Put d H - (P - P)L F R (P -P s-nT+1l T-r o r1 s-nT r T-rj -d + (P - P )LI F R - P ) + (PT ' )-T s-(n-l)T RT (PT T-rl-d 0 < r < T. (12) This defines Ht for all t, and H T = F Further, define H ( Ps ({Hs-nT}) H() d t o os ({H }) = t H(1) t t rrt~s -Iss-nT t 36

(2) d (1) H =P5 ({1T-nT H ) ({H s-nT s s-nT Then, i) H = H for all t (2) (1) ii) H - H iii) If F satisfies equation (9), then Ht = F (1) and H() = F Proof: By the definitions, H is given by -00 H (PT P ) Lt 00 R (P -P t T t n,s s-nT s-nT s-nT n,s t T -d At most two terms from the infinite sum can contribute anything, by virtue of the projection (PT- P ). Let k be that integer such that T O s - kT < t <s - (k- 1)T and let, = t - (s - kT) Then, since F = H (1) Then, since F T= H, the expression for H) above reduces to s n- s-nT ' t the expression for Ht = H given by equation (12). Thus H = H thes e fo -nT+~T| t L (a) (1) for all t, and H = H follows from this equality and from the definitions. The assertion iii) is obvious, since quation (12) becomes equation (9) if H is replaced by F. We conclude this Section with a simple error bound on the interpolated Ht+] as given by equation (9) when Ht and Ht+T are in error. Proposition 2. 19 Let I|| F Ft|| < ~ and ||Ft+ - F || < g. Then, if Ft+a and Ft+. are each given by equation (9) in terms of Ft, Ft+T and 37

t' t+T' respectively, IF t+ri - Ft+l I 2.~ Proof: Expressing Ft+, Ft in terms of F. Ft+T and Ft. Ft+T from --— _ t0r| t+T|t t+~T t t+T equation (9) yields ~F -F II td) IIF R (P -P ) R P P t~'t+rI t+ll tsu T t T-rq -dTT sup t+T T (PT T —d Ft+T R - (P - PT )1 (13) t ri-T T T —d -T UT Now, by the properties of., o T R (PT- P c T ( 0<_l<T n T - -d T T Hence, suP JIFt Rl (P- P) u - Ft R (P - P tsP i T-,] -d t I T-i -d /U T sup IIFtu - Ftu|| ' e T The second term in the inequality (13) is also dominated by ~, by essentially the same argument.t 38

IV. TRAJECTORIES OF THE FINITE-TIME PROJECTIONS FOR CLASSES OF SYSTEMS We now consider a class of systems p = ( /, f,,V) in its natural representation form, J = ( $, g,9, ), where ' is a shiftinvariant, T-compact subset of P+d with property (P), and where ' +d is T or 4. v is, of course, a subset of (t,> ); further hypotheses on 49 will be made as needed. Each F Ejwill generate a trajectory {Ft} E /VT' whether F is causal with bounded memory less than or equal to d, or not. If /C Id (f, ), then each of these trajectories will yield the corresponding F through the mapping p. We investigate some basic properties of these families of trajectories. Temporarily take T >0 to be fixed. Let 2? be the closed linear subspace of the Banach space ( (, ~ ) generated by "/, and let 2nt = t ~ t is a linear subset of ( CT, /T); its closure, 5t, is the closed linear subspace of ( T', ~T) generated by mt / We define (or ) to be a linearly predictable class of systems with respect to T if each mapping Tr is 1:1 from #' onto 't, teR1. When Se is a linearly predictable class a prediction mapping 0(t,s) carrying 0 H into H, t < s can be defined by t s 0(t,s) = Tr o, -o <t, s < o. s t For each t, s, 0(t, s) is obviously a linear transformation with domain ft and range;s. 39

The intuitive meaning of et being a linearly predictable class o is that no two trajectories associated with the FE a corresponding to,<| can cross or touch and be at the common point at the same time. o Two trajectories can cross or touch provided the time of arrival at the common point is different for the two. A class of systems consisting of a single system ( '4 has only one element) is always predictable in the sense of this definition. We further define a stationarily predictable class of systems with respect to T to be a class xf with the property that whenever iT F = Tr G, t s F and GE ~et, then tt+a F = Tr G for all real numbers a. Intuitively, t+a s+a this implies that the systems F and G have trajectories which as geometrical entities are identical. Furthermore, no individual trajectory can cross itself. If the definition is weakened to read: rt F = i G, F and GE L, t s implies tt+a F = Tr G for all a > 0, we call the class 4 a future-time t+a s+a o (f.t. ) stationarily predictable class with respect to T. If either of F or G is not causal with bounded memory (d) it is obviously possible that Tr F = Tw G for all a without F and G being the same. a a In this case J can be stationarily predictable without being linearly 0 predictable. A fortiori, 4 can be f.t. stationarily predictable without being linearly predictable. However, if the iv associated with 3 is o a subset of 5d ( (, i ), so is 4. Then if for some t, rt F = wtG, d t t it follows from stationary predictability that tr F = Tr G for all a, and a a hence by Proposition 2.17 that F = G. Thus, in this situation stationary predictability implies linear predictability. Under the same condition that 40

C d ( ' ( - ), if e 'is only f. t. stationarily predictable, the d (j o situation is complicated a little, but (can be interpreted in much the same way as will be seen below. In case jg is linearly and stationarily predictable the prediction 0 mapping 0(t, s) can be written as a function of the difference s-t only, once the domain has been defined properly. In fact, suppose to start with that F' E 94 and also F' E t+a Then F = rt F for some FE e~; and also t t+a t F' E7a G for some GE t. t+a Thus, 0(t,s)F' = ir o Tt '(1t F) = rr F and 0(t+a, s +a) F' 5it t w o Tr - (t+a G) = r G. By the definition of a stationarily predictable s +a t+a s+a ' class, rr F = Tr G; hence 0(t,s) Ft = 0(t+a, s+a)F'. Now (with a s s+a slight abuse of notation) let O(T)F' = (t, s)F', s = t + T, for all F' such that for some t, F' E 't. This definition is meaningful, because if more than one pair (t, s) satisfy the conditions they all yield the same 0(t, s)F'. The domain of 0(T), for any T, will now include U x t; extend this by linearity to tER1 i = linear span { U '), }. The family {((T)}, T E' R, is now a tE R1 one-parameter group of linear transformations on 2. We note that A C j ( tT', ~T). In fact, the elements of 7 are of the form N F'= E an(PT - P) Lt F Rt (P P T 0 ( E nt n n T -d where {tl, * *, tN} is an arbitrary finite set of real numbers, as is also 41

{al,, * N* * } and F,..., FN are each elements of 3 ( i, ). N N Since a Lt Fn Rt is also a bounded, continuous mapping, we n=l n n can denote it by F E Z( ', y ). Then, F =(PT-P) F (P - P_) E ( ( T) In case. is a linearly and f. t. stationarily predictable class we can, similarly, for any T > 0, put O(T) F1 = O(t,s) Fl for all F1 such that for some pair (t,s) with t > 0 s - t = T, it holds that F1E tL. The domain of O(T), T > 0, can now be extended by linearity to /L = linear span { U t}. The family {0(T)}, T > 0 is now a one-parameter t>0 semigroup of linear transformations on 1, which is also contained in 3~( T' yT) If soZ is f. t. stationarily (but not linearly) predictable and 4LtC /Id0 ( * ' ), a semigroup can be established in essentially the same way. Suppose F' = r F = t G. Then the fact that irt+ F = r G, t t tiT t+T T > 0, implies that F and G restricted to (I - Pt) L are the same mapping. We now redefine tr1 as the set function: rt1 (F') = {F:rtF = F'}. Then O(t,s) can again be defined as wr o rtl, but only, of course, for t < s. O(t,s) S t is again linear on Zt' and the development that follows for the semigroup case can be repeated exactly. In what follows we restrict attention to the semigroups of linear transformations, as being of more immediate interest than groups in modelling for system identification. The usual linear operator norm, when it exists, of the linear transformation O(T) is given by 42

IF'E ( IIF "C V T where the symbol [ | has been used to provide a reminder that this is a different kind of norm than has been used for the other mappings that have appeared. From the definition of A+ it follows that 0(T) is a bounded operator if and only if there is a number B > 0 such that su jJ(P - P) ( Lt- F R \ ) (P -P ) ujJ N uet (PT o 0 n Ln+T n tn+T/ ( T -d) 1U:<B sup (P P) a( Lt F Rt (PT P )u[ (14) uUE'~\ n n n T d n=l T-d for any positive integer N, any set of points tl, ' ', tN all greater than or equal to zero, any set of scalars al,, ', ac and any F1, *.-, FN belonging to 7L. This is a regularity condition on the time behavior of the mappings F. Note that, unfortunately, it is not sufficient to consider just those F e 2/, but rather all finite linear combinations of these and of their translations. If /2 is itself a subset of ( ZC,. ) that is invariant under time shift, then all the /2t are the same and the sums in the t condition (14) collapse to single terms. Using the definitions established, we can now state a basic fact, which is really a corollary to Proposition 2. 12. Proposition 2.20 Let. be such that,$C /d (C, /), let it be f.t. o d 0 stationarily predictable with respect to T, and let the 0(T), T > 0, be bounded operators. Then {0(T)}, T > 0, is a strongly continuous semigroup 43

of bounded linear operators on the Banach space +, the closure of 9+ in j(T, /T). Proof: It is supposed of course that the 0(T) are extended by continuity to +. All that has to be shown is that O1(T)F' - F' -| 0 as T - 0, for any F' E. Since C ( and since any F' E t ( AT' yT) can be written F' = (PT - Po)F (PT - d) it follows that F' is the image under or of itself, regarded as an element o of ~ (,, t). We write, F' = -r F = F. Then ffI0 o 0 110(T) F' - FII = | F - Fl || as T 0 by Proposition 2.12. Clearly the hypothesis that / C d(, ) can be replaced d by the hypothesis that vf is linearly predictable, and then with the other 0 hypotheses in force the conclusion still follows. For convenience we shall refer to an f that satisfies either the conditions of Proposition 2.20 or o the modified conditions just given as a linear dynamical class of systems with respect to T. This terminology is introduced with some apology since dynamical is such a widely used term; however, it seems reasonably appropriate. There is no inference, of course, that the individual systems in the linear dynamical class are linear. Thus far, T, the length of the interval of observation of the output, has remained fixed. We now look at how the properties of the special classes of systems introduced in this Section are affected by changes in T. When T is changed, so is the norm on the input space, which is always 44

(P) (P) assumed to be a subset of X P. In fact ||UlT, w < T. T+d T' T However, as has been pointed out earlier, the membership of U does not depend on the value of T, nor does the topology on Z-, nor do the properties of T-compactness and shift invariance. Similar statements can be made for if CZ= if. If y =, then not even the norm on / is changed. In any event, the class of mappings; ( (1, ) is not affected. Proposition 2.21 If J is a linearly predictable class of systems with respect to T', then it is also linearly predictable with respect to any T > T' Proof: Suppose the linear mapping Trt(T) given by t (T)F = (P - P) L F Rt (P -P t T t t T -d is singular. Then for some F 0, rt (T)F = 0; and for T' S T (PT' - Po) [(PT - P ) Lt F R (PT - P)u] 0 o T o t t T-d for all u E. Since (PT, - P d) u e d for all u E, (PT' P) LtF Rt (PT' -Pd) u = O for all uE U. Hence irt(T') is singular, and the assertion is proved by contradiction. Proposition 2.22 If r is a class of systems with the property that 0;-C _d (, 9 ), and if o is stationarily predictable with respect to T', then it is stationarily predictable with respect to any T > T'. Stationary predictability can be replaced simultaneously in hypothesis and conclusion by future-time stationary predictability. Proof: Suppose to start with that T' < T < 2T. We need to show that the 45

condition Tr (T)F = rT (T)G, where F, GE it C ) implies s ' d'( ).implies rt+ (T)F = Tr (T) G for all a. We note that the condition at (T) F = (T) G t+a s+a t s can be written (P - PO) (L FR - L GRs) (P - P) u= 0 T o t t s s T -d for all uE L. Since (PT - P d) u E X for all uE L, it follows that (PT- P) (Lt FRt - L GR) (P- Pd) u 0 for alluE u.;i.e., tr (T')F = i (T')G. By hypothesis, it follows that t S t+a (T')F = ws (T')G, or, t+a ' s+a (PT- P ) L (LtFRt - L GRs) R (PT - Pd) (15) for all uE L_, and any real number a. Now (PzT - PT)L (LtFRt - L GRS) Ra (P - PT-) ZT' Ts'a t t (P2TIs s Z T'-d = (P p )L (LtFR L GR )R uP )R 0 E-TI (PT' - ^Po) LT. (F - ts s a+T' (PT-' -d -T' for allu E tC, since R, (u)E V for all u E U, and we can replace the a of equation (15) by a + T'. Since the mappings F and G are causal with bounded memory (d), (PZT' - P ) L (LtFRt - L GR )R (P - P )u 2T' - PT' )L (Lt FRt- Ls GR)Ra (P2T PT-d u + (P - P ) L (FR - L GR )Ra (PT - P u o a t t s sd which equals zero by the calculations above. It follows then by a now familiar argument that 46

(P - P ) L (LtFR - L GR )R (P -P )u= 0 T a t t s s a T -d for all uE, which is what needs to be shown. The extension to arbitrary T > T' follows by induction. The proof for future-time stationary predictability is the same with a restricted to be > 0. Proposition 2.23 If,o is a linear dynamical class of systems with respect to T', and if it further has the property that / C )d (0U ) then is a dynamical class with respect to any T T'. Proof: In view of the preceding Proposition, all that needs to be proved is that if 0T (T) is a bounded operator for all T ' 0, then 0 (T) is a bounded operator for all T > 0 whenever T > T'. The meaning of the subscripts T and T' on O(T) is obvious. In what follows it is necessary to go back and forth between norms in [T and in nTp so a subscript T or T' is used. The facts that, by an obvious identification of elements, /T' can be thought of as a subset of /T' T' < T, and that then IIYIIT' = IyIjIT when yE ET' are used without comment. Again assume to start with that T < 2T'. We have, II T (T)FIIT sup II(P - P) ( Ltn+T R+T) n=l (PT - Pd)UllT T 0, FE X, for some F E C d ( d, i ), and some scalars a. Now n T (T F n N || )| s sup ||TP 1 ( Tn +T n +T_ ~U 'n=l 4 47

* (PT P ) u T + suP (P P) a L F R +) ~T P-d TU"T T'n t +T n t +T ^=1 n nn=l n * (PT- Pd)ullT = sup IIA(u)IIT + sup IIB(u) IT (16) where the A and B are defined implicitly. Because the F 3d ( U, ), n 3 d N A =(PTP (Z LtT F R T' o ( n tn+T Fn\nt +T n o =l n tn + n=1 Since A(u) is different from zero only on [O,T'], and since OT,(r) is bounded, sup IA(u)[T sup |IA(u) II= IO (T)FII < to (T) ~ II[[FI N |T(T)| ' sup 1 (P -P ) ( tn Fn Rt u 11 ( leT' *^ * MT * (17) = I OT' ( * F 1Tt n7 to Tn n Using the fact that T - T' ' T', and also using again the fact that the F are causal with bounded memory (d) yields nN B(u)T, (T) p) (P an Lt+T Fn R )t +T T n n n T n=l (PT - PT-T'-d)ulIT = I|IC(U)IIT where C is defined implicitly. 48

Now, L C R T T-T' T-T' N = T' 0) a n T-T'+t +T n T-T'+t +T) T' n=l n so, by the fact that 0T, (T - T' + T) is a bounded operator, T' sup IlL CR_(u)|J < 0,(T - T' + T) P T-T' C T- T' (u) T' I T (T-T +) N sup I(P - P) a L F R) T -(P P gn T' - -dn nT' n=1 n n- 1 = |T, (T - T' + T)1 I|IFIIT, But, sup LTT, C RTT, (u) T, sup IC(u)l Thus, sup |IB(u)IlT < sup IIC(u) lT | T, (T - T' + T)| * |IFIIT T',,(T - T' + T)| IFIIT (18) Combining the inequalities (16), (17) and (18) yields (T) FII < (I' (T)| + IT' (T - T' + T)). IIFIIT T T T T for all F e +, which establishes the result when T < TT'. This can be extended to all T > T' by induction. t If now 4 is a class of systems with 3SC od (, ) and is o d d dynamical with respect to some T' > 0, one can put T equal to the infimum 0 of all such T' and know that - is dynamical with respect to any T > T. It is to be noted that the hypothesis that 2/ C ~,d ( "U,/ ) cannot be dropped in this assertion. In fact, it is not very difficult to give an example 49

where Proposition 2.22 is violated if the mappings F are not causal with bounded memory (d); thus the semigroup property is not preserved. If J is a linear dynamical class with respect to T and with o 2X C 9d (, rj ), then it is clearly possible to deal with the discrete parameter semigroup {n = 0 (nT)}, n = 0, 1, 2,, and still completely describe the future of the system by virtue of the interpolation formula (9). Under certain conditions when = ( g, g,,f,C ) is a linear dynamical class, the discrete parameter semigroup {0 } can be used to induce a "corresponding" semigroup {0n } of linear operators on the linear space spanned by the system parameter space X1 of an E-representation of. We describe a situation in which this can be done and construct the. The construction is not unique, as will be seen, but any {n } so devised approximates {0n} in the sense to be indicated. Let it be assumed that 9 is T;. Write = ( g - T n n [/nT' _ LT)' n = 0,, 2,..., for the classes of truncated systems, where 9nT is the set of all 1T F, FE E. By the assumption on nT nT ' T- L 2 e Since o is a linear dynamical class, T = 9(nT) o = 00 0nr-. Let it further be required that U VnT is a compact subset n=0 nT of r( T' U T)' and for convenience denote nU i0 by T /'T n= nT T Then each An is a subclass of = ( VT g, S T T) Since T and ST are compact and T = Lz, 2 has a standard E-representation (, 0 ), i = ( ' fi, ', T) as given by Proposition 1.7 of Part 1, and 0l is linear. -1 = 01 "T is a subset of a finite-dimensional Euclidean space; let R be the Euclidean space 50

generated by i. The representation mapping 0l as given by Proposition 1.7 is actually defined as a continuous linear map from the closed linear span of ST onto R. Obviously the closed linear span of JT is contained in +, the domain of the O(T). Let {bi, *, bK} be elements of j1 which form a basis for R, and denote the coordinate functionals {b,*, *, b }, so that any element x E R can be written 'KK K x = E b* (x) b. i=l The idea of the construction of ' is that 0 should be the composition of the mappings 4i, 0, 0 in that order. However, this will not quite do, because @(x), xE -t,is not necessarily contained in 6T' and hence is not necessarily in 2, the domain of 0(T). To correct this, we construct a linear mapping 4i which does satisfy the condition @(x) E T 9 x e and which is close to i. Consider the continuous linear functionals on the closed linear span of &T given by b.*o 0, i = 1, *, K. Let E be the null space of b. o 0. First choose an element HI belonging to S that T does not belong to 6; this is possible by the definitions of J1 and b. Then bl* o C (H1) = al i 0. Next, choose Hz E CT' not in 62 and linearly independent of H1. This can be done by virtue of the linear independence of the b., and yields b zo 0 (HZ) = cz 0. Continue this procedure to obtain a linearly independent set {Hi, * *, HK} HiE T satisfying b*o 0 (H) = JK 1 T I a. 0. Define another basis for R with elements in Xi by K C = E [bio S(Hj)] b. i= J 51

since each 0 (H.) E ~ I, it is clear that the c. do belong to Define 0, a linear mapping from the linear span of {Hi,, H onto R by K 0 (H.) = ci i= 1, **., K, and extending linearly. 0 is 1:1, so we can K define - = 1 a linear mapping from R onto the linear span of {H1,,H}. If H belongs to the linear span of {H1, * * *, HK} and also belongs to ~T' then we have /K \ K 0 (H)=0 Yi Hi) =E YiCE i 1t 1=1 i=l thus i carries any element in Xi into 7T. As was already mentioned, 4 is not uniquely defined, except in certain cases of finite-dimensional AL T' since the choice of H, * *, HK is not unique and the resulting linear space K spanned by them is not unique. It now follows that if H E T. then | H - o 40 (H) | 2. In fact, since ( 61, 0l) is an E-representation of -, 1H - 1 o 01 (H)|| < E. But 4 o 01.(H) is an element of ST and it has the same representing element as H, i.e., 0 o o 01 (H)= 0i (H). Hence, || b o 01 (H) - 1, o 01 (H) | = | [ (H)] - o o [ o (H)] - o o< E, from which the assertion follows. K K Proposition 2.24 The mapping 0 from R into R given by 0 = 0i o 0 o is well-defined and linear. If H E, then o T lie n H - ' o o 01 (Ho)l < 2E [1 + lel +... + le"n] (19) where lel denotes the norm of 0 = 0(T). 52

Proof: It has already been ascertained that the range of 4 is contained in the linear span of CT, which in turn is contained in N+. So 0 o is defined. By definition, AT is invariant with respect to 0; since 0 is linear, the linear span of ST is carried into itself by 0. Thus the range of 9 o q is contained in the domain of 01, and 01 o 0 o i is defined as a linear K transformation from R into itself. If H E T H = L o01 (H) E T' and |H H 1 < 2E, as already shown. Then O H E ||le -o 0H I< 1 |H1 - IHo Ho<Z t0 and l|0 H - to 1 0o o L o 01(H)|| < ZE + 2 | 0e The inequality (19) follows by induction. Only linear predictability and associated ideas have been considered in this Section. However, it probably should be noted, although the fact is obvious, that a class of systems could be described as predictable in a wider sense. Indeed, if {T }, n = 1,2, *., is any sequence of mappings n from 5j~ T ( UT, i) into T ( T' T) so that the images under these mappings satisfy the conditions of Proposition 2.15, then the class is "predictable" in an obvious sense. 53

V. REMARKS It will be noticed that, for what has been labeled a linear dynamical class of systems, a structure has been described that is analogous to the usual state-variable formulation of a linear system. In fact, we can write either F- 8 (t) F t o Yt - Ft Ut or nt (T) F(n-)T YnT = FnT nT where u = (P - Pd ) u, Yt = (P tT- Pt)y. The first equation in either case corresponds to the state equation for a linear, time-invariant unforced system, and the second to a time-varying observation equation — actually a linear observation equation, since Ft (ut) for fixed ut defines a linear mapping from ( rT., Z/T) into /T' It follows that the identification problem, when there is noise added, is thereby analogous to the problem of estimating state in a linear system when there is additive noise. A study of identification of F E ~ along the lines of this analogy will be made in a future report. A practical difficulty is, of course, that in modelling many real problems involving rapid time variation the transformations e(T) cannot be known; but this is simply to say that a rapidly time-varying system is not identifiable if there is no information about the future time variation. 54

The characterization of system trajectories in terms of strongly continuous semigroups of linear operators obviously suggests the application of some of the elaborate theory of such semigroups to further study of the structure of these classes of systems, but this is a matter for future work. 55

APPENDIX PROJECTIONS ON PAST AND FUTURE The projections P used in this paper are defined by [P f] (s) = f(s), s t (Al) =0, s>t where f is a function on R1. This definition is still meaningful if f is an element of a space for which the elements are equivalence classes of functions equal a.e. Lebesgue, for then it is applied to each representative of the equivalence class. Most of the operations involving these projections are intuitively clear from the definition. Here and there, however, one may want a formal proof of an identity involving these projections. If one if going to the trouble to provide such proofs, it seems as if the properties that are used might as well be axiomatized, particularly since this does not involve much effort. Then generalizations are at least possible. There is nothing new in thus generalizing the notions of past and future, of course; see, e. g., [ 3 ], [ 4 ], and [ 5]. However it is not the intent in this paper really to pursue any notion of generalized time; so we do not build on theory established in the references cited, but merely develop some simple results ad hoc. These results are more than sufficient for what is needed here. For the remainder of this Appendix, the operators Pt are not to be taken as defined in Section I unless such an interpretation is specifically 57

indicated, but are to be considered abstractly as operators belonging to a family according to the following definition. Definition Al. Let 2 be a linear space. Let {P }, -o t < oo, be a parametrized family of operators on (that is, mappings from 5 into $ ) such that the following conditions are satisfied: 1) P =0 (the zero operator); P -oo +oo = I (the identity operator) 2) P P P P for all t,s t s s t 3) If t< s, PtP = P t s t 4) Ptis linear on 5) If (Pb - ) y = (P - P ) z for arbitrarily large positive b a b a numbers b and arbitrarily large negative numbers a where y and z are elements of.,then y = z. Then {Pt} will be called a family of generalized-time projections on (g.t. projections). Proposition Al Let ^ be any r-0 P space, or any LP (R1) space, or A, or any closed linear subspace of one of these. Let {Pt} be the family of projection operators defined by equation (1), or the extension of (1) to equivalence classes of functions. Then {P } is a family of g.t. projections on the space in question. Proof: Obvious verifications. The projection property (P) as defined in Section I is still a meaningful concept when applied to g.t. projections on a subset of. Let 1 and 32 be linear spaces with families of g.t. projections {Pt} and {Qt}, 58

respectively. Let 2lbe a subset of L with property (P), and let F be a mapping from I into 2,. As in the special case, F is said to be causal if Q F (u) = Q F P (u) for all t and all u E I; F has bounded t t t memory (d) if (Q - Qt) F(u)= (Q - Qt) F (P - Pt-d) (u) for all t and ' --- - — 00 t t oo t-d all uE Proposition A2 (Proposition 2.7) If F is a mapping from Z into ~L that is causal and has bounded memory (d), then for every T > 0, Qt+T- Q) F(u)= (Qt+T - Q F (Pt+T - ) (A4) t+T t t+T t t+T t-d for all t and all u 2 Conversely, if equation (A4) is satisfied for some T > 0 and all t and all uE 2, then F is causal and has bounded memory (d). Proof: We prove first that causality and bounded memory (d) imply the property (A4). For any uE ~, any real number t and any T > 0, (Qt+T Qt) Fu =(Qt+T Qt Qt+T Fu = (Qt+T - ) Qt+ F Pt+T u (Qt+T - t) F Pt+T u Qt+T (Q - Qt) F (P u) t+T 0o t t+T Qt+T (Q - Qt) F( (P - u) t+T, (Q3 co at)F (Poo t-d t+T =(QT - Q)F (P+T Pt )u T+t t t+T t-d Only conditions 1), 2), 3) and 4) of definition (Al) and the properties of causality and bounded memory have been used. 59

Now suppose that (A4) is satisfied. We prove causality. Let b >t be positive and a < t be negative. Then, K (Qb ) Q ) Fu= - a Z( - Q ) (Q Fu (-b a t ( b a k. ( t-kT t-(k+l 1)T for any K such that t = (K + ) T < a. By (A4) this is equal to K Qb - a [ (Qt-kT - t(k+l)T t-kT t-(k+l)T-d)] k=0 (Qb Qa) [ t-kT t-(k+l)T F t-kT t-(k+l)T-d P u k=0 b a t-kT t(k+1)T (Qb - Qa) Qt F Pt u Hence, by condition 5) of the definition, Qt Fu = Q F P The proof that F has bounded memory is completely analogous. Let {zk} be an arbitrary sequence of elements belonging to, and let {k = P - Pt } be a sequence of differences of g.t. projections k tk tk-l where the {tk} -* ** -, -1, 0,, 2,..., satisfy t <t and lim tk = o k 2, -1,k k+l k —02. -oo kC lim t = -oo. In Section III infinite sums of the form k —o k 00: Ak zk k=l are used. These sums have no meaning as far as the structure given by definition Al is concerned, and some further condition is necessary. It is sufficient to require: 6) Corresponding to every {Zk}, zk E, and {Ak}, k = 1, 2, -, where the \k are as defined above, there exists a z E ~ with the property 60

K 2(b) (P-Pa)z (Pb- a) Ak zk k=- K(a) for all b > a, where K1 and K2 are any integers large enough that the interval (a,b] is contained in the interval (-t1 t]. If condition 6) holds, the, for example, 00 00 z - Ak z and P z (P - P )z -00 k a t k=O t-kT t-(k+l)T -oo k=0 Also, expressions of the kind oo L- (Qt-kT Qt-(k+l)T) Fk (t-kT t-(k+l)T-d) u k=- oo are defined, where each Fk is a mapping from ~ into ~2 as above. p It is clear that if y is any O T space, or 0 (but not, of course, L ), and the {Pt} are ordinary time projections then condition 6)holds. 61

REFERENCES 1. Root, W. L. Approximate representations of causal systems with bounded memory. Appears in Techniques of Optimization. A.V. Balakrishnan, ed., Academic Press, pp. 51-64 (1972). (Proceedings of Fourth IFIP Colloquium on Optimization Techniques, Los Angeles, 1971.) 2. Root, W. L. On the modelling of Systems for Identification. Part 1. i-Representations of classes of systems. Technical Report, Univ. of Michigan, 1972. 3. Porter, W.A. Some circuit theory concepts revisited. Internat. J. of Control 12 (1970), 11. 433-488. 4. DeSantis, R.M. and Porter, W.A. On time-related properties of nonlinear systems. SIAM J. Appl. Math., V. 24, No. 2 (1972), pp. 188-206. 5. Saeks, R. Resolution space. Operators and systems. Lecture Notes in Economics and Mathematical Systems. No. 82. SpringerVerlag. 62