THE UNIVERSITY OF MICHIGAN INDUSTRY PROGRAM OF THE COLLEGE OF ENGINEERING LINEAR SYSTEM APPROXIMATION BY MEAN SQUARE ERROR MINIMIZATION I'N THE TIME DOMAIN Elmer qo. Gilbert This dissertation was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the University of Michigan, 1956. January, 1957 IP-201

C t ( a

ACKNOWLEDGEMENTS The author wishes to express his special gratitude to the Cochairmen of his committee, Professors L. L. Rauch and R. M. Howe, for their interest, assistance, and supervision of the work. Thanks also go to the members of the committee, Professors W. Kaplan, A. B. Macnee, N. R. Scott, and C. B. Sharp, and to Dr. J. Otterman for their helpful suggestions. The encouragement and contributions given by E. O. Gilbert throughout the course of the work are especially appreciated. Last of all, the author is indebted to the Industry Program of the College of Engineering for the reproduction of the dissertation.

TABLE OF CONTENTS Page LIST OF TABLES....................... vii LIST OF FIGURES....................... ix LIST OF SYMBOLS....................... xii CHAPTER I. INTRODUCTION.................. PART ONE - BASIC THEORY CHAPTER II. NOTATION AND PRELIMINARY ASSUMPTIONS. 7 Linear System Theory.............. 7 The Approximation Problem......... 8 Physical and Mathematical Limitations... 12 Preliminary Simplifications of the Prescribed Response................. 15 CHAPTER III. THE WEIGHTED MEAN SQUARE ERROR CRITERION... 17 The WME Criterion.17 Frequency Domain Errors... 20 System Function Approximation Errors. 22 CHAPTER IV. THE MINIMIZATION PROBLEM........ 26 The Unconstrained Approximation....... 26 The Constrained Approximation........ 31 Approximation by Constrained Functions... 39 CHAPTER V. ORTHONORMAL FUNCTIONS............ 43 Orthogonalization of Arbitrary Functions... 43 Orthogonalization of Exponential Functions.. 43 CHAPTER VI. CHOICE OF APPROXIMATING FUNCTIONS. 59 Optimum Choice of Approximating Functions.. 59 Approximate Choice of Approximating Functions 61 PART TWO - APPLICATIONS CHAPTER VII. IMPULSE RESPONSE APPROXIMATION........ 69 Laplace Transform Approximation Procedures.. 69 Analog Computer Approximation Procedures... 76 Example s.................. 89 v

TABLE OF CONTENTS (CONT'D) Page CHAPTER VIII. THE ARBITRARY INPUT PROBLEM.... 108 The Equivalent Exponential Input Problem... 109 Computer Evaluation of Coefficient Integrals. 112 Experimental Determination of System Characteristics............ 116 Correlation Function Computation.. 123 CHAPTER IX. CONCLUSION. 127 APPENDIX. INVERSE MATRIX TABLES............ 131 BIBLIOGRAPHY........................ 142 vi

LIST OF TABLES Table Page I Example Information................ 90 II Approximation Data................ 92 III Approximation Coefficients............ 93 vii

LIST OF FIGURES Figure Page 2.1 Linear System Notation.............. 7 2.2 Approximation Problem Notation.......... 9 2.3 Block Diagram Representation of Series Expansion of H*(s)..................... 11 3.1 Vector Diagram of F, Fo*, and F - Fo. 21 3.2 Equivalent Block Diagram of Approximation Problem with Input Replace by Fi(jw) and uo(t).24.... 24 3.3 Equivalent Block Diagram of Approximation Problem Showing the Weighting of eh by Fi(j(D)....... 24 5.1 The Co Contour in the s Plane........... 49 7.1 Physical System for Computation of the Integral dn 77 7.2 Analog Computer Circuits for Non-orthogonal Functions.................... 79 7.3 Cascade Connection of All-Pass Elements for Realization of Orthonormal Functions....... 80 7.4 Analog Computer Circuits for Orthonormal Functions 80 7.5 Computer Block Diagram for Pole Optimization... 82 7.6 Physical System for Computation of Exponentially Weighted Integral................. 83 7.7 Circuit Modifications for Changing s into (s-sk). 84 7.8 Unconstrained Analytic Approximation of Fourth Power Pulse................... 94 7.9 Constrained Function Approximation of Fourth Power Pulse.................... 95 7.10 Constrained Approximation of Fourth Power Pulse.. 96 7.11 Unconstrained Computer Approximation of Fourth Power Pulse.................... 97 7.12 Unconstrained Approximation of Delayed Pulse... 98 ix

LIST OF FIGURES (CONT'D) Figure Page 7.13 Orthonormal Function Approximation of Trapezoid Pulse............ 99 7.14 Weighted and Unweighted Approximation of Trapezoid Pulse.......... 100 7.15 Orthonormal Function Approximation of Matched Filter Response................. 101 7.16 Signal Source for Detection Problem....... 106 8.1 Prescribed Input and Response for Example.... 111 8.2 Diagrams for Direct Computation of dn and cnm.. 113 8.3 Diagrams for Computation of dn and cnm from Correlation Functions....... 115 8.4 Diagram for Computation of cnm from Correlation Function.................... 115 8.5 Diagram for Experimental Determination of dn.. 118 8.6 Diagram for Realization of Single Input, Multiple Output System.................. 121 8.7 Optimum Filter Problem............. 122 8.8 Physical Systems for Computing Correlation Function Coefficients dn............ 124

LIST OF SYMBOLS a - lower limit of x interval an - nth coefficient in approximating series a] - column of coefficients al, - - - aN A - gain constant of system function H*(s) An - coefficients in partial fraction expansion of H*(s) b - upper limit of x interval bn - nth coefficient in constrained approximating series b] - column of coefficients b1, - - - bN Bm - coefficients in differential equation for Prony method cnm - constant defined by weighted integral of Gn(t)Gm(t) co1 - elements of matrix C-1 nm C - N by N matrix with elements c c- _- inverse C matrix dn - constant defined by weighted integral of fo(t)Gn(t) d] - column of coefficients dl, - - - d Adn - computer error in dn Ad] - column of coefficients Adl, - - - N e(t), E(s) - error between fo(t) and fo*(t) e*(t) - approximation of e(t) e* - estimate of peak approximation error max eh(t), Eh(s) - error between h(t) and h*(t) e,(t) - Prony method error E - relative mean square error in dl, - - - dN xi

f(t) - solution of Prony method differential equation fi(t), Fi(s) - prescribed input fo(t), Fo(s) - prescribed response fl(t) - first derivative of fo(t) fo(M) (t) - Mth derivative of f (t) fo*(t), Fo*(s) - approximation of prescribed response fkj - weighted integral of fo(k)(t)fo(j) (t) g(t) - x transformation function, function used in computer approximation g] - column used in derivation of Ic h(t), H(s) - prescribed impulse response, prescribed system function h*(t), H*(s) - approximation of prescribed impulse response, approximation of prescribed system function I - identity matrix I - weighted mean square error TI, - increase in I as a result of Ad] Ic - weighted mean square error for constrained approximation maZx - weighted integral of fo (t) IM' I~p IRJ II - mean square error in the magnitude, angle, real part, and imaginary part of E(jc) Io - weighted mean square error for Prony method Irel - relative weighted mean square error I/Imax ATrel - increase in Irel as a result of ALd] j - the imaginary unit k, kq - value of constraint condition kl, kN - smallest eigenvalue of C-1, largest eigenvalue of C-1 xii

kqn - Kq{ Gn} k] - column of values kl, - - - kQ K - Q by N matrix with elements kqn Kn - normalization constant for Gn KT - transposed K matrix K{}, Kq{} - operator which measures constraint condition Ln(X) - Laquerre polynomial of order n M - number of poles in H*(s) n(t) - noise signal in optimum filter problem ni(t) - undesired component of input no(t) - response component dependent on ni(t) N - number of approximating functions P - number of zeros in H*(s) Pn(x) - functions orthogonal in (a, b) Q - number of constraints r - sensitivity of ZIrel to E rm - elements of the matrix R R - matrix which defines one set of approximating functions in terms of another set of approximating functions s - complex variable of Laplace transform sn - pole of H*(s) sp - zero of H*(s) Sk - exponential constant in weight factor series sk - exponential constant in predistortion series Sum - pole in nth approximating function sm - zero in nth approximating function xiii

S(t) - signal in detection problem t - time variable T - constant used in time scaling or translation of time function T1, T2 - limits on time interval of approximation T1, T2 - interval for determination of e* max uO - unit impulse u1, u2, - - - - derivatives of unit impulse u 1, u 2' - - - - integrals of unit impulse Wk - coefficients in exponential weight factor series wk -1/2 wk - coefficients in predistortion series for W (t) W(t) - weight factor Wo(t) - weight factor for Prony method Wp(t) - weight factor for functions Pn(x) p n Wml - maximum value of W-1 (t) x - argument of orthogonal function Pn(x) x(t), X(s) - response of computer circuits y(t), Y(s) - input to computer circuits - negative real part of nth pole An - distance of nth pole from origin Gn(t), 8n(s) - nth approximating function of fo(t) An - Lagrange multiplier An(s) - all pass function with poles of the nth orthonormal function - exponential constant of weight factor - 3.1416... a - real part of s xiv

an - real part of sn T - variable of integration, argument of correlation functions cn(t), On(s) - nth approximating function of htt) *ii(T) - auto-correlation function of fi(t) oi (T) - cross-correlation function of fo(t) and fi(t) Vfg(T), ~ fg( W) - cross-correlation function of f(t) and g(t) w - imaginary part of s - imaginary part of s xv

I. INTRODUCTION The synthesis of linear systems to have prescribed transient response has become increasingly important in recent years. Present applications in automatic control, electronic computation, data transmission, noise filtering, and measurement of linear-system characteristics are concerned with input functions which are not periodic in time. It is therefore understandable that synthesis procedures which are stated and carried out in terms of time response, like the ones presented here, are of considerable interest. The synthesis problem is rarely solved without error. Limitations such as those imposed by noise and linear-system physical realizability assure this. Consequently, synthesis really involves the solution of t-wo problems: (1) the approximation problem, the determination of an approximate system function which matches closely a prescribed system function; (2) the realization problem, the construction of a physical linear system which possesses the approximate system function. This investigation is aimed primarily at solving the first problem. The approximate system function is expanded in a series of realizable approximating functions, and the coefficients in this series are chosen to minimize a time weighted average of the squared response error. The theoretical development and practical application of this weighted mean square error approximation method are the purpose of this dissertation. Before discussing the advances made in mean square approximation it is relevant to review briefly the present state of linearsystem approximation for prescribed transient response. The greater part of past work is recent and has been restricted primarily to the -1

-2determination of system functions suitable for realization in the form of finite, lumped-element electrical networks with fixed parameters.l A number of useful methods2 have been proposed which produce time-response errors tending toward zero with increased network complexity. No optimum is obtained, however, in the sense that a measure of the error is truly minimized. Furthermore, the procedures are only practicable when the prescribed input is the impulse function and the prescribed response is an analytic expression. The mean square error approach has been investigated by relatively few workers [1,2,17,18,22,31]. By far the most general and complete treatment is the report by Kautz (17], which is devoted primarily to the approximation of impulsive responses with known Laplace transforms. Certain fairly broad classes of exponential time functions, orthonormal in the semi-infinite interval, are developed and employed in an approximating series. Although Kautz and others mention the approximation problem when the input function is arbitrary they do not present wholly satisfactory solutions. Thus, present methods of approximation for prescribed transient response are limited mainly to the following areas: (1) approximation by systems which are finite, lumped, and fixed in time, (2) approximation of impulsive responses expressable in analytic or Laplace transform form, (3) means square approximation by certain classes of I Winkler (41] gives an excellent review of the approximation problem as applied to electrical networks for both time and frequency response. An extensive bibliography is also included. 2 The following methods and corresponding references are noteworthy: (1) matching of time moments [15,34,37], (2) numerical calculation by time series [3,24,25], (3) Pade approximates and continued fraction expansions [17,26,29,36], (4) Prony's method [6,28] (5) rational fraction approximation along contour in complex plane [7,14], (6) Fourier series [35].~

-3orthonormal exponential functions. It is felt that the contributions of this investigation largely overcome these limitations. A summary of the more important results includes: (1) a general theory of constrained and unconstrained mean square approximation by linearly independent butnot necessarily orthogonal functions; (2) a practical solution of the arbitrary input problem; (3) procedures for generating wider classes of orthonormal approximating functions, especially orthonormal exponential functions; (4) the application of analog computer techniques to the mean square error approximation problem and the realization problem, (5) methods for experimentally measuring linear-system characteristics, processing experimental data, and experimentally synthesizing optimum filters. In order to best develop these results the text has been divided into two parts. The first part discusses the basic theory involved in the weighted mean square approximation of time invariant linear systems. It includes chapters on notation and preliminary assumptions, the meaning of the weighted mean square error criterion, constrained and unconstrained approximation by linearly independent approximating functions, the orthogonalization of linearly independent functions with special emphasis on complex exponential functions, and the choice of suitable approximating functions. In the second part, the theory of part one is applied to practical synthesis problems. In this part there are two chapters, the first, on the impulse response approximation problem, and the second, on the arbitrary input problem and the handling of experimental data. An appendix tabulates various families of exponential approximating functions and inverse matrices useful in analytic approximations.

PART I BASIC THEORY

II. NOTATION AND PRELIMINARY ASSUMPTIONS A resume of notation and preliminary assumptions is required before the detailed discussion of basic theory can begin. This chapter will consider theory pertaining to linear-system description, a precise statement of the approximation problem and the mean square error approach, physical and mathematical limitations involved, and preliminary simplifications of the prescribed response. Linear System Theoryl The linear-system notation used is shown in Figure 2.1. The input is fi(t), and the response is fo(t). Laplace transforms of LINEAR SYSTEM fi(t) h(t) fo(t) F1i(S) H(S) F0(S) Figure 2.1 Linear System Notation lower case letter time functions are given by corresponding upper case letter functions of the complex variable s = a + jw. Thus (2.1) -St F(s) f (t) dt 0 o In what follo~v-s all such transforms are presumed to exist. To avoid omission of the time functions for t < O the time origin is chosen so that the time functions are zero for t K 0. 1 A detailed discussion of linear-system theory is given by Gardner and Barnes [10]. -7

-8The synthesis problem is essentially an input-response problem. It is therefore assuraed that the linear system is initially at rest with the total response dependent entirely on the input. In this case, Fo(s) = H(s)Fi(s). (2.2) The system function H(s) may be defined either in terms of a transformed input and response by equation (2.2) or in terms of the transform of the impulse response h(t), 00 -st H(s) = e h(t) it (2.3) 0 The impulse response or weighting function is important because it allow the response to an arbitrary input to be expressed by means of the superposition integrals, 00 fo(t) = fi(T)h(t-T) dT (2.4) -00 and 00 O (t) = fi(t-T)h(T) dr. (2 5) If the linear system is realizable it must not be a predictor, and h(t) = 0 for t < O. This causes the upper limit in equation (2.4) to become t and the lower limit in equation (2.5) to become zero. The Laplace transform equation (2.2) and the superposition integrals will be used frequently in future sections. The Approximation Problem The approximation problem involves a prescribed system function and an approximation to it. Problem notation is shown in Figure 2.2. The prescribed system function is defined in terms of

-9PRESCRIBED SYSTEM h (t) fo(t h (f~(t) H(S) F S) fi(t) Fi(S) APPROXIMATE SYSTEM ES) h W t) t) H (S) Figure 2.2 Approximation Problem Notation a given input fi(t) and response f (t) and is not necessarily physically realizable. In the special case where f i(t) is the unit impulse. the prescribed response f (t) is simply the impulse response h(t). The approximate system function (note that approximate functions are distinguished from their exact counterparts by an asterik) is subject to conditions of realizability and is chosen to make the approximation error e(t)(or E(s) ) small. Two methods of error evaluation may be considered. In a frequency domain approximation the prescribed input and response are stated in Laplace or Fourier transform notation so that a function of E(s)(or E(jW) ) measures the approximation tolerance. In a time domain approximation the prescribed input and response are stated as time functions so that a function of e(t) measures the approximation tolerance. When accurate duplication of transient response is of primary importance the second procedure is more direct and allows better control of approximation error.

-10The mean square error approach to the time domain approximation depends on two things: (1) a series expansion of the approximate weighting function, and (2) the minimization of the Fweighted mean square error in time response. The series expansion includes N predetermined approximating functions and is -kuritten as N h* (t) =j anq(t) (2.6) n=l Consequently, the approximate system function is given by the Laplace transform of equation (2.6). N H*(s) ann(s) n() (2.7) n=l Conditions which pn(t) and n(s) must satisfy if H*(s) is to be realizable are considered in the next section. Factors controlling the choice of the functions themselves are examined in Chapter VI. The weighted mean square error is defined by T2 T2 I =Jw(t)e(t)dt = W(t) [fo(t) - fo(t)]2dt T T (2.8) I 1 where T < t < T is the interval of approximation and W(t) is a posi1- 2 tive, bounded weight function. When I is minimized with respect to the coefficients al, —-- aN, the approximation problem is considered solved. Before I can be minimized it must be expressed in terms of the coefficients. This is possible through application of the superposition integral 1 Occasionally, it may be desirable to let W(t) be unbounded. This is permissible if the limitations of the next section are satisfied.

-1100 N oo f(t) =o(tT) h*)dr an fi(t-T)(T) dT (2.9) 0W n=J.0 Notation is simplified by letting 00 Qn(t) =0fi(t-T) (Pn(T) dT (2.10) then N fo(t) = angn(t) (2.11) n=l and T2 N I =W(t)[(t) t) - anGn(t)] 2it ~ (2.12) T1 n=l Detailed procedures for minimizing I are taken up in Chapter IV. The equations of the previous paragraph can be understood more clearly by reference to Figure 2.3. The approximate linear system i | s o (S) F $, (,) e,qt) o o M Fi(S) Block Diagram Representation of Series Expansion of *(S) o o 0 0 o o

-12is divided into N parallel paths with respective system functions Tn(s), gains an, and outputs anen(t) which are summed to form the approximate response fo(t). Such a representation is schematic; it does not mean that H (s) must be realized in the same way. Physical and Mathematical Limitations If the approximate linear system is to be realizable it must be non-predicting and stable. 1 Or equivalently, the following conditions 00 hold: (1) h*(t) = 0 for t < 0, (2) Ih*(t)ldt is bounded or H*(s) is 0 dHi2 bounded and analytic for a > 0 (s = a+jM0) and has an integrable dHs on ds the entire jC0 axis.2 Clearly, the same two conditions must apply also to the functions of the series expansion, cpn(t) and On(S). Conditions (2) means, among other things, that H (s) is bounded at the point at infinity; i.e., the approximate system has finite gain at infinite frequency. This excludes certain ideal devices such as perfect differentiators. Actually, most practical systems have the even greater restriction of zero gain at infinite frequecy. In the majority of useful approximation problems it is further specified that the approximate system be lumped and finite. Unless otherwise noted this assumption will be made in the following work. Mathematically, the lumped and finite condition demands that: (1) H*(s) be a real, rational function of s with a finite number of poles, (2) h*(t) be a real, finite sum of complex exponential functions with a possible impulse added at t = 0. Stating these conditions in equation form for a Mth order system yields 1 A stable system is defined as one producing bounded responses for bounded inputs. 2 These conditions are discussed more fully by James, Nichols, and Phillips [16].

-13M ((S-1)(S —2)..(ss-p) m H (s) =A = Ao+ (ss 1)(s s2) --— (-SM) m=l m (2.13) and M h*(t) = A u (t) + A m 1,mt 0 _ M 1,2 m=l (2.14) wherep,ss m' and Am, occur in conjugate pairs when complex, and where am < (sm = am + jam) for m = 1, ---— M. If H*(s) has zero gain at infinite frequency, P < M and Ao = 0. The series expansion of the approximate system function given in equation (2.7) must contain the same poles as H*(s). Since the approximating functions in the series are predetermined, this means that the poles of the final approximation are predetermined. The zeros of H*(s) depend, of course on the coefficients al, —--- aN In addition to the above system limitations, solution of the minimization problem requires I to be bounded (and hence, in this case continuous) for finite a1, —-— aN. From equation (2.12) it is evident that this is assured if and only if all the integrals T2 T2 T2 2 2 2 W(t)fo2 (t) dt, W(t)12(t) dt, —- W(t)O2(t)dt T T T 1 1 1 (2.15) are bounded. The first integral depends on the prescribed response. If it is unbounded a preliminary simplification of the prescribed 1 An nth order pole in H*(s) will cause equation (2.13) to have the terms Am Am + 1 Am+n-l and equation (2.14) to have the')smt Ald _ )2 S-Sm (S-Si (S S)n 2 uo(t) is the unit impulse function.

-14response is necessary. Such simplications are described in the next section. The remaining integrals depend on the approximating functions pn(t) and the prescribed input fi(t). For realizable, finite, lumped element systems the integrals always will converge provided the energy in fi(t) between T1 and T2 is finite, i.e. T2 fi (t) dt < c (2.16) T1 For a bounded input and a finite interval (T1,T2)equation (2.16) is obviously satisfied. In an infinite interval the entire input must have finite energy. Thus, the infinite interval may be used for pulse -like inputs but not for ever present random inputs.l Useful, unbounded inputs are the unit impulse uo(t) and its successive derivatives u1(t) u2(t), —-—. If they occur in (T1,T2) the integrals will be unbounded unless H (s) falls off at a sufficient rate as s- oo. The exact condition required is M-P > n+l where n is the subscript on the highest order impulse appearing in (T1,T2); that is, H*(s) must have at least an (n+l)th order zero at s=o. Last of all, the approximating series fo (t) must not contain any redundant terms which can be expressed as linear combinations of the other terms of the series. More precisely, the functions Ql(t), —-— N (t) N 1 N must be linearly independent or fo = a0n(t) must be zero only when all the coefficients al, —--— aN are zero. iut fO(t) h*(T)fi(t-T) dT is identically zero only when h (t) is zero (excluding the trivial case where fi = ). Thus, the linear independence of the system approximating functions cp1 (t), —--- pN(t) is a sufficient condition for the linear independence of ol(t), —-—.N(t). Therefore, it will be assumed that cpl(t), —-N(t) 1 The input energy may even be infinite if W(t) goes to zero at a sufficient rate as ItlI-o.

-15are always linearly independent. Since M poles can generate only M independent functions, this further requires that N < M. Preliminary Simplifications of the Prescribed Response In order to satisfy the previously mentioned integral restriction on fo(t) and to reduce approximation complexity and error, it is desirable to simplify the prescribed response f (t) before the actual approximation process begins. Possible simplifications include: 1. Change of time scale. The prescribed input and response t t are replaced by fi(T) and f(Tr) and the approximation completed. The resulting system function is H*(Ts). Through a logical choice of T, notation and numerical work is much simplified. 2. Extraction. This technique is feasible when h(t) is approximated directly (f0 = h) and is expressed in analytic form. Terms which are realizable (the impulse, real and complex exponentials) or better handled by direct approximation (the derivatives or integrals of the impulse) are subtracted from h(t) and realized separately. The remaining part of h(t) is then approximated by the mean square error method. 3. Delay removal. The prescribed response is replaced by f (t+T) and the approximation completed. The resulting system function is H*(s)e expressed as a rational function of s. If a shift in the time origin of the response is unimportant the result may be used directly. H*(s) is realized by following H (s) e by an ideal delay of T seconds. Delay removal allows more accurate approximation when the approximated system possesses a delay-like character.

4. Integration removal. The prescribed response is replaced by fo(t) and the approximation completed. The resulting system function is s H (s), and multiplication by - achieves the desired approximation. Integration removal is important in approximating prescribed systems known to have an integrating behavior. The multiplication by 1 tends to produce an f*(t) which is smoothed S O and has less approximation error ripple. 5. Differentiation removal. The procedure is similar to 4. except f fo(t) dt is approximated and the multiplication is by s. 0 The error in fo*(t) has increased approximation ripple. T2 One or more of the above operations will usually make Wfo2dt T1 bounded; if not, the problem statement should be again inspected. Perhaps, an equivalent but more suitable prescribed input and response can be chosen. Subsequent chapters assume that the above conditions are satisfied and that the preliminary simplifications are completed.

III. THE WEIGHTED MEAN SQUARE ERROR CRITERION As seen in Chapter II the weighted mean square error criterion (abbreviated WME criterion) is fundamental to the approximation process. Therefore, it is important to investigate the criterion and the reasons for its choice. In addition to doing this, the following sections will examine the resulting errors in the frequency domain and in system function approximation. The WME Criterion In order to solve the minimization problem it is necessary to define a suitable measure of the approximation error. Such a measure should meet the following requirements: (1) be zero for zero error, (2) be positive for non-zero error, (3) decrease with decreasing error, (4) permit mathematical solution of the minimization problem. While the first three conditions are easily satisfied, it is the last condition which really forces the choice of the WME criterion. All other proposed criteria fail on this count and lead invariably to trial and error solutions of the minimization problem [13]. The WME criterion has a number of other important characteristics. A most important characteristic is the square weighting of error amplitude, which tends to reduce strongly large errors at the expense of increased Small errors. The error magnitude criterion is better in this respect but does not allow mathematical solution of the minimization I = W(t) le(t)l dt (3.1) T1 -17

problem. Fortunately, the deficiency is not severe in most applications where the prescribed response is fairly smooth. The square weighting may even be preferred when large peak errors are particularly undesirable. Because of the heavy weight placed on large errors the unweighted mean square error criterion (W(t) = 1) has a tendency to cause approximation errors which oscillate symmetrically about zero with relatively constant peak amplitude. This property seems characteristic of most mean square approximations, especially when the approximated functions are continuous. The Fourier series expansion of a triangular wave is a good example, the peak error deviations being almost equal with slight increases at points of slope discontinuity. An exception occurs when the prescribed response and approximating functions are both small for any length of time, since then the error amplitude must also be small. But in time regions where both the prescribed response and the approximating functions have appreciable value, the error does tend to oscillate with approximately constant peak deviations. The WME criterion permits the peak deviations to be varied in a prescribed way as a function of time and is therefore superior to the simpler, more commonly used, unweighted mean square error criterion.1 The variation is achieved by making W(t) large in the time regions where the error is to be made small. In the weighted error integral 1 1 W2fo -W 2fo is equivalent to fo - fo* in the unweighted error integral and has as a result approximately constant oscillation peaks 1 1 in regions where W2 fo and W2 f* have appreciable value. Thus, the envelope of the oscillating error is approximately proportional to 1 W2 (t). While W may be any bounded, positive function there are certain 1 A good example of WNE criterion application; is given by Westcott [40] where W(t ) = t.

-19more desirable choices, discussed in part two, which reduce numerical and analytic work in impulse response approximation. Although it is impossible to calculate the error e(t) without evaluating the approximating series, it is possible to make a rough estimate of its maximum value from the quanity I. From the discussion of the previous paragraph the following somewhat crude approximation to e(t) seems reasonable: e(t) -e e (t) = AW 2(t)sin ot, T1 < t < T2 (3.2) = 0, t < Ti t > T2. 1 2 1 1 * The interval (T1,T2) is the region in which W Zfo(t) and W 2fo(t) have appreciable value, and sin %ot represents approximately the rather 1 oscillatory nature of e(t) within the envelope W -(t). Substituting equation (3.2) in the WME integral gives ~~~~T T 21 2 I - W(t)[A2W (t) sin Rot]2dt = A [ 2 2 sin 2 t] dt. 1 T1 (33) Assuming an integral number of periods of sin 2 %ot in (T1,T2), 2 Finally, eliminating A between equation (.4) and equation (3.4) Finally, eliminating A between equation (3.4) and equation (3.2) yields, emax emax - W- (t) sin 2ot Soax =2 maax 2IW'1 ~ I max T2 -T (35)

-20the estimate of the maximum error deviation. Despite the rather gross assumptions made in deriving equation (3.5), it has given in many cases accuracies of better than 50%. Application of equation (3.5), including the choice of (T2,T1), is demonstrated in part two, Chapter VII. Frequency Domain Errors Minimization of the WME integral brings about an approximation in the frequency domain as well as in the time domain. To see the relation between frequency domain errors and time domain errors, consider first the infinite interval, unweighted mean square error integral, I = [fo - fo* dt. (3.6) -Co By means of the complex convolution integrall of the Laplace transform I may be written as c+j I = 1 [Fo(-s) - Fo*(-s)][Fo(s) - F *(s)]ds 27j c-joo(3.7) Evaluating equation (3.7) on the imaginary axis (real frequency axis) 00 I: j Fo(Jm) - Fo (jw)I dmD (3.8) Thus, the unweighted mean square approximation in the time domain is a mean square error-magnitude approximation in the frequency domain. But frequency domain approximation errors are more usually measured in terms of magnitude error, phase error, real part error, and imaginary part error. The mean square integrals of these errors are respectively: IM= 1 [IFI -IF (]2 (39) 1 See Gardner and Barnes [10], page 275.

-2100 I -1 I [angFo - angFo *]2d, (3.10) 00 -I [21 /[ReFo - ReF ] d, (3.11) R 2 - 0 (3.11) 00 II= 1 [ImF - imFo d. (3.12) -00 Expansion of equation (3.8) into imaginary and real parts of F and F O O gives I = IR + II. (3.13) so that I > IR and I > II. Reference to Figure 3.1 shows that lFO- Fo0*> IF0l- IFo*I so that I > IM. It is seen that IM, IR, and II _ * 1 Vector Diagram of Figure 3.1 Vector Diagram of FTe Fdor and Fo - Fo are not minimized by minimizing I. They do converge toward zero, however, as I converges toward zero. No similar bound on the mean square phase error Ip is possible. When IFo - Fof,IFol and IFo*I are all small, then the angular error between Fo and Fo may be large though IFo - Fo J is small. Unfortunately, all three quantities often become small together as - 0o

-22and the minimization of I (a measure of F0 - Fo* ) allows large phase errors at high frequencies. This is what happens in practice. The phase of Fo* (jL) as o_ 0o is dependent on the approximating functions used and not on the phase of Fo(jL) as c -,o. This limitation of the mean square error approximation can be eliminated by constrained approximations of the type described in the next chapter. Good phase approximation, as well as good magnitude, real part, and imaginary part approximation, is then possible. The introduction of the weight factor W(t) and the finite interval (T2, T1)makes the above relations much more complicated. One interpretation of the WME integral is to consider I as a weighted average of frequency domain errors arising from time errors which exist in small time intervals. Thus, N 00 I = j W(T1 + nAt) IFon - Fon I d u (3.14) -00 n=l where N A t = T2 - T1 and Tl+nAt (s) = F-st Fon f(s) f(t) dt. (3.15) Tl+(n-l)at Though this representation is admittedly strained it does allow the previously developed results to be extended to the WME criterion. System Function Approximation Errors It is useful to know how the prescribed impulse response function h(t) is approximated. If fl(t) is the unit impulse, the WME criterion applies directly to the impulse response error eh(t) = h(t) - h*(t). If fi(t) is an arbitrary input the approximation criterion which applies to eh(t) is not as simple.

-23Again, the initial development will be restricted to the infinite interval, unweighted mean square error criterion. Substituting Fo= HFi and Fo= =H Fi in equation (3.8) yields 00 00 I = 2 - Fi - Fi* | 2dH = 1 IFi2 H-H* 1 2 (3.16) — 00 — 00 The effect of an arbitrary input is now clear. The mean square error in the h(t) approximation is weighted in the frequency domain by 1Fi 2 The result seems reasonable, for the best approximation would be expected in the frequency range where the input amplitudes are greatest. It should be noted that the weighting depends only on I Fi and not on the phase of Fi. Thus, when the unweighted mean square error criterion is used, the input need not be the prescribed input fi(t) but may be any time function which possesses the same spectral magnitude. For example, a random input with spectral energy IFi12 = may be replaced by the exponential pulse -t fi(t) = e, t > O = 0 t<O which has the same IFil. The approximation independence of input phase does not indicate, though, that phase characteristics of the prescribed system function H(s) are ignored. A change in the phase of H(jm) does change I as given by equation (3.16). The extension of these relations to the WME integral requires techniques similar to those used in the previous section. A preferable alternative is to consider two block diagrams which are equivalent to the diagram of Figure 2.2. FiFigure 3.2 shows the first modification of Figure 2.2. The input is replaced by a unit impulse followed by a system function Fi(jco)(not necessarily realizable). Since the system

-24Figure 3.2 Equivalent Block Diagram of Approximation Problem with Input Replaced by Fi(ja) and u*(t) is linear from point A to point B, the order of the operations is immaterial, and Fi(jc) may be shifted from A to B as in Figure 3.3. Figure 353 shows that the error eh in impulsive response approximation is first frequency weighted by Fi(jcL) and then squared and time weighted f~f)f Figure 3.2 Equivalent Block Diagram of Approximation Problem Showing the Weighting of eh by Fi(j o(t) Fir 33EuvlnBlk ig oAprxmtoPrbeShown

-25by W(t). Since the phase of Fi(jci) affects the time distribution of the filtered error e(t), it must also affect the value of I. Thus, contrary to the unweighted approximation, the weighted approximation is sensitive to forcing function phase. In summary, the following statements concerning the WME approximation can be made: 1. The WME criterion is the only practical error criterion permitting mathematical solution of the minimization problem. 2. Large approximation errors are weighted heavily compared with small approximation errors. 3. The approximation error tends to oscillate about zero and 1 1 _ - -- * when W2fo and W2fol are reasonably large has an envelope roughly proportional to W2(t). 4. Good time domain approximation assures good frequency domain approximation of the magnitude, the real part, and the imaginary part of the response Fo(ju. 5. Good phase approximation of Fo0(jL) for small Fo(jQo) generally Remands a constrained WME approximation. 6. For arbitrary inputs the impulse response error eh is weighted frequency-wise by Fi (j)).

IV. THE MINIMIZATION PROBLEM The preceding sections have described in detail the approximation problem, mathematical and realizability restrictions, and properties of the WME approximation. This chapter will develop the general theory of WME approximation and derive the equations necessary to solve the minimization problem for simple and constrained approximations. Practical applications of the results are delayed until part two. Before beginning it is advisable to review briefly the conditions specified in Chapter II. They are: (1) H*(s) mut be physically realizable, (2) the weighted square integrals of fo, 01, - - - GN must be bounded (to assure the continuity of I for finite a1, - - - aN), (3) the system approximating functions Al, - - - PN must be linearly independent (to prevent G1, - - - GN from being linearly dependent). The first section that follows is devoted to the simple or unconstrained WME approximation. The Unconstrained Approximation The WME integral is given by T2 T2 N Ij W[ ff]2t = [fo - fa *] dt T1 T' n=l T2 NN T2 -26- fo dt- 2 an Gdt + anam W ngmdt T1 n=l T1 n=lm=l T1 Since the integrals of equation (4.1) appear frequently it is desirable to introduce the follov.ing notation: -26

-27T2 dn = W fo dt, n = 1, - - - N (4.2) T T2 Cm' OnGm dt = ctmn n = I, - - - N, m = 1, - - - N. Ti (4 3) Substituting these expressions into equation (4.1) yields T2 N N N I = Wfo2dt - 2 andn + anamCnm (44) 0 n n n Cnm ~~d~tC Zrhamc_ (4.4) T1 n=l n=l m=l To obtain the WME approximation I must be minimized with respect to al, - - - aN. Since I is continuous for finite a, —-- a it possesses continuous derivatives with respect to the an which may be set equal to zero to find the stationary point. Thus, N a 2 dn + 2amc = 0. n = 1, - - - N. (4.5) m=l These N equations defining the stationary point coefficients may be expressed more simply in matrix notation by d] = C a] (4.6) If the matrix C is non-singular (det C ~ 0) a unique solution of equation (4.6) results. But the det C is the Gram1 determinant of the functions Wl/2Gn, which are linearly independent because the See Courant and Hilbert[8],page 61.

-28functions Gn are linearly independent. The determinant is therefore non-zero,and equation (4.6) does define a unique stationary point. Furthermore, the quadratic form N N L ~ anamCnm n=l m=l is positive definite.1 Hence I can be made arbitrarily large by increasing Iall, - - - IaNI indefinitely.2 Since the stationary point is unique it must therefore be a minimum. The value of the minimum may be calculated by expressing equation (4.4) in matrix notation and substituting the known value of d]. Thus T2 2 I -= W fo dt - 2 a, d] + +a C a] To T2 = Wf2 dt - a, d] T1 (4.7) Other equally good expressions are T2 T T2 I =W fo2 dt - d, C d] (4.9) T 1 1 Ibid. 2 For large lani the quadratic form in equation (4.4) increases much more rapidly than adn n=l

-29Because the quadratic form a C a] is positive definite equation (4.8) shows that T 2 Iz W fo dt. (4.10) T1 It is therefore logical to define the worst possible WME as T2 2 Imax = W fo2 dt (4.11) T1 which exists.when there are no terms in the approximating series. The relative error I/I is less than or equal to one and is particularly useful because it provides a normalized indication of approximation accuracy which is independent of the amplitude or time scaling of the approximation problem. To summarize, the solution of minimization problem is unique and involves the following steps: (1) the evaluation of the 1/2 N(N+3) integrals (equations 4.2 and 4.3) which are the elements of matrices d] and C, (2) the inversion of the matrix C and the calculation of the column of approximating coefficients from a] = C-1 d], (3) the calculation of I from known a] and d] by equation (4.7). Usually, the inversion of C entails the greatest amount of work, especially when C is large. It should be noted, however, that C depends only on the approximating functions 01' - - - 9N and not on the

-30prescribed response fo. Thus, a single inverse matrix suffices for the approximation of any number of prescribed responses. It would be particularly fortunate if T2 Cnm JW Gnm dt 1 n= m (4.13) T - on n m for then the coefficients would be given simply by T 2 an = n =fW o n dt, n = 1, - - - N (4.14) T and the WME by N I max = a a] = Imax an (4.15) n=l While such "orthonormal" function approximations are valuable they are not necessarily easier to implement since the generation of orthonormal families of functions may be as difficult as matrix inversion. Orthogonalization procedures are discussed fully in the next chapter. Relative merits of orthogonal and non-orthogonal approximation depend on application and are treated in part two. Since the generation of families of orthonormal functions depends on the linear combination of linearly independent functions, it is important to investigate the approximation properties of approximating functions which are linear combinations of 01, - - - GN. The most general set of such functions Q1' - - - GN is given by the matrix equation

-31] = R ], (4.16) where the matrix R is any matrix whose determinant is non-zero. The latter requirement is necessary to assure the linear independence of the new functions 9]. By substituting 9] in the previously developed equations it is readily shown that approximation by 8] is exactly equivalent to approximation by 9]. Hence, it makes no difference in the final approximation whether or not the approximating functions are first orthonormalized. The Constrained Approximation Frequently it is necessary for the approximate response to satisfy exactly certain specific conditions which stem from important properties of the prescribed response or from the requirements of a particular application. Unfortunately, such specific conditions can only be approached, not equaled, by the finite, unconstrained approximation of the previous section. It is therefore desirable to formulate a WME approximation which is constrained. Typical conditions which can be enforced by constrained approximation are: (1) the values of fo*(t) or its derivatives at specified instants of time, (2) the values of Fo*(s) or its derivatives at specified points in the complex plane, and (3) the asymptotic behavior of F0*(s) at s = ~. The last condition is especially important when the asymptotic phase is to be exact, or when s| -H*(,s) = const is fixed by practical realizability limitations. In any event, none of the above conditions are normally obtained with a simple WME approximation. They must be forced on

-32fo*(t) by restricting the way in which the approximating functions are combined. Symbolically, a constraint condition may be written as k = K {fo*(t)}, (4 17) K { } is the operator which measures the desired condition, and k is its specified value. For a particular set of approximating functions equation (4.17) is a relation between the approximation coefficients. To distinguish these constrained approximation coefficients from the unconstrained coefficients, they will be denoted by b], i.e., N fo = bn n (t) (4.18) n=l The following examples show how constraints restrict the way in which the coefficients can vary: I. The value of fo*(t) equals the value of fo(t) at t = tl. N fo(t1) =tl) = *t bn Gn (tl) n=l 2. The area under fo*(t) equals one. T2 N T2 1 = k fo* dt =Zbn en (t) dt T1 n=l Tl 3. The im s Fo*(s) = const. This condition is restated as f*(O) and f'*() a3 f*"(O) = Oand f0'*(O) = 0.

-33N 0 =k fo*(O) =j bn Gn (0) n=l N O = k2 f'*(O) = bOn Gn' (0) n=l 4. The mean square value of fo* equals one. T2 N N T2 i = k = fo*2 dt bn m bn bman Om dt T1 n=l m=l T The first three conditions might be termed linear constraints in that N k =K {fo*(t)} = bn K {Gn(t)} (4.19) n=l Linear constraints are by far the most common and permit a simple theoretical solution of the constrained WME approximation problem. The fourth constraint is nonlinear in the approximation coefficients and makes practical solution of the minimization problem very difficult. Constraints which are nonlinear are not considered further in this investigation. In general, there may be more than one constraint equation. If there are Q of them, they may be ordered by subscripts as follows: N N k= K1 {fo*} n K bn ki {On} = bn kl n=l n=l

-34N N k2 = K2 {f0*} =X bn K2 {fn} = bn k2n n=l n=l N N k = KQ If o*}= bn K Q In =L bn kQn ~ (4.20) n=l n=1 where k _ K {G (4.21) Equations (4.20) are written more compactly as k] = Kb] (4.22) The matrix K has elements kqn given by (4.21) and has Q rows and N columns. All the constraint equations in the set (4.20) must be consistent. If the approximating functions are such that kql, kq2 - - kqN are all zero, then the specified value k must also be zero. In qN q this case no limitation is imposed on b] and the whole equation is superfluous. Similarly, the rows of K must not be linearly dependent. If they are, the same constraint condition holds at least twice, perhaps with different specified values. The solution of the constrained approximation problem requires the minimization of the constrained WME integral T N c = (t) [fo - bn n] dt (4.23) T1 n=l

-35under the supposition that the coefficients b] always obey the matrix constraint equation. There are several ways of approaching the minimization problem. The most direct method is to use equation (4.22) to eliminate Q of N coefficients in (4.23) and to vary the remaining N-Q coefficients to produce the minimum. A more satisfactory procedure, which makes better use of the equations already developed for unconstrained approximation, is the method of Lagrange multipliers.2 The multipliers X1, - - - XQ are introduced and the quantity T2 N Ie =W[f0 - fo*- dt + q [ K {fo*} - k ] (4.24) T1 q=l is minimized with respect to b1, - - - bN and X1, - - - X. This is equivalent to minimizing equation (4.23) subject to equation (4.22). Setting aic Ic =0 O, q=l, - - - Q; b = 0, n = 1, - - - N q n yields equation (4.22) and N N 0 = -2 d n + 2Z bmc+ q kq n = - - - N m=l q=l (4.25) where d and c are given by the same integral formulas as before. In n nrm matrix notation equation (4.25) is simply d] C b] + 1/2 KTX]. (4.26) 1 Note that N must be greater than Q. 2Courant and Hilbert[8lpage 164.

-36KT has N rows and Q columns and is obtained by transposing the rows and columns of K. Since C has an inverse, equation (4.26) may be solved for b] to give (4.27) b] = C d] - 1/2 C K k] = a] - 1/2 C K X] T T where a] is the column of unconstrained approximation coefficients. To find b] the column k] must be obtained. This is done by substituting b] from equation (4.27) into equation (4.22), k]= K b] = K a] - 1/2 K C K X], (4.28) T and solving equation (4.28) for k] X] = 2 { KC K T- { K a] - k]}. (4.29) -l The matrix K C KT has Q rows and columns and has an inverse if the rows of K are linearly independent, a condition which has already been assumed. Equation (4.27) and equation (4.29) define a unique stationary point which by the argument used earlier must be a minimum. Before calculating the value of the minimum it is worthwhile to compare the solution steps for unconstrained and constrained approximations. The unconstrained approximation requires: (i) the evaluation of the integrals for dn and cnm, (2) the inversion of C and the calculation of a]. In addition to these steps the constrained approximationi necessitates: (3) the evaluation of the coefficients k rom equation (4.2), (4) the ca ation and inversion of from equation (4.21), (4) the calculation and inversion of

-37K C K T- and (5) the solution of the following equation for b] (obtained by substituting X] from equation (4.29) in equation (4.27)). b] = a]- C- KT { K C1 K T { K a]- k]}. (4.30) None of these additional equations are particularly complicated. The -1 inversion of K C K is usually not too difficult since the T number of constraints Q is generally small. Where different prescribed responses are approximated with the same approximating functions and constraints, only the columns a] i-n equation (4.30) are changed. Thus, the constrained approximation is a relatively simple and straightforward extension of the unconstrained approximation. The minimum value of Ic is given by Ic = Imax - 2 Ib d] + Lb C b]. (4.31) To simplify the manipulations let g] = 1/2 KT ] (4.32) then Cb] = d] - g] and equation (4.31) becomes Ic = Imax -2,b d] + it, {d] - g]} =max L {d] + g], max

-38Ic I ~max - { LL - C -} {d] + g]l = Im.a, d] + C -1 g] max c = I + g,g C g] (4 Io = I+1/4 X K C - K IC -I +1/4k ^ K C K T XI (4.34) Ic = I + 1/2 {K a] - k]}, (4.35) where I is the WME for the unconstrained approximation. Either equation (4.34) or equation (4.35) may be used to calculate Ic, the latter probably being the easiest since both X] and { K a] - k]} are encountered in application of equation (4.30). Since the quadratic form g C 1 g] is positive definite the constrained error Ic must be greater than or equal to the unconstrained error I. The number 1/4, K C -1 K X] (or 1/2 X { K a] - k]}) measures the difference between Ic and I and thus the severity of the imposed constraints. If the prescr bed response and constraints are incompatible I may be intolerably large even when N is arbitrarily large. As in the unconstrained approximation the functions 0], which are defined by equation (4.16) produce the same approximation as 9]. Hence the functions G] may be orthogonalized and still produce the same final approximation. Orthonormal function approximation causes little change in the derived formulas. The matrices C and -1 C are simply replaced by the identity matrix.

-39Approximation by Constrained Functions The method of constrained approximation just presented is important but does involve additional work over the unconstrained approximation. It would be useful, if possible, to work instead with a series of approximating functions, each constrained so that equation (4.30) would reduce to b] = a]. (4.36) Such a set of functions would make constrained approximation more practical. Inspection of equation (4.30) shows that equation (4036) holds if K = 0. This places a restriction on the type of constraints that can be used. They must be homogeneous, i.e. k] must equal zero. Fortunately, homogeneous constraints are relatively common. Examples include: (1) zeros of fo*(t), (2) zeros of Fo (s), (3) asymptotic behavior of F *(s) at s = a. It will now be shown that with homogeneous constraints it is always possible to construct a set of constrained functions. The development depends on an alternative solution of the constrained WME approximation problem. The approximate response is given by N fo= bnn(t) = b] (4.37) n=l where Kb] = 0 (4.38) By partitioning the Q by N matrix K it is possible to eliminate Q of the N coefficients in equation (4.37). Thus equation (4.38) becomes b]Q Kb] = = [KQQK(NQ)] b(N) (4. 0 = KQQb]Q +KQ(N-_3](N-Q)

-40The subscripts indicate the number of elements in matrices. For example, K -K b al(Q+l) 1lN Q,(N-Q) b] Q(Q+l) — N b The particular choice of partitioning in equation (4.38) makes no real difference since the ordering of the original functions G] is arbitrary. Since the rows of K are assumed to be linearly independent the ordering may be arranged so det KQQ O and b] _-KQQ KQ(N-Q) ](N-Q) (4.4o) Substitution of equation (4.40) in equation (4.37) yields the result fo = b] = b]Q + N G- ](N-Q) -1 (4.41) =]{ (N-Q) - QKQQ KQ(NQ) }b](N-Q) The minimization problem is solved by varying the remaining N-Q coefficients of b] to make the WME integral a minimum. In (N-Q) terms of the N-Q functions 01, NQ, n th (N-Q) 9 Q QQ KQ(N-Q) (4.42) and the N-Q coefficients blob bNQ, b] = b]N-Q (4.43) the approximation is an unconstrained one yielding _-1. b] = C d] = a].

-41Thus the equivalent unconstrained functions Q are a linearly independent1 set satisfying the condition K = O. Often, the family of constrained functions may be found more simply by inspection. Suppose s-sand t he 1 s-s —-N condition s23F (s) -const. as s - 00 is required. This causes the two constraints On(o) 0, G(0) = 0. An obvious set of N-2 constrained, linearly independent functions is simply e1 (s )(s-_s2) (ss3) 2 - ( ss 2)s-s4 _ - - - - - - - = 1 N-2 (s-s) (s-s )(s-sN) This particular type of constrained function is very useful and its application in a practical problem will be demonstrated later. While limited to homogeneous constraints, the constrained function approach is also valuable when additional non-homogeneous conditions are specified, since the number of Lagrange multipliers is then only the number of non-homogeneous conditions. Further convenience in the application of constrained functions would be obtained by their orthogonalization. To review, WME approximation may be either constrained or unconstrained. Unconstrained approximations result in simple coefficient equations but do not allow approximation conditions to be specified precisely. Constrained approximations are more difficult to obtain 1 They are linearly independent because each function contains separately one of the independent functions of 0 (N-Q)

-42but permit arbitrary linear constraints to be imposed in a straightforward manner. And finally, homogeneous linear constraints may often be implemented with less work by means of suitably constrained approximating functions,

V. ORTHONORMAL FUNCTIONS The simplification in coefficient evaluation which results from orthonormal function approximation is especially important when many different approximations are to be made with the same set of approximating functions. It is therefore appropriate to discuss various sets of orthonormal functions and the procedures for obtaining them by orthogonalization of linearly independent functions. Two classes of orthogonalization procedures are considered in this chapter. The first is applicable to any arbitrary set of linearly independent functions. The second is restricted to linearly independent exponential functionsl of the type which occur in impulse response approximation. In this class simple formulas are available to define the orthonormal functions. Orthogonalization of Arbitrary Functions Various orthogonalization procedures which are applicable to any set of linearly independent approximating functions may be derived, all producing different families of orthonormal functions. The Schmidt process, which is developed in this section, is as practical as any, even though it demands considerable numerical effort. Although the details of the process are well known [8], they are repeated here for convenience. The desired orthonormal functions Q1,-....N are expressed by linear combinations of the given linearly independent functions @1' - 0N' To avoid complete repetition of the orthogonalization 1 The exponential functions considered include real exponential functions and exponentially damped cosine and sine functions.

-44procedure when the number of functions N is changed, it is assumed that Q1 depends only on Q1, Q2 depends only on Q1 and Q2' and so on. Thus G1 = i +_ G2 = r2101 + r2292 03 r3101 + r3202 + r3303 N r 9 N, Nn n n=l (5.1) The transformation coefficients rnm must be chosen so the orthonormal condition holds; i.e. so 2 (5.2) cnm = W(t) On(t) km(t)dt = (Qn',m) = 1, n=m 1 r, M Applying equation (5.2) with n = m = 1 yields 2 (,1')r (1) = (015.3) = 1 and 1 1 = 1 ~ 1 (5.4) Considering next -2, equation (5.2) must hold for n = 2 and m = 1,2. If ~2 is written as a linear combination of ~1 and Q2' it is apparent that (02,1l) = requires =2 r22 [~2 - (2'1) 1]. Taking r 22 = _ _ -1 - = 1 makes (02'2) = 1 and gives - 02 - (02,@1) 81 \J (,%)e 2(%2',1) + (%2, 1)

-45r2 - r Cl c21 _ r2 C2 22 11 21 and consequently 1 r22 c2 -r2 c 11 21 (5.6) r = -r r2 c 21 22 11 21 (5.7) In a similar manner - (Q3,ol) o1 - (o3,2) 8 2 - (5, )2 -2 (3' o3) - (o3,) (~3, ~2) "'~~ J~l~-(~~~r2-B,~ rll- c 31 )r 5 -r11 c 1Q1- (r21C51+r22c32 2 33 rlC2 r c + r )2 -r151 - r21c31 +r22c32 (5.8) and N-1 N -r c Q -(r c +r c )2- -(N N 1 N1 1 r21c Nl + 22cN )g2r "N~~~~~~~ n=1.. r N1 (r21cNi r22cN2 )2 -1) n /n=> (5.9) from which r31,r32 r33 and rNi rN21 —-rNN may be calculated. All indicated divisions are possible since the numerator functions and the 1 square roots of their norms are non-zero. Furthermore, rll depends on c11; r21,r22 depends on r1,c21,c22; r 1,jr2, r5 depend on r11,r21,r22, c31,c2c3; and so on. Thus the orthogonalization of any set of linearly independent approximating functions 1 —-- N is always possible in terms of the constants cnm of the set. It is apparent from inspection of the above equations that application of the Schmidt process is not simple. Numerical work grows 1 This is a direct result of the linear independence of 81'- 0N*

very rapidly with the number of functions, making machine calculation almost mandatory. Because of this and other considerations, which are discussed later, it is sometimes preferable to invert C. Orthogonalization of Exponential Functions Approximation by exponential functions occurs, among other times, when the prescribed input is an impulse or a sum of exponential functions. Since such approximations are common it is worthwhile to consider separately the orthogonalization of linearly independent exponential functions. Fortunately, these functions admit mathematical techniques which are far more general and tractable than the Schmidt process. Restrictions. — The prescribed input is assumed to begin at t = 0 so the approximating functions 1, -----.N are zero for negative time and are linear combinations of functions of the form tPet, tPeot cos Qt, tPertsin 6t (p = 0 or an integer) for positive time. Since the interval of expansion is semi-infinite, i.e. T1 = 0 and T2 = c, the integrals tW(t) Qn(t) dt must be finite. This is insured if all a are negative. That is, the prescribed input must be damped and 1' —---— N stable. The Laplace transforms of the approximating functions must correspondingly be linear combinations of functions of the form 1 1 1 ( )p+l ( j )p+l (s- j )+l 1 If W(t) becomes exponentially unbounded as t -. O the a must be more than negative. They must be sufficiently negative to cause the integrand to vanish as t -e o.

-47As expected, the poles of these functions are real (at s = a) or complex conjugate pairs (at s = a + jw) and lie in the left half plane. When p f 0 they are multiple. Last of all, the functions 1, 1.- -N (or 1 - eN) must be linearly independent. This requires the total number of poles (counting multiple poles according to their order) to be greater than or equal to N. Orthogonalization by Transformation —Limited classes of orthonormal exponential functions can be generated from known sets of orthogonal functions by transformation of variables. For example, suppose the functions Pn(x) are known to be orthonormal with respect to Wp(x) in the interval (a,b). Then, b W (x)P (X)P (x) dx= 1 m= n p n m a (5.10) O, m n By letting x = g(t) (5.11) such that a = g(O) (5.12) b = g(co) (5-13) equation (5.10) becomes 00 Wp[g(t)] Pn[g(t)] Pm[g(t)] g(t) dt = 1, m = n (5.14) 0 = 0, m n Thus, the functions 5n(t) = Pn[g(t)] are orthonormal with respect to W(t) = WD(g)g in the interval (0, X ). Or if a unity weight is desired, 8n(t) = Pn(g)\(g) g'

Relatively few transformations of the above type are workable. Pn(x) and g(t) must be chosen so that the _n(t) are of the proper ex-2t ponential form. Kautz 17] has used g(t) = 1 - 2e transform the x interval (-1, 1). Of the various known functions orthogonal in (-1,1) he took the Legendre and Tchebycheff polynomials. The resulting en functions had poles at s = -(2n-1). Other possible transformations -t -~ would include e= cos x (orthonormal cosine functions in (O, )) and et= x (orthogonal even Legendre functions in (0, 1)). Again, these transformations produce nfunctions with poles spaced evenly along the negative real axis.l An alternative approach to the WME approximation is proposed by Papoulis [31]. Instead of transforming the approximating functions to the variable t, he transforms the prescribed response to the variable x. Although the method is different the final approximations are much the same, since identical Pn functions and transformations yield identical poles in Fo*(s). Further elaboration on transformation procedures is possible; other types of orthogonal functions can be transformed. However, the results of the remaining part of the chapter are sufficiently general to make such investigation of little value. Residue Statement of Orthonormal Condition —The orthogonalization of exponential functions by application of the theory of residues has many advantages. To develop this method the Laplace transform equivalent 1 Transformations such as x = e-tcost generate complex poles but are rather inefficient because relatively few orthonormal functions require many poles.

-49of the orthonormal condition must be considered. In what follows W(t) = 1. 00 Hence, Hence, ( n = dt = 1, m=n n n m n m 0 (5.15) - 0, m n By means of complex convolution integral of the Laplace transform, equation (5.15) may be expressed by2 c+jOc c+joo Cnm = 2j i (s) k (-s)ds = 27 %(-s) em (s) ds (5.16) c- j0 c-jO where c must be taken as zero if the integrand is always to be analytic on the path of integration. The assumptions concerning en(S) assert that en(s) -, ot. as Isl -moo where p > O. Therefore, an(s) ~(-s) must go to zero at least as rapidly as const. as IsI -* >. Hence, equation (5.16) may be evaluated along the closed contour which traverses the Jc axis and encircles the entire right half plane at infinity. For the contour Co shown in Figure 5.1 (R-> o) mn becomes simply _ 1'= = 1=~ -, -~ 6(s) (-s) ds (517) Co Co Figure 5.1 The CO Contour in the s Plane 1 See Gardner and Barnes [10], page 275. 00 2 Note that / e st elt) - (t)dt = cnm at s = O. nmH

-50since the contribution to the integral along the infinite semi-circle is zero. Finally, because en(s) gm(-S) is analytic along Co, the residue theorem may be applied to yield the desired resultcnm = - j residues of (n(s) %m (-s)in the right half plane = 1, m = n = 0, m ~ n Orthogonalization by Means of the Residue Condition —The steps taken in the orthogonalization of exponential functions by application of the residue condition are similar to those of the Schmidt process. They are: (1) the specification of the linearly independent functions G1,-.-. 0N,i.e. the specification of the poles of 81.-' —- 8N; (2) the linear transformation of these functions into the orthonormal set. As before, complete repetition of the orthogonalization process for different N is avoided if 91 depends only on 01' 32 depends only on 1 and 02, and so on. Or in terms of the frequency domain functions, 81 should contain only the first pole, 82 should contain only the first and second poles,......, and 8N should contain all the poles. To begin, assume that the given poles all lie on the negative real axis at s =-1O-C2, ----- -C.N (the an are positive). The first function is then K1 1l' (5.19) where K1 is an arbitrary constant. Application of equation (5.18) for n = m = 1 gives 2 1 1 -residues of in the right half plane =- + 1 1. L (s.a ) (-Stan ) _ \(5.20)

Thus, - 4 7= (5.21) 1 s< The higher order functions require that equation (5.18) holds when m / n (really m < n since crn = cnm ). This is assured if On(s) is constructed so En(s) Em(-s)has no poles in the right half plane for all m less than n. The obvious step is to choose the zeros of En(S) so they cancel the right half plane poles of 8m(-S). But these poles occur at s =- -- - an-l' Thus equation (5.18) is satisfied for m f n when the zeros of Bn(s) are taken at s = 1,a2, —--- an 1 Applying this result to 82 yields K2(s —al) 82 = K(sa (5.22) (s~ )(S~ ) 1 2 where K2 is given by ~~~~~~~2 2 - reidues of e2(s)2(-s) in the right half plane - = 1. (5.2) Therefore, - Ef (s-Ja) 2 0s+2 )(s=2) (5.24) (s4a )(s+a) Extension of the process to higher order functions is clear, leading to the formula - a- (ss-c)(S-a2) ----- (s- al) 1 2 n (S+C-) (S+-,2) —--------—.. (5-25) When both real and complex poles are specified the same orthogonalization scheme is valid, although the formulas are somewhat more complicated. If the real poles are given by s =-Cn and the complex poles by s =-an + j nI7 - in (-a is the real part, Bn is the distance from the origin); then the orthonormal functions will be proven to be

-52A (S) n n ()n-l sA (5.26) 8 =A (s)'Js+n) "n n-l s2+2 +n2 (5.27) /2 n(s -Pn) n+l= A (S)n-l s2+2. B2 (5.28) n n Equiation (5.26) holds if the nth pole is real, and equation (5.27) and equation (5.28) hold if the nth and (n+l)th poles are complex conjugates.2 A n(s) is the all-pass function of magnitude one which has its poles the first n poles of specified set, i.e. An(s) (s+s-) s+s2)....... An(s) = (s-sl)(s-s2) ___ —— (s-sn) (5.29) For example, the real pole functions of the above paragraph yield An(~S) = (s -a1 ) ( s -2)....(s -n) (s+ al) ( s -2 -n) To show that these functions all satisfy equation (5.18) it is only necessary to note the following: (1) A()n A(-s)n = 1, (2) the sum of the residues of 2an or 2an(s+t n)(-s+~n) s2+a, sn 2)(s2-2a s 2) (s+an ) ( s+. n) ( n n n n in the right half plane is -1, (3) An_l((s)Bs) has no poles in the right half plane for m < n,(4) the sum of the residues of 2Cn(S- _n)(-S_ _n) in the right half plane is zero. (s2+2C 2 sp2-)(s2_21nst 2) Unfortunately, the determination of the corresponding time functions 01 ----- N is not as simple. Each Laplace function 1 A somewhat more restricted version of these equations is given by Kautz [17]. 2 Two equations areneeded for complex poles if, the corresponding time functions 9n and Anal are to be real

n (s) must be expanded in a partial fraction series of n terms, where each term has as an inverse transform an exponential time function. Thus, for first order real poles1 -a t O = r e 1 1 11 -a t - t O2 = r e 1 + r e 2 2 21 22 N N = rNn e nt(50) and first order complex poles = r e (-A1 + jc)t+ r e (-al - jol)t 1 11 12 2 21 22 3 = + r32 e a1 r e r 3 3) + r e (-a3 - JO)t 34 (5.31) With multiple poles the equations are the same except some of the exponential functions include multiplicative integral power of t. The orthonormal functions produced by multiple poles at s = -1 are of particular interest. In this case, 1 The constants rmn are tabulated by Kautz [17] for several real and complex pole sets.

-546in (s ) = nT (S)n1- (5.32) and the corresponding time functions are the well known orthonormal Laguerre functions -n (t) = 4 e-t Ln_l1(2t) (5.35) The polynomials Ln(x) are the Laguerre polynomials of order n given by x n Ln(x) (x x) (5-34) Historically, Laguerre functions were used by Lee [22] in the first published report on mean square approximation of linear system response. They also were discussed and applied by Wiener, and subsequently, by many others. Itshould be noted that the functions e are limited in their n characteristics. They all act as for large s. Hence, pres scribed fall off rates on Fo (s) different than const. must be obtained by constrained approximation. An alternative approach is discussed in the next subsection where formulas are given for constrained orthonormal exponential functions. One modification of the orthonormal function formulas is easily made at this time. The complex pole equations (5.27) and (5.28) may be replaced by n A 2 2 (5.35) n-l s + 2a s+ 2 n n += A (s) 24 2 n+l n-i (2 36)

-55a substitution which is easily verified through equation (5.18). The simplicity of these expressions is useful when the functions must be realized by a physical system. Note that En in equation (5.35) differs from the members of the earlier set in that s2n -e 2onst. as Isl - Orthogonalization of Constrained Functions by Means of toe Residue Condition. —Constrained exponential functions can be orthogonalized by techniques which are similar to those Just developed. As a beginning, consider the functions which act as const. as IsI - o (K-1 constraints). Applying the condition that E (s)m (-s) has no poles in the right half plane for m < n yields (5.37) n (s1 (nl)(S-Sn2). —-.(s-snK) where A n-l is again an all pass function containing the poles of the first n-l functions. The normalization constant depends on the type and number of poles. If K=2, real and complex poles give respectively En = (s) J i anja(an2(+5 ) (5-38) n-l (S+anl) (s+- 2) and A (s) 2_jnn n- s2 + 25 s + 2 n n Since n(s) must contain as many zeros as 8n_l(s) has poles, it is clear that 3n has K more poles than On-l. Thus, a set of N constrained orthonormal functions has a total of KN poles. Chapter IV indicates, however, that N + K - 1 poles (i.e. one additional independent function for each constraint) are sufficient. Consequently, equation (5.37) is rather extravagant in the use of poles. It is shown

-56later that this is a price which must be paid if the orthonormal constrained functions are to be defined by a simple formula. Arbitrary zeros are generated in the orthonormal functions by letting en = A (s) Kn(s-sl) (s sn2)-( —--— (nJ) n-l (SSnl)(SSn2 ) --------- (S-S) n-l ~ (SSnK) (5.40) where J < K so that en(s) -+O as Is s-,. Each zero in equation (5.40) is chosen to achieve a desired homogeneous constraint condition, such as 1n(tl) = 0 or 8n(s1) = 0. To illustrate, consider the case where 0 (1) = 0. Then, n Kn (s-l) A h(s) n-l (S-s ) (s-s ) (5.41) nl n2 If the poles are real and Call E aln2 (5.42) =A(s) 2anla n2(anl+an2)(anl-n2) s-l n-1 n a n2 a2+ - a n2 (s+a )(s+a n1O i-n2.2CS1 n2 nl n2 Although the normalization constants for the constrained functions appear complicated, they are readily derived and present no real difficulty. As before the equations are wasteful of poles since N functions and K-l constraints require KN poles. The inefficiency of pole utilization in the above functions is a direct result of the simplified orthogonalization procedure em" ployed. The functions are constructed so that no poles of En(s) gm(-s) lie in the right half plane for m c n. This condition is a good deal less general than that stated by equation (5.18) which applies to the

-57sum of residues in the right half plane. To expand these remarks consider the residue orthogonalization 1 1 of the functions (s+l)2' (s1, etc.; that is, an orthogonalizlim ation under the constraint that IsfI-*+ s 6in(s) = O. According to the theory of Chapter IV, N orthogonal functions should require (N+l) poles. The first function is simply K1 61 - (s+1)2' (5.43) The second function has three poles and one zero s2l which must be chosen so K1K2 (s-r2l) (-s+1)2 (s+1)3 has a zero sum of residues in the right half plane. Thus 6 K (s-~3) (5.45) 82 - <(s+1)3 Continuing, e3 has four poles and two zeross31 and s32, and must satisfy equation (5.18) with n=3 and m=1,2. Solution of simultaneous algebraic equations in s31 and 332 gives ( 2 s+! ) (s+l)4 (5.46) Derivation of the higher order functions proceeds in a like manner; however, the numerical work multiplies rapidly, forcing a practical end to the calculations. Certainly, no simple formula is available for expressing the Nth function. The orthogonalization of constrained functions which have a minimum number of poles must depend, therefore, on a general orthogonalization method like the Schmidt process.

-58Summary —In conclusion, the available methods for orthogonalizing linearly independent exponential functions possess considerable variety, simplicity, and generality. Orthogonalization by transformation generates various exponential sets with different weight factors. The orthonormal residue condition allows arbitrary pole functions of both the unconstrained and constrained types to be written in simple Laplace transform formulas. All of these factors do much to make approximation by exponential functions practicable. On the negative side, the transformation method is limited by the number of known, applicable transformations and orthonormal function sets. Furthermore, the residue theory fails when applied to constrained functions having a minimum number of poles or to weight functions W(t) which are not a constant. Difficulties of this sort are to be expected in a process as complicated as orthogonalization. Part two will discuss various ways for minimizing some of these deficiencies in practical applications.

VI. CHOICE OF APPROXIMATING FUNCTIONS Earlier chapters have assumed the predetermination of the approximating functions 1,' - - - N.' It is the purpose of this chapter to investigate the factors influencing the choice of these functions, or more explicitly, the choice of cp1, - - - CPN or the poles of H*(s). Although an optimum solution of this pole location problem exists, it is numerically impractical and is usually replaced by less exact and more feasible techniques. The following sections discuss, therefore, not only the pole optimization equations and their solution, but also the criticalness of pole locations and several of the more important approximate methods for pole determination. Optimum Choice of Approximating Functions Two sets of parameters characterize the approximate linear system and affect the weighted mean square error I. They are the approximation coefficients al, - - - aN and the poles sl, - - - SM. Chapter III describes the minimization of I with respect to al, - - aN. If optimum (i.e., least WME) poles positions are also to be found, additional partial derivatives of I must be set equal to zero. Repeating equation (4.7) I=Imax -2 d] + a C a] (6.1) and setting -= 0 for n = 1, - - - N yields the N equations obtained 6an before, d] = C a]. (6.2) 1 Note M > N. See Chapter II. -59

-60But I also depends on d] and C which in turn depend on G1, - - - QN and hence on sl, - - - sM. Thus setting -i = 0 for m = 1, - - - M m yields the further equations s= -2 d] + a { C C a] 0. (6.3) Simultaneous solution of equations (6.2) and (6.3) for al, - - -aN and sl, - - - sM gives the desired optimum.1 Difficulty in obtaining an actual numerical solution of these equations arises because dn and cnm are complicated transcendental 2 functions of sl' - - - sM. Derivation and trial and error solution of such equations for N > 2 becomes a formidable task, even when automatic computing machines are available. An added complication is the possibility of multiple stationary points, each of which must be determined if the true optimum is to be established. Several alternative procedures for pole optimization exist. An iterative numerical method described by Aigrain and Williams [23 makes use of an intermediate approximation by a large number of Laguerre functions. It is, however, only applicable to the approximation of impulsive responses having known Laplace transforms. Another approach is trial and err approximation by orthonormal exponential functions. Here I = Imax -2 an2 is calculated repeatedly for different comn=1 binations of pole locations until the optimum solution is found. Later 1 A similar set of equations written in Laplace transform variables was derived by Aigrain and Williams [1] for limited types of input functions. 2 An exception occurs when W(t) and fo(t) are exponential and T1 = O, T2 = a. Then dn and cnm are rational functions of Sl' - - - SM. The equations,however, are still difficult to solve.

-61on, it will be shown how this method can be practically implemented with a repetitive analog computer. Approximate Choice of Approximating Functions Since none of the above solutions are particularly straightforward it is important to question the criticalness of pole location. Does an approximate solution of the pole optimization problem appreciably alter approximation accuracy? For numerous practical examples the answer is fortunately no. Pole positions can be changed rather drastically in certain directions with little effect on system response (provided, of course, that the approximation coefficients are always chosen for minimum error). However, in some cases the tolerance to pole postion shifts may be poor or the approximate poles may be poorly chosen. In these cases, the additional error can always be corrected by the addition of more terms (i.e., more poles) in the approximating series. The Prony Method —The Prony method [6,28] is one of the better techniques for approximate choice of approximating functions. Though limited to exponential functions, it has considerable generality and good accuracy. It is based on the determination of a linear differential equation with constant coefficients of order M which best fits the prescribed response. The M linearly independent exponential functions which are solutions of the equation are used as source functions for 1' —---.N. "Best fit" means that the coefficients B1, —--- BM in 1 See the examples given by Kautz [17].

-62(M) (1),I BM fo(t) + - - - B1 fo(t) + fo(t) = e0(t) = 0 (6.4) are chosen so the WME MT2 Tn2 M w o (t)o dt (6 I Wo(t)e (t)dt = Wo[ Bmfo(t) + f(t)] t (65) T1 T m=l is a minimum. The exponential functions are the M linearly independent solutions of the equation (M) (1) BM f (t) + - - - - B1 f (t) + f(t) = O. (6.6) Hence, the poles of the approximation Fo*(s) are the roots of the equation BMs +.. BlS + 1 = O. (6.7) It should be noted that the poles produced by the Prony method are in no way related to the poles defined by the optimization equations of the previous section. Nothing is said relative to the coefficients of each exponential function; only the exponential decay constants are considered. Of course, the poles determined will be exact if fo(t) is itself a sum of M exponential functions. The weight factor Wo(t) has the same function as previously discussed weight factors. It emphasizes the undesirability of error as a function of time. The constants B1, - - - BM are found by setting -B = for aBM m = 1, - - - M. Defining 1 dM(t) (t

-63T2 f- = j Wfo f(k)f )dt (6.8) 1 equation (6.5) becomes M M I =)Z Bj Bkfk + 2k + f00 (6.9) j=1 k=l k=l and aIo - = 0 = 2 Blfml + 2 B2fm2 + - - - 2 Bmf + 2 fm (610) m, ----— M. Or in matrix notation equations(6.10) are simply f --------- f B fM After the constantsl- - - BM are obtained by solution of equation (6.11) they are substituted in equation (6.7) which is then solved for the M roots which are the approximation oles. A number of observations concerning the Prony method can be made: 1. It requires complicated calculations (the inversion of an Mth order matrix and the solution of an Mth degree 1 Equation (6.11) has a solution If the det rmin nt of the matrix is zero then the functions WlV2f M) _ _ - -Wl2 fa l) are not linearly f m BMthat fo satisfies a linear differential After the on with constant coefficients BMof order less than M. A lowerquation (6.11) thorder matrix will thus yield inthe coefficients B(6.) which define thed for exponential functions of which far must be comation posed.

-64algebraic equation). The calculations are, however, easily made by a high speed digital computer. 2. The poles obtained are not necessarily in the left half plane. If unstable poles are near the imaginary axis, they may be shifted into the left half plane with little added error. If they are deep inside the right half plane, the entire procedure must be repeated with more favorable Wo or (T1, T2). 3. The constants fkj can be computed even if fo is not expressed in analytic form. This means the Prony method is applicable when fo is given as experimental data. 4. Equation (6.4) may be integrated an arbitrary number of times before minimization of Io. This weights more heavily the undesirability of approximation errors in the low frequency region and allows some or all of the derivatives of fo to be replaced by integrals of fo. The latter change is particularly important when the derivatives of fo are difficult to obtain. This may be the case when fo is obtained from experimental measurements. The Prony method has been discussed in some detail because of its close relation to WME approximation and generality. One other method which is perhaps less exact but simpler to apply will be described now. The Preliminary Approximation Method —In this method a preliminary approximation is made by an approximation procedure different from the WME approach, and the poles of the preliminary approximation are used as the poles of the WME approximation. While many types of

-65preliminary approximations may be used, those which have reasonable simplicity are preferred. Depending on accuracy and computational complexity, the approximation may require varying degrees of effort. Kautz 17i, for example, discusses an accurate preliminary approximation which employs a continued fraction expansion. In his method Fo(s) is expanded in a power series about a point2 in the complex plane, and the leading terms of this power series are then duplicated by a continued fraction expansion. The poles of the terminated expansion become, finally, the poles of the WME approximation. The overall work is somewhat less than that required by the Prony method and is better suited to hand calculations. While continued fraction expansions have certain advantages (among them control of approximation error in the time domain), other kinds of preliminary approximations of similar complexity are undoubtedly as good. Their study is, however, beyond the scope of this investigation. A less precise method of preliminary approximation is to take the poles of well known approximations which have the same characteristics as the prescribed response. For example, a WME low-pass filter might use the poles of a Butterworth or Tchebycheff low-pass filter. Or a prescribed weighting function h(t) which is similar to an exponentially damped periodic function might be approximated by a sum of exponentially damped harmonic sine and cosine functions. Though empirical, this approach is simple and often very successful. To aid its application, several sets of functions with different approximation I See the references given in Chapter I. 2 The point is chosen to make approximation errors small in certain time regions.

qualities are tabulated in the appendix. Their use is illustrated in the next chapter. From the preceding discussion it is clear that the method of preliminary approximation does not lend itself to a brief and specific description. Further elaboration is therefore limited to the examples of part two. To review, the optimum determination of approximating functions is very difficult and must usually be replaced by approximate methods. The Prony method and preliminary approximation method often offer satisfactory solutions when fo is to be approximated by a sum of exponential functions. In other cases little can be done to solve what is an extremely complicated problem, and good approximation accuracy must be achieved by the expedient of including many terms in the approximating series.

PART TWO APPLICATIONS

i I I

VII. IMPULSE RESPONSE APPROXIMATION Since a large fraction of approximation problems are stated by prescribing the impulse response, it is desirable to consider separately and in detail the factors relevant to WME approximation of impulse responses. As noted earlier, the impulse response approximation problem is a special case of the more general problem in which fi(t) = uo(t), fo(t) = h(t), and @n(t) = cPn(t) (n = 1, - - - N) and in which the approximating functions are linear combinations of exponential functions. This restriction on the form of the approximating functions permits not only the development of the orthonormal function formulas of Chapter V, but also the approximation procedures which follow. The description of these procedures is the purpose of this chapter and divides conveniently into two sections: (1) the analytic approximation of h(t) (fo(t)) from its Laplace transform H(s) (Fo(s)), and (2) the analog computer approximation and simulation of h(t). To illustrate further and to apply the results of these sections, a concluding section presents several examples. Laplace Transform Approximation Procedures Since the approximating functions are linear combinations of exponential functions, it is possible to write M s t ~n = nm m=l or equivalently sit = R (7.1) -69

-70where s, —--- s are the poles of the approximation. As indicated be1 M fore, the determination of the approximation coefficients a] requires the calculation of the matrices C and d], and the inversion of C. Expressing C in terms of equation (7.1) gives (assume for the time being that W(t) = 1) slt 00 e C = G] G dt = R { l e - -- e j dt }RT O ~ eMt 0 01 M S S + S1 R RT 1 1 M 1 M M (7.2) Usually, the above formula is simplified by taking R = I. If orthonormal functions are employed, equation (7.2) is not required since C = I. In a similar manner, 0 0 1 slt O ~ e M 00 r smt But f0(t) e dt = Fo(-sm), the Laplace transform of the prescribed response at s = -sm. Hence, d] = R Fo- (- s.) (7 4)

-71Because Fo(s) is known to be regular in the right half plane, the numbers Fo(-sm) exist and yield by equation (7.4) the coefficients dl, - _ - dN. From the preceding paragraphs it is clear that there are two alternative approaches for analytic approximation of fo(t) from Fo(s): 1. The orthogonal function approach. By applying the techniques of Chapter V,C = I, and the calculation and inversion of C is avoided. The determination of d] = a] must be preceded, however, by the evaluation of R which in turn requires a partial fraction expansion of each orthonormal function 81, - - - eN. 2. The exponential function approach. The matrix R is made equal to the identity matrix I so that d] is simply F(-S5l) Fo(-sl and a] = C1 The F0 (-SM)Fo(-SM) necessary calculation and inversion of C complicates this procedure.2 This is substantially the procedure proposed by Kautz (17). For a given set of poles the calculation of C-1 in approach 2 can be made from R in approach 1. The orthonormal functions give sit1 a ] = d ] =Fo(-Sl) -- F(-SM) RT R ~~~~~~o.....7 ~Ist while the exponential functions yield fo*= A 0 ] = C- 0] = 1F0(.s-) - - FO (- M). C- By comparing these equations it is seen that C- = RT R.

-72Little difference in overall work exists between the two approaches if all the approximation details are worked out from the beginning. However, when the inverse matrix C-1 is known prior to approximation, the second approach is especially convenient to use. To make such a technique possible, a tabulation of inverse matrices for certain pole combinations is given in the appendix. Slight variations in the above formulas are made to avoid the introduction of the complex time funcSmt tions e and to simplify the form of approximations. A set of constrained approximating functions is also included. For these and other details see the appendix and the examples at the end of this chapter. Although the above description of mean square approximation by means of the Laplace transform has been limited to simple approximations, the constraint equations and constrained functions of Chapters IV and V are also applicable. The only difference lies in the evaluation of the coefficients dl, - - - dN by means of equation (7.4). When the weight factor W(t) is not a constant the theory is not as simply extended since the integral W(t) e0 f (t) dt 0 0 cannot, in general, be interpreted in terms of F (s). Furthermore, as mentioned in Chapter V, the residue orthogonalization of exponential functions with respect to an arbitrary W(t) is not practical. However, one particular type of weight factor does permit the application of the Laplace transform procedure to non-orthogonal approximating functions. It is defined by the sum of exponential functions

-73K W(t) = wk e1 ( (7.5) k=l With this weight factor oo 1 1 K K +S C = wk e k ],G, dt = wk R RT k=l 0 k=l 1 1 SM+Sl+Sk SM+S+2 M k M M k and (7.6) K 0 K Fo (-Slsk) d] = Wk Jekt fo(t)Q] dt = wk R k=1 k=l ( (7.7) Though the computational effort is evidently increased, the weight W(t) given by equation (7.5) is quite flexible and can be made to cover a wide range of practical situations with relatively few terms. Again, much of the detailed work would be eliminated by a tabulation of inverse matrices for various pole combinations and weight factors. Predistortion2 of the prescribed response is another method for handling a somewhat more restricted type of weight factor. It has the advantage that it may be applied when the approximating functions are unweighted orthonormal functions. To begin, the WME integral The terms of equation (7.5) may also contain multiplicative integral powers of t. Equation (7.6) would then contain terms such as (S —Sn+k - and equation (7.7) the derivatives of Fo(s). 2 The predistortion method is discussed by Kautz. However, some of the weight factors proposed by him are unbounded and lead to divergent coefficient integrals.

-740 I = W [fO - f*]2 dt is written as an unweighted mean square error integral 00 I - V [W1O/2 - wl/fo*]2 dt 0 in which the predistorted prescribed response J/2 fo is approximated 1/2 by W fo* a sum of unweighted approximating functions.-a 0]. It is clear then, that 00 d]= 3 W 12f o] dt (7.8) 0 and W1 2fo= * = ] fo = w-2 ] 79) If the impulse response h*(t) = fo*(t) in equation (7.9) is to be a sum of complex exponentials (S 0] is already such a sum) then W-1/2 must also be a sum of complex exponentials, say -1/2 * Skt(7.10) = k e (7.10) k=1 Substituting equation (7.10) in equation (7.9) and taking the Laplace transform of the result yields the final approximation F o(=) I wk j ann(s - Sk) (7.11) k=l n=l

-75Several disadvantages of the predistortion method are obvious: (1) the approximation has KM poles rather than M poles;1 (2) the evaluation of the integrals, 00 smt / f e 0 K * t * Skt ) wke k=l in equation (7.8) becomes difficult, eliminating one of the main advantages of the Laplace transform procedure; (3) the class of allowed weight factors is restricted. Consequently, W-1/2 is usually limited to one or two real exponential functions. When W-1/2 e -1/24t the formulas become particularly simple: W e= t (7.12) F (-s - ~/2 d] = (7.13) F 0( SM- /2 t l (sc /25 Fo*(s)= _ j I (7.14) E)N(s+p/ 2 It is this special case of predistortion that has the most practical value. More complicated weight factors are treated better with nonorthogonal functions and equations (7.6) and (7.7). In summary, the unweighted mean square error approximation of impulse responses having known Laplace transforms is straightforward, 1 c* * A judicious choice of poles in e1, - - - eN and of Sl, - - - sK can reduce considerably this number of poles.

-76requiring only the values of Fo(s) at s = -sl, - - - -SM. Either orthonormal or independent exponential approximating functions can be used and yield similar amounts of computational effortexcept when tables of inverse matrices are available. Then, the independent exponential functions are preferred. Constrained approximations can be made simply, but the introduction of arbitrary weight factors is more difficult and must be limited primarily to non-orthogonal approximating functions and the weight factor defined by equation (7-5). Analog Computer Approximation Procedures The Laplace transform procedures of the previous section have two primary disadvantages: (1) they are limited to prescribed response functions having known Laplace transforms, and (2) they require lengthy calculations to obtain the approximate function fo*(t) and the error e = fo - fo*. Conventional digital or analog computing techniques offer the most obvious and straightforward solution to these problems. An alternative computer method, more closely related to the realization aspects of linear system synthesis, is proposed in the present section. It is based on the physical realization of the functions $1y - - - ~N (i.e., E, - - - 8N) which make up H*(s) and depends on these physical realizations to compute the approximation coefficients and to simulate the approximate system. The following discussion of this "analog" computer approximation procedure includes the mathematical basis for analog computation of the approximation coefficients, the electronic differential analyzer circuits for realizing non-orthogonal and orthonormal approximating functions, a computer method for trial and error solution of the pole optimization problem, computer

-77evaluation of exponentially weighted coefficient integrals, and the effects of computer inaccuracy. Analog Computation of Approximation Coefficients —Figure 7.1 shows the coefficient computation process as applied to the integral 00 dn =/Gn(t)fo(t)dt 0 A physical system which is synthesized to have a transfer function On(s) receives as an input a forcing function which is made equal to PHYSICAL SYSTEM INPUT On (S) OUTPUT fo0t) L o (t) | dn AT t-O Figure 7.1 Physical System for Computation of the Integral dn fo(-t). In practice fo(-t) would be obtained from fo(t) by some kind of function generator, perhaps a tape recorder run backwards. By means of the superposition integral the output of the system is output = Gn(T)fo(T-t)dT (7.15) 0

-78and at t = 0 is seen to be equal to the desired coefficient dn. If need be, the coefficients cnm can be obtained in the same manner by replacing fo(-t) with Gm(-t). If the physical systems 01, - - - N were able to compute just the coefficients and nothing else, the above scheme would have little advantage over a direct analog computer integration of the formulas for dl, - - - dN. However, the physical systems can also be combined as in Figure 2.3 to simulate H*(s) and to measure the approximation error e = fo - fo* for arbitrary input functions. When the complexity of the simulation is not too great it may even be used directly as the final approximate system, bypassing the usual realization step of the synthesis problem. Differential Analyzer Circuits for Realization of Approximating Functions —Various physical embodiments of the functions 91, —- are possible, but those which employ passive or active electrical networks are favored because of their simplicity and inherent accuracy. The physical systems which follow are examples of lumped2 active networks and make use of electronic differential analyzer computing elements.3 The use of the convolution property to compute integrals is well known (see for example Laning and Battin [21]). McCool [27] discusses the application of a second order linear system with no damping to the determination of Fourier series coefficients, and Bose [5] utilizes the orthonormal Laguerre functions to find the coefficients al, - - - - a as functions of time in a non-linear system representation. Only he last two references are concerned with WME approximation. Nothing precludes the use of linear systems which are not lumped. Korn and Korn [19] discuss electronic differential analyzers and explain the standard circuit symbology that follows.

-79Simpler passive networks can also be derived but lack convenience and flexibility. Figure 7.2 shows the circuits for obtaining the commonly used functions 1..,and s 1 s+a' 2 s + 2 s + as + Y (8) X (s) Y(s) X,(s) Xs) x I -S v Y S~2a S+R2 Figure 7.2 Analog Computer Circuits for Non-orthogonal Functions. The pole positions are changed with the coefficient potentiometers which have the settings C and P. As in Chapter V, a is the negative real part of the pole, and P is the distance of the pole from the origin. When the orthonormal functions of Chapter V are to be realized, the circuits become more complex and a cascade connection of individual elements, as in Figure 7.3, is preferred. For the functions An defined by equations (5.35) and (5.36) the ratio n assumes either Otterman [30] employs circuits of this type for reulization of Laguerre function approximations, but he computes the coefficients analytically.

-80s - a s2 _ 2 s + 2 the form or the form n Figure 7.4 gives the n s2 + 2nS + 2 circuits for realizing these functions along with taps for obtaining L-.,(t ) L-4 n(t) Figure 7.3 Cascade Connection of All-Pass Elements for Realization of Orthonormal Functions x,. S-an n,_ A A I L SS2n,._\.1 (b) -,-m (a) Real Pole. (b) Complex Poles. xa~"* -ans =nu /a _o Au - sr 2n~ xan t +G82n n nI x.. s'- ao. s, 4,2a..A.,.,

en(s). When the individual elements are connected in cascade, as in Figure 7.3, most of the summing operations that take place in the end amplifiers of Figures 7.4a and 7.4b can be performed at integrator inputs, so that only one summing amplifier and N integrating amplifiers are needed for a total of N orthonormal functions. It should be noted that the orthonormal functions generated by the preceding circuits require no preliminary mathematical calculations. They are produced entirely by the physical arrangement of the circuit components and remain orthonormal even when the pole positions are varied by changing computer settings of an and Pn. A set of such circuits could be connected permanently in a practical, special purpose computer for orthonormal function approximation and simulation with arbitrary pole positions. A further advantage of orthonormal function approximation is the simplicity of evaluating the approximation coefficients a] = d]. The matrix R is not required as in the Laplace transform procedure since an = dn is obtained directly from the system function En(s) and the forcing function fo(-t). This eliminates one of the main disadvantages of the orthonormal function expansions given in Chapter V. It also suggests a simplified computer method for optimum pole determinat ion. Computer Method for Pole Optimization —The computer block diagram for pole optimization is shown in Figure 7.5. A series of squaring devices and a summing amplifier are used in conjunction with the orthonormal function circuits to compute N n=l

-82and consequently the mean square approximation error N I max - an n=l Because the approximation error is so easily obtained, it is practical to choose the optimum pole positions through trial and error adjustment of the N computer settings an and Bnl Suitable convergence schemes, foVt) | 1 l | I | ORTHONORMAL FUNCTIONS tr O a ON DEVICES Ow~N Imao- Z Figure 7.5 Computer Block Diagram for Pole Optimization such as the method of steepest descent,would speed the process, and the computer could be run repetitively at high speed so that changes in I could be observed almost instantaneously. The major disadvantage of the method is the high computer accuracy which is demanded because I becomes small as the square of approximation error. For example, an Trial and error procedures for system approximation have been proposed by many authors [9, 42]. As far as is known, none have used orthonormal functions to automatically compute the zeros of H*(s) and thus halve the number of unknowns.

-83average approximation error of 1 percent might require a computer error of.01 percent. Practically then, computer optimization must be limited to a reasonable number of poles, perhaps four or five. Fortunately, it is the simple approximation with relatively few poles where optimum pole determination is most important. Computation of Weighted Integrals —Little elaboration on the previous remarks concerning the weight factor W(t) and constrained approximations is needed here. The introduction of the weight factor W(t) again offers the major difficulty. If the integral skt e fo(t) G (t) dt 0 is needed, as in equation (7.7), it can be evaluated from the physical system shown in Figure 7.6 by noting that'the impulse response skt e G (t) corresponds to a system function -st +skt e e e (t) dt = n(s - Sk) (7.16) 0 PHYSICAL SYSTEM INPUT n (SSk) OUTPUT fo0N e Skt o() c I t ) fe st0 dt AT t O Figure 7.6 Physical System for Computation of Exponentially Weighted Integral and that at t = O the convolution integral gives the indicated output.

-84When the physical system is an electrical network the change from ~n(s) to ~n(s-sk) is easily made if sk is real. All the dynamic elements of the system are modified as shown in Figure 7.7. Note that the change in impedance for the inductor or the capacitor always produces a negative sK. Positive values of sk must be obtained with active elements as the example in Figure 7.7d demonstrates. ORIGINAL ELEMENT MODIFIED ELEMENT C L L$ L R~)L {e) -Ec -L$ S S+o S-a'V Figure 7.7 Circuit Modifications for Changing s into (s-sk) (a) Capacitive Impedance (b) Inductive Impedance (c) and (d) Integrator System Function If the function W(t) is not a sum of real exponentials (this happens in equations (7.6) and (7.7) when the sk are complex), the computation of the weighted integrals must assume the more direct approach discussed earlier. The integral 00 W fo Gn dt 0 is calculated by forcing the system en(s) with W(-t)fo(-t) and measuring the output at t = O. The main complication, if it can be considered a complication, is the generation of the function W(-t)fo(-t).

-85Error Analysis —Last of all, it is important to consider the effects of computer errors. As in most computational problems, digital or analog, complete error analysis is very difficult and must be replaced in part by a comparison of computer solutions with known analytical solutions. This has been done in several examples in the next section with results that appear quite satisfactory for average approximation problems. In a theoretical sense, it is also possible to say something about the relations between errors in the approximation coefficients d] and the WME of the resulting approximation. While such relations.are important and are developed in the following paragraphs, it must be remembered that their final application depends on knowledge of the size of the errors in d] and the dependence of these errors on computer errors. To discover the effect of errors in d] it is convenient to write the computed coefficients as d] + Ad] where the elements of Ad] represent the computer errors in d1, - - - dN. Because the coefficients are no longer the optimum coefficients, the WME will be increased from I by an amount AsI defined by T2 I + A1= W[fo - L C-1 (d] + Ld]) ]dt. (7.17) the basis of T2 dn o W fo Gn dt 1

-86Expansion of equation (7.17) gives T2 I + = /W[(fo - fo*) -,j CA A ] ]dt T1 T2 T2 = I - 2JW(fo - f*) 3 c- 1A] dt + w(,ocl 1 ])2 dt T1 T1 T2 T2 = I - 2J Wfo dtC-1 Ad] + 2 W{. C-1 0]} ~, dt C-1 Ad] 1 T2 1 + W{ iJ C G]} { C-1 d]} dt T =I i - 2 C C+1 Ad] + 2, C-1 Ad] and A = _ L C-Ad] (7.18) Thus the increase in the WME is readily determined when ZAd] is known. If it is found that AI is small compared to I, then the computer errors in d] can be considered negligible. From the standpoint of ultimate approximation error, formula (7.18) is the logical one to apply. A less valid approach is to find the errors in the coefficients a] by ad] = c-Q a]. (7.19) In many cases (especially when the signs of the terms in N Ad = C-l X m=l

-87alternate) LI is small even when Aal, - - - AaN are very large. Such results are possible because the effect of a large error in one coefficient may be compensated by corrective changes in the remaining coefficients. While equation (7.18) is applicable when ZA] is known it does not say anything about the sensitivity of I1 to Ad] when Ad] is not known. In this respect it is helpful to consider the relationship between N E n E =_.. (7.20) n=l the relative mean square error in dl, - - dN, and el - A AI (7.21) rel T2 I max 2 Wf dt 0 T1 the relative mean square approximation error due to iAd - - - dN. -1 If k1 is the smallest eigenvalue of C and kN is the largest eigenvalue of C-1 then [8] N N kl din2 < L C-1 Ad] < kN XAdn (7.22) n=1 n=l and N klIdn2 < d C-1 d] <kN dn (7.23) n=l n=l

-88But Imax = C d] for I < < I so max max.4 c-1 Al],, C-1 a] lrel = (7.24) max C1 Hence, LIrel r E, I < < Imax (7.25) where kl kN kN < r < k (7.26) NN k Equation (7.26) puts bounds on r, the sensitivity of the relative mean square approximation error to relative mean square coefficient error. When kN/k1 is large the sensitivity may be very large or very small depending on the particular values of dl, - - - dN and Adl, - - - AdN. From an uncertainty point of view this is undesirable and it would be better if kN/kl = 1. This is the case when G1, - - - GN are orthonormal. Thus, orthonormal approximating functions have a decided advantage over non-orthonormal functions for analog computation of approximation coefficients. If non-orthonormal functions must be employed, some check on final approximation accuracy should be made. Usually, this involves a computer measurement of e = fo - fo*; the calculation of kN/kl is more difficult and less conclusive. To review, it has been shown that WME approximation by analog computation is practical and has many advantages. Among them are: the simple computation of weighted or unweighted approximation coefficient 1 This is especially true for high order matrices where kN/kl tends to be large.

-89integrals, convenient realization and simulation of H*(s), rapid error calculation, and a computer method of pole optimization. Although computer errors are an added problem, they are small in most practical problems and are minimized by the use of orthonormal approximating functions. Examples The examples of WME impulse response approximation given in this section are presented to illustrate the conclusions of the preceding sections and to show the practicality of the approximation procedures. They are summarized with pertinent approximation information in Table I and are described in detail in the following paragraphs. Tables II and III contain supporting numerical results while Figures 7.8 through 7.15 show plots of fo and fo* The computing components employed in examples four, six, and eight through eleven were part of a general purpose, direct current electronic differential analyzer. All summing amplifiers, integrating amplifiers, and coefficient potentiometers were calibrated within 0.1 percent and were connected according to the diagrams of the previous section. A relay control device was incorporated to disconnect all integrator inputs at t = O. This held the output voltages so they could be measured statically with an accuracy comparable to component accuracy. The first example in Table I involves the analytic approximation of a fourth power pulse by six non-orthogonal Laguerre functions1 whose poles are at s = -1. The pulse is defined by 1See the Appendix, section 1.

TABLE I. EXAMPLE INFORMATION Example Prescribed Type of Approximating* Pole N W(t) Figure Number Response Approximation Functions Positions 1 fourth power unconstrained non-orthogonal an = 1 6 1 78 pulse, equa- analytic Laguerre,(1) n = 0 tion (7.27) 2 fourth power unconstrained non-orthogonal xn = 1 5 1 7*9 pu].se, equa- analytic constrained n = 0 tion (7.27) Laguerre, (2) 3 fourth power constrained non-orthogonal ax = 1 5 1 7. n pulse, equa- analytic, constrained = 0 tion (7.27) f*1(O) = 0 Laguerre, (2) 4 fourth power unconstrained non-orthogonal n = 1 6 1 7.11 pulse, equa- computer Laguerre, (1) nn = 0 tion (7.27) 5 fourth power unconstrained orthonormal Cn = 1 6 1 pulse, equa- analytic Laguerre Pn = o tion (7.27) 6 fourth power unconstrained orthonormal Ct = 1 6 1 pulse, equa- computer Laguerre Pn =0 tion (7.27) 7 delayed unconstrained non-orthogonal C11 = 1, Cx2 =.8 5 1 7.12 pulse, equa- analytic Butterworth, C4 =.3* 2 1 tion (7.28) (6) 4= 1.044 * Numbers refer to Appendix sections.

TABLE I. (COwT ID) Example Prescribed Type of Approximctting* Pole N w(t) Figure Number Response Approxim-ttion Functions Positions 8 trapezoid unconstrained orthonornal = 1, = z2 75 7 7.15 pulse, equa- computer 4 =5, =6.25, tion (7.29), = 1 T =7 9 trapezoid unconstrained orthonormal = l = 2/5, 5 1 7.114 pulse, equa- computer 14 = 1/ 3, Pn = 1 tion (7.29), T= 4 10 trapezoid unconstrained orthonormal Ul *75Y U = 5/12, 5 e / 7.14 pulse, equa- computer 1 1/12, P2 -.856, tion (7.29),, =.946 T =4 11 matched unconstrained orthonormal Cl = 1, a2 =.809, 5 1 7.15 filter re- computer 14 =.309, Yn = 1 sponse * Numbers refer to Appendix sections. **Obtained by predistortion. Poles of final approximation are those of example 9.

-92TABLE II. APPROXIMATION DATA I * rel Example I'max me max I rel max max 2 1 Number Crel 1 3.2507.00651 --.114.065 1o 2 3.2507.00oo684 --.092.067 10 3 3.2507.00oo845.095.074 10 4 3.2507.0102.ooo365 - 5 3.2507.00651 6 3.2507.0032 5.92 x 106 7 2.9556.0999.810.243 lo 8 4.6667.0066 --.054.083 9 9 2.6667.0098 -.133.102 5 10 5.8100.0220 --.178.179 8 11 * In computer approximations I is obtained from computer coefficients.

TABLE III. APPROXIMATION COEFFICIENTS Example Coefficient Number Number 1 2 5 4 5 6 7 1 dn 0.218425 0.464987 0.633805 0.689099 0.645712 0.554782 an 0.1136 -1.4244 7.7915 -17.9408 26.9405 -11.5127 2 dn 0.464987 0.633805 0.68gogg 0.645712 0.534782 an -0.8565 6.2768 -15.6691 25.1229 -10.7069 3 d 0.464987 0.633805 0.689099 0.645712 0.554782 bn 0.0000 2.8509 -9.5025 19.6415 -8.7493 4 d 0.220 0.467 0.634 0:690 o.644 0.55 an -0-132 o.696 o.4oo -4.832 15.168 -7.040 5 an 0.50890 -1.00628 1.26388 -0.67687 -0.05930 0.24999 -- 6 an 0.512 -1i.oo8.1.265 -0.678 -0.059 0.252 7 dn o.oi83156 0.0552155 0.0551941 0.6062901 0.0170925 an 14.5785 -20.2825 2.8474 4.9605 -o.0038 8 an 1.599 -0.204 i.543 -0.595 -0.505 -0.202 -o.o85 9 an 1.529 -0.599 -0.706 0.150 o.oo4 10 an i.885 -1.167 -0.857 o.o43 o.196 11 an 0.338 -0.4621 -1.075 0.385 0.164

1.0 fo 0.,o m~~~~~~~~~~~~~~~~~~ 2 4 6 r o Figure 7.8 Unl"onstrained Analyti Approxiiiiation of Fourth Power Pulse

1.0 fo 0.5 Fp 2 4 6 8 1 Figure 7.9 Constrained Function Approximation of Fourth Power Pulse

I.0 fo 0.5 fo 2 4 6 8 Figure 7.10 Constrained Approximation of Fourth Power Pulse

1.0 fo 2 4 6 8 10 Figure 7.11 Unconstrained Computer Approximation of Fourth Power Pulse

1.0 fc 0.5 f, 0 2 6 8 c -F 0.7 Figure 7.12 Unconstrained Approximation of Delayed Pulse

N;3 fo 1.0 0.5 2 7 4 a Figure 7-i3 Orthonormal Function Approximation of Trapezoid Pulse

*~~~~~~~~~~~~~~~~~~~~~~~~~~~.5 t w(t)ae fo 1.0 W (O~a1. 0.5 2 4 - Figure 7.14 Weighted and Unweighted Approximation of' Trapezoid Puise

1.0 fo 2 4 Figure 7.15 Orthonormal Function Approximation of Matched Filter Response

-102t4 t3 t2 fo(t) 256 6 1 + 0 t < 8 (7.27) =0,t< t 0, t > 8 is symmetric about t = 4, has a maximum at t = 4 of fo(4) = 1, and has zero value and slope at t = 0 and t = 8. Its length is taken to match approximately the length of the longest approximating function, t5 t G6 -t which has its maximum at t = 5. Coefficients d - d6 and al, - - a6 are determined from Fo(s) and its derivatives from the formulas of the Appendix and are listed in Table III. Figure 7.8 compares the approximation fo* with the prescribed response fo; the error'.s small with the greatest deviations occurring at t = 0 and t = 8, where fo has properties which are not characteristic of the approximating functions. As the first step in improving the approximation at t = 0 the constrained Laguerre functions of Appendix section 2 are employed. These functions have the same poles as the unconstrained functions but satisfy in addition the condition Gn(0) = O. For a six pole approximation five functions are required. Again, the coefficients are listed in Table III, in this case under example two. The plot of fo* is shomw in Figure 7.9 and is seen to differ little from the previous plot except at t = 0. The leniency of the imposed constraint is confirmed by the small increase in relative mean square error from 0.00651 to 0.00684. As the final step in matching the approximation at t = 0, example three utilizes the constrained Laguerre functions in a

-103constrained approximation so that fo *(0) = f 0t(O) 0. The constrained coefficients b - - - b5 are obtained from equations (4.30), and the approximation is plotted in Figure 7.10. A noticeable change in the error magnitude and distribution is observed. The severity of the constraint is indicated by the increase in I re from o.o00684 to 0.00845. In order to determine the accuracy of computer approximation, example four repeats the unconstrained pulse approximation of example one. Table III shows that the computer errors in dl, - - - d6 are very small, all less than 0.002. Even so, the errors in the calculated approximation coefficients al, - - - a6 are extremely large. The apparent discrepancy is due to the sign alternations of the terms in the sum 6 a v -1 an ZCnm m= 1 At first inspection it would appear that the an are worthless, but the approximation plotted in Figure 7.11 shows otherwise. Although e has increased, it is still tolerably small. This result is confirmed by the small increase in relative mean square error, zirel = 0o.00065.1 These facts all agree with the remarks of the previous section. While Alre1 is quite small the ratio AIrel/E = r = 89.7 is large. Hence it would seem advantageous to obtain the approximation by means of orthonormal functions. This has been done analytically in 1 The value d c-l d] Irel = 1 I max given in Table II has questionable significance in computer approximations because d] is not precise. See footnote at bottom of page 85. A more satisfactory computer value could be obtained by direct integration of w [ff-0*] dt

exnmple five and by the computer in example six. Again the computer errors in the coefficients are small, less than 0.003, but the increase -6 in relative mean square error ZNI = 5.92 x 10 is negligibly small. This graphically illustrates the superiority of orthonormal functions for computer approximation. Example seven is concerned with the analytic approximation of the delayed exponential pulse fo(t) = 0, t < ( (7.28) = e6(t-5)e-t, t > 5 The pulse reaches its maximum of one at t = 6 and decays slowly. It is particularly difficult to approximate because it is zero for the first five seconds.l The Butterworth functions of Appendix section 6 are used in the approximation because their poles assume a configuration in the s-plane which is similar to the configuration obtained by a Pade approximation of an ideal delay..As in example one the length of the prescribed response is chosen to approximate the length of the longest approximating function. The approximation coefficients obtained from Fo(s) are shown in Table III; fo*(t) is plotted in Figure 7.12. Obviously five approximating functions are insufficient in this problem. This is not surprising since all finite representations of e-TS with good transient response have many terms. In example eight a computer approximation of the trapezoid pulse fo(t) = 0, t < 0 (7.29) = 1, O~<t < 2 1 Delay removal (see Chapter II) would permit fo to be approximated with no error.

- 105= 2(1 T <t<T T' fo(t) =0, t > T is made. The seven approximating functions are orthonormal with poles on the unit circle at s = -1, -3/4 + j 1-(3/4)2, -1/2 + j l - (1/2)2, -1/4 + j 1l - (1/4)2. Different pulse lengths yield the following computer mean square errors: T = 10, Irel = 0.0188; T = 7, Irel = 0.00658; 1 T = 5, Irel = 0.00654; T = 4, Irel = 0.0121. Consequently, it appears that the length T = 6 is about optimum for the above approximating functions. This figure agrees closely with the length of the longest approximating functions. Figure 7.13 shows fo* for T = 7 with seven, five, and three of the above poles. For N = 5, the last two poles are omitted; for N = 3 the last four poles are omitted. Omission of the last two poles, which contribute mainly to fo* for t > 7, causes Irel to increase from 0.00658 to 0.0139. Example nine repeats -the problem of example eight with only five poles, s = -1, -2/3 + j - (2/3)2+ The pulse length is reduced to T = 4 to compensate for the change in approximating functions. As would be expected, Figure 7.14 indicates an increase in approximation error over the error in example eight, N = 7; the same conclusion is not valid in example eight, N = 5. Hence, the present poles are superior for a five pole approximation. Example ten considers the same approximation with W(t) = eO.5t. The method of predistortion is used (equations(7.8)through (7.11))in conjunction with the pole shift procedures of Figures 7.6 The computer mean square error Irel although differing somewhat from the true mean square error, should be a good comparative measure of the different approximation errors.

-106and 7.7. To fix the poles of the final approximation at s = -1, -2/3 ~ j 1 1 - (2/3)2, -1/3 + j - (1/3)2, the poles of the orthonormal functions are taken at 5 = -3/4, -5/12 + j -j(2/3)2, -1/12 + j J - (1/3)2. The weighted approximation is plotted with the unweighted approximation in Figure 7.14. As predicted by theory, errors at large t are reduced at the expense of increased errors at small t. Example eleven concerns the approximation of a filter for detecting a signal in white noise. If the detection condition specifies that the ratio of peak signal to rms noise be a maximum, then the matched-filter criterion of Van Vleck and Middleton [39] holds and states that h(t) = S(-t) where S(t) is the signal tc be detected. In the present problem S(t) is generated from a square pulse by a transmitter which acts like a linear system of second order. Figure 7.16 shows the signal source and defines the signal parameters. Since h(t) = S(-t) = 0 for t > O, it cannot be approximated directly. It must first be delayed. By letting h(t) = S(-t-4) for t > 0 and h(t) = 0 for t < 0 most of the signal is considered. Figure 7.15 shows fo(t) = h(t) = S(-t-4) and the corresponding computer approximation fo*(t). The five approximating functions have Butterworth poles INPUT PULSE I. TRANSMITTER INPUT I SIGNAL 2 1t PULSE 42 D I Ps()m Figure 7.16 Signal Source for Detection Problem

-107at s = ej. e( /5) ej( /5) Considering the nature of f the five term approximation is fairly good. A time delay greater than four seconds in the function S(-t) would reduce the error due to omission of S(t) for t > 0, but it would also increase the mean square approximation error. In this respect the approximation of the matched filter impulse response is more difficult than the previous approximations. The last three columns of Table II are included to demonstrate the application and accuracy of equation (3.5) for the estimation of the peak approximation error. emax is peak error obtained from e = fo - fo*; e*ax is the estimated peak error from equation (3.5); T2 - T1 is the time interval in which W1/2 fo and W1/2 fo* have appreciable value. Because the approximating functions have approximately the same length as fo, T2 - T1 is taken to be slightly greater than the length of fo. An exception is example ten where wl/2 f is somewhat shorter than 04 and 05. Agreement between em and e* is quite good. Examples seven through ten should be discounted to a certain degree because of the computer errors in Irel.

VIII. THE ARBITRARY INPUT PROBLEM Many practical applications of linear system approximation, such as the determination of linear system characteristics from experimentally obtained response data or the synthesis of amplifiers with pulse inputs, lead naturally to prescribed inputs which are arbitrary functions of time. In these applications the approximation problem becomes quite complicated. No longer are the response functions g (t), ---— QN(t) sums of exponential functions. Consequently, most of the practical procedures of Chapter VII (for example, the simple orthogonalization of approximating functions and the computer evaluation of approximation coefficients) are not applicable. Furthermore, error questions make doubtful the common practice of replacing the arbitrary input problem by the corresponding impulse response problem. Not only may there be an error in the determination of h(t) from fo(t) and fi(t), but as Chapter III indicates, direct approximation of h(t) and fo(t) yield different error distributions in the frequency domain. This chapter attempts to resolve some of the difficulties of the WME arbitrary input problem by presenting simplified analytic and computer techniques. Since the theory of WME approximation, as it pertains to the arbitrary input problem, has been discussed fully in Chapter IV, the following sections will be concerned primarily with modifications of the theory and with detailed methods of coefficient evaluation. The first section describes the reduction of the arbitrary input problem to the easily solved exponential input problem. After several methods for analog computation of approximation coefficients are developed, various -108

-109procedures for handling experimental data are discussed. The chapter concludes with a WME method for correlation function computation. The Equivalent Exponential Input Problem Chapter II showed that the effect of an arbitrary input is to weight the error Eh = H - H* by Fi(j u). As a result minor changes in fi(t) (more precisely, in Fi(jw) ) do little to change H. Thus fi(t) may be replaced by an exponential series approximation, Ti(t), with little change in the final approximation. Since the exponential input yields exponential approximation functions, the new approximation problem is easier to solve. Statement of the modified approximation problem is straightforward. fi(t) is replaced by Ti(t) and fo(t) is replaced by o(t) = h(T)fi(t-T)dT; i.e. Fo = HFi = F Fi The approximate system 0 1i l function H which is obtained from the approximation to Fo, F0 = Fi corresponds closely to H* of the original problem. It is important to note that fo(t) is made to approximate fo(t) and not fo(t). Approximation of f (t) would cause the difference between H* and H* to be highly dependent on the accuracy of the approximate input. (t). The major difficulty in formulating the equivalent exponential input problem is the determination of f0o(t). This is especially true when h(t) is given implicitly by specification of rather arbitrary fo(t) and fi(t). On the other hand, if fo(t) and fi(t) have known Laplace transforms, no problem exists since F (s) = ) F(s). Many methods, 0 Fi (s) (Me such as those of the previous chapter, are suitable for obtaining the exponential series approximation fi(t). When W(t) = 1 the freedom in choice of Ti(t) is greater since only the magnitudes of Fi(JC) and F1(jU)

-110must be matched.l This suggests various frequency domain approximation methods [41]. Once the preceding steps have been completed, actual solution of the approximation problem is simple. Since Fo(s) = H (s) Fi(s) is a rational function of s, all the terms of fo(t) are exponential functions. Hence, the approximation procedures of Chapter VII are all valid. The system function is given by H =. If H is not to contain as poles F. and zeros, the zeros and poles of Fi, the approximation Fo must be constrained to possess the factorFi. For example, if Fi = s and it lim *79 is specified that Is - s H* = const. and that H does not have a zero at s = -1, then Fo* must contain at least one pole at s = -1 and must lim be constrained so -| sFo* = const. While increasing the likelihood of more constraints, the above solution of the arbitrary input problem is little more complicated than the solution of the impulse response problem. Analytic solutions are perfectly feasible when fo(t) and fi(t) have known Laplace transforms. The steps are enumerated as follows: 1. Approximate f (t) with as few exponential functions as considered necessary, to obtain fi(t). 2. Calculate f (t)(or its transform Fo(s)), the response F of H - to Fi(t). 3. Approximate f-o(t) by the constrained exponential series,~~~~*3 fo so that prescribed constraint conditions on h*(t)(or H = Q) Fi are met. Fi 1 See Chapter III, page 25.

-111Some of the simpler arbitrary inputs do not require an exponential function representation. Suppose fi(t) = ul(t), the unit step function, and fo(t) =, 2T = (t-) 2 T <t<T 2 T 2 (8.i) = 1, T < t, 00 as in Figure 8.1. Since f f2 dt = a, the problem cannot be solved o fi(t) 1.0 O t fo(t) 1.0~ _ _ I 0 1/2T T t Figure 8.1 Prescribed Input and Response for Example

-112directly. By taking fo0 = u_(t) = f_(t) = t < 0 T = 1, O <t < = 2-at T <t<T T 2 = T < t (8.2) the difficulty is removed. Once fo has been obtained (see examples eight, nine, and ten of the previous chapter), fo = u-1 -fo and H = s [1 - Fo ] 1-sF Fi s (8.3) Unless f0 is a constrained approximation Is -limooH = const.. Approximation for prescribed step function response has assured the condition H(O) = H*(O) = 1.. Computer Evaluation of Coefficient Integrals When it is impossible or impractical to reduce the arbitrary input problem to a form where the approximating functions are exponential functions, direct calculation of the integrals for cnm and dn is necessary. Usually, the functions to be integrated are not elementary and some computer technique must be employed. This section presents practical computer diagrams for this purpose. Although the discussion is limited to the integrals for cnm and dnn similar procedures may also be applied to other integrals, such as those which often appear in constraint equations. For convenience, the equations defining cnm and dn are repeated here. 1 This is really the preliminary simplification of extraction. See page 15.

-113T 2 din = / W(t) fo~(t) n(t) dit T1 2 - W(t) fo(t) fj(tl) ) n(cpn) d.rl dt (8.4) Cnm = W(t) Gn(t ) m(t )dt T1 T2 = - W(t) fi(t-rl) p (T1) d T1 fi(t-T2) cp (r2) dT2 dt T1 0 0 (8.5) The physical systems On(S) which make up the final approximation H are used, as shown in Figure 8.2, to obtain the required functions Qn(t). As expected, the circuit for the computation of cnm fi[t) 8n(t) dn fo() jX f fij M (D M ) 8"MW(t) Figure 8.2 Diagrams for Direct Computation of dneand cnm requires only the prescribed input fi(t) and not fo(t). If all the integrals are computed simultaneously N setups of the indicated type are required. By storing the functions fo and fi the integrations may be performed consecutively with only one setup, changing as required

-114the system functions ln(S). Sometimes the prescribed input and response functions are specified in terms of the auto-and cross-correlations functions T T -T 1f (t) fi(t+T) dt 2 1 (8.6) T2 toi (T) T T- o(t) fi(t+T) dt oi T -T 0 (8.7) 2 1 T2 This is especially true when fi(t) and fo(t) are the results of experimental measurements. The interval (T1,T2) is assumed to be long enough so that changes in T and T which are as large as the maximum 1 2 considered ITI values, produce little change in ii or ofi. Under these conditions T 2 4(-T ) f (t) fi(t-Tl) dt 21 T T2 i i (T2-rl) T 1 fi(t) fi(t+T2-T1) dt T 1T1 2(8.8) 2-T T2 i 22 1 ~ / fi(t-2 ) fi(t-l) dt T o-T (8.9) 2 1 Upon changing the order of integration in equations (8.4) and (8.5) it is seen that dn= (T2-T1) if(-Trl) cPn(Tl) drT1 (8.10) 0 1 See James, et al. [16]. Normally T1 and T2 g to - and. Since practically they remain finite, the limit will not be taken here.

-11500 00 Cnm = (T2-T1) Yn(T2) vii (T2-r2 ) m(T1) dT1 dr2 (8.]i1) when W(t) = 1. Figure 8.3 gives the diagrams for evaluating cnm and dn when the known correlation functions are generated as time functions. As in Chapter VII, the linear systems On(s) perform the desired integrations by convolution. A different method for computing cnm is shown:Im (S) d n Cnm,oi (t) T2 - ~n (s).> () dt AT t=O Figure 8.3 Diagrams for Computation of dn and cnm from Correlation Functions Cnm 4iH (t i'm ) g(t) g(-t T)-T cI(S) m (S) T ____ W AT t=O Figure 8.4 Diagram for Computation of cm from Correlation Function

in Figure 8.4. Here, the time function g(-t) must be reproduced from g(t) and used as a forcing function for the second system.l The above integration schemes are, of course, subject to computer errors. Analysis of the errors as they influence fo is more complicated than in Chapter VII because both cnm and dn are inaccurate. Without going into detail, the least difficulty is again encountered when the eigenvalues of C-1 are approximately equal. Unfortunately, the functions @n(t) depend on both fi(t) and On(s) so orthogonalization of 9'1 ----- N is not attractive. The only thing that can be done is to choose 1.-. ON in such a manner as to make Q1, —--- GN approximately orthonormal. This approach is discussed further in the next section. Experimental Determination of System Characteristics The problem of determining the system function (H) of an unknown physical system from input and response data (fi and fo) is an important one Which has been investigated by many authors. Although many methods have been proposed, most of them rely either on numerical analysis [24,32,33,42] or correlation function measurement2[12,23]. The purpose of this section is to show some of the relations between WME approximation and experimental system function determination. No claim of originality is made for all of the material. 1 Laning and Battin [21] (page 218) describe a similar integral evaluation for a different application. 2 Truxal [38] gives an excellent review of the problems involved in determining correlation functions from experimental data. 3 Booton [4] describes a general mean square error method for measurement and representation of system characteristics.

-117Review of the Problem —Most practical measurement problems are complicated by the fact that the input to the physical system under test consists of two components: (1) the known test input fi(t), and (2) an undesired and usually unknown input ni(t). The function ni(t) may be an irremovable noise input, or it may be an input required for continued operation of the system under test. Correspondingly, the measured response consists of the two components, f (t) and no(t), resulting separately from each of the two inputs. From measurement of the operating records, fi(t) and f (t) + no(t), the unknown system function must be approximated as closely as possible. Auto- and cross-correlation functions offer one solution of the problem. The functions T2 2 2 Tf 1 fo(t) + no(t)] fi(t+T) dr = oi(r) + TT T2-T ~~2 1 T1 T2-T~~1 T1 no (t) fi(t+r) dt are computed by one of the conventional methods. If the time interval (T2-T1) is sufficiently long, and if no(t) and fi(t) are uncorrelated, the second integral of the second equation vanishes so that ii (i.) and r(oi&() are known. Because 4ii(T) and oi (r) are related by 00 rz 1 voi(-t) = h(T) ii(t-) dr (8.12) 01 (8.12) the unknown weighting function h(t) is also known. Various methods for solving the integral equation have been employed. Most of them are standard linear system approximation procedures because 1ii(t) and 1 See Truxal [38], page 437.

-118toi(-t) are related, as inputs and outputs of the system H(s) are related. The WME Approximation —The most obvious method of WME approximation is to reduce the aforementioned correlation function problem to an equivalent arbitrary input-response problem, i.e., let fi(t) = 4ii(t) and To(t) = 4oi(-t). This method has the deficiency that the new input fi does not produce the same frequency weighting of approximation error that fi produces. A more direct and satisfactory procedure for approximating h(t) is the coefficient evaluation scheme of equations (8.10) and (8.11). In this approach, the original test input fi(t) is retained even though cnm and dn are calculated from 4ii and *oif A still more direct approach does not require the added work and error of preliminary calculation of the correlation functions. The coefficients cnm and dn are obtained from fi(t), fo(t),and the physical systems On(s) as in Figure 8.2. The components for computation of dn (W(t) = 1) are shown in Figure 8.5 as they would appear in an actual test situation; the components for computation of cnm remain as in Figure 8.2. Presence of the undesired output no(t) causes little error in dn if the n (t) + + fo(t) + no(t) H (s) fi ( t).T2 Figure 8.5 Diagram for Experimental Determination of dn Figure 8.5 Diagram for Experime ntal Determination of d all coefficients are computed simultaneously, a great number of integrating

-119and multiplying components is necessary. Consecutive computation of cnm and dn is possible if f i(t) and f (t) + no(t) are stored or if fi is stationary so that any (T1,T2)gives the same result for large (T2-T1). Long test periods are not mandatory when ni(t) is small or zero. A period several times longer than the period for which h(t) has appreciable value should be adequate. White Noise Testing and Function Orthogonality —All of the above methods become greatly simplified when W(t) = 1, (T2-T1) is large, and fi(t) is a white noise. Since lii(T) = PU (T) the coefficients cnm defined by equation (8.11) become simply Cnm = P (T2-T1) j pn(T1) (m(Tl) dT1 (8.13) 0 But the functions cpn(t) are known sums of exponential functions so the cm can be calculated and do not need to be measured experimentally. In fact, when the q:n are orthonormal -1 1 cranm P(T2-T1) (8.14) If the functions cpn are not orthonormal but are chosen from the appendix, the tabulated inverse matrices yield C 1 upon multiplication by 1 P(T2-T1) The constants dn must still be obtained experimentally. Arguments concerning the eigenvalues of C and sensitivity of the approximation error to coefficient errors hold equally well when applied to white noise testing or impulse response approximation. Therefore, the use of orthonormal On functions is again indicated. If fi (t) 1 P is the power per unit bandwidth. In practiceFi (jWc) is band limited and ~ii is not an exact impulse function. If the spectral density is flat well beyond the response frequencies of the q, then the representation is satisfactory.

-120is not white noise, approximate orthogonality of 1 —--- can be had by letting On = G7n, where,1' —--- a N are orthonormal and G is a rational system function which changes fi(t) into a function which approximates white noise.l The complexity of the functions O1l ----- sN should not be too great since G can usually be kept rather simple. Because ~1 ----- 1N are not exactly orthonormal, C must be inverted; however, the approximate orthogonality assures good accuracy. Choice of Approximating Functions —The choice of the approximating functions lD' —--- ON remains a difficulty in the experimental approximation problem. Generally, little is known about the exact pole positions of the unknown system function H(s). It is even possible that H(s) does not possess a finite number of poles. Sometimes, the physical system does give information about the form and approximate pole positions of H(s). This is often true of the linear equations which describe the small motion response of an airplane; known airplane data enable a reasonably good estimate of pole positions. In such systems, the estimated poles plus a few added poles should provide an excellent basis for choice of Ow!-' - N. In other cases, the poles may often be obtained by examination of the transient response of the unforced system. Graphical determination of decay rates and oscillation frequencies is one such approach. Another is physical mechanization of the Prony method. If no success is had in pole determination, accuracy of the final approximation must rest primarily on a sufficiently large number of fairly arbitrary approximating functions. 1 The determination of G is straightforward. See Goldman [11], page 264.

-121Multiple Input and Output Systems — Multiple input and output systems are approximated by simple extension of the above principles. A separate approximation is found for each of the input-output pairs. Although the calculations are no more complicated than those already discussed, the number of calculations mounts rapidly with multiplicity of inputs and outputs. When the same approximating functions are used for all approximations in a single system, some duplication in approximation and realization is avoided. For enxample, single input, multiple output systems lead to the realization of Figure 8.6. Only one set of approximating systems is required. By introducing the coefficient gains and summing operations at the inputs of the On(s) a similar reduction in complexity is obtained for single output, multiple input systems. QI) Of' O o O i(t)t o 0 0 0 0 i Figr 86 irm. fok (t) Figure 8.6. Diagram for Realization of Single Input, Multiple Output System. If multiple input, multiple output systems are to be realized, use of the same schemes is not possible without duplication of each of the approximating systems. However, the common poles in all approximations do not permit the realization of a linear system with only M dynamic

-122elements (M is the number of poles in the approximating functions). Optimum Filter Designl-Procedures which are valid for experimental determination of system characteristics are also valid for experimental design of optimum filters. The unknown system is simply replaced by an ideal system which performs exactly the prescribed filtering task. Even though such an ideal system is not mathematically attainable, the input and response functions which are required for mean square approximation of the system are available. The problem is illustrated in Figure 8.7. The weighted mean square error I is minimized in the usual way through the approximation IDEAL FILTER fo(t) DESIRED RESPONSE (DEPENDENT H (I) ON f1(t) ) fi(t) + n(t) e M INPUT PLUS NOISE APPROXIMATE FILTER - H s) fo(t) APPROXIMATION TO fo(t) Figure 8.7. Optimum Filter Problem. coefficients an defined by cnm and dn. To obtain fo(t) it is presumed that fi(t) and n(t) can be separated during the experimental design measurements. Typical optimum filter operations which might be specified include: (1) smoothing, fo(t) = fi(t); (2) delayed smoothing f (t) = fi(t-T); (3) prediction, fo(t) = fi(t+T)2; (4) differentiation, fo(t) = fi(t). Since the approximate system is time invariant, the 1 Bose [5] has recently proposed methods similar to the ones of this section. 2 The time advance fo(t)=fi(t+T) would be obtained by working with delayed input and output functions, fo(t-T)=fi(t) and f1(t-T) + n(t-T).

-123statistical properties of fi(t) and n(t) should also be time invariant, i.e. stationary. Moreover, the time averaging period (T2-T) must be sufficiently large to reduce the statistical deviations in cnm and dn to a reasonable point. The optimum filter design of the preceding paragraph is only optimum in terms of the approximating functions O1' —--- ON' With an infinite set of complete approximating functions the approximation error would be decreased even further. While this "finiteness" error is decreased by increasing N, there is no reason for going beyond the N where the finiteness error becomes small compared to the filtering error. Experimental design of optimum filters has only been touched upon here. Certain classes of non-stationary inputs could be treated by constrained approximation [43]. Introduction of distributed element approximating functions would permit design of finite memory filters [43]. Many other questions, such as practicality, accuracy, and computational complexity can be settled only by completing some trial problems. Correlation Function Computation Although the subject of correlation function computation cannot be classified under linear system approximation, the following method is so closely related to the material of the above sections that it merits mention. Conventional methods of computing the cross-correlation function of f(t) and g(t) 1 The same method of correlation function computation was discovered earlier by Lampard [20]. However, his work is restricted entirely to orthogonal Laguerre functions.

-124T 2 Afg (T) = T1 f(t) g(t+T) dt 1 T -T f(t-T) g(t) dt 2 1 (8.15) require the operations of delay, multiplication, and integration. In the present section 4fg is represented by a mean square approximation of exponential functions, either orthonormal or linearly independent. For T > 0 (T < 0 will be considered later), let 4fg be approximated by gv (T) = ann(T) T > 0 fg n=l (8.16) As in previous approximations, a] = C d] where C is known from 1' —---..N and d] is defined by LINEAR SYSTEMS MULTIPLIERS INTEGRATORS g(t) ~P, 5 - T (l )dt,- 2 (sS) 8t M.._ f(t) O O O O O O O O O O O O 0N ( Figure 8.8 Physical Systt..is for Computing Correlation Function Coefficients dn

-12500 dn = 4ffg(T) Pn(T) dT 0 T2 o0 Tg(t) f (t-T) (P) n(T) dt T2-T 1 T1 T2 _ 1~, g(t) Qn(t) dt T2-T1 T 2 1 (8.17) The function @ (t) is the response of the system fn to the input f(t). Thus, the circuit of Figure 8.8 gives dl, —--- dN. In order to approximate *fg for T <0 it is necessary to note that 4gf(T) = *fg(-T). Hence the dn, which hold for the negative T approximation, are computed by reversing the positions of f(t) and g(t). The complete approximation is N *fg = [ann m(T) + anpn(-T)] (8.18) n=l The cross-spectral density is the Fourier transform of equation (8.18). N fg () = [anon(j) + anconj On(jW) ] (8.19) n=l When g = f the auto-correlation function is obtained. Since 4ff(T) = fff(-T) equations (8.18) and (8.19) simplify to N 4ff(T) = an[(T) +n(T) ] (8.20) n=l N Iff(W) = 2 anRe n(j) n=l (8.21)

-126Mean square error computation of correlation functions has the following advantages: (1) no delay device is required, (2) f(t) and g(t) need not be stored for repeated calculations since the dn may be computed simultaneously, (3) functional representations are obtained for the correlation functions and spectral densities. Disadvantages are:(l) lack of satisfactory method for choosing approximating functions, (2) inability to calculate Imax = Vfg dT and thus the mean square approximation -00 error. The last difficulty can be eased by checking the approximation error in 4fg or 4ff at T = 0 or the value of Iff(t) at several frequencies.

IX. CONCLUSION The primary goal of the preceding chapters has been to extend the theory and practicality of linear system approximation by mean square error minimization in the time domain. Success in achieving this goal can be attributed to two considerations. First, emphasis has been placed on the general theory of mean square approximation rather than on a specialized aspect of it, such as approximation by restricted classes of orthogonal functions. Second, recognition has been made of the difficulty of practical computations by supplementing available analytic procedures by matrix tables and computer methods. Since the effort to present a comprehensive exposition has necessarily required the inclusion of supporting material, it is proper to summarize the major results which are believed to be new. They are as follows: 1. A detailed investigation of the mean square error criterion, its meaning in the frequency domain, and its relation to the arbitrary input problem. A simple formula for estimating the peak approximation error from the mean square error. 2. A general statement and solution of the mean square error approximation problem for non-orthogonal approximating functions and an arbitrary input. The extension of constrained approximations to the general mean square approximation problem and the introduction of constrained approximating functions. -127

-_1283. The generation of constrained orthonormal exponential functions and of more general unconstrained orthonormal exponential functions. 4. The approximation of prescribed responses having known Laplace transforms by means of non-orthogonal exponential functions and exponentially weighted non-orthogonal exponential functions. 5. Tables of inverse matrices to simplify approximation calculations for various families of exponential approximating functions. 6. The analog computation of approximation coefficients through linear system realization of approximating functions, including system simulation and approximation error evaluation, orthonormal function realization, pole optimization through orthonormal function computation, computation of exponentially weighted integrals, and error analysis. 7. A simplified solution of the arbitrary input problem through the use of an equivalent exponential input. 8. The computer evaluation of approximation coefficients for arbitrary or experimentally measured input and response functions. The experimental design of optimum linear filters. 9. The application of the general WME approximation theory to correlation function determination. Although the above results are encouraging, the mean square approximation of linear system response has several important limitations. Most seriously, the pole determination problem has no completely satisfactory solution. The more exact methods of Chapter VI are lengthy and

-129offer no realassurance that the poles chosen are close to optimum. When the prescribed system is lumped and of relatively low order (a situation which is only of significance in experimental determination of unknown system characteristics) and is excited by a simple input, the Prony method or variations of it provide good pole determination. For more complicated conditions the poles must be estimated as best as possible and a sufficient number of them included to meet the prescribed accuracy requirements. In this respect the WME approximation procedure is no more deficient than any other presently known approximation procedure. Finally, the calculations tend to become excessively lengthy when the number of approximating functions is large, when the weight factor is not constant, when several constraints are applied, or when the input is approximated by more than a few exponential functions. This difficulty should not be considered so much as a fault of the WME method, as an indication of the complexity of the problem being solved. In many cases the length of calculations could be reduced by a more complete tabulation of inverse matrices for unconstrained, constrained and weighted approximating functions. A still more general approach would be the programming of the approximation problem on a large scale digital computer. The following possibilities for further work in mean square approximation of linear systems exist: (1) approximation by distributed element linear systems, (2) approximation of time-variant linear systems by time-variant approximating systems, (3) investigation of the feasibility of experimental approximation methods. Preliminary study has shown that (2) gives promise of useful results.

APPENDIX Inverse Matrix Tables The following families of non-orthogonal approximating functions are defined in the interval (0,o) and are unweighted. In addition to the tabulation of the inverse C matrices, formulas are given for the calculation of dl, —--- dN. To aid in the choice of time scaling, an indication of the length of the longest approximating function in each family is also included. -131

-1321. Unconstrained Laguerre Set Poles: all at s = -1. System functions: n = ( n (s+l)n tn-1 -t Time functions: n = et (n-l)' Time of maximum value for longest function: N-1 seconds Coefficients: cnm (n + m - 2) (n-l)' (m-l)' 2 n+mO0 -16 96-192 128 I 10 _ - _ _ _ - -4h0 240 80 -560 1,472 - -- 8 -80 608 -1,728 2,176-24 32 -256 768 -1,024 512

-13312 -6o 440 - 16o -1,360 4,672 -240 2,208 -8,128 14,976 192 -1,856 7,168 -13,824 13,312 -64 640 -2,560 5,120 -5,120 2,048

-1342. Constrained Laguerre Set Poles: all at s = -1. System functions: n= (s+l)n +l Time functions: _ tn e-t n. Time of maximum value for longest function: N seconds Coefficients: C (n + nm) nm t n+m+l n. m'. 2 00 o0 dn = gnfo dt = e-t fo(t) dt 0 n 0)n n (_1)n d Fo n! d sn s=l Inverse Matrices: 16 -- 40 -- 80 - -16 21 -80 192 -- -240 832 48 -128 96 288 -1,088 1,536 -- -128 512 -768 409To 140 -560 2,538 1,008 -4,928 10,176 -896 4,608 -9,984 10,240 320 -1,706 53,840 -4,096 1,2706

-1353. Real Exponential Set Poles: at s = -n. System functions: =n = + Time functions: Gn = e-nt Time constant of longest function: 1. second Coefficients: c = 1 nm n+p =n nfo dt = e-ntfo(t) dt 0 0 = OF(n) Inverse Matrices: 18 -- 1 72 -- 200 - -24 36 -240 900 -1,200 8,100 -- 180 -720 600 2,100 -15,120 30,000 -1,120 8,400 -16,800 9,800 450 - -4,200 44,100 12,600 -141,120 471,000 -15,120 176,400 -604,800 793,800 6,300 -75,600 264,600 -352,800 158,760 882 - -11,760 176,400 -- 52,920 -846,720 4,234,200 -- -105,840 1,764,000 -9,702,000 19,845,000 97,02o -1,663,200 8,731,800 -19,404,000 19,209,960 -- -33,264 582,120 -3,104,640 6,985,440 -6,985,440 2,561,328

-1364. Fourier Set (a = 1) Poles: at s = -1, -l+j, -l+j2, -l+j3. System functions: 81 = s1 s+l'9 = 2 2+,n even s2+2s+1+.25n2 =.5(n-1), n odd s2+2s+1+.25(n-1)2 Time functions: ~1 = e-t n = e-tcos.5nt, n even n = e-tsin.5(n-l)t, n odd Time constant of longest function: 1. second Coefficients: r r d = f dt t= e-tf (t) dt = F (1) 1_ lo o 0 O 0 d n = f t f(t)e-tcos.5nt dt = ReF (1-j.5n), O 0 n even 00 = fo (t)e-tsin.5(n-l)t dt = ImFo (-1-j. 5 (n-l)), n odd Inverse Matrices: 50 -- - ~ 200 -- -4o 36 -- -80 100 -oo -4o 28 44 -2931 77 4779 2 2 -100 -135 1829 963 -- 2 2 47 665 -488 -102~ -21$ 47y

-137+417.277 -- + 38.5090 +227.768 -693.318 -127.759 +1216.63 -414.060 -164.428 + 764.415 +557.085 - 57.7696 -217 769 +1o06.650 +126.430 +280.658 - 1208379 - 75.1836 + 9.99718 - 3.26160 +116.442 +74.3945 +154o067 + 96.6591 -281.097 -218.654 -108.437 -22.7395 +111.771

-1385. Fourier Set (C = 2) Poles: at s = -2, -2+j, -2+j2, -2+j3 System functions: a1 = 1 s+2 s+2 n = s2+4s+4+.25n2, n even =.5(n-1) s2+4s+4+.25(n-1)2, n odd Time functions: = e2t -2t e cos.5nt, n even -2t = e sin.5(n-l)t n odd Time constant of longest function:.5 second Coefficients: d! = lfo dt =J e 2tfO(t) dt = FO(2) O 00 O o d = ~nfo dt fo(t)e-2tcos.5nt dt = ReF (2-j.5n), n 0 n even f= (-t)e-2tsin.5(n-l)dt = ImF (2-J.5(n-1)), n odd Inverse Matrices: 1155.93 -1087.94 1031.94 - 543.971 495.973 327.986 -114.216 63.4051 - 2.5357 209.899 -137.873 147.812 57.2765 - 56.6722 - 59.6550 - 0.1193 -- -91.6699 41.9000 -167.979 25.4848 178.428

-139744.520 42.927 -.0o46 -- - 307.778 -84.290 1309.553 49.614 -56.645 - 58.250 -.o39 -1453.43 17.557 -517.647 42.064 3950.60 - 746.282 14.258 376.995 6.934 1283.85 654.807 647.493 12.743 190.339 -7.974 -1826.77 -612.515 885.057

-1406. Butterworth Set Poles: approximately at s ei2 (, k 1,- N System functions: Gen = San s2+2c s+G4+_2n Pn 0n+1 - sn+2nS+- n +ni Time functions: Gn = e cos n t Qn+l = e-Cntsin Pn t Time constant of longest function: 1 seconds UN Coefficients d = J nfo dt f= f-Cntcos Pnt dt = ReFo(Oc-jn) O O 00 00 dn+l = n+lfo dt fo e-ntsin nt dt = ImFo(cn-jBn) O 0 Inverse Matrices and Pole Positions: = 7, =1 =7 a1 = 1., 1= 0, =2 o=. 2 = 0.9 2.8 -- -4.19002 - -2.8 8.4 1.79293 1.23279 -- 4.45686 -3.01823 -1.50614 1= 0.9, 0.4, 3 = 0.4, = 0.9 24.3357 -. -54.7549 270.734 -14.9758 33.6953 10.8159 6.65580 - 55.3273 - 4.80699 15.0888

-141O!1 =1, 1 = ~O a2 = o.8, 2 = o.6, a = 0.3, 4 = 1. 527.998 - 545.360 578.690 -- -198.029 184.086 138.573 49.8568 -59.8808 - 7.09480 10.4793 50.0106 -49. 5157 -34.1496 3.03990 10.6929

BIBLIOGRAPHY 1. Aigrain, P. R., and Williams, E. M., "Synthesis n-Reactance Networks for Desired Transient Response," Journal of Applied Physics, Vol. 20, pp.597-600; June, 1949. 2. Aigrain, P. R., and Williams, E. M., "Design of Optimum Transient Response Amplifiers," Proceedings of the Institute of Radio Engineers, Vol 37, pp. 873-879; August, 1949. 3. Ba Hli, F., "A General Method for Time Domain Synthesis," Transactions of the Institute of Radio Engineers, Vol. CT-1, pp. 21-29; September, 1954. 4. Booton, R. C., Jr., "The Measurement and Representation of Nonlinear Systems," Transactions of the Institute of Radio Engineers, Vol. CT-1, pp. 32-35; December, 1954. 5. Bose, A. G., "A Theory for the Experimental Determination of Optimum Non-linear Systems," The Institute of Radio Engineers Convention Record, Part 4, pp. 21-31; 1956. 6. Carr, J. W., III, "An Analytic Investigation of Transient Synthesis by Exponentials," S. M. Thesis, Dept. of Elect. Engineering, Mass. Institute of Tech.; 1949. 7.. Cerrillo, M. V., and Guillemin, E. A., "Rational Fraction Expansion for Network Functions," Proceedings of the Symposium on Modern Network Synthesis, pp. 84-127, New York' April, 1952. 8. Courant, R., and Hilbert, D., Methods of Mathematical Physics, Interscience Publishers, Inc. New York; 1953 9. Gabor, D., "Communication Theory and Cybernetics," Transactions of the Institute of Radio Engineers, Vol. CT-1, pp. 19-32; December, 1954. 10. Gardner, M. F., and Barnes, J. L., Transients in Linear Systems, John Wiley and Sons, New York; 1942. 11. Goldman, Stanford, Information Theory, Prentice-Hall, Inc., New York; 1953. 12. Goodman, T. P., and Reswick, J. B., "Determination of System Characteristics from Normal Operating Records," ASME-IRD Conference on Automatic Control, Ann Arbor, Michigan; April, 1955. 13. Graham, D., and Lathrop, R. C., "The Synthesis of Optimum Transient Response," Transactions of the American Institute of Electrical Engineers, Vol. 72, Part II, p. 273, November, 1953. 14. Guillemin, E.,'That is Network Synthesis,"'11nsactions of the Institute of Radio Engineers, Vol. PGCT-1, pp. 4-19; December, 1952. -142

15. Huggins, W. H., "Network Approximation in the Time Domain," Report E5048A,. Air Force Research Laboratories, Cambridge, Mass.; October, 1949. 16. James, H. M., Nichols, N. B., and Phillips, R. S., Theory of Servomechanisms, McGraw-Hill Book Company; 1947. 17. Kautz, W. H., "Network Synthesis for Specified Transient Response," Technical Report No. 209, Research Laboratory of Electronics, Mass Institute of Tech.; April, 1952. 18. Kautz, W. H., "Transient Synthesis in the Time Domain," Transactions of the Institute of Radio Engineers, Vol. CT-l, pp. 29-39; September, 1954. 19. Korn, G. A., and Korn, T. M., Electronic Analog Computers, 2nd edition, McGraw-Hill Book Company, New York; 1956 20. Lampard, D. G. "A Method of Determining Correlation Functions of Stationary Time Series," Proceedings of the Institution of Electrical Engineers, Part c, No. 1, p.-35; March, 1955. 21. Laning, J. H., Jr., and Battin, R. H., Random Processes in Automatic Contr6l, McGraw-Hill Book Company, New York; 1956. 22. Lee, Y. W., "Synthesis of Electrical Networks by Means of the Fourier Transforms of Laguerre's Functions," Journal of Mathamatics and Physics, Vol. 11, pp. 83-113 June, 1932. 23. Lee, Y. W., "Application of Statistical Methods to Communications Problems," Technical Report No. 181, Research Laboratory of Electronics, Mass. Institute of Tech.; September, 1950. 24. Lewis, N. W., "Waveform Computations by the Time Series Method," Proceedings of the Institution of Electrical Engineers, Vol 99, Part III, pp. 109-110; September, 1952. 25. Linvill, W. K., "Use of Sampled Functions for Time Domain Synthesis," Proceedings of the National Electronics Conference, Vol. 9, pp. 533542; 1953 26. Mathers, G. W., "The Synthesis of Lumped-Element Circuits for Optimum Transient Response," Technical Report No. 28, Electronics Research Laboratories, Stanford University; November, 1951. 27. Mc Cool, W. A., Paper 14, Project Cyclone Symposium I; March, 1951 28. Moore, A. D., "An Application of Prony's Method to Time Domain Synthesis," Technical Report No. 42, Electronics Research Laboratories, Stanford University, February, 1952. 29. Nadler, M., "The Synthesis of Electrical Networks According to Prescribed Transient Conditions, " Proceedings of the Institute of Radio Engineers, Vol. 37, pp. 627-629; June, 1949.

-14430. Otterman, J., "Time Domain Network Synthesis for an Analog Computer Setup," Proceeding of National Simulation Conference, pp. 24.1 - 24.5, Dallas, Texas; January, 1956. 31. Papoulis, A., "Network Response in Terms of Behavior at Imaginary Frequencies," Proceedings of the Symposium on Modern Network Synthesis, New York, N. Y.; April, 1955. 32. Shinbrot, Marvin, "A Least Squares Curve Fitting Method with Applications to the Calculation of Stability Coefficients from Transient Response Data," Technical Note 2341, National Advisory Committee for Aeronautics; April, 1951. 33. Shinbrot, Marvin, "On the Analysis of Linear and Nonlinear Dynamical Systems from Transient Response Data,?" Technical Note 3288, National Advisory Committee for Aeronautics; December, 1954. 34. Spencer, R. C., "Network Synthesis and the Moment Problem," Transactions of the Institute of Radio Engineers, Vol. CT-1 pp. 32-33; June, 1954. 35. Strieby, M., "A Fourier Method for Time Domain Synthesis," Proceedings of the Symposium on Modern Network Synthesis, pp. 197-211, New York; April, 1955. 36. Teasdale, R. D., "Time Domain Approximation by Use of Pade Approximates, " The Institute of Radio Engineers Convention Record, Part 5, pp. 89-94; March, 1953 37. Thomson, W. E., "The Synthesis of a Network to Have a Sine-Squared Impulse Response,," Proceeding of the Institution of Electrical Engineers, Vol. 99, Part III, pp. 373-376; November, 1952. 38. Truxal, J. R., Automatic Feedback Control System Synthesis, McGrawHill book Company, Inc., New York; 1955. 39. Van Vleck, J. H., and Middleton, D., "A theoretical Comparison of the Visual, Aural, and Meter Reception of Pulsed Signals in the Presence of Noise," Journalof Applied Physics, Vol. 17, pp. 940-971; November, 1946. 40. Westcott, J. H.., "The Introduction of Constraints into FeedbackSystem Designs," Transactions of the Institute of Radio Engineers, Vol. CT-1i, pp. 39-49, September, 1954. 41. Winkler, S., "The Approximation Problem of Network Synthesis," Transactions of the Institute of Radio Engineers, Vol. CT-1, pp. 521; September, 1954. 42. Zabusky, N. J., "A Numerical Method for Determining a System Impulse Response from the Transient Response to Arbitrary Inputs," Transactionsof the Institute of Radio Engineers, Vol. PGAC-1, ppT566; May, 1956.

-14543. Zadeh, L. A., and Ragazzini, J. R., "An Extension of Wiener's Theory of Prediction," Journal of Applied Physics, Vol. 21, pp. 645655; July, 1950. 3 9015 02826 0803