A SOLUTION OF THE FILTERING AND SMOOTHING PROBLEMS FOR UNCERTAIN-STOCHASTIC DIFFERENTIAL SYSTEMS ANDREI V. BORISOV Department of Industrial & Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 ALEXEI R. PANKOV Department of Applied Mathematics Moscow Aviation Institute Moscow,127080, Russia Technical Report 91-33 November, 1991.

A SOLUTION OF THE FILTERING AND SMOOTHING PROBLEMS FOR UNCERTAIN-STOCHASTIC DIFFERENTIAL SYSTEMS ANDREI V. BORISOV Department of Industrial Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 and ALEXEI R. PANKOV Department od Applied Mathematics Moscow Aviation Institute Moscow, 127080, Russia ( Revised on November 14, 1991) In this paper we present a new filtering algorithms for uncertain-stochastic dynamic systems, which are optimal in the sense of some minimax-stochastic criterion. These algorithms give a possibility to estimate the state vector of dynamic systems under lack of a priori information about the system characteristics using observations of the state and input signals. The algorithms allow us to construct an optimal two-filter smoothing algorithm for pure uncertain dynamic systems. These algorithms are numerically tested, results are compared with the results of ]Kalman filtering and smoothing in the case of complete information about input signal characteristics. 1 Introduction. The Kalman filter (see Kalman and Bucy, 1961) is known as the best mean square estimator of state vector of dynamic system of observation. First time using of filtering algorithms was limited by some strict conditions like linearity of system and observations, complete information about system and input signal characteristics, Gaussianity all of noises. After that there were a lot of trials to extend the class of systems of observations for which filtering algorithms also may be 1

used efficiently. It was obtained general equation for nonlinear filtering and filtering of conditionally-gaussian processes (see Liptser and Shiryayev, 1977), filtering of diffusion processes with Poisson-type of observations (Pardoux, 1979) and filtering of Poisson processes in Gaussian noise (Hero, 1991), filtering of Ito-Volterra processes (Kleptsina and Veretennikov, 1985). The necessary and sufficient conditions for finite dimensionality of recursive filters were also presented in (Brockett and Clark, 1980) and (Tam et al., 1990). All these papers considered a filtering problem assuming complete information about system and input signal characteristics. In many practical situations a priori information about system and inputs is incomplete. To overcome this serious obstacle one may use adaptive (Goodwin and Sin, 1984),(Fomin, 1984),(Eweda, 1991), H~ (Nagpal and Khargonekar, 1991), robust (Kassam and Poor, 1985), (Barton and Poor, 1990), minimax (Kurzhansky, 1977),(Malyshev and Kibzun, 1987) approaches. All these algorithms used limited amount of a priori information. In many applied cases it is possible to use both observations of state vectors and observations of some input signal components to obtain a high precision estimates even when we know nothing about input characteristics. In this paper we consider new recursive filtering algorithms for uncertain-stochastic systems given state and input signal observations. These algorithms were presented in (Borisov and Pankov, 1991) briefly. Here, we prove the optimality of a filtering algorithm for continuous-time systems of observations. This algorithm gives us a possibility to design an optimal fixed-interval smoother for pure uncertain systems. The analog of this smoother for pure stochastic systems was considered in (Mayne, 1966), (Wall et al., 1981). We also consider the results of some numerical experiments. Accuracy of the new estimates in the absence of a priori information about inputs is compared with the accuracy of the Kalman filter estimates with completely known input signal characteristics. It turns out these accuracies are close to each other. Hence, we can use a numerically efficient filtering approach for the wide class of uncertain-stochastic 2

systems and obtain acceptable estimates. From these experiments it also follows that the smoothing estimates are significantly more accurate than the corresponding filtering ones. Smoothing estimates are therefore more preferable in the case of postexperimental data processing. 2 Model description. Consider the following dynamic system dxt = a(t)xtdt + dwt, t > 0; xo = v (1) where Xt is a state vector, v is an uncertain initial condition, xt, v E RP, Wt is an input signal, a(t) is known matrix-valued function. Assume that wt is given by dwt - b(t)utdt + d, (2) where ut is an uncertain vector, ut E Rq; {t} is a p -dimensional zero-mean Wiener process with differential covariance matrix C(t) b(t) is a known matrix-valued function. Information concerning {xt}, {ut} and v is obtained from the observation processes {yt}, {zt} which are given by yo = O + Wo, dyt = q(t)utdt + dwt, (3) dzt = b(t)xtdt + drlt, where {wt} and {7t} are zero-mean Wiener processes with the differential covariance matrices Q(t) and P(t) respectively, Wt E Rm,jt E Rn; wo is a zero-mean random vector with covariance matrix Qo. We assume for simplicity that wo, {Jt}, {Wt} and l77t} are independent. The matrices a(t), b(t), C(t), q(t), '(t), Q(t), P(t) have piecewise-continuous elements. Let Ut be a class of Lebesgue integrable functions on [0, t], then for every v and {ur} E Ut equation ( 1 ) defines the unique second order process {xT}. 3

Now let us consider the linear discrete-time dynamic system xt = atxt-h + Wt, xo =, t =h, 2h,... (4) where Xt is a state vector, v is an uncertain initial condition, at is known matrix, h > 0 is a time increment, Wt is an input signal given by Wt = btut + ~t. (5) Here ut is an uncertain vector, bt is known matrix, { it} is a zero-mean discrete random process with cov(st,s) = Ctsfs The corresponding observation model is given by YO = ov + o, Yt = OtUt + Wt, (6) Zt = Otxt + rIt, where {Wt}, {t} are zero-mean discrete random processes with cov(Wt, JS) = Qts6t and cov(rlt, 77) = Pt s; co is a zero-mean random vector with covariance matrix Qo. We assume for simplicity that wo, {f}t, {ct} and {r]t} are independent. 3 Filtering problem. The linear filtering problem for the system (1)-(3) is to calculate the best linear estimate 't given observations {yr, Z1, 0 < T < t} We consider the best linear estimate in the sense of minimizing the following criterion Jt = sup E((xt - Xt)TEt(Xt - Xt)} (7) vERP,{ur}eUt where E > 0 is known weight matrix. Proposition 1. Assume that {xt}, {yt} and {zt} are given by (1)-(3). Let i) Qo > 0, Q(r) > 0, P(r) > 0 for all r E (0, t]; ii) q$T0o > 0, q(T(r)q(T) > 0 for all r E (0, t]. 4

Then the Jt -optimal linear estimate X~ is unbiased. X~ and its error covariance matrix k(t) are given by the following equations dxt= a(t)5i~tdt ~ b(t) [q$T(t)Q-1 (t)q$(t)]1q$T(t)Q'1 (t) dyt ~ k(t>PT(t)p-l(t) [dzt - (t td] k(t) = a(t)k(t) + k(t)aTt - k tV t)~)4(t) k(t) + b(t)[OT(t)Q"-'(t)O(t)]-1bT(t) + 0(t) with 'Initial conditions =O OO 0 [q0Q'~]'$'y; k (0) = 0 (9) Proof of Proposition 1 see in Appendix A. Now let us consider the discrete-time model (4)-(6). Let Ut = (u,.. aT, VT)T be the m -dimensional block vector,M = (t/h)q +p. The linear filtering problem for the system (4) is similar to the filtering problem for the system (1) but the optimality criterion slightly differs from (7) =sup E{(xt - x ')Th(xt - x*)} (10) Proposition 2. Assume that { xt}, { yt} and {zt}I are given by (4) -(6). Let i) Qr > 0, P~, > 0 for all Tr E [0, t], a E [h,t] ii q~'$> 0,for all T E [0, t]; iii)a, is nonsingular for all -r E [h, t]. Then the Jt*-optimal linear estimate x* is unbiased.x* and 'its error covariance matrix kt are given by the following equations xt= atx~t + bt q5[Qjqt]1fjt, kt= at kt..haT + bt [q$Q71qOt]1'bT ~ Ct; (1 5

Xt =- t + ktft (tkt + Pt)- l(z - (12) kt = - kt- T(4tktt + Pt)-l4tkt with initial conditions = [Qo oQ-10] 'oYo; ko = [TQ-1o~]-1 (13) Proof of Proposition 2 see ( Borisov, Pankov and Sotsky, 1991 ). 4 Fixed-interval smoothing problem. The linear fixed-interval smoothing problem for the system (1)-(3) is to calculate for all t E [0, T] the best linear estimate 4x given observations {y=, z7r, 0 < T < T}.We consider the best linear estimate in the sense of minimizing the following criterion Jt - sup E{(xt- x) t(xt- x)} (14) VERP,{ur}),UT where E > 0 is a known weight matrix. The dynamic systems under investigation are purely uncertain, i.e. C(t) - O. The smoothing algorithm is based on the idea of a two-filter smoothing similar to one for pure stochastic systems (Mayne, 1966),(Wall et al., 1981). Let us consider a reversed time system d = -a(t)x dt - dw, XT - z, (15) If r = XT and u4 = ut for all t E [0,T] then systems (1) and (15) define pathwise-equal processes. We assume existence of the final moment T state vector observation YT = tTXT + WT, (16) where wJT is a zero-mean random vector with COV(WT,WT) = QT Proposition 1 being applied to the model (15) makes possible to obtain the backwards estimate 4F given observations {yf, ZT, t < T < T} 6

i.e. the best linear estimate of xt given the " future " observations. This estimate is unbiased with the error covariance matrix k[. Proposition 3. Assume that {xt}, {Yt} and {zt} are given by (1),(3),(16). Let a conditions of Proposition 1 hold and i) the system ( 1 ),( 2 ) is pure uncertain; ii) $TT > 0, QT > 0; Then the Jt -optimal unbiased smoothing estimate x^ and its error covariance matrix kt are given by t= k[k ^t +r (kt) ]; kS = [kt1 + (kr)-] (17) t _ kt [tl^ T (1 Proof of the Proposition 3 see in Appendix B. 5 Numerical examples. 1. Let us consider the control system given by x - X + 0.25x = ul + 0.5itl+ u2 +, x(0) = X0, i(0) = 0o (18) where {u1}, {u2}, are unknown input signals, xo, io are unknown initial conditions, {(} is the Wiener process with differential covariance C = 2500. Equation (18) may be rewritten as the first order differential system 1 = X2 + 0.5ul (19) X2 = -0.25X1+ x2 + 1.5ul + u2 + ( where x(t) = xl(t). The observatoins are given by Z( -- Xl(O) - 70 z =x(0)+ro = X(10) + rt z = X1 + (20) Yi = ul + u2 + -l Y2 = u2 - u2 + WC2 7

where {rl} and {(Wl, w2)T} are the Wiener processes with differential covariances cov(rl,r) = 900, cov(wi, wi) = A, COv(W2, 2) = j, cov(wi, w2) = A//2 respectively. The known parameter A/ characterizes the accuracy of the observations; cov(r7o, 0) = COV(rT, AT) 100000. The filtering problem is to calculate recursively the Jt-optimal estimate of the state x given the observations (20). We use the algorithm (8),(9). Table 1 gives the estimation results: 1cr1, 2,c3 are the estimation error standard deviations, obtained by A = 100, A = 1000, Ai = 5000 respectively, UK is the estimation error standard deviation obtained by the Kalman filter in the ideal situation when complete information about ul, u2 is available. From Table 1, it follows that the estimates given by the algorithm (8),(9) are close enough to those of the Kalman filter and may be used in the case of unknown but observable input signals when the Kalman filter is useless. time oaK Ol o2 3 0 316.23 316.23 316.23 316.23 0.2 74.02 74.03 74.15 74.66 1 66.37 66.40 66.63 67.67 2 54.51 54.60 55.45 58.61 3 51.93 52.10 53.48 57.78 5 51.68 51.87 53.37 57.77 6 51.68 51.87 53.37 57.77 Takble 1. Estimation error standard deviations (ua, U2, a3) for the filter (8),(9) and corresponding values JK for the Kalman filter. 2. Let us consider the system of observations (18)-(20) in the case C = 0. The fixed-interval smoothing problem is to calculate Jt-optimal estimate for all t E [0,10]. We use the algorithm given by (17). Table 2 gives the estimation results: af, orf, o are the filter estimation error standard deviations obtained by = 100, u = 1000, u = 5000 and Cra2, cr are the smoothing estimation error standard deviations respectively. From the Table 2 it follows that the smoothing 8

estimate is significantly more accurate than the corresponding filtering estimate. time af (2f O Of s 0 316.23 12.13 316.23 24.93 316.23 36.57 0.2 74.01 11.98 74.13 23.71 74.64 33.02 1 66.17 11.95 66.42 23.14 67.46 32.32 2 53.01 11.89 54.00 22.70 57.57 31.02 3 47.98 11.98 50.28 22.40 56.23 30.80 5 44.75 13.53 49.26 22.32 56.18 30.80 7 44.19 19.82 49.26 23.59 56.18 30.84 8 44.15 25.46 49.26 27.35 56.18 31.43 9 44.13 32.55 49.26 34.57 56.18 37.38 9.8 44.13 37.41 49.26 40.39 56.18 44.01 10.0 44.13 43.71 49.26 48.67 56.18 56.31 Table 2. Standard deviations of the filtering estimate error (af, ca, oaf) and the corresponding values a', a', a3 for the smoothing. 6 Appendix A. Proof of the Proposition 1. An arbitrary linear estimate xt is given by Xt = a(t)yo + /3 (t, r)dy, + J 7(t, r)dZr (21) where the optimal functions, 3,,y must be ror At = t - Xt can be decomposed as At mt = E{xt - Xt} is the bias of the estimate random error. Then Jt - J1 + J2, where calculated. The er=mt + AO, where A, = -A - mt is a Tl r T'- - T - r- r A nT-" A nl Jt = sup Lm;2mtJ, Jf -t- J, VERP,{uT}EUt m.t = [(t, 0) - a(t)oo - So 7r(t, )(-r)(T (, O)dr7]v + (22) Jo [(t, r)b(r) - o3(t, r)(rT) - (o y(t, 7),(u)'(a, r)dab(r)]UrdT, 9

where ( (t, r) is a solution of the matrix differential equation { 4(t, 7) = a(t)~(t, r), 4(T, T) =- Since v E Rp and {ut} E Ut 0, if Et[4(t, 0) - at(t)qo - ft y(t, r)(b(r)(r, 0)dr] = 0 Jt = - and St[()(t, r)b(T) - /3(t, T)r)() - ft y(t, r)%()(a(, r)drb(r)] = 0, o, otherwise By using the matrix Schwarz inequality ( Fomin, 1984 ) it may be shown that the optimal a(t), /(t, r) are t (t) = [1(t, O) - ot)y(t, T7)(T)(T-, 0)dr]+ 2, 3(t, 7) = [D(t, 7) - fo 7(t, a)u(a)(a, T)da]b(T)+(T)) (23) where 0+ = (OQOl0-o)-1oTQ-1, 0+(T) = (0T(T)Q- (T)b(T))-1T(T)Q-1(T) are pseudoinverse of o and b(T). From ( 22 ) Xt is unbiased and hence At = A'. Hence Xt = ^ (t, 0 yo + J (I(t, r)b(r)(+(T)dyr + f y(t, T){dz,- (24) lQ(r)[D(r, 0)q/yo + L J((rT, a)b(a)q +(o)dy,]dT} Then the error At satisfies the equation t At = J0 7(t, r)dg - At, where {At}, dt are defined by dAt = a(t)Atdt + b(t)q+(t)dwt + d&t, Ao = o+wo (25) dgt= -- (t)Atdt- drt, go = 0 (26) The problem is to find a weight function -(t, r) that minimizes Jt or equivalently to calculate the Jt2-optimal estimate At of At given observations {gI}.The solution of this problem is given by the Kalman filter 10

IdAt = a(t) At dt + k (t)4'T(t) Pl (t) [dgt -'b(t) A-tdt], k(t) =a(t)k(t) + k(t)a T(t) - k (t)4'T (t)P- I(t)4' (t) k(t) + b(t) [q5T(t)Q1 (t)O(t)]"1bT(t) + 0(t), A0 = 0, k(0) =(q$TQ-~1q0o)-1 (27) Here k(t) =cov (At - At, At - At) = cov (xt - X't, Xt - t At j e(t, T) k(T>)T(T) PIl(T) dg, (28) whereO (t, T) 'is given by From ( 28 ) it follows that 7(t, ) = (t, T)k(,r T>f(Tr)p1 (T) (30) Now we can obtain the first equation (8) by substituting ( 30) into ( 24 ) and differentiating both sides of ( 24 ) with respect to t. 7 Appendix B. Proof of Proposition 3. First we state optimal estimate 4~t to be a linear combination of Xt ad 4r 4t = A(t~xt + B(t)4t (31) Then from the Gauss-Markov theorem we have A(t) = ks6(t)k-1(t), B (t) = ks9(t)[kr(t)]1, (32) Iks(t) = k-1(t) + [k~()11 11

As it follows from the proof of Proposition 1, xt is given by Xt = e(t){ (t, 0) yo + Jo (t, T)b(T) +(T)dy)}+ fO y(t, T){dzr - /(T)[(T (7, O)-+yo + jo' ^ (-, a)b(ua)q+(a)dy] dr}+ (I - E(t)){<Ir(t, T)/+yT + T Jr(t, T)b(-T)+(T)dy,+ -Yr(t, r){dzr - 7 (T)[<r(T, T)$~yT + fT r(T, (c)b(ua)q+(7)dy]]dT} (33) where the optimal coefficients y(t, r), Yr(t, ), e(t) must be calculated. Sol E{As} = E{xt - x} = 0 i.e. x^ is an unbiased estimate of xt Let us define the stochastic processes At, A4, gt, gt by dAt = a(t)Atdt + b(t)4+(t)dwt, dgt = P(t)Atdt - drlt, (34) Ao = o$wvo, go=0 -dAr = -a(t)Adt - b(t)f+(t)dwt, <dgt = 4(t)Atdt - dt, (35) AT - OTWT, g T The smoothing estimate error A' can be decomposed as A= - o' (t, r)dg, - e(t)At + f [I(t, )dg - [1 -(t)]A The problem is to find weight functions y(t, r), yr(t, r) that minimize Jt or equivalently to calculate the optimal estimate of the linear combination e(t)At + [1 - e(t)]Ar given observations {g9, 0 < T < t} and {gr, t < r < T}. The solution of this problem is given by the Kalman filter for the system ( 34 ),( 35 ). As in the proof of Proposition 1 we have 7(t, 7) = 6(t)e(t, T)k(T)IT(T)P-(T), 36 7 (t, r) = [I - (t)]We(t, T)kr(T)T(T)P-1(T) (36) where e(t, r) and Or(t, r) are given by (, r) = [a(t) k(t)T (t)pl(t)(t)](t, r), t > T, (r, r) = I, (r (t, ) = r[-a(t) - kr(t)pbT(t)P-1(t)0b(t)]Er(t, r), t < r, r(T, r) =. Now we can substitute ( 36 ) into ( 33 ) and obtain ( 31 ) in the form = e(t)xt + [I - (t)]4. 12

References Barton, R.J., and Poor, H.V., 1990 Robust estimation and signal detection. IEEE Transactions on Information Theory,36,485-501. Borisov, A.V., Pankov, A.R., and Sotsky, N.M.,1991 Filtering and smoothing in uncertain-stochastic systems with partially observable input signals. Automation and Remote Control, 3,85-95 (in Russian). Borisov, A.V.,and Pankov, A.R., 1991 Optimal signal processing for uncertain-stochastic systems. Proceedings of 30th Conference of Decision and Control. Brocett, R.W, and Clark, J.M.C., 1980 The geometry of the conditional density functions. Analysis and Optimization of Stochastic Systems, O.L.R. Jacobs et al., eds, Academic Press, 299-309. Eweda,E.,1991 Convergence of the sign algorithm for adaptive filtering with correlated data. IEEE Transactions on Information Theory,37, 1450-1457. Fomin, V.N., 1984 Recursive estimation and adaptive filtering. Moscow, Nauka ( in Russian ). Goodwin, G.C.,and Sin, K.S., 1984 Adaptive filtering, prediction and control. New Jersey, Prentice Hall Inc. Hero, A.O., 1991 Timing estimation for filtered Poisson processes in Gaussian noise.IEEE Transactions on Information Theory,37, 92 -1046. Kalman R., and Bucy, R.S., 1961 New results in linear filtering and prediction theory. Transactions of ASME,83, 95-108. 13

Kassam, S.A., and Poor, H.V., 1985 Robust techniques for signal processing: a survey. Proceedings of IEEE,73,433-481. Kleptsina, M.L., and Veretennikov, A.Yu., 1985 On filtering and properties of conditional laws of Ito-Volterra processes. Statistics and Control of Stochastic Processes, Steklov Seminar, N.V. Krylov et al.,eds., Optimization Software Inc. Kurzhansky, A.B., 1977 Control and observation under uncertainty. Moscow, Nauka ( in Russian ). Liptser, R.S., and Shiryayev, A.N., 1977 Statistics of random processes. New York, Springer-Verlag. Malyshev, V.V., and Kibzun, A.I., 1987 Analysis and sinthysis of the aircraft high accuracy control. Moscow, Mashinostroenie (in Russian). Mayne, D.Q., 1966 A solution of the smoothing problem for the linear dynamic systems. Automatica,4,73-92. Nagpal, K.M., and Khargonekar, P.P., 1991 Filtering and smoothing in an Hc~ Setting. IEEE Transactions on Automatic Control,36,152 -166. Pardoux, E., 1979 Filtering of a diffusion processes with Poissontype observation. Stochastic Control Theory and Stochastic Differential Systems, M. Kohlman and W. Vogel eds., Lecture Notes in Control and Information Sciences,16, New York, Springer-Verlag. Tam, L.-M., Wong, W.S., and Yau, S.S.-T., 1990 On necessary and sufficient condition for finite dimensionality of estimation algebras.SIAM Journal of Control and Optimization,28, 173-185. 14

Wall, J.E., Willsky, A.S., and Sandell, N.R., 1981 On the fixedinterval smoothing problem. Stochastics,5,1-42. 15