THE UNIVERSITY OF MICHIGAN COMPUTING RESEARCH LABORATORY1 A UNIFIED METHOD FOR EVALUATING REAL-TIME COMPUTER CONTROLLERS: A CASE STUDY K.G. Shin, C.M. Krishna and Y.H. Lee CRL-TR-23-83 JUNE 1983 Room 1079, East Engineering Building Ann Arbor, Michigan 48109 USA Tel: (313) 763-8000 1This research has been supported in part by NASA Grant No. NAG 1-296. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the funding agency.

A UNIFIED METHOD FOR EVALUATING REAL-TIIME COMPUTER CONTROLLERS: A CASE STUDY1 K.G. Shin, Senior Member, IEEE, C. M. Krishna, Student Member, IEEE, and Y.H. Lee, Student Member, IEEE Computing Research Laboratory Department of Electrical and Computer Engineering The University of Michigan Ann Arbor, Michigan 48109 ABSTRACT A real-time control system consists of a synergistic pair, that is, a controlled process and a controller computer. We have defined new performance measures for real-time controller computers on the basis of the nature of this synergistic pair. In this report we present a case study of a typical critical controlled process in the context of new performance measures that express the performance of both controlled processes and real-time controllers (taken as a unit) on the basis of a single variable: controller response time. Controller response time is a function of current system state, system failure rate, electrical and/or magnetic interference, etc., and is therefore a random variable. Control overhead is expressed as a monotonically non-decreasing function of the response time and the system suffers catastrophic failure, or dynamic failure, If the response time for a control task exceeds the corresponding system hard deadline, if any. A rigorous probabilistic approach is used to estimate the performance measures. The controlled process chosen for study is an aircraft in the final stages of descent, Just prior to landing. Control constraints are particularly severe during this period, and great care must be taken in the design of controllers that handle this process. First, the performance measures for the controller are presented. Secondly, control algorithms for solving the landing problem are discussed and finally the impact of our performance measures on the problem is analyzed, showing that the performance measures and the associated estimation method have great potential use for designing and/or evaluating real-time controllers and controlled processes. Also, one application for the design of controller computers, presented in detail, is checkpointing for enhanced reliability. Index Term- Controlled process(es), controller computers, hard deadlines, response time, performance measures, allowed state space, aircraft landing, checkpointing. lThis work has been supported In part by NASA Grant No. NAG 7 -296. Any opinions, findings, and conclusions or recommendations expressed In this report are those of the authors and do not necessarily reflect the views of the funding agency.

1. INTRODUCTION Any real-time system can be regarded as a composite of controlled subsystems (henceforth called controlled processes) and controller subsystem(s). Traditionally, the performance of real-time control computers has been analyzed separately from that of the corresponding controlled processes. For example, the response delay caused by the controller is neither studied rigorously nor reflected carefully into the design of control algorithms for the controlled processes. The design of the controller Is frequently based on ad hoc requirements Imposed by control designers. While this yields acceptable results in the control of non-critical processes, such an approach needs to be improved in the design of controllers for critical processes e.g. aircraft. What is called for is a procedure for specifying and evaluating controller performance, enabling systematic application and providing objective results that lend themselves to formal validation. The use of computers as real-time controllers is becoming increasingly attractive due to continuing advances in the development of inexpensive, powerful microprocessors and memories. However, performance measures presently used to characterize real-time computer systems are adapted versions of those employed for more conventional computers. There is a considerable mismatch between the requirements of real-time applications and what is provided by these measures. To solve this problem, several contortions of the conventional measures have been proposed. Generally, these involve representing real-time computer performance as a vector pcRP, made up of such traditional indices as (conventional) reliability, throughput, survivability, availability, etc. However, it is impossible to compare two performance vectors (and therefore the corresponding computer systems) without an associated metric. One straightforward approach is to use a linear metric(i.e. inner product) to map the vector into a scalar that is then claimed to represent the performance of the system. For example, the mapping can be carried out by assigning weights to the various components of the performance vector and adding them to 2

produce the scalar. That Is, If the weight vector Is wwT = (w1...,,wm), then the mapping Is f:Rm -R with / (p)=wTp. This process of ascribing weights is largely subjective and is therefore inaccurate to begin with. Even if the weight ascription were completely objective, serious practical difficulties would remain. For example, since the components of the performance vector are mutually dependent (sometimes in a very complex manner), the weights (that are supposed to define the sensitivity of the scalar to the respective vector components) must be modified by (often very complex) correction factors to account for this coupling.2 Furthermore, relating the resulting scalar to "real-world" performance parameters (such as operating cost, etc.) is difficult. The performance measures we introduced in [1] are designed to get around these difficulties by expressing the performance objectively in terms of the response time of the computer-controller. From the point of view of the controlled process, the computer controlling it is a black box whose behavior is exemplified by its response time and reliability. It is well known that controller delay has a detrimental effect on process behavior, our measures take the form of a quantification of this. The performance measures are considered in Section 2 in some detail prior to the presentation of an idealized case-study of their application. The case-study is that of a real-time computer in charge of an aircraft in its final phase of flight, just prior to touchdown. There are stringent control constraints that must be met. These consist of limits on the speed of touchdown (both horizontal and vertical), the angle of attack, ax, and the pitch angle, i?. For a definition of these angles, see Figure 1. These constraints are variously intended to safeguard against running out of runway, undercarriage collapse, stalling, and landing either on the aircraft nose or tall. Insofar as this is a control problem with severe constraints, the 2 This Is because the weights are supposed to represent the total derivatives of the mapped scalar to their respective components. However, since the components of p are not orthogonal, this Is not true for the unmodified weights: they represent the partial derivatives which are not equal In this case to the respective total derivatives. 3

problem is typical of many other critical applications, such as the control of nuclear reactors, the generation and distribution of electrical power, life-support systems, etc. Since our objective here is to illustrate the use of our performance measures and not to solve a control problem, the aircraft system is somewhat idealized in this report. Figure 2 shows the block diagram of a typical control system. The inputs to the controller are from sensors that provide data about the controlled process, and from the environment. This is typically fed to the computer-controller at regular intervals. Data rates are usually low: generally fewer than 20 words a second for each sensor. Central to the operation of the system is the trigger generator. In most systems, this is physically part of the controller itself, but we separate them here for purposes of clarity. It is the function of the trigger generator to initiate execution of a controller job (defined later). Triggers can be classed into three categories. (1) Time-generated trigger: These are generated at regular intervals, and lead to the corresponding controller job being initiated at regular intervals. In control theoretic terms, these are open-loop triggers. (2) State-generated trigger: These are closed-loop triggers, generated whenever the system is in a particular set of states. For practicality, it might be necessary to space these triggers by more than a specified minimum duration. If time Is to be regarded as an implicit state variable, the time-generated trigger is a special case of the state-generated trigger. One can also have combinations of the two. (3) Operator-generated trigger: The operator can generally over-ride the automatic systems, generating and cancelling triggers at will. The output of the controller is fed to the actuators and/or the display panel(s). Since the actuators are mechanical devices and the displays are meant as a human 4

interface, the data rates here are usually very low. Indeed, as we have pointed out elsewhere [13], a computer control system exhibits a fundamental dichotomy, with the I/O being Carried out at rather low rates and the computations having to be carried out at very high rates owing to real-time constraints on control. The controller in our case-study is a real-time computer. It executes predefined control Jobs. There is a certain number of control jobs in any control system that are executed repeatedly. A control system executes "missions." These are periods of operation between successive periods of maintenance. In the case of aircraft, a mission is usually a single flight. The operating interval can sometimes be divided down into consecutive sections that can be distinguished from each other. These sections are called phases For example, Meyer et. al [6] define the following four distinct phases in the mission lifetime of a civilian aircraft: (a) Takeoff/cruise until VHF Omnirange (VOR)/Distance Measuring Equipment (DME) out of range. (b) Cruise until VOR/DME in range again. (c) Cruise until landing is to be initiated. (d) Landing. The phase to be considered here is landing, it takes about 20 seconds. The controller Job that we shall treat is the control of the aircraft elevator deflection during landing.3 The specific system employed is assumed to be organized as shown in Figure 3. Sensors report on the four key parameters: altitude, descent rate, pitch angle, and The output of the controller Is assumed to be fed Into a peripheral processor that Is dedicated to controlling the actuator -- In this case the elevator. 5

pitch angle rate every 60 milli-seconds.4 We have a time-generated trigger, with a time period of 60 milli-seconds. Every 60 milli-seconds, the controller computes the optimal setting for the elevator, which is the only actuator used in the landing phase.5 The execution time for the computation is nominally 20 milli-seconds, although this can vary In practice due to failures. Since the aircraft is a dynamic system, the effects of controller delay are considerable -- as we shall see in this report. Since the process being controlled is critical (i.e. in which some failures can lead to catastrophic consequences), variations of controller delay and other abnormal behavior by the controller must be explicitly considered. For simplicity, we do not allow Job pipelining in the controller; in other words a controller job must be completed or abandoned before its successor can be initiated. The following controller abnormalities can occur: (i) The controller orders an incorrect output to the actuator. (ii) The controller takes substantially more than 20 milli-seconds (the nominal execution time) but less than the inter-trigger interval of 60 milli-seconds to complete executing. iii) The controller takes more than 60 milli-seconds to complete executing. In such a case, the abnormal job is abandoned and the new one initiated. We say that a control trigger is "missed" when this happens. An analysis of controller performance during the landing phase must take each of the above abnormalities into account. 4 The sensors and actuators ar and actuators are assumed to have their own dedicated processors for I/0 purposes. When we speak of "controller delay," we also Include the delay In these processors. A/so, the period of 60 milli-seconds Is arbitrary, and the choice of this period does not alter the method developed here. 6 There are other actuators used aboard the aircraft for purposes of stability, horizontal speed control, etc. We do not however consider them here, concentrating exclusively on the control of the elevator. 6

This report is organized as follows. In Section 2 we present the performance measures that will be used, and Section 3 contains a description of the controlled system. In Section 4, we derive the measures associated with the controlled process (the aircraft), and in Section 5 we consider one example of their application for the design of real-time controllers. The report concludes with Section 6. 2. PERFORMANCE MEASURES 2.1. Review of the Performance Measures For completeness, we review briefly in this section the performance measures to be used, which were Introduced by us in [1 ]. The measures are all based on a single attribute: computer controller response time distribution. A real-time computer controller in general exhibits stochastic behavior.6 Real-time computer controllers repeatedly execute predefined control jobs which are Initiated either by environmental stimuli or internally. Central to our performance measures are the concepts of dynamic failure and allowed or admissible state-space. Every critical process must operate within a state-space circumscribed by given constraints. This is the allowed state-space. Leaving this state-space constitutes dynamic failure.7 In the example we treat here, the states are the altitude, the vertical speed, the pitch angle, and the pitch angle rate. Each of these has a constraint. For example, the aircraft must not touch down with too great a downward velocity or the undercarriage will collapse. The performance of the controlled process naturally depends on the speed of the controller. If the controller takes longer than a certain duration to formulate the control, dynamic failure becomes possible. This duration is the hard deadline. 6 This Is partially because failures are assumed to occur randomly over the operating interval. The failure law for the components of the computer Is assumed to be known. Furthermore, execution of control tasks Is stochastic due to the blocking at shared resources, conditional branches In task code, etc. 7 Dynamic failure that can occur as a result of the controller not responding fast enough to the environment. It expresses the fact that slowness of the controller can be a cause of catastrophic failure. 7

We define a cost function Ca(t) associated with controller response time t for controller job a. The cost function takes the following form: ga(() if O<S-Tda (1) (1) Ca( M if t>Tdla 0 otherwise where ga(*) is a suitable continuous non-decreasing function of C and Tda is the hard deadline associated with the job a. Clearly, since the environment influences the quality of system performance the cost function is implicitly a function of the system state. Also, if da is a finite quantity in some region of the state-space, the job is critical in that region. The determination of the hard deadline is treated in detail in Sections 2.2 and 2.3. For controller response times less than the hard deadline, the cost function in (1) above is continuous, monotonically non-decreasing, and therefore always bounded for finite response time. For consistency, it is assumed that the costs accrue as the execution proceeds. The functions (called the finite cost functions) g, can be obtained using the known to control theory and express the consumed energy, fuel, time, or some other physical parameter associated with the trajectory of the system as it travels from its initial to its final state. See, for example, [2,3], for details. The cost of running the controlled process over, say, an interval of time [to,t/], is usually expressed by: tf (2) @ E[f, (x(t )u(t)t) |y(T),trT-t ]dt (2 to where E[l[*] represents conditional expectation, fo is the instantaneous index of performance at time t, and x(t)cRn, u(t)ERi and y(t)cRm represent the state, input, and measurement vectors respectively. A good representation for gba() is given by: 8

a(t) f= [a a() (~) for 0 Tda(3) 0 otherwise where 4a(it) s expected contribution of Ua, to @ if response time of that particular execution of job a=r7, and uca = control input subvector associated with job a. Note that the input vector u for job a consists of the control input subvector, u., as well as the environment (random) input subvector, uea. A version Is an instance of the execution of a task. Versions are numbered in sequence of Initiation: successive versions of task i being denoted by Vil..., V,. The response time associated with a version Vi is denoted by RESP(ij). Let qg(t) represent the number of versions executed for task i over the interval [O,t), and r the number of distinct tasks. Then define s(t)- Pi(t) irl t-1 q (t) f g9i(t) if O<t<td where P = - hi (RESP (i)) and h (t)= gi(tdi) if t>tdi For an illustration, see Figure 4. Clearly, hi is the cost function Ci "hard-limited" at gi(t^d). ri is essentially the finite operating cost associated with task i. Remark: It might legitimately be argued that to associate a contribution to finite cost after the hard deadline has been missed is inconsistent with the notion of hard deadlines being "absolute" in the sense that missing a hard deadline, by definition, has catastrophic consequences (e.g., an airplane crash). By this argument, hi (t) = 0 for all t > tdi. However, such an assignment would, while pacifying the purists, lead to unpleasant anomalies, not the least of which is that a very poor system which almost always misses deadlines would exhibit a smaller finite operating cost than a counterpart that almost always fulfills them. Also, assume that the computer system is modelled as a Markov process. This is clearly possible. The number of states8 depends on the extent to which the system is capable of graceful degradation. Let B be the set of states where the probability of failure is unity. These states represent the states when the extent of 8The word "state" used here has a different connotation from the "state" discussed In the preceding sections. The latter has a control-theoretic meaning. On the other hand, the state of controller computers usually means the number of functioning processors, buses, memories, jobs, etc. However, both forms of usage conform to the essential concept of "state," as a codification of relevant system condition. For this reason, the same word has been used for the two different purposes, following the usual practice. Its Interpretation should be made from the context. 9

hardware/software collapse Is so great that there is a zero probability of successful execution of any task in finite time. Let L(t) denote the probability distribution of the operating Interval duration between two successive service stages. Our performance measures are then:9 Probability of Dynamic Failure, pdy: This is the probability that over the operating interval, at least one hard deadline is missed for whatever reason. This probability incorporates within it the probability of static failure, which is the probability that so massive a hardware failure has occurred that the system utilization is greater than unity. Static failure probability has erroneously been treated in most of the literature as expressing the total probability of failure. This is most decidedly not the case in real-time systems. Mean Modified Cost. M f j ES(t) I system never enters state set B dL(t) 0 It can be shown that, for physical systems, this integral always exist since the lifetime is always finite with probability 1. The performance measures can be used to rank rival computer systems and to help design Improvements to existing systems in the context of the control application. Typically, the probability of dynamic failure is used as a pass/fail test for candidate controllers. This test can be very severe: for example, 10-9 is the specification for failure probability adopted by NASA for computer controllers of the next decade handling a 10-hour civilian flight. The mean cost is then employed to rank controllers that have passed the dynamic failure criterion. For fuller details, see [1]. Note that all parameters associated with these performance measures can either be definitively estimated or objectively measured. Also, the measures specifically incorporate the controlled process into a determination of the controller's 9 There are other performance measures developed In [1 ], but not considered here. For our purposes, the measures listed here are sufficient. 10

capabilities. This is, as far as we know, a novel approach which ensures that the performance measures are not generally, but instead specifically, indicative of the controller performance in a given application. For this reason, these measures are intrinsically more reliable than others in use. 2.2. Hard Deadlines Roughly speaking, hard deadlines are deadlines that must be met if catastrophic failure is to be avoided by a critical process. In other words, it is the deadline that, if met by the controller in (correctly) formulating its response, ensures that the system remains In Its allowed state-space. Traditionally, It has been assumed by computer engineers that the hard deadlines for each critical controller job are somehow "given". Unfortunately, this presupposes a precise definition of the hard deadline and a means for obtaining it. Neither seems to exist in the literature. At first glance, it might seem that the hard deadlines can be obtained relatively easily from an analysis based on the state equations of the controlled process. This is the case when individual controller actions are decoupled from each other and the process is simple. For an example in which this is the case, see [1]. However, when there is a considerable coupling between individual controller jobs, I.e., when two or more controller jobs mutually affect each other, or when no closed-form solutions are available for the process state equations, obtaining a hard deadline for each can be difficult. For example, in the aircraft landing problem, the controller has over the twenty seconds or so that it takes to complete the landing, to compute the elevator deflection a number of times (in our example about 330 times). The constraints are on the final values (except for the angle of attack), i.e., as long as the aircraft touches down on the runway without over- or under-shoot, with an acceptable velocity and at a proper pitch angle, dynamic failure has not occurred. The problem here is that It is not just a single controller action that determines whether catastrophic failure will occur or not; It is the cumulative effect of, in this case, 330 or so distinct 11

controller actions. How then is one to allocate deadlines to the individual actions? It is clear that we need a more carefully defined framework to handle these problems in a feasible manner. For this it is convenient to represent the controlled process by a state-space model. Let the state of the system at time t be denoted by x(t). State transitions are characterized by a mapping V:TxTxXxU -, Xwhere T c R represents the time region, XcR1 the state space, and UcR1 the input space; that is, x(t1) = y(tl, to, x(to), u) (4) where ucU represents the controls (inputs) applied to the process in the interval [tot ). Let DcU be the admissible input space (i.e. the range of inputs that it is possible to apply), and XA cX the allowed state-space. Then, the hard deadline associated with controller job oc triggered at to when the system is in state x(to) is given by aX((t o)) inf sup 7r\ V(t+Tr to, x(to), u) C Xu (5) tE( Thus, for every point in the state space, we have for each critical controller job a corresponding hard deadline.10 It should be noted that the calculation in (5) is performed over the entire admissible state and input space and is thus difficult to achieve. One might wish to perform the calculation over only a subset of the admissible input or state space. To allow for this, the notion of conditional hard deadlines can be employed. Let us assume that the sets wccO and acXA are specified, and also that x(to)Ea. 10 Notice In this context that It Is not possible for the hard deadline as defined above to be negative unless this Is the first Instance during the current mission that the controller job Is being executed. This Is because If this were the case, failure would already have occurred on the previous execution of the controller job by definition. Also, the deadline on the first instance of execution of the control function cannot be negative, since that would mean that the controller-process system had been Improperly designed. 12

The conditional hard deadline1l of job a, denoted by rdalw, a is defined as Tdal,,(x(to)) - inf Ssp rI(t rpto+T, to, x(to), u) E a (5a) UEM For the purposes of the case-study in this report, however, we shall restrict ourselves to the unconditioned hard deadlines. As It stands, the expression for the hard deadline (and for the conditional deadlines) Is not easy to obtain for the entire state-space. In fact, it Is almost impossible to obtain in closed form in all but the simpler control systems. We shall later see how to obtain a good approximate expression of rda. Also, if the environment is stochastic in character and not deterministic, the hard deadline Is a random variable. Assuming that the environment is stochastically stationary leads to the existence of the distribution function of the hard deadline. 2.3. Allowed State-Space and Its Decomposition As we said above, it is difficult to determine the hard deadline and the finite cost function as a function of the state over the entire state space. To take our present example of an aircraft, the solution of the state equations is not obtainable in closed form when controller delay is considered. To obtain the functional dependence of the hard deadlines or the finite cost function of each controller job on the current state vector is therefore impossible to do analytically, and prohibitively expensive to do numerically for a large number of sample states. To get around this problem, we divide the allowed state-space down into subspaces. Subspaces are aggregates of states in which the system exhibits roughly the same behavior.12 In each subspace, each critical controller Job has a unique hard 11 Unlike the unconditioned hard deadline, It Is possible for the conditional hard deadline to be negative since no specific relationship Is required between the subsets W and a. 12Even If there do not exist clear boundaries for these subspaces, one can always force the admisslble state space to be divided Into subspaces so that a sufficient safety margin can be provided. This Is a designer's choice for approximation. 13

deadline. Remarks: In some subapaces. a job described in general as "critical" might not be critical in the sense that even if the execution delay associated with it is infinity, catastrophic failure does not occur. That is the associated hard deadline may be infinity for a particular subspace. What does usually happen in these circumstances is that the system moves into a new subspace - or at the least toward the subspace boundary - in which the dangers of catastrophic failure are greater. In this subspace, the requirements on controller delay are more stringent, and there might well be a hard deadline, representing a critical task. Thus a "critical" job need not be truly critical in every subspace, it only has to map into a critical task - defined in the sequel - in at least one subspace. Also, subspaces are job-related, i.e. the same allowed state space can divide into a different set of subspaces for each control job. For convenience, a controller "task" is defined as follows. Definition: A controller task, often abbreviated to "task", is defined as a controller Job operating within a designated subspace of the allowed state space. Let S; for i=0,1,...,s be disjoint subspaces of XA with XA = USi and let J denote a i=l controller job. Then, we need the projection:(J, XA) -4 ((To. So), (T1, Si),...,( Ts Ss)) where Ti is the controller task generated by executing J in Si. With each controller task, we may now define a hard deadline without the coupling problem mentioned above. We denote it by tJ for critical task Tj (for convenience, however, the superscript J will be omitted in the sequel). We will see that a critical job can possibly map into a non-critical task for one or more allowed subspace; it only needs to map into a critical task in at least one such subspace to be considered critical. A. Allowed State-Space The admissible state-space is the set of states that the system must not leave if catastrophic failure is not to occur. Consider the two sets of states X2 and XA defined as follows. (I) XA is the set of states that the system must reside in if catastrophic failure is not to occur immediately. For example, we may define in the aircraft landing problem, a situation in which the aircraft flies upside down as unacceptable to the passengers and as constituting failure. Notice that terminal constraints are not taken Into consideration here unless the task in question is executed Just 74

prior to mission termination. (ii) Xi is the set of acceptable states given the terminal constraints, i.e., it is the set of states from which, given the constraints on the control, it becomes possible to satisfy the terminal constraints. Note that leaving XA means that no matter how good our subsequent control, failure has occurred.13 On the other hand, changing the control available can affect the set XI. The admissible state space is then defined as XA - XA n Z4. Obtaining state-space XI can be difficult in practice. The curse of dimensionalIty ensures that even systems with four or five state variables make unacceptable demands on computation resources for the accurate determination of the allowed state-space. However, while it can be very difficult to obtain the entire allowed state-space, it is somewhat easier to obtain a reasonably large subset, XaCXA. By defining this subset as the actual allowed state-space, (i.e., by artificially restricting the range of allowed states), we make a conservative estimate for the allowed state-space. Note that by making a conservative approximation, we err on the side of safety. Also, the information we need about XA may be determined to as much precision as we are willing to invest in computing resources. In what follows, to avoid needless pedantry, we shall refer to the artificially restricted allowed state-space, XI, simply as the "allowed state-space." B. On Obtaining the Subspaces While the methods used to isolate the subspaces for each particular control application will probably be different, the basic approach is much the same in all cases. Let N(x) represent a neighborhood of XCXA, C~(t) and tdi(x) denote respectively the cost function and the hard deadline associated with Ti where - represents 13 Strictly speaking, of course, there can be no subsequent control since by leaving X, the system has failed catastrophically before the next control could be Implemented. 15

the controller response time, and d: XxX - R is a metric function. Then the subspaces So, S1,.., S of XA can be obtained by the following steps. S1. Choose a set of I points, pi XA, i=1,2,...1. S2. Construct a neighborhood around each of these points, N(pi) for pi, such that (I) All N(p)'s are disjoint and XAcUN(pi). i (ii) | t (x) - tdt(pi) I K, for all xcN(pi) and for some K1>0. (iii) d (Cx(4), CP ( ) < K2fi () for all xcN(pi) and for some monotonically non-decreasing positive function fi( ) and K2 > 0. S3. if d(CtCp, Ci+') K3f'(K ) for i=1,2....,(-l), some K3 > 0, and monotonically non-decreasing positive function f'(t) then merge N(pj) and N(p,+1) forming subspaces S = (So, S1,... S,). S4. if S satisfies all the requirements of the application jobs then successful decomposition else choose a different set of points and go to S2. The job of dividing XA into S = (SO, S1,.... S,) is sometimes made easy by the existence of natural cleavages in the state-space, when the latter is viewed as an influence on system behavior. In most cases, however, such conveniences do not exist, and artificial means must be found. The problem then becomes one of finding discrete subdivisions of a continuum. The method we employ is to quantize the state continuum in much the same way as analog signals are quantized into digital ones. Intervals of hard deadlines and expected operating cost (i.e. the mean of the cost function conditioned on the controller delay time, and using the distribution of the latter) are defined. Then, points are allocated to subspaces corresponding to these intervals. To take a concrete example, consider a state-space XcRn that is to be subdivided on the basis of the 16

hard deadlines. The first step is to define a quantization for the hard deadlines. Let this be A. Then, define subspace Si as containing all states in which the hard deadline lies in the Interval [(i-l)A, iA). Alternatively, one might define a sequence of numbers A,, A2,... such that the subspaces were defined by intervals with the A's as their end-points. This would correspond to quantizing with variable step sizes. The subspace In which the Job under consideration maps Into a non-critical task is a special case and is denoted by So. Subspaces can also be defined based on a quantization of the expected operating cost or on both the operating cost and the hard deadlines. We provide an example of subdivision by hard deadlines in Section 4. The size of each subspace will depend on the process state equations, the environment, and how much computing effort it is judged to be worth spending on obtaining the subspaces. Naturally, all other things being equal, the smaller a subspace the greater the accuracy of the inherent approximation.14 In the rest of the report, to illustrate the derivation of the performance measures, we carry out their evaluation when the controlled process is an aircraft in the phase of landing. Also, an optimal checkpointing is considered for the design of a reliable controller. 3. The CONTROLLED PROCESS The controlled process is an aircraft, in the phase of landing. The model and the optimal control solution used are due to Ellert and Merriam [4]. The aircraft dynamics are characterized by the equations: il(t) = b llx (t)+b l2x(t)+b 13x(t)+c lm (t,) (6a) zz(t)= z (t) (6b) 14 The error that ensues as a result of quant/zatlon of the state space can be estimated In the same way that quantizatlon error Is estimated In signal processing theory. 17

x3(t) = b 2x2(t)+b33x3(t) (6c) x4(t) = x3(t) (6d) where ze Is the pitch angle, xl the pitch angle rate, X3 the altitude rate, and X4 the altitude. ml denotes the elevator deflection, which is the sole control employed. The constants by and c l are given In Table 1. Recall that C denotes controller response time. The phase of landing takes about 20 seconds. Initially, the aircraft is at an altitude of 100 feet, travelling at a horizontal speed of 256 feet/sec. This latter velocity is assumed to be held constant over the entire landing interval. The rate of ascent at the beginning of this phase is -20 feet/sec. The pitch angle is ideally to be held constant at 20. Also, the motion of the elevator is restricted by mechanical stops. It is constrained to be between -35~ and 16~. For linear operation, the elevator may not operate against the elevator stops for nonzero periods of time during this phase. Saturation effects are not considered. Also not considered are wind gusts and other random environmental effects. The constraints are as follows: The pitch angle must lie between 0~ and 10~ to avoid landing on the nose-wheel or on the tall, and the angle of attack (see Figure 1) must be held to less than 18~ to avoid stalling. The vertical speed with which the aircraft touches down must be less than around 2 feet/sec so that the undercarriage can withstand the force of landing. The desired altitude trajectory is given by 100e -t/5 Ot;s15 (7) hd(t) 20 -t 15s ts20 while the desired rate of ascent is -20e-5 0t<15 (8) hd(t) = { -1 15st 20 The desired pitch angle is 2~ and the desired pitch angle rate is 0~ per sec. 18

The performance Index (for the aircraft) chosen by Ellert and Merriam and suitably adapted here to take account of the nonzero controller response time t is given by @() = jfem(t,t)dt to where t represents time, and [to, t ] Is the Interval under consideration, and where em (t.) = -h(t)[hd(t)- 4(t)]2+~h^(t)[hd(t)-X3(t)]2+~((t)[Z2d(t)-x2(t)]2 + ~~(t0) [Xld(t)-xl(t)]2+[M l(t,)]2 where the d-subscripts denote the desired (i.e. ideal) trajectory. To ensure that the touch-down conditions are met, the weights ~ must be impulse weighted. Thus we define: h (t) = P4(t) + P4.t 6(20-t) (10a) h(t) = -O3(t) + 3.t 6(20-t) (1Ob) o,(t) = (2,t (t)6(20-t) (10c) ^0=(t) = 1(t) (10d) where the functions ry must be given suitable values, and 6 denotes the Dirac-delta function. The values of the y are given based on a study of the trajectory that results. The chosen values are listed in Table 2. The control law for the elevator deflection is given by: m l(t,) = c, 2K, T,[k ( t-)-k 11(t -)sx(t)-k l2(t-)x2(t-)-k s1(t -t)z3(t -t)-k 14(t -O)x4(t -)] where the aircraft parameters are given by: Ks = -0.95 sec-1, Ts = 2.5 sec, c, = 1 radian sec-' and the constants k are the feedback parameters derived (as shown in [4]) by solving the Riccatian differential equations that result upon minimizing the process performance index. For these differential equations we refer the reader to [4]. 79

4. DERIVATION OF PERFORMANCE MEASURES We consider here only one controller task: that of computing the elevator deflection So as to follow the desired landing trajectory. The inputs for the controller here are the sensed values of the four states. We seek the following information. As the controller delay increases, how much extra overhead s added to the performance index? Also, it is intuitively obvious that too great a delay will lead to a violation of the terminal (landing) conditions, thus resulting In a plane crash. This corresponds to dynamic failure, and we are naturally interested in determining the range of controller delays that permit a safe landing. Consider first a formal treatment of the problem. The control problem is of the linear feedback form. The state equations can be expressed as: x(t) = Ax(t) + Bu(t) where the symbols have their traditional meanings. Define the feedback matrix by S(t). Then, clearly, u(t) = 2(t-)x(t-) For a small controller delay (i.e., a small t), the above can be expanded in a Taylor series and the terms of second order and higher discarded for a linear approximation. By carrying out the obvious mathematical steps, we arrive at the equation: x(t) = E(t,)x(t) + f(t) as representing the behavior of the system (assuming the given initial conditions). For further details, see Figure 5. Given a closed-form expression for the ke(t) that appear in E(t,t), we could then proceed to study the characteristics of the system as a function of the matrix K However, in the absence of such closed formulations for the kV., we must take recourse to the less elegant medium of numerical solution. The procedures we follow for obtaining the numerical solution are as follows. First, the feedback values are computed by solving the feedback differential 20

equations that define the ki. These are not affected by the magnitude of the controller delay. Then, the state equations are solved as simultaneous differential equations. These are used to check that the terminal constraints have been satisfied, and in the event that they are the performance functional is evaluated. This procedure must be repeated for each new subspace. Since the environment is deterministic in this case (no wind gusts or other random disturbances are permitted in the model(6)), the hard deadline associated with each process subspace is a constant and not a random variable. The trajectory followed by the aircraft when the delay is less than about 60 milli-seconds follows the optimal trajectory closely although the elevator deflections required would be intuitively assumed to increase as the delay increases. Also, the susceptibility of the process to failure in the presence of incorrect or no input is expected to rise with the introduction of random environmental effects. The control that is required for various values of controller delay is shown in Figure 6. Due to the absence of any random effects, elevator deflections for all the delays considered tend to the same value as the end of the landing phase (20 seconds) Is approached, although much larger controls are needed initially. In the presence of random effects, the divergence between controls needed in the low and the high delay values of controller delay is even more marked. We present an example of this in Figure 7. The random effect considered here is the elevator being stuck at -356 for 60 milli-seconds 8 seconds into the landing phase due to a faulty controller order. The controlled process is assumed In Figure 8 to be in the subspace in which the landing job maps into a non-critical process (defined in the sequel as So). The diagrams speak for themselves. We shall show later that this demand on control is fully represented by the nature of the derived cost function. Also, above a certain threshold value for controller delay, we would expect the system to become unstable. This is indeed the case in the present problem, although this point occurs beyond a delay of 60 milli-seconds for all points in the allowed state space (obtained 21

In the next section), which cannot by definition occur here. 4.1. Allowed State Space In this subsection, we derive the allowed state space of the aircraft system. To do so, note that In Ellert and Merriam's model, Xj does not exist. The reason is that the state equations do not take into account the angle of attack. In the idealized model we are considering, it Is implicitly assumed that the constraint on the angle of attack is always honored, so that the only constraints to be considered are the terminal constraints. The terminal constraints have been given earlier but are repeated here for convenience. The touchdown speed must be less than 2 feet/sec in the vertical direction, and the pitch angle at touchdown must lie between 0~ and 10~. To avoid overshooting the runway, touchdown must occur at between 4864 and 5120 feet in the horizontal direction from the moment the landing phase begins. The horizontal velocity is assumed to be kept constant throughout the landing phase at 256 feet/sec.16 Thus, touchdown should occur between 19 and 20 seconds after the descent phase begins.16 The only control is the elevator deflection which must be kept between -35~ and 15~. The set of allowed states is generally found by solving the differential equations for the system backwards from the point of landing. However, this can be computationally expensive, so we follow a cheaper alternative. The initial conditions of the process as It enters the landing stage are known. Also known is that the controller is triggered every 60 milli-seconds. It is assumed that the computations take a minimum of 20 milli-seconds to complete. Using these data, it becomes possible to determine that portion of the allowed state-space that the controlled process is ever 16 We do not consider here how that is to be done; In practice this will constitute a second controller job. We do not treat this here. 16 This makes time an "implicit" state variable. 22

likely to enter to a good approximation. In Figure 8, we plot the range of allowed state values that we obtain. As indeed it should be, the allowed state-space is a function of time. 4.2. Designation of Subspaces We subdivide the allowed state-space found above using the method described in Section 2. The criterion used is the hard deadline, since the finite cost function (derived in the next subsection) is found not to vary greatly within the whole of the admissible state-space. The value of A chosen is 60 milli-seconds. In other words, we wish to consider only the case where a trigger is "missed." The allowed state-space In Figure 8 is subdivided into two subspaces, So and S,. These correspond to the deadline intervals [60, 120) and [120, oo). S is the non-critical region corresponding to the [120, oa) Interval. Here, even if the controller exhibits any of the abnormalities considered In the Introduction, the airplane will not crash. In other words, if the controllers orders an incorrect output, exhibits an abnormal execution delay or simply provides no output at all before the following trigger, the process will still survive at the end of the current inter-trigger interval if, at the beginning of that interval, it was in So. On the other hand, if the process is in SI at the beginning of a inter-trigger Interval, it may safely endure a delay in controller response. However, if the controller behaves abnormally in either providing no output at all for the current trigger cycle or in ordering an incorrect output, there is a positive probability of a air crash. Notice that we explicitly consider only missing a single trigger, not the case when two or more triggers might be missed in sequence. This is because dynamic failure is treated here as a function of the state at the moment of triggering. If two successive triggers are missed, for example, we have to consider two distinct states, namely the states the process is in at the moment of those respective 23

triggers. To speak of deadline intervals beyond 1 20 milli-seconds is therefore meaningless In this case since the triggers occur once every 80 milli-seconds. This is why the second deadline interval considered is [120, oo), not [120,180). The hard deadline may conservatively be assumed to be 60 milli-seconds in Sr. By definition it is Infinity in So. 4.3. Finite Cost Functions As indicated in the preceding section, the finite cost does not vary greatly within the entire allowed state-space. It is therefore sufficient to find a single cost function for S0 or Sl. The determination of the cost function is carried out as a direct application of Its definition. That is, the process differential equations are solved with varying values of 4. The value of t cannot be greater than the inter-trigger interval of 60 iilli-seconds since, by assumption, no job pipelining is allowed and the controller terminates any execution in progress upon receiving a trigger. The finite cost function Is defined as17 (0 = - () - (0) (12) This function is found by computation to be approximately the same over the entire allowed state-space as defined in Figure 8. In Figure 9, the finite cost function is plotted. The costs are in arbitrary units. Bear In mind that these measures are the result of an idealized model. We have, for example, ignored the effects of wind gusts and other random effects of the environment. When these are taken into account, the demands on controller speed get even greater, i.e. the costs increase. 17 Recall that 4(4 ) represents the contribution to the performance functional by a version that takes 4 units of time to compute. Since there is only one controller job under consideration, the subscript on' has been suppressed. 24

The reader should compare the nature of the cost function with the plots showing elevator deflection in Figure 6, and notice the correlation between the marginal Increase In cost with increased execution delay and the marginal increase in control needed, also as a function of the execution delay. 6. APPLICATION EXAMPLE FOR CONTROLLER DESIGN: CHECKPOINTING With stringent requirements on the reliability of any computer controlling a highly critical system, it becomes necessary to obtain mechanisms to identify and correct controller errors. Two characteristics must be exhibited by any such mechanism: a high probability that errors once existing are caught in time, and a fast means for recovering from the error. In this section, we deal with the latter. One common recovery method is the use of the recovery block or recovery region which establishes checkpoints and saves the current job states during normal execution. When an error is detected, the system rolls back to the state saved at the previous checkpoint and the affected task-version is resumed. Clearly, checkpoints can enhance the reliability of execution and reduce the recovery overhead. They can also, however, lead to increased controller overhead since the insertion of checkpoints increases controller delay. It is therefore important to carefully check during design if any overall benefits accrue from the installation of checkpoints, and not to Include them anyway through an ad-hoc design procedure. Checkpoints should be regarded as useful supplementary devices to enhance reliability in certain cases, not as a panacea for reliability problems. It is the purpose of this section to demonstrate the use of the performance measures described above in a study of the effectiveness of checkpointing. Specifically, we shall In this section consider (a) whether checkpointing is indicated in our aircraft landing problem, and (b) if so, what the optimal number of checkpoints is. 25

Several methods for analyzing the rollback recovery system have been proposed [8 - 12]. They in general compute the optimum inter-checkpoint Interval for minimum total execution time. In [12], a complete expression for the characteristic function of total execution time is given which takes into account imperfections in the checkpoints, the occurrence of error during recovery, and multi-step rollback. However, when mean time between failure (MTBF) is many orders of magnitude larger than both the nominal task execution time (4) and the duration of the phase (tp), the model for solving the probability distribution of the task execution time and the probability of dynamic failure with checkpoints can be simplified. Let MTBF= 104 hours. This is a reasonable assumption given contemporary processors ([7], page 161). Then in the landing phase, the ratio of t to MTBF is of the order of 10-10. It Is therefore an acceptable approximation to assume in computing the execution time that no further errors occur during error-recovery. Let the occurrence of error be a Poisson process with rate X=1/ MTBF. Let th, to, tov denote the time needed to set up rollback, restart, and checkpoint. Also, we assume that the saved state may be contaminated with probability p. which means the system can be recovered using rollback with probability pb =l-ps, and has to restart with probability pe. Thus, we have the total execution time of one version, 4ts te = 4 + ntot + trc (13) where n Is the number of checkpoints inserted and tr, is the time overhead used for recovery. tr Is a random variable which depends on the probability of failure, pb, and p,. 0 if no error occurs (14) ts = +tbtroll if error occurs and the version is recovered by rollback t, +tt,^ it error occurs and the version restarts where tho and t9t, are the computation undone because of rollback and restart, 26

respectively. Let tym, be the interval between checkpoints and equal to / (n +1). The density function of trou and t,st are given by froll(t) = Xe-Xt/ (1 - etn) for t [ O.,t ] and ftgrt(t) =;e -X^t /(1 - e -A) for t c[ O,], respectively. Let the density of total execution time solved from above equation be f (t). Then the mean execution cost and the probability of dynamic failure are given by COST = f ff (t)hjt(t)dt (15) J=1 0 pdn = 1.0o- (1-.O-ff(t)dt-p) (1) J=1 tit where qa is the number of versions executed for T7 during a phase, h,. the cost function for executing the j-th version of Ti, tA the deadline associated with the j-th version of T3, and pJt is the probability of static failure for the j-th version of Ti during t which can be regarded as the probability of resource exhaustion, unsuccessful recovery, successive failures during recovery, etc. Using the cost function and hard deadlines given in the above section, and assuming that p,r is the probability of having an error during recovery, the improvement In the probability of dynamic failure, Pdyn, upon insertion of checkpoints are shown In Table 3.18 The probability of dynamic failure does indeed decrease as more checkpoints are inserted. Unfortunately, the mean execution cost increases in as this is done. Through the cost functions it is possible to express the precise extent of this cost increase, and decisions about tradeoffs can be made. When the nominal execution time is 20 milli-seconds as assumed in the rest of this report, all that checkpoints do is to increase the overhead, i.e. the mean finite cost. No discernible drop exists In the probability of dynamic failure when checkpoints are added. The marginal gain in reliability for any payment in finite operating cost is therefore about zero. This was only to be expected since the nominal execution time is one-third that of the hard deadline and the probability of dynamic failure 78 Since the landing job is noncritical In SO, the Issue of checkpolnting does not arise there. All re27

without checkpoints was vanlshlngly small. However, as the nominal execution time increases (the hard 4Be<one baeng assumed to be its S1 value of 60 milli-seconds), the benefits of checkpointing emerge. This is made clear through the fourth column in Table 3 where we present the marginal tradeoff ratio between the benefits gained in the form of improved reliability and the loss in the form of increased controller overhead. As is evident from this, for 20 milli-seconds nominal execution time, the optimal number of checkpoints is zero. For 30 milli-seconds, there is something to be gained in reliability by putting In one checkpoint, for 40 milli-seconds, there is a gain to be had on adding up to two checkpoints although the marginal gain falls off sharply after the first. For a nominal execution time of 50 milli-seconds, the benefits continue rather steadily until four checkpoints have been added. The fifth checkpoint provides some noticeable improvement In reliability, although the marginal gain is distinctly smaller than for the first four. The recommendations for design are now rather clear: use no checkpoints if the nominal execution time is 20 milli-seconds, and use Table 3 to decide on the optimal number of checkpoints for the other cases. 6. DISCUSSION In this report, we have presented a case-study of the determination of performance measures introduced in [1], and considered an important application in controller design. Central to our report is the idea that it is possible to objectively quantify the performance of a controller. Owing to this objectivity, there are many possible extensions to this work. One extension, presently under study, is the issue of distributed control, and the cost of transmitting global status information to all the local controllers. For guaranteed reliability, the local controllers require a complete knowledge of marks In this section are therefore concerned with the characteristics of the process In S1. 28

the global state of the system. This, however, has a cost in terms of the extra response times exhibited by the overall controller. If the local controllers have less than complete Information, their actions cannot be optimal and might even be Incorrect. However, the response time of the controller could be significantly reduced, with errors occurring rarely enough to make that an Improvement. Readers will recognize this formulation as an example of the application of Markov decision theory with costly Information with the cost functions for the controller jobs now providing the cost of status Information. Other applications include the quasi-optimal allocation and reallocation of control Jobs to different processors in a multiprocessor controller, the dynamic control of queues in controllers, and the objective ranking of rival computer systems as controllers of any specific process. ACKNOWLEDGMENT The authors are indebted to Rick Butler and Milton Holt at NASA Langely Research Center for their technical and financial assistance. REFERENCES [1] C. M. Krishna and K. G. Shin, "Performance Measures for Multiprocessor Controllers," Performance'83: Ninth Int'l Symp. Comp. Perf., Meas., and Eval., pp. 229-260. [2] D. E. Kirk, Optimal Control Theory, Prentice Hall, Englewood Cliffs, NJ, 1970. [3] A. P. Sage, Optimum Systems Control, Prentice Hall, Englewood Cliffs, NJ, 1970. [4] F. J. Ellert and C. W. Merriam, "Synthesis of Feedback Controls Using Optimization Theory -- An Example," IEEE Trans. Auto. Control, Vol. AC-8, No. 4 April 1963, pp. 89-103. 29

[6] L. T. Wu, "Models for Evaluating The Performability of Degradable Computing Systems," Computing Research Laboratory Report CRL-TR-7-82, The University of Michigan, Ann Arbor, June 1982. [6] J. F. Meyer, et. at, "Performability Evaluation of the SIFT Computer," IEEE rans. Comput., Vol. C-29, No. 6, pp. 501 - 509, June 1980. [7] J. H. Wensley, et. at, "Design Study of Software-lmplemented Fault-Tolerance Computer," NASA Contractor Report 3011, 1982. [8] K. M. Chandy, J. C. Browne, C. W. Dissly and W. R. Uhrig, "Analytic Models for Rollback and Recovery Strategies in Data Base Systems," IEEE Trans. Softw. Engg., Vol. SE-1, No. 1, March 1976, pp. 100-110. [9] K. M. Chandy and C. V. Ramamoorthy, "Rollback and Recovery Strategies for Computer Programs," IEEE Trans. Comput., Vol. C-21, No. 6, June 1972, pp. 646-666. [10] E. Gelembe and D. Derochette, "Performance of Rollback Recovery Systems under Intermittent Failures," Comm. of the ACM, Vol. 21, No. 6, June 1978, pp. 493-499. [11] J. W. Young, "A First Order Approximation to the Optimum Checkpoint Interval," Comm. of the ACM, Vol. 17, No. 9, Sept. 1974, pp. 530-531. [12] Y.-H. Lee and K. G. Shin, "Design and Evaluation of a Fault-Tolerant Multiprocessor Using Hardware Recovery Blocks," Computing Research Laboratory Report CRL-TR-6-82, The University of Michigan, Ann Arbor, August 1982. [13] K. G. Shin and C. M. Krishna, "A Distributed Microprocessor System for Controlling and Managing Military Aircraft," Proc. Distributed Data Acquisition, Computing and Control Symp., pp. 156-1 6, Miami, FL, December 1980. 30

\ IFT / /.C > < HORIZONTAL,CG —' —-— ^\CX J - \ / y\ _^AIRPAN FLIGHT PATH /u1Dfio o irtne (o[x Figure 1. Definition of aircraft angles (from

ENVIRONUENT CONTROLLED SENSORS JOB LISTCCK -- PROCESS30 LIST *o. /-~ -/- - ~TRIGGER Fige 2. A ENlRATOR STATE ESTIIHATES ACTUATORS CONTROLLER DISPLAY OPERATOR Figure 2. A typical real-time control system

SENSORS, SENSOR, & SACU A GROUND I/O PROCESSORS PROCESSORS PACUATORS COMMUNICATION -A/D CONVERSIONS CONTROLLERACUAORS AND I/O -* CBUFFERING S Dl /A CONVEORSION DYNAMICS AND I/O - BUFFERING PERIPHERALS - ORMATTING BUFFERING - BCEURCINC -BROADCASTING -C i_ -'...........ILOT. F7, —......DISPLAY Figure 3. Aircraft control system schematic

execution with t0 Vt4 0tfn~~~ II VViz, V/ I a) ^^^<<-'''' ^^ —-"^ /with-=0 time trigger trigger trigger trigger 1 2 3 4 6y=RESP(ij) for j =12,3,4. Figure 4. Illustration of Cost Functions.

al1 a12 L 13 E(t,0 ) 0 b32 b33 0 O 1 where a l = [1-c,k li(t )]- [b ll-k l(t)c2 l-c 21 l C01(t)+2bllk (t)+k l2(t)-c 21k2, (t)l a12 = [l-c flk Il(t)-1 [b12-c2k2(t)-c 121 l lk2(t )+ b l2k (t )+k22(t)-C lk 2l (t)l] a, = [ -c 2lk l1(t )1-' [b 13-c 2lk 13(t)+b 13k Il(t )+k23(t )-C lk (t)k s(t)l] a14 = [1-C ~kll(t)(]-1 [-k 14(t)-b llkl4(t)t-k-24(t)C+C 2lkll(t)k4(t)t] When the execution delay is C, the approximate state equations are C2l [kl(t ) + t (t ), (t)+b llk (t)+fk2(t )- 2l l(t)ll()] 0 x(t) = E(t,)x(t) + O 0 Figure 5. The Approximate State Equations

CV N IN CVi~~~~~~~~~~~~~~~~~~~~~~CI o0 *(0.. COS 0.00 4.00 8.00 12.00 16.00 20.00 1 00 400 00 12.00 16.00 20.00 TIME (SEC) TIME (SEC) (c) =0 msec. (b) 6 = 40 msec. FN. i II s- d (0 (0 0.00 4.00 8.00 12.00 16.00 20.00.00o 4.00 8.00 12.00 16.00 20.00 TIfE (SEC) TInE (SEC) (c) = 50 msec. (d) C = 60 msec. Figure 6. Elevator Deflection.

DEFLECTION W) DEFLECTION (RO) -0.63 -0.42 -0.21 -0.00 0.21 0.42 -0.63 -0.42 -0.21 -0.00 0.21 0.42 c 8 0 co~~~ cm a (ft cr a^s~ 0 80 e1 a CIO,~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P 0~~~~~~~~~~~~~ Q1 Oi "C3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~iC C D 33o 0 tr g'N R * 8 CC o 8 CD 0 I-. 5 P DEFLECnON WRO) DEFLECTION (RADI ( Q0.63 -0.42 -0.21 -0.00 0.21 0.42 60.63 -0.42 -0.21 -0.00 0.21 0.42 -I~ b I O~~~~~~~~~~~~~/ $ ~*r+~~~~~ 0 u~~~ ~ s ~ 1~~~~~~. ~ ~ ~ 513 0r 0t __ __ __ __ __ __ __ __ __ _ CD V 8 ~~~~~~~~~~~~~~~~~~~~~~~~~.8:

CD C3 I~D %a Desired.C Trajectory _.CD l0 8 TIME (SEC) w-c"So Is wCO I! / CD I / I it gC \ / (:J 1' / 1/ C"000 4.00 8.00 12.00 16.00 20.00 TIME (SEC) Figure 8(a). Allowed State Space: Altitude.

CMD' ^^ LI~~~~~~~~~~O y \3^:^ <? / //~/ V U, CN LLJ~~~~ W CD~~~~~~~~~~~~~~~~~~ LJW'I-B~~~~~~~~~~~~~~~~ 0 IL W CD I O 4.00 810Desire0.0 0.0 — I Descent Rate TIMTE (SEC) LW ~ Rue()AloeSaeSaeDsetRt 1- /' z I so I W CD C 0. CD j/~~~~~~~~~'9~~~s CD CD LO; 1~~~~~~~~~~5 r I~~~~S CD CD; C0.00 4.00 8.00 12.00 16.00 20.00 TIME (SEC) Figure 8(b). Allowed State Space: Descent Rate.

L) - C,, CO!- I x t'~ \\ /~/ ~Desired Pitch Angle 0.00 4.00 8.00 12.00 16.00 20.00 TIME (SEC) Figure 8(c). Allowed State Space: Pitch Angle.

(0 c^ CD 004'0 801201602.0 J g Desired Pitch Angle Rate. CD...............,.... I.............. l 0.00 4.00 8.00 12.00 16.00 20.00 TIME (SEC) Figure 8(d). Allowed State Space: Pitch Angle Rate.

cI cm C) i~~~~~, -w, I 0 CD o" 0 =0.00 10.00 20.00 30.00 40.00 50.00 60.00 DELRY (MSEC) igure 9. Landing Job Cost Function.

Feedback Term Value b,, -0.600 b 7, -0.760 b ___ _0.003 b ____102.4 b ____.____-0.4 ____CC _______ -2.374 Table 1. Feedback Values Weighting Factor Value,i1(t) 99.0 P2.t,(t) 20.0 ip3(t) (O0t<15) 0.0 ((t) (15_t220) 0.0001 1.000 yO4 0.00005 CP4.tI 0.001 Table 2. Weighting Factors

n MEAN COST Pdn Tradeoffx 107 0 0.12848 0.30B6E-15 1 0.12909 0.3086E-15 0.0 2 0.12971 0.3086E-15 0.0 3 0.13033 0.3086E-15 0.0 4 0.13095 0.3086E-15 0.0 5 0.13157 0.3086E-15 0.0 (a). Nominal execution time 20 msec. n MEAN COST pdyn Tradeoffx 107 0 0.26156 0.37037E-07 1 0.26431 0.37037E-08 121.5 2 0.26709 0.37037E-08 0.0 3 0.26991 0.37037E-08 0.0 4 0.27272 0.37037E-08 0.0 5 0.27567 0.37037E-08 0.0 (b). Nominal execution time 30 msec. n MEAN COST Pdyn Tradeoffx 107 0 0.55352 0.30555E-06 ____ 1. 0.55472 0.43055E-07 2177.0O 2 I 0.55586 0.30555E-07 109.8 3 0.55694 0.30555E-07 0.0 4. 0.55795 0.30555E-07 0.0 5 0.55891 0.30555E-07 0.0 (c). Nominal execution time 40 msec. Table 3. Checkpoints

n MEAN COST p, Tradeoffx 107 0 0.89694 0.46666E-06 - 1 0.90846 0.35666E-06 95.6 2 0.92025 0.26166E-06 80.6 3 0.93231 0.16666E-06 78.7 4 0.94466 0.71666E-07 76.9 5..0.95730 0.46666E-07 19.8 (d). Nominal execution time 50 msec. Pb =0.9. MTBF= 104 hours, tv =0.1 msec. tb=2.0 msec. t,=2.0 msec. Tradeoff ratio for n checkpoints (nrl) _ COST with n chechpoints - COST with n-1 checkpoints p~dn, with n-1 checkpoints - Pdyn ith n checkpoints Table 3. (Cont.) Checkpoints