COOLEY ELECTRONICS LABORATORY The University of Michigan Errata Sheet for Cooley Electronics Laboratory TR 136 -o3674-4-T Page 20 In the sixth line from the bottom of the page, replace "a + d/2" with "at + d/2." 22 In the second line from the bottom of the page, replace "7k(L)" with "k(L)." 28 In the fifth line above Q.E.D. add ")" to P("A"ISN). 30 In the twelfth line add the words "equals the loss due to a false alarm" after the word '!miss.", it2 1 35 In the first line below Eq. 2.7 replace "1 x (- f)" with " 1 exp[- t " 2~~~ 2 4n T211 2f 35 In Eq. 2.9a the first term on the right hand side of the equation is negative. In E. 2.6 te tem ev2/2 37 In Eq. 2.16 the term e is in the numerator. 37 In Eq. 2.11 the inequality sign is missing. Refer to the same equation on page 36. 39 In the first line below Eq. 2.24 replace 'Eq. 2.14" with 'Eq. 2.24." 57 In the third line replace "interactive" with "iterative." L 6o In Eq. 3.8 the first term on the right hand side should be e f(yJSN). L l+e l 70 In Section 3.3 L is non-negative. 88 In the second line above Eq. 4.1 replace "n" with 'k". 88 In Eq. 4.1 the right hand side should read fFk 1(L + In[2(y)]) f(y)dy + C. 110 In the two lines above Eq. 4.32 replace "A0" with "a." 0 0 115 The word "Determind" on the lower graph should be replaced with "Determined." 117 In the bottom equation omit "= - n(). p 135 All "~" should be replaced with "0." 135 In Eq. E-3 the right hand side should read "= a(.02 i - f x0 (x)dx)." 135 The equation in the third line below Eq. E-3 should have a negative sign on one side of the equation. 135 The equation in the fourth line below Eq. E-3 should read lerrorli = a[(i/50 - 0 (xi) + O(xil) 144 In the upper right hand block of Figure B-3 POSNA(I) = ( + D)P 2 DP 159 In the second and third blocks replace "comuted" with "computed." In the fourth and fifth blocks replace "paint" with "print." 162 In the first block replace "paint" with "print." In the third block replace "resolt" with "result."

Technical Report No. 136 3674 - 4- T THEORY OF SIGNAL DETECTABILITY: OBSERVATION-DECISION PROCEDURES by T G Birdsall R A Rooerts.I. Approved by: B F Barton FOR COOLEY ELECTRONICS LABORATORY Department of Electrical Engineering The University of Michigan Ann Arbor Contract No. Nonr-1224(36) Office of Naval Research Department of the Navy Washington 25, D. C. January 1964

,:. A,,owergs <e=

TABLE OF CONTENTS List of Figures............................. v List of Tables....................... ix List of Symbols........................x Abstract.......................... xiii 1. Introduction.....................- 1 1.1. Introduction to Applications 1 1.2. Historical Background 9 1.3. Outline of This Report 9 1.4. Introduction to Notation and the Mathematical Treatment of Observation-Decision Procedures 11 2. Nonsequential Observation-Decision Procedures............. 24 2.1. Preliminary Remarks 24 2.2. Fixed Observation-Decision Procedure 24 2.3. The Predetermined Nonsequential ObservationDecision Procedure 29 2.3.1. The Optimum Nonsequential ObservationDecision Procedure for Normal Observation Statistics with Continuous Observation 29 2.3.2. An Example of an Optimum Nonsequential Procedure with Discrete Observation 42 2.3.3. An Example of an Optimum Nonsequential Procedure with Continuous Observation 46 2.4. Bounds on Deferred Decision Parameters by Use of Optimum Nonsequential Procedures 48 2.5. Summary of Nonsequential Observation Decision Procedures 52 3. Sequential Observation-Decision Procedures............... 56 3.1. Preliminary Remarks 56 3.2. Method of Solution and an Illustrative Example 56 3.3. Deferred-Decision, Continuous Observation Case 70 3.4. Deferred-Decision, Discrete Observation Case 83 4. Numerical Results and Comparison of Observation-Decision Procedures................ 88 4.1. Numerical Results for Deferred-Decision Procedures 88 4.2. Approximate Methods for the Determination of the Asymptotic Decision Boundary, a" 105 4.3. Inner and Outer Bounds for the Deferred-Decision Boundary 112 4.4. Comparison of Sequential and Nonsequential Observation-Decision Procedures 118 iii

5. Future Studies, Summary, and Conclusions............... 5o1. Future Studies and Generalizations of Present Solution for Deferred Decision 5.2. Summary and Conclusions of Observation-Decision Procedures References.................................. 126 126 127 129 Appendix A. The Computer Program for the Determination of Deferred-Decision Boundaries................ 130 Appendix B. The Computer Program for the Determination of the ROC and Average Number of Observation Functions.. Appendix C. The Computer Program for the Rapid Probe........... 131..... 132 Appendix D. The Computer Program for the Determination of the Average Excess over the Decision Boundary..... The Error Analysis of the Computer Program for the Determination of the Deferred-Decision Boundaries. 132 Appendix E........ 133...... 137 Appendix F. The Error Analysis Computer Program....... Appendix G. The Computer Program for the Determination of the Value Contours for the Optimum Nonsequential Decision Procedure......................... 137 Distribution List................................. 163 iv

FIGURES 1. A Heuristic Physical Model of an Observation-Decision Problem......... 4 2. Output Display for a Fixed Observation-Decision Procedure.......... 6 3. Output Display for a Predetermined Observation-Decision Procedure.... 7 4. Output Display for a Deferred-Decision Procedure.............. 8 5. Output Display for a Wald Sequential Procedure......................... 10 6. The Terminal Loss Function, T(L), for WM > WFA.............. 15 FA.~~~15 7. Three Normal ROC Curves on Linear Paper with Parameter d......... 17 8. Three Normal ROC Curves on Normal Coordinates with Parameter.... 19 9. The Distribution of the Logarithm of the Likelihood Ratio for a Normal ROC with Parameter d in N and in SN.................... 21 10. The Expected Loss Function and ROC Curve for a Fixed Observation Procedure with the Cost of Observation Zero................25 11. The Expected Loss Function and ROC Curve for a Fixed Observation Procedure with the Cost of Observation Greater Than Zero.......27 12. The Contour Graph of the Value of Observation for the Continuous Observation Case and W/c = 30......................... 31 13. The Contour Graph of the Value ofObservation for the Continuous Observation Case and W/c = 100....................32 14. The ROC Curves for the Optimum Nonsequential Procedure for the Continuous Observation Case As a Function of the W/c Ratio.........40 15. The Normalized Expected Loss Function for an Optimum Nonsequential Procedure with Normal Observation Statistics and W/c = 30., A0 =., and d = 1.0............................. 43 16. The ROC Curve for an Optimum Nonsequential Procedure with Normal Observation Statistics, Discrete Observation Case, with W/c = 30., A0 = 0., and d = 1.0............................. 44 0 17. The Normalized Expected Loss Function for an Optimum Nonsequential Procedure with Normal Observation Statistics and W/c = 30., A 0=0.,and d=.25..............................47 0 I 18. The Expected Loss Function for an Optimum Nonsequential Procedure with Normal Observation Statistics.... e A A A A.49 19. The Expected Loss Function for an Optimum Nonsequential Procedure with Normal Observations..........................50 20. The Optimum Nonsequential Decision Boundaries for Large Available Quality, D, and Continuous Normal Observations As a Function of W/c, for A0 = 0 and -1.0986........................51 0 v

21. The Optimum Nonsequential Decision Boundaries for Large Available Quality, D, and Continuous Normal Observations As a Function of the Available Quality, D, for A0 = 0 and -1.0986......................... 53 22. The Difference Between the Decision Boundaries for the Optimum Nonsequential Procedure with Continuous Normal Observations As Obtained By Using the Correct Log-Odds-Ratio and a Log-Odds-Ratio of Zero Plotted As a Function of W/c........................ 54 23. The Optimum Expected Loss Function As a Function of the Log-OddsRatio, L, for a Deferred Decision Procedure in Which WM > WFA..... 59 24. The Expected Loss Functions As a Function of L for the Illustrative Example Depicting the Solutions for nma = 1, 2, 3, and co........ 71 max 25. The Deferred Decision Boundary Points for the Illustrative Example As a Function of the Available Quality, D = nmax d....................... 72 26. The Average Number of Observations for the Illustrative Example As a Function of L Depicting the Solutions for n = 1, 2, 3, and oo...... 73 max 27. The ROC Curves for the Illustrative Example Plotted on Linear Paper... 74 28. The ROC Curves for the Illustrative Example Plotted on Normal Coordinates................................. 75 29. A Heuristic Physical Model of a Deferred-Decision Problem............. 77 30. The Average Excess over the Decision Boundary for Constant Boundaries in L As a Function of the Available Quality for Four Initial L Values.. 86 31. The Average Excess over the Deferred Decision Boundary for Two Initial L Values............................................ 87 32. The Normalized Expected Risk Functions for a Deferred-Decision Procedure As a Function of L with Parameters W/c = 30., A0 = 0.,d =.25 and D =.25,.50, 1.0, 2.0, 4.0, 8.0, and 10.0.................. 90 33. The Normalized Expected Risk Functions for a Deferred-Decision Procedure As a Function of L with Parameters W/c = 30., A0 = 0., d = 1.0, and D = 1.0, 2.0, 4.0, 8.0, and 10.0............ 91 34. The Normalized Expected Risk Functions for a Deferred-Decision Procedure As a Function of L with Parameters W/c =100., A0 = 0., d =.25, and D =.25,.50, 1.0, 2.0, 4.0, 8.0, and 10.0. 92 35. The Normalized Expected Risk Functions for a Deferred-Decision Procedure As a Function of L with Parameters W/c =100., A0 = 0, d = 1.0, and D = 1.0, 2.0, 4.0, 8.0, and 10.0........................93 36. The Deferred Decision Boundaries As a Function of the Available Quality, D = nma d for W/c = 100., A 0., andd =.25 and 1.0.......... 94 max0 37. The Deferred Decision Boundaries As a Function of the Available Quality, D = nma d for W/c = 30., A0 = 0., and d =.04,.25, and 1.0........ 95 max 38. The Deferred Decision Boundaries As a Function of the Available Quality, D =n dforW/c = 30., 100., and 500. with d =.25........... 96 max vi

39. The Conditional Average Number of Observations for a DeferredDecision Procedure for the Condition SN As a Function of L with W/c = 30., A 0 =., and d = 1.0....................99 40. The Conditional Average Number of Observations for a DeferredDecision Procedure for the Condition N As a Function of L with W/c =30., A0= 0.,andd= 1.0.......................100 41. The Average Number of Observations for a Deferred-Decision Procedure As a Function of L with W/c = 30., A0 = 0,,andd = 1.0...........101 42. The ROC Curve for a Deferred-Decision Procedure on Normal Coordinates with Parameters W/c = 30., A0= 0, and d = 1.0........102 43. The ROC Curve for a Deferred-Decision Procedure for the Signal 6 db Below the Expected Size Signal....................... 103 44. The Average Number of Observations for the Signal 6 db Below the Expected Size Signal As a Function of L...................104 45. The Wald Sequential Procedure Expected Risk Function As a Function of L with Parameters W/c = 300. and A0 = 0. for the Continuous Observation Case...................................a... 111 46. The Difference Function, D(L), Between the Wald Sequential Expected Risk Function and the Terminal Loss Function As a Function of L with Paramaters W/c = 300. and A0 = 0. for the Continuous Observation Case............................................... 113 47. The Inner and Outer Bounds for the Asymptotic Deferred-Decision Boundary As Determined by the Optimum Nonsequential Procedure and the Wald Sequential Procedure, Respectively, As a Function of W/c....................................115 48. The Normalized Expected Risk Function for the Optimum Nonsequential Procedure and the Wald Sequential Procedure As a Function of L with Parameters W/c = 300. and A = 0 for the Continuous Observation Case....................................116 49. The Asymptotic Decision Boundary As Determined by the Rapid Probe Simulation, the Optimum Nonsequential Procedure, and the Continuous Observation Analytic Solution As a Function of L..............119 50. The Average Observed Quality Plotted Against the Available Quality for a Fixed Observation Procedure, An Optimum Nonsequential Procedure, a Truncated Wald Sequential Procedure, and a Deferred-Decision Procedure with Parameters W/c = 30., d =.25, and L = 0.........122 51. The Probability of a Correct Terminal Decision Plotted Against the Available Quality for a Fixed Observation Procedure, An Optimum Nonsequential Procedure, a Truncated Wald Sequential Procedure, and a Deferred-Decision Procedure with Parameters W/c = 30., d =.25, and L = 0. A. A. A A A @ e.......123 52. A Comparison of the Expected Risk Functions for Various Truncated Wald Procedures and Deferred Decision with Parameters W/c = 30., d = ~25, and L= 0............... 125 vii

A.1. The General Block Diagram for Computation of the DeferredDecision Boundary Points..........................138 A.2. Data Read in and Computation of T(L).....................139 A.3. Computation of Mean Motions, Likelihood Ratios, and Interpolation Constants..............................140 A.4. Integration of Risk Functions.........................141 A.5. Computation of Decision Boundaries, Print Out, Convergence Check, and Substitution of "New" Risk Functions.................. 141 B.1. The General Block Diagram for Computation of the ROC and Average Number of Observation Functions for the Deferred-Decision Procedure..142 B.2. Data Read in and Computation of T(L)......................143 B.3. Computation of the ROC and Average Number of Observation Functions for n = 1 and Mean Motions Integration of the ROC and Average Number of Observation Functions..........................144 B.4. Integration of the ROC and Average Number of Observation Functions....145 B.5. Computation of the Expected Risk.......................146 B.6. Print Out, Convergence Check and Substitution of "New" Functions for "Old" Functions...............................147 C.1. The General Block Diagram for the Binomial Approximation to the Normal Deferred-Decision Procedure............ l 0 0............ 148 C.2. Computation of T(L), ROC, and Average Number of Observation Functions.....................149 C.3. Computation of "New" Functions from "Old" Functions and of Risk Functions....................................150 C.4. Computation of the Decision Boundary, Print Out, Convergence Check and Substitution of "New" Functions for "Old" Functions..........150 D.1. The General Block Diagram of the Monte Carlo Simulation for Determination of the Average Excess over the Decision Boundary...........151 D.2. Data Read in, Preliminary Calculations, and Initialization..........152 D.3. Selection of Random Number, Gaussian Motion, and Truncation Check...152 D.4. Convergence Check and Print-Out................152 E.1. A Graph of T(L) - G35(L) for a W/c = 30 Depicting a 10 Percent Change in the Cost of a Single Observation...................... 153 F.1. The General Block Diagram of the Error Analysis Program.........154 F.2. Computation of T(L) and the Mean Motions of the Likelihood Ratios.....155 F.3. Computation of the Likelihood Ratios, Interpolation Constants, and the Risk Functions......................... 156 F.4. Computation of Decision Boundary and Print Out of Risk Function and Decision Boundary.... 157 viii

F. 5. Increase Observation Cost by 10% at n = 24 for 10 Trials and Substitute 'New" Risk Function for "Old" Risk Function......... 158 G.1. The General Block Diagram for the Calculation of Constant Value Contours for the Optimum Nonsequential Procedure... 159 G.2. Data Read in and Preliminary Calculations................. 159 G.3. Computation of the Value of Observation.................. 160 G.4. Computation of Optimum d Boundaries and Print-Out of P("A" | SN) and P("A?"iN)................................... 161 G.5. Complete Print-Out..............................162 TABLES I. The Normalized Expected Loss Function for an Optimum Nonsequential Procedure with Normal Observation Statistics and W/c= 30, A0 = 0., and d =.25.................................. 45 II. The Expected Risk Decomposed into the Terminal Error Loss and the Observation Cost for the Optimum Nonsequential Procedure and the Wald Sequential Procedure for W/c = 300 and A = 0............ 114 0 III. The Comparison of Deferred Decision (Optimum Sequential Procedure) and the Optimum Nonsequential Procedure for W/c = 30., A0 = 0, d= 10, andL=0.........................120 ix

SN N WM WFA P(SN), P(N) f(y ISN), f(y IN) L C (y) A 0 LIST OF SYMBOLS The response, "the condition at the input of the receiver is signal-plusnoise." This response is made when it is profitable to do so. The response, "the condition at the input of the receiver is noise alone." This response is made when it is profitable to do so. A symbol used to represent the phrase "signal-plus-noise." A symbol used to represent the phrase "noise alone." The loss due to responding "B" when the actual condition of the receiver input is SN. The loss due to responding "A" when the actual condition of the receiver input is N. The a priori probabilities of SN and N, respectively. The conditional probability density functions of the observed variate, y, under the conditions SN and N, respectively. The log-odds-ratio defined as, L Ain 1 P(SN) 1-P'(SN)~ The likelihood ratio of the observed variate. (y) f(y SN) = f(y IN)' The decision point on the likelihood ratio axis for no possible observations / permitted. = (n FA. 2 1 1 A symbol defined by the equation W= WM W +. W and A are introduced M FA0 for reasons of symmetry in resulting equations. The terminal decision expected loss function. This is the average loss that results when a terminal decision is made. The probability of responding "A" when the cause of the input to the receiver is SN. This is called the probability of detection. The probability of responding "A" when the cause of the input to the receiver is N. This is called the probability of false alarm. The normal index of detectability for a single normal observation. The normal index of detectability if the total available signal is observed. The normal index of detectability for the actual observed signal. The cost of an observation of quality d. The expected risk function for a deferred-decision procedure at the kth stage, i.e., there are k available deferrals remaining. The average number of observations for a deferred decision procedure at the kth stage. The observation cost for a nonsequential observation-decision procedure. W T(L) P("A" ISN) P("A" IN) d D D 0 cd Gk(L) yk(L) C x

F,A RE(L) R(L, Do) V(L,D0) R*(L, Do) Fk(L) Yk(L) xk(L) vk(LISN) Yk(L IN) S(Do, W/c) ~ (d) (S/N)o,P( ) The decision points in log likelihood ratio upon which~a decision of whether to terminate the observation or not is based. The expected loss or risk function due to a terminal decision error. The expected loss or risk associated with an observation of nonzero quality, Do, for a nonsequential observation-decision procedure. The expected value associated with an observation of nonzero quality, DO. The expected loss or risk function normalized by dividing by the loss due to a terminal decision error. The optimum deferred-decision expected loss or risk function at stage k. Fk(L) = minimum [ Gk(L), T(L)]. The probability of detection for a deferred-decision procedure at stage k. The probability of false alarm for a deferred-decision procedure at stage k. The average number of observations conditional to the course of the input being SN for a deferred decision procedure at stage k. The average number of observations conditional to the cause of the input being N for a deferred-decision procedure at stage k. The standard form for the decision boundary. This is the decision boundary in log likelihood ratio that results from a continuous observation. The average excess over the decision boundary which results because of discrete observations. The signal-to-noise ratio at the output of a receiver where the output is some monotone function of the likelihood ratio of the input to the receiver. A symbol used to denote the mean or expected value of the quantity in parentheses. xi

ABSTRACT The theory of signal detectability is extended to include observation-decision procedures other than the familiar fixed observation-decision procedure. The theory of signal detectability usually partitions detection devices into two cascaded sections. The first section processes the physical waveform and, in the optimum decision device, has as an output the likelihood ratio of the input waveform. The second section operates on the output of the first and its output is the actual decision "signal present" or "signal absent." In much of the work in the literature the emphasis is placed on studying the first section with the second section a simple threshold device. In this work we study the second section. The observation-decision procedures studied include predetermined nonsequential procedures and the optimum(Bayes') sequential procedure, deferred decision. Included are examples with normal observation statistics. Approximations and bounds are derived for deferred decision parameters. Comparisons are given between the various observation decision procedures. The comparison of the optimum nonsequential procedure and deferred decision are roughly the following: if the available output signal-to-noise ratio which one would obtain is small (on the order of +4 db in 2E/No), then sequential procedures consume about 60% as much time as nonsequential procedures and the resultant error probabilities are approximately the same; if the available output signal-tonoise ratio is on the order of + 10 db, then deferred decision and the optimum nonsequential procedure consume about the same amount of time with sequential procedures making less terminal decision errors. xiii

THEORY OF SIGNAL DETECTABILITY: OBSERVATION DECISION PROCEDURES 1. INTRODUCTION The objective of the work reported herein is to study techniques and methods for the improvement of special receivers known as detection devices. The detection devices are simple receivers whose only objective is to determine the presence or absence of a signal in a background of interference. The same theory and techniques, of course, apply directly to any mechanism for deciding between two or a handful of simple causes which give rise to a physical observation. The treatment of detection devices usually partitions the receiver into two cascaded sections. The first section processes the physical waveforms, i.e., amplifies, heterodynes, filters, crosscorrelates, etc. Ideally, the output of this first section is the "likelihood ratio" of the input waveform. The study and design of such equipments constitute the major portion of the literature in detection theory. The second section operates on the output of the first, and its output is the actual decision, "signal present" or "signal absent." In order for the detection device to be optimum, these two sections must form an optimum combination. In much of the work in the literature, the detection situations studied are those for which the second section is a simple "threshold," or voltage comparitor, and thus the first section can be studied considering the possible threshold values as a parameter to be determined later. In this report, the second section of the receiver is pursued; several progressively better decision procedures are studied that might be used if the detection allows and warrants such increase in complexity. 1.1 INTRODUCTION TO APPLICATIONS Deferred decision is a formal abstraction of a procedure for observation and decision which is quite common in human experience. Specifically, it is assumed that something is being observed, e.g., the output of an electronic device called the receiver-front-end, and that the input to the receiver is either the background, i.e, noise, or a signal in the background. In this report it is assumed that the signal is of a steady-state nature, that is, the longer the receiver output is observed the more information can be obtained as to whether the signal is present or absent. In other words the condition at the input to the receiver, which is either the background or the background and signal, does not change during the observation. The decision that is to be made is whether the input to the receiver was caused by a signal or by

background conditions alone. Such a decision is referred to as a terminal decision because the observation process terminates when such a decision has been reached. This terminal decision may be made soon after the observation has begun or it may be deferred in order that the observation can continue. A characteristic of this observation-decision procedure is that a terminal decision cannot be postponed indefinitely. One goal of an observation-decision procedure might be to make terminal decisions which are as correct as possible. When this is the only goal then the optimum procedure is to observe the receiver input for as long as permitted. At the conclusion of the observation a terminal decision is made. This observation-decision procedure is called a fixed observationdecision procedure or a fixed time observation-decision procedure. In many observation-decision procedures there is a competing goal. This other consideration is that of reaching a terminal decision as quickly as possible commensurate with the resulting terminal decision errors. In this procedure the observer balances the losses due to an erroneous terminal decision with the cost of observing. The balance between the loss due to a terminal decision error and the cost of observing is accomplished by selecting the correct observation length before the observation begins. The observation length is predetermined and depends on the "quality of the observation," 1.the loss due to a terminal error, the cost of observing, and the initial information. Since the observation length may be selected short of the maximum length, it is the general nonsequential. observation-decision procedure. We call this a predetermined observation-decision procedure. A third type of observation-decision procedure is characterized by the continuous balancing of the potential terminal error losses with the cost of observing. This is accomplished by deciding at each moment whether to continue observing or to stop and make a terminal decision. In the previous two procedures the length of observation is known before the observation begins. In this third type of observation-decision procedure the decision of whether to continue or terminate depends on what has been observed. The observation length in this case is a random variable, dependent on the quality of the observation, the loss due to terminal decision errors, the cost of observing, initial information, and the observation itself. 'Quality of Observation" is a phrase used here to include such factors as signal-tonoise ratio, front end noise figure, and all similar factors. In each specific situation it will be quantified by an appropriate definition. 2

Another way to characterize these observation-decision procedures is to consider the procedures on the basis of two decisions. The first decision is a decision on what the observation time will be; the second decision is a terminal decision on whether to respond "A" or "B." In the fixed observation-decision procedure the observer necessarily does not account for the observation quality, the cost of observing, and the initial information. This results in the selection of the maximum allowable observation length. Thus the only nontrivial decision which the observer makes is a terminal decision. In the predetermined nonsequential procedure, the observer has two decisions to make: a decision on the observation length which is based on parameters previously discussed and a terminal decision at the end of the predetermined observation length. Note that in nonsequential observation-decision procedures, the decision on the observation length is made only once for each terminal decision. Sequential observation-decision procedures permit the observer to make the decision on what the observation length will be many times for each terminal decision. It is clear that in order to optimize a sequential procedure, the observer should make the decision on what the observation length is continuously, that is, the observer should decide at each instant if the observation should continue or be stopped at which time a terminal decision is made. Clearly, sequential observation-decision procedures include the predetermined and fixed nonsequential procedures as special cases. Thus, the word "sequential" refers to the fact that the observer can make the decision on what the observation length is more than once for each terminal decision. The advantage of a sequential procedure results from the fact that this decision can be made many times, ideally, continuously. The observation length in a sequential procedure is a random variable. The particular sequential procedure examined in this paper is an optimum sequential observation-decision procedure called "deferred decision." One example of a physical situation by which such observation-decision procedures might be put into practice is presented in the following diagrams and discussion. Consider the physical model illustrated in Figure 1. The receiving array or antenna input is due to one of two causes: the background or noise (N) or the signal-plus-noise (SN). Assume this input is processed by standard processing, i. e., heterodyning, filtering, detecting, etc., followed by an integrator. After a predetermined amount of time the integrator is sampled and this sampled output is then applied to a display scope. The observer makes the decision "A" (alarm, alert, attack, action, etc.) when he decides signal-plus-noise was the cause of the input. He makes the decision "B" when he decides noise alone was the cause of the input. 3

ANTENNA INPUT, y,(y), THE LIKELIHOOD RATIO INTEGRATOR OUTPUT Figure 1. A heuristic physical model of an observation-decision problem.

In a sequential test he has one more alternative; that of deciding whether to continue to the observation or not (assuming he has not reached the time where he must make a terminal decision, e.g., rotate the antenna). Depending on the observation-decision procedure and what the operator observes, a terminal decision is made or the observation continues after the output from the integrator is observed. Referring to Figure 2, a fixed observation procedure is implemented in the following manner. The antenna is fixed in position for a fixed time. The integrator is sampled at the end of this fixed time and presented to the observer. He then makes his terminal decision on whether the receiver input is due to SN or to N. If he decides the receiver input is due to SN he makes the decision "A," otherwise, he makes the decision "B." The observer makes his decision based on whether the integrator sampled output falls above or below a predetermined cut level. This comparison or cut level is a function of the ratio WFA/WM, where WFA and WM are defined as follows: WFAis the loss incurred in an erroneous terminal decision of "A," i.e., saying "the input is due to SN" when it is actually due to N. Similarly WM is the loss incurred in erronously saying "B" when the actual input is due to signal-plus-noise. The first error is called a false alarm, the second error is a miss. The predetermined nonsequential observation decision process is a fixed observation procedure in which the time of observation is a variable chosen by taking into account the quality of the observation, the cost of the observation, the cost of terminal decision errors, and initial information. Figure 3 depicts the observation decision process for the predetermined nonsequential procedure. Everything is as in the fixed observation procedure discussed previously except the observer has one other decision to make; that of choosing his observation length as any length less than maximum allowed observation length. To operate sequentially, the integrator output is sampled continuously. Based on what is observed the observer decides on whether to continue the observation or to stop the observation and make a terminal decision. The operator decides whether to continue or not by comparing the integrator output with two "decision boundaries." These "decision boundaries" are functions of the ratio WFA/WM and the available time remaining for the observation. Figure 4 depicts how the boundaries might look. The maximum available time corresponds to the maximum time the antenna is pointed in one direction. As the available time in which to make a terminal decision decreases, the "decision boundaries" become closer and closer together as one might intuitively suspect. If the output from the integrator crosses over the upper boundary, the decision "A" is made. If the sampled output crosses over the lower boundary, the decision "B" is made. A sample point that remains in the center portion between the decision boundaries indicates that the decision should be made to continue the observation. 5

INTEGRATOR OUTPUT, RESPOND "A" ( AVAILABLE \QUALITY L INITIAL tMAX TIME RESPOND "B" Figure 2. Output display for a fixed observation-decision procedure. 6

INTEGRATOR OUTPUT RESPOND "A", -3 ( AVAILABLE QUALITY LiNITIAL - I TIME t=O PREDETERMINED to tMAX RESPOND "B" Figure 3. Output display for a predetermined observation-decision procedure. 7

INTEGRATOR OUTPUT A BOUNDARY RESPOND "A",VAILABLE \ QUALITY TIME t= t MAX L INITIAL RESPOND "B" r BOUNDARY Figure 4. Output display for a deferred-decision procedure. 8

Note that the familiar sequential analysis of A. Wald (Reference 3) can be considered a deferred-decision procedure in which the available time to make a terminal decision is unbounded. In this sequential procedure the decision boundaries are constant with respect to time as shown in Figure 5. A more rigorous and complete explanation of the ideas and concepts presented in this brief introduction constitutes the major portion of this report. The use of the physical model shown in Figure 1 is, of course, only one of many physical models that could have been chosen to present the concepts of the observation-decision procedures we have discussed. 1.2 HISTORICAL BACKGROUND Historically, the idea of sequential observation-decision procedures, i.e., an observationdecision procedure where the length of the observation is not predetermined but depends on the observations, dates back to at least 1929. H. F. Dodge and H. G. Roming (Reference 1) used an observation-decision procedure where the decision to take another observation depended on the outcome of the first observation. A terminal decision was made either after the first observation or, if another observation was taken, after the second observation. Dodge and Roming's procedure allowed for only two samples. Others recognized (Reference 2) that multiple decisions on whether or not to continue the observation would reduce the average observation time (or alternately the average number of samples) needed to reach a terminal decision. A particular method of a sequential observation-decision procedure, the probability ratio test, was developed by A. Wald and published in 1943 (Reference 3). This book, "Sequential Analysis," has been a major reason for the interest in sequential observation-decision procedures. 1.3 OUTLINE OF THIS REPORT Chapter 1 contains general introduction and explanation of the composition of an observation decision procedure. Three types of observation decision procedures are introduced and the notation and basic assumptions of the mathematical framework are given. Chapter 2 is devoted to a study of nonsequential observation-decision procedures. Such a study is contained in this work for two reasons. Nonsequential procedures form a nonoptimum subclass of the more general sequential procedures and as such yield valuable bounds on various parameters of interest in sequential procedures. The nonoptimum sequential procedures are used as bounds because these problems can be solved analytically which is not true of most deferred-decision problems (except for academic problems). Secondly, in many practical prob lems, a sequential procedure could not be implemented, because of the very nature of the problem or because the added complexity of a sequential procedure would not be warranted. 9

INTEGRATOR OUTPUT RESPOND "A" A BOUNDARY AVAILABLE QUALITY 1 L INITIAL TIME r BOUNDARY RESPOND "B" Figure 5. Output display for a Wald sequential procedure.

Chapter 3, entitled Sequential Observation-Decision Procedures, constitutes the main topic of interest. A simple example is worked to acquaint the reader with the deferred-decision procedure. The decision boundaries for deferred decision are approximated and analytic results are obtained for the continuous observation case, i..e., when the decision on whether or not to continue to observe can be made continuously. Chapter 4 contains the numerical results of the theory given in Chapter 3. These results were obtained using a digital computer. The computer programs are to be found in the appendixes. A basis of comparison among the observation-decision procedures is presented. The observation-decision procedures are then compared. A summary and conclusions are presented in Chapter 5. We also discuss briefly some future studies. This is done so that the reader can better judge the position of the work of this present report in the continuing development of the theory of signal detectability. 1.4 INTRODUCTION TO NOTATION AND THE MATHEMATICAL TREATMENT OF OBSERVATION-DECISION PROCEDURES Much of the material presented in this section has been previously published in Cooley Electronics Laboratory Technical Report No. 123, "Deferred Decision Theory" by the late H. H. Goode, July 1961. It is presented here again in the interest of completeness and continuity. The sequential observation-decision procedure that we will examine will not be a continuous procedure, that is,, the decision on whether to terminate or continue the observation will not be made continuously but rather in discrete steps. The continuous case is not feasible mathematically or computationally within our present mathematical framework. The discrete nature of the observation process, of course, does not imply that the actual observation is discrete, but only that the decision on whether to continue or terminate is made in discrete steps. The available number of decisions that can be made on whether to continue or terminate the observation will be denoted by n. The word observation will now pertain to either one discrete observation step or the total observation, the distinction made clear by context. Consider an experiment in which the possible alternative causes are designated signalplus-noise (SN) or noise alone (N). If the observer decides the cause of the observation is SN he responds "A," otherwise his response is "B." The possible alternative causes of the experimental observation, SN and N, have a priori probabilities denoted P(SN) and 1 - P(SN), respectively. The observer of the experiment observes a random variable y whose probability distributions are: under the condition SN, f(y ISN), and under the condition N, f(y IN). The 11

observer can make two types of errors when a terminal decision is made. He can respond "A" where N is the true condition or he can respond "B" when SN is the true condition. The first error is called a false alarm (FA), the second error is called a miss (M). The losses associated with making these two errors are known to the observer and are WFA(n) and WM(n), respectively. The quality of a single observation in a sequential procedure we designate as d. The reason for this notation will be obvious later. The cost of an observation of quality d is denoted cd. In general, the density functions of the observed variate, the losses due to errors, the quality of a single observation, and the cost single observation may all vary with n. Under the above conditions the observer is to follow an observation-decision procedure which will maximize his expected value over the total observation, or alternately, minimize his expected loss (in those cases where the two are equivalent). The fact that so many parameters are allowed to vary in a fairly arbitrary (but known) manner during a sequence of observations is a striking generalization drawn from standard sequential observation-decision procedures where these parameters all remain constant. This generalization is possible because the number of observations is bounded above. For each specific sequential problem, the general method of solution is that of successive iterations: a solution is obtained for a single allowed deferral (i.e., one observation allowed), then two deferrals allowed, then three and so forth. Thus each set of parameter values is absorbed into the solution one step at a time. This paper will be explicitly limited to the stationary case, i.e., where the parameters of the observation-decision problem are independent of n. This is done so that the new work may be compared to the present literature on sequential procedures based on Wald's sequential analysis. In conventional sequential analysis the number of allowable deferrals is either infinite or so large it has no essential effect. If a large allowable number of deferrals is to have "no essential effect" then all essential variables must converge as the number of deferrals approaches infinity. This convergence has indeed been shown for the stationary case (Reference 4). The question remains as to how quickly the process converges. Hopefully, the process converges quickly enough so that it is economically feasible to do the computations involved by high speed digital computers. This has been the case for the parameters we have chosen. A transformation of P(SN), the a priori probability of SN, which allows one to obtain more insight and intuitive feeling for deferred decision is the "log-odds-ratio" L, L A n [ P(SN)-' 1 P(SN) ] 12

The probabilities of the causes, SN and N, before and after a single observation, are related by the familiar equation often called "Bayes'Theorem." Let P(SN y) be the a posteriori (after observation, y) probability. Then Bayes' Theorem is P(SNly) P(SN) f(yISN) P(SN) f(y ISN) + [11- P(SN)] f(y IN) This takes on a singularly simple form when expressed in log-odds-ratio. P(SN ly) P(SN) f(y ISN) 1 - P(SN y) - [1 - P(SN)] f(y IN) L (based on y) = L + n f(y SN) (1.3) f(y IN)J This last term is the natural logarithm of the "likelihood ratio" (y) = f(y ISN)/If(y IN) (1.4) This transformation is used so frequently in this work that it is singleu uUL allm LLL~LU Lemma 1. Lemma 1. Let n be the allowable number of deferrals. If an observation y is taken and f(y) denotes its likelihood ratio, then Ln = L + n * (y) (1.5) This report concentrates on a simple, normal, stationary detection case of deferred decision. "Simple and normal" means that the logarithm of the likelihood ratio of the observations is normally distributed with known parameters under both noise and signal-plus-noise. Further, the observation is discrete and the individual discrete observations are statistically independent and of constant quality. The losses due to a terminal decision error are assumed constant throughout time and the cost of an observation is positive and also constant in time. The criterion for the optimum solution is "maximize a linear utility" or "minimize an expected loss." The expected loss of a terminal decision can be derived as follows: If the value of the log-odds-ratio at the time of the terminal decision is L, then the corresponding cause probabilities are P(SN)= L (1.6) 1+e P(N) = L (1.7) L If the decision "B" is made, the probability of error is the probability that SN was the cause. If the decision "A" is made the probability of error is the probability that N was the cause. Thus we have 13

expected loss for "A" decision: WFA( + e L) expected loss for "B" decision: WMe L /(1 + e I To minimize the expected loss one chooses the minimum of WFA/(1 + eL) and WMeL/(1 + e ). This minimum is the loss due to the better terminal decision and is denoted T(L). In order to obtain a more symmetric form for the terminal loss function let W and A, be defined by 2 1 1 2 1 +1 (1.8) M FA W Then W = (1 + e ) (.10) FA 2 and WM =2(1 + e=~) (1.11) Thus the risk for a terminal decision is T(L) = min (1 + e ) - (1 +e ) = (1.12) L1+e ' +e ' +eL 1 21+eL Clearly, the terminal decision "A" is made for L > A and the terminal decision is "B" for L < A. The possibility L = A0 may be disposed of as the reader sees fit. Either an "A" or "B" decision or any random mixture of the two will result in the same loss, W/2. L -A e W o T(L)= L2 (1 + e L 1+e2 0O l+e ( + e) L >A (1.13) I +e A plot of the terminal loss curve is shown in Figure 6 for WM> WFA. For WM = WFA, T(L) is symmetric about L = 0. 14

TERMINAL EXPECTED LOSS w/2 Ao 0 L Figure 6. The terminal loss function, T(L), for WM > WFA. 15

In order to evaluate "how good the detection is" in a detection situation involving signal, noise, and a receiver, the receiver operating characteristic (ROC) is used (Reference 5). The ROC is a graph of the relation between the probability of detection, P("A" [SN), and the probability of false alarm, P("A"IN). The parameter along the curve is the log-odds-ratio L. A single curve applies to a fixed physical situation: signals-noise-receiver. An ROC is called "normal" if the curve can be parameterized by the normal probability distribution as follows: P("A" ISN) = ~(h + Vd), when P("A" IN) = ~ (X) where: t -x2/2 4 (t) = S- f e /2dx (1.14) -00 Three normal ROC curves are plotted in Figure 7. Normal ROC curves arise from the "normal case" in which the logarithm of the likelihood ratio is normally distributed under one of the causes, i.e., SN or N. This is a one parameter class of problems as demonstrated below. Let y be the observation and let z = f nrQ(y)] the natural logarithm of the likelihood ratio of y. The case we are discussing is where z is normally distributed with mean m and 2 variance a, say under condition N. f(z IN) = 2 exp 2 (1.15) Since z is a transformation of y, the distribution of z is derived from that of y by direct substitution, being careful to account for the change in the differential size. The variable z is one dimensional. If y is one dimensional, then f(z ISN) = f[ y(z)ISN] dy (1.16) dz If y is multidimensional, then f(z ISN) = f[ y(z) SN] J( ) (1.17) where J is the Jacobian of the transformation. In a similar manner we find that f(zIN) =f[y(z)IN] J(X) (1.18) 16

1.0.8 d=I 0.6 d 0.0 z C, a0 1 1 / I 0.2.4.6 8. 1,0 P ("lI N) Figure 7. Three normal ROC curves on linear paper with parameter d. 17

Combining Equations 1.15 and 1.16 we have f(z SN) f(y(z)SN) f(z IN)N ) [ y(z)] f(z IN) f[ y(z) (N] =e f(z IN) 1 (z - m)2 =- exp- 2 +Z = AT~exp -+ m + (1.19) L C 2 2cr2 This is a probability density function. Its complete integral is unity. +00- 2 -1 = Of_ f(z ISN)dz = exp _m - 2 -(7 This implies that m = 2 Collecting all this together we note that the normality of z under one condition forces it to be normal under the other condition. Both normal distributions have a common variance, denoted by d. For the normal case, in noise N the logarithm of the likelihood ratio is normal with mean -.5d and variance d with d > 0. In signal-plus-noise (SN), the expression CnF[ (y)] is normal with mean = +.5d and variance d. The equations for the ROC for a single normal observation are P(T"A" ISN) = b (X + Id)', when P("A" IN) = ~ (X). Figure 8 shows a family of such curves on special paper 1 designed to simplify the presentation of these curves. The quantity d is identified as the quality of a single observation and serves to index the ROC. Let (x1, x2,..., x) be a series of n normal observations in an observation-decision process. Let the quality of x. be d. and assume the x. are independent. Then the joint 1 1 1 observation (x1, x2,..., xn) has a log-likelihood-ratio which is distributed normally and the ROC of the joint observation is also normal. The detectability of the joint observation is n n E d.. When the d. are all equal,then obviously d. = nd, where n is the number of obser1i1 'i= 1 vations and d is the quality of a single observation. In general n < n where n is the max max greatest number of allowed deferrals. We will use the notation D = n d. max In a given problem the available quality, D, is specified. For discrete problems the quality of a single observation, d, is also specified. The observed quality denoted Do, is determined by the solution of the observation-decision process. Using these ideas the three types of observation-decision procedures can be characterized as follows: No. 42, 453 of Codex Book Company, Norwood, Massachusetts. 18

0.999999 0.999.99 0,9999 0.999 0.99 0.90 f 0.80 0.70 z 0.60 a) 0.50 0.40 X. 0.30 0.20 0.10 0.01 0.001 0.0001 0.00001 0.000001 555 5 5 - 0. 000.0000 C 0 0 0. N ro ~ U~ (D r-; aQM Qn 008 0 o 6 6 o 6 o 080 06 0 0 6 0*06 0 0 q 5 0 P (A' N) Figure 8. Three normal ROC curves on normal coordinates with parameter /di. 19

(i) For a fixed observation procedure the observed Do is a constant, i.e., Do = D. (ii) For a predetermined observation-decision procedure Do is a variable. For the optimum procedures Do will be a function of L, W, A, d, c, and D; where cd is the cost of an observation of quantity d. (iii) For a sequential procedure Do is a random variable. For the optimum procedures (and most others, too) the distribution of D will be a function of L, W, A, d, c, and D. A physical meaning for the quality of an observation, d, can be given to the simple detection problem of a signal known exactly in added white Gaussian noise. This detection problem, sometimes called the first problem (Reference 6) has been extensively studied in the literature. 2E For this problem d = N, where E is the energy of the signal and N is the noise power density N O measured in watts per cycle per second. Although this report deals exclusively with the normal case the results are useful in situations where, strictly speaking, not all of the normal assumptions are met. In physical problems we may never actually have the normal case. However, in many physical applications the ROC is normal over the region of interest.. If one restricted himself to these regions then we may use the results of this report as an approximation. The approximation depends on how close the physical problem approaches the normal case. Physically, for a normal ROC, we can relate the parameter d to the output signal-to-noise ratio of our receiver, (S/N)o. The available D will then correspond to a total or integrated (S/N)o. As shown previously a normal ROC implies that some monotone function of the observed statistics under either conditions, SN or N, is normally distributed. We have parameterized this situation by a variable d. The normalization is as shown in Figure 9. The mean of the d d noise is at -2, the mean of the signal-plus-noise is a +. The variance of both is d, d > 0. What does a signal-to-noise measurement of the output of receiver mean in terms of d? To measure (S/N)o we could perform the following series of experiments. The noise power of the noise would be measured in the absence of the signal. This would merely be the variance of our Gaussian noise distribution or d. Then we could look at a scope or meter with the signal and noise present. We would observe the average amount that the signal's presence increases 20

Z I k d= 2% q =- d-2E z N0 d O +d In[Qy)] Figure 9. The distribution of the logarithm of the likelihood ratio for a normal ROC with parameter d in N and SN. 21

the scope or meter reading. This would be the separation of their means, or again, d. The signal power then would be this voltage squared or d. Thus our measurement would give d2 (S/ N) -=.=-d. (S/N)o d From this we can conclude that for a normal ROC an available D = nmaxd = 1 is a (S/N)o of 0 db, provided we observe the entire observation. An available D = n maxd = 100 is a (S/N)0 of 20 db, assuming a normal ROC. The sequential procedure is a trade between (S/N) and a 0 shorter observation time. The increase in risk due to terminal decision errors which is due to a smaller (S/N) is balanced against a savings in risk due to a smaller observation cost. 0 As an aside, another viewpoint of an observation-decision procedure can be obtained by considering what the "state" of the observation-decision procedure is. The state of a physical system can be defined as the specification of a minimum set of variables needed to predict the future behavior of the system. Thus the state of an observation-decision procedure is U = U(n; L; f(xlSN), f(xIN), C, WM, WFA) The functions f(x ISN), f(x IN), C, WM, and WFA, for the stationary case, do not change for a given total observation and so can be grouped under "boundary conditions." Thus the state description is given by L and n. In addition to the ROC there are other measures useful in evaluating and describing an observation-decision procedure. We will be interested in the risk functions, Gk(L), and the average number of observations, Yk(L). The risk function, Gk(L), is the expected loss due to terminal decision errors and the observations costs. It is the combination of the information given by the ROC (error probabilities) and the average number of observations each weighted by their respective costs. These risk functions change with the available number of deferrals (n = 1 for nonsequential procedures). The risk function for stage k is obtained by averaging the risk function for stage k - 1 over all possible observations, y. The number of observations we designate as yk(L). For nonsequential procedures, k = 1. For a fixed observation procedure y1(L) is a constant determined outside of the observer's control. For the optimum nonsequential procedure y1(L) is a variable chosen before the observation decision process by the observer. For a sequential procedure yk(L) is a random variable. In these processes we will be interested in the expected or average number of observations,yk(L). The average number of observations is related to the conditional average number of observations by 22

L 1 k(L) = e (LISN) + k( L IN) (1.20) l+e l+e These functions, -k(LISN) and k(L IN), will be obtained iteratively in the same manner that the risk functions and ROC functions will be obtained. This iterative procedure is explained in detail in Section 3.2. The introduction given here is not meant to serve as a complete explanation of the concepts involved in observation-decision procedures. It is meant to introduce the reader to the notation and to present a brief outline of the remainder of the report. The physical model used to explain some of the ideas involved is one of many that might have been used. We hope that it gives the reader some notion of how this theory might be implemented in an actual physical problem and possibly some intuitive feeling for the overall problem. 23

2. NONSEQUENTIAL OBSERVATION-DECISION PROCEDURES 2.1 PRELIMINARY REMARKS A nonsequential observation-decision procedure can be included in the class of sequential and deferred decision procedures as a nonoptimum procedure. Nonsequential processes can also be considered as optimum processes resulting from some restrictions or side conditions. Whenever the cost of observation is zero,then the optimum procedure for stationary terminal losses is to observe as long as possible. This is a fixed observation-decision procedure. A somewhat similar action results from demanding that observations costs be "prepaid" with no refunding for unused observation time. The standard approach of classical statistics can be viewed as a general nonsequential observation-decision procedure in which the observation length or quality is chosen independent of a priori odds. This constitutes a nonoptimum observation-decision procedure under our definition of optimum (i.e., minimize the average loss). The optimum procedure is to choose the length of the observation based on AO, W, D, L, and c. This is the predetermined nonsequential procedure.. These forms of nonsequential processes have the advantage of being solvable analytically. In addition to being of interest in themselves, they can serve as valuable sources of bounds on corresponding functions for deferred decision. In many physical situations nonsequential procedures are the only procedures that can be used in reaching a terminal decision. For example, if a decision is to be based on results that can only be obtained by the purchase of equipment and man-hours beforehand, then we are dealing with a nonsequential problem. For these reasons the mathematical discussion of observation-decision procedures begins with nonsequential. procedures. 2.2 FIXED OBSERVATION-DECISION PROCEDURE As stated previously, in a fixed observation process,the quality of the actual observation, Do, is equal to a fixed D. The value of D is fixed by the problem and is not under the control of the observer. The object of an observation-decision process is to reach a terminal decision. The making of a terminal decision may result in an error. The expected loss due to these errors is represented mathematically by T(L). Assume the complete observation costs an amount C. Consider the relationship between T(L) and the "look ahead" loss, i.e.,,the expected loss if the observation is taken. If the cost of observation is zero then the expected loss and terminal loss are qualitatively shown in Figure 10. The ROC is also shown. Since the cost of observation is zero it always pays to take the observation. The expected loss, if an observation is taken, is everywhere less than the terminal loss function. The ROC is a continuous curve extending from point (0,'0) to point (1,1). 24

EXPECTED LOSS - W/2 T (L) L 1.0.8.6 '1,1) P ("A"/SN),4.2 0 0.2.4.6.8 1.0 P("A"/ N ) Figure 10. The expected loss function and ROC curve for a fixed observation procedure with the cost of observation zero. 25

Suppose that the cost of observation C is greater than zero. The '1ook ahead" loss curve is everywhere raised by an amount C. In other words, the expected loss is now the cost of the observation plus the loss that occurs because of errors in terminal decision. Referring to Figure 11 it is clear that for certain a priori opinions (represented by initial logodds-ratio, L) the observation is not advantageous. This set of L's occurs for L < Iand L> A. For these L's the observer's prior opinion is sufficiently strong that an observation strong enough to correct it, if wrong, would cost more than is saved by the reduction of errors. Thus the observation is never taken for this set of L's. The ROC is an arc and the points (0, 0) and (1, 1). The extent of the arc is determined by the decision boundary points, r and A. From the preceding discussion, the following two theorems occur as a logical consequence. Theorem I: For C/W >.5, if one is allowed the option of taking the observation or not taking the observation, the observation is not taken. This is true for all possible distributions of observations. Proof: The maximum of T(L) is T(Ao) =.5W. Let RE(L) denote the risk due to terminal decision errors. No matter how good an observation is available if C'>.5W, the total risk, RE(L) + C, is not less than C. Thus RE(L) + C > T(L). This means that the total expected loss if the observation is taken is greater than if no observation is taken, independent of the quality, D, of the observation. Thus if the observer has the option of observing or not, then for C/W > 0.5 the observation is not taken. Q.E.D. Theorem II: Given any C/ W < 0.5,there is a minimum quality of normal observation necessary to warrant observation. Proof: For an observation to be advantageous, the risk based on the observation must be less than the risk incurred by a decision based on prior information only. In symbols RE(L) + C < T(L) or T(L) - RE(L) > C If we write the above equation in terms of normal observation statistics we have -A-A A W I + +e -+ e (ASN) 1+e P("AINT()-[RE(L)R + C] -L -L - P( ISN) L P("AN) C, L < A +e l+e +e O 26

EXPECTED LOSS W/2 T(L) C I I I I I L 1.0 P ("A"/SN).4.2 0.2.4.6.8 1.0 P("A"/N) Figure 11. The expected loss function and ROC curve for a fixed observation procedure with the cost of observation greater than zero. 27

A -A A W 1 + e - +e ("A" T(L) - [RE(L) + C] =L [1 - P("A"I SN L P(..A.T.IN) - C, L A +e +e l+e 0 By routine calculation one finds that this quantity has a positive derivative for L < A0 and a negative derivative for L > A0 for any ROC point except (0, 0) and (1, 1). Hence, there is a single maximum at L = A. In order to show that there is a minimum quality needed before an observation is taken given that the C/W < 0.5 we need to show that when we take the maximum difference between T(L) and RE(L) + C, this difference depends on the quality of the observation. Thus we write max Rax F ax a L ROC [T(L) - (RE(L) +C)] ROC LT() (R + C)] R max [W (P(CAT ISN)- P(t"A" IN))- C] The quantities P("A" |SN) and P("A"1N) are the probabilities of detection and false alarm respectively. They depend on the quality of the observation D. Thus no observation is warranted 2C 0 unles P("'A"J 'SN) - P("A"I N) >- for some ROC point. For decisions based on likelihood ratio the ROC is convex; for normal observation as well as for any other symmetric observation statistics this means the maximum of P("A" 1SN - P("A"[N) occurs on the negative diagonal, P("A" ISN) + P("A" I N) = 1. To specialize this proof for the normal case, max max T R C L ROC [T(L)- [RE(L) + > C] 0 => (.5'D)>.5 +L ROC Lj - Thus the minimum quality is given by (.5/D-in) = 5+ min W Q.E.D. In words, Theorem I states that if the cost of the observation is too great then, roughly speaking, no matter how much "information" is obtained from this observation, this "information" cannot make up, by a decrease in errors, the cost of the observation. Theorem II states that for a given (C/W) <.5 ratio, there is a minimum amount of '.'information" needed in order to make it profitable to sample. The quantity (C/W) will turn out to be a convenient parameter for the description of observation-decision procedure as might be suspected from this theorem. Both theorems apply to optimum nonsequential procedures and to deferred decision. 28

2.3 THE PREDETERMINED NONSEQUENTIAL OBSERVATION-DECISION PROCEDURE The predetermined nonsequential procedure is a generalization of the fixed observation procedure. In the predetermined nonsequential procedure, in contrast with a fixed procedure, the observer chooses his observation length prior to the start of the observation. To optimize, in the sense of minimum average loss, his observation-decision procedure he chooses the observation length based on all the information he has available before the start of the observation. This information includes the loss due to terminal decision errors, the quality of an observation, the cost of the observation, and the a priori probability of the occurrence of a signal. Mathematically, we represent these parameters by W and O, Do, cD0, and L, respectively. Thus basically we wish to determine the observation length, or alternately, the quality of observation, Do, in order to minimize the average loss for a total observation. We wish to determine the quality of observation for minimum average loss as a function of W, AO, C, and L. In Section 2.3.1 we discuss the optimum predetermined procedure called the optimum nonsequential procedure. The specific optimum nonsequential procedure examined is that of normal observation statistics with "continuous observation." By continuous observation we mean that the observation length is a continuous variable, or alternately the observed quality is a continuous variable. This is to be contrasted with the discrete observation case in which the observed quality is discrete; the observation quality can be chosen only in chunks or discrete steps. The numerical calculations are done for the normal case. The specific choice of distributions for the observed variate under noise and signal-plus-noise is made so as to obtain concrete numerical results. The logic used in any specific problem is common to all such problems. The differences in numerical answers occur because of the specific distributions chosen for the observed variate. 2.3.1 THE OPTIMUM NONSEQUENTIAL OBSERVATION-DECISION PROCEDURE FOR NORMAL OBSERVATION STATISTICS WITH CONTINUOUS OBSERVATION. The basic problem we have in optimizing the predetermined observation-decision procedure is the correct choice of the observed quality, D. We say we have an optimum observation quality when the total 0 expected loss of the observation-decision procedure is minimized. This average loss is composed of two parts: (1) the cost of the observation of quality D which is equal to cD and 0 0 (2) the loss due to terminal errors. For a specific problem involving W, Ao', c, Do, and L, let us determine the "contour graph of the value of observation." This is a graph of Do, the quality of an observation, vs. L 29

(L represents our a priori opinion of the presence of a signal). The level curves or contours of the graph are contours of constant value of observation, that is, the additional value (or reduced loss) that will occur if the initial log-odds-ratio is L, and observations of quality Do are taken compared to no observation taken. Each set of contours corresponds to a fixed W/c ratio. In Figures 12 and 13 are shown two such graphs. Figure 12 is for the case W/c = 30 and Figure 13 is for the case W/c = 100. The contour graph of the value of observation can be used to explain the aspects of the optimum nonsequential procedure. The analytic derivation will be given later in this section. The graphs are obtained by use of a computer program (see Appendix G). Referring to Figure 12 let us examine in detail the predetermined observation-decision procedure. Assume we have determined the cost, c, and the losses due to terminal errors, W. For simplicity we will assume that A = 0, i.e., the loss due to a miss. Suppose that the available quality is very large. That is we can, if we want, choose a very large observation quality, D. For a specific a priori log-odds-ratio L, say 1.0, we note the following. For a very small observation quality, say D <.5, the value of observation is less than zero. This means that our a priori opinion, represented by L = 1.0, is sufficiently strong that it does not pay to buy the small amount of "observation information" represented by the observation quality Do < 0.5. In other words the cost for the amount of "information" we received from the observation is too great. The decrease in terminal error loss is not great enough to warrant taking the observation. This is shown on the contour graph of the value of observation by falling outside of the zero value contour. If we now increase the observed quality to 2.0, with L = 1.0, the value of observation is approximately 10. This is for W/c = 30. As we let the observation quality increase we note that we again fall outside the zero value contour. We are again paying too much for observation in relation to the amount we gain by a decrease in terminal decision error loss. At some point along each L value there exists an optimum observation quality which maximizes the value of the observation-decision procedure. This is represented by the dotted line in Figures 12 and 13. This dotted line gives the optimum observed quality for a given L value and a given W/c ratio. We note also that there exists a set of L values for which one never intersects the zero value contour as the observation quality is increased. Roughly speaking, we can say that the observer's prior opinion of the cause of the observation overrides any "information" he can economically obtain by taking the observation. 30

8 7 W/c = 00.0,d - I.C 6 t 5 4 VALUE=0 2 4 8 1 20 2 -2 - 1.6 -1.4 -.8 -.4 0.4.8 1.2 1.6 Figure 13. The contour graph of the value of observation for the continuous observation case and W/c 100. Figure 13. The contour graph of the value of observation for the continuous observation case and W/c = 10.0. 2

5 4 _ V /C - VU.U, U- I,V. d1/2 4 -1.2 -.8 -4 0 4.8 1.2 L L for the continuous observation case and W/c Figure 12. The contour graph of the value of observation for the continuous observation case and W/c = 30.

Notice that by Theorem II the zero value contour should not reach the origin of the contour graph. The minimum D for a W/c = 30 is.028 and for a W/c = 100 this minimum D is.0025. This minimum D is so small that it can't be seen in the plotsof Figures 12 and 13. We are interested in the intersection of the dotted curve with the zero value contour. These two points in L determine when we should take an observation if given the option of observing or not. Let P1 and A1 be the L values corresponding to the intersection of the dotted line with the zero value contour. Further let rl correspond to the smaller L value and A1 the larger L value. Then as a function of W/c we have that if our prior opinion, represented by L, is such that L < r1 or L > A1 we will not observe if we have the option of observing or not. If we do observe we know that the value of the observation will be less than zero and in fact, less than if we did not observe at all. As we decrease the ratio of the loss due to a terminal decision error to the cost of an observation of quality one, the contour graph for the value of an observation "shrinks." Clearly, this agrees with one's intuition. The decrease in the W/c ratio can be viewed as an increase in the cost of observing. Thus the cost of the "information" we receive from an observation on which to base a terminal decision is increasing. This means that the balance between cost of "information" received and the decrease in loss due to terminal decision errors is such that prior opinions tend to become more important. Thus if one is fairly certain before the observation what the cause of the observation is, the observation will not be profitable. Mathematically, we see this as a decrease in the interval between r1 and A1 as the W/c ratio decreases. This implies that the value contours "shrink" as the W/c ratio decreases. The preceeding discussion is a heuristic explanation of the optimum nonsequential procedure. The analytic development of the problem, given below, will place these general ideas on a rigorous mathematical foundation. Repeating, our basic problem is to determine the right observation quality, Do, such that the value of the observation is maximized. This is equivalent to minimizing the average loss. Thus the procedure to determine the optimum observed quality is clear. We merely express the value of the observation in terms of losses due to terminal errors, the cost of the observation, the quality of the observation, and the representation of our a priori probability of the presence of a signal. Having this expression we find the maximum of the value of the observation considering the observed quality as a continuous variable. The solution for the observed quality in this equation is the desired answer. 33

The minimum average risk for an immediate terminal decision, i.e.,, the observation is not taken, is T(L). T(L) = min {P(SN)WM, P(N)WFA } (2.1) In terms of W, A0, and L this average risk is ~ OL-A T(L) = 2 - L min e 1 (2.2) Equation 2.2 is the average risk for D = 0, i.e.,, an immediate terminal decision. The average risk associated with an observation of nonzero quality D is easily seen to be 0 R(L, Do)= P(SN)WM P("B"I SN) + P(N)WFA P("A" IN) + c Do (2.3) The average value, V, of observation is naturally defined as the amount to be gained by observing. V(L, Do) = T(L) - R(L, Do) (2.4) In Equations 2.1 through 2.4,we have indicated that the value function, the average risk function for D > 0, and the terminal risk function T(L) are functions of only L and D. (T(L) being a trivial function of Do.) We have chosen to suppress the functional dependence of these functions on the losses incurred in terminal decisions and observing costs. We do this because these quantities are usually fixed in any given problem. The usual problem is how to operate if given these constraints, i.e., W, A0, and c. Continuing, we see that from Equation 2.4 that there results two functional forms for the value function. There is a different functional form for V(L, Do) depending on whether L < A or L> A. 0- For L < A we have for the value function 0 V(L, Do) = T(L) - R(L, Do) = P(SN)WM - {P(SN)WM P("B"'ISN) + P(N)WFA P("A, IN) + c Do} = P(SN)W M P("A"I SN) - P(N)WFA P("A" IN) - c DO L L WP("A" ISN) WA P('AI N) - c Do (2.5) =L M +e FA l+e +e 34

In like manner for L > A the value function can be written 0 L 1 V(L,Do) L W P("B"L SN) N) - c D (2.6) 0 + L WM P("B" 'SN)+ L WFA o The conditional probabilities P("A" SN), P("A"1 N), P("B"j SN), and P("B"j N) are uniquely defined by the ROC (see Section 1.4). In this report we restrict ourselves to the so-called "normal ROC." This means we can write the above conditional probabilities as follows. P("A"1iSN) = I (v) and P("B"ISN) = 4 (-v) (2.7) P("A"I N) = Q (u) and P("B" N) = 4 (-u) where 4, (u): o (t)dt:= f x ( dt For any fixed set { L, Do, W, Ao, c} one can manipulate u and v to maximize the value function, V. This is a fixed-observation procedure. This has been studied at length (see, for example, Reference 6). The well known results are that v - u = > 0 (2.8a) O 2 2 v -u =2(L- A) (2.8b) 0 Solving for u and v we have /D L- A 2u =" + -A (2.9a) 0 /D L-A O O V = 2 + 0 (2.9b) Equations 2.9a and 2.9b give the values of u and v which maximize the value of observation for a fixed set {L, Do, W, Ao, c}. This is merely the solution of the familiar fixed-observation procedure expressed in terms of performance parameters. Equations 2.7, 2.8, and 2.9 stem directly from the fact that a normal observation quality of Do implies that the logarithm of the likelihood ratio is normally distributed with mean +2 and variance D in signal-plus-noise and with mean - and variance D in noise alone. Let us now consider the problem at hand; the determination of the optimum observation quality. This observation quality results in the maximum value for observation and, as discussed previously, will depend on L, our opinion prior to the start of the observation of the cause of the observation. 35

We can express Equations 2.5 and 2.6 in normal parameters. Consider Equation 2.5, the value of observation for L A o. (The same logic applies to Equation 2.6 for L> A.) V(L,D0) = P(SN)WM P("A" ISN) - P(N)WFA P("A"IN) - c DO, L< A (2.5) Expressing Equation 2.5, repeated above, in terms of normal parameters and W, Do, and L, we have, for L <_ A, the following o L- 2cD L W Ie W (v) - +e(u) o.e V(L,D) = W.+e eL i ol+ o 2 L W A l + e o The observation is profitable when ever the value of observation is greater than zero, i.e., V(L, Do)_>0. This condition can be written 2cD L L-A l+e o The optimization in a predetermined observation-decision procedure occurs in the selection of the proper observed quality, Do, so as to maximize the value function, V(LDo). We therefore look for a relative maximum of Equation 2.10 as a function of Do (or equivalently 1DO) with W, AO, and L fixed. We also restrict the observed quality to be greater than zero. From Equations 2.9a. and 2.9b,, we first evaluate L - A av 1 o u afWD_ 2 D _ (2.12) o 0 0 au 1 L-A -v aX =-2 2 D =- - (2.13) o 0 0 Using the results given in Equations 2.12 and 2.13 we now differentiate Equation 2.10 with respect to ook which results in Equation 2.14. 0O aV(L,D0) Wl (L-40) 4c;b+L O ~-e O O I + sion equal zero. This is equivalent to 4cD L (L-A ) ol+e o W 0+e-A=-e p v)u + (u)v, L~A0 (2.15) u 1 + e 36

By Equation 2.8b and from the definition of ((v) we have /2 2\ 2 e(L- (v) = e e =(u) (2.16) Hence Equation 2.15 for the relative maximum of V(L,Do) becomes 2cD < L o. 1 +e p(u) (v-u) (2.17) W A 2 1 +e We thus summarize the situation for L< A. The value of observation is positive when 2cD L (L-Ao) -. e, e ~ (v) - 1(u) (2.11) W A 1 +e And the value of observation has a relative maximum when 2cD L 0 a l+eA = (u)(-') (2.17) w h ='tA 2u) 2 1 +e The boundary value of L for L< A, which we denote by rF1, for which the optimum choice of the observation quality Do would just break even with no observation and immediate terminal decision, is given by the simultaneous solution of Equation 2.17 and the equality of Equation 2.11. This operation results in Equation 2.18. This simultaneous solution is, in terms of our value contour graph, the intersection of the zero value contour and the dotted line. The dotted line represents the optimum observation quality. The analytic formula is given by Equation 2.17 in implicit form. y(u l(v) - M (u) = S y(U) (2.18) or 4 (v) v = 4(u) u (2.18) and from Equation 2.8b we have 2 2 v -u = 2{-A) The calculations for L A have been rigorously carried through by the authors. These 0 calculations serve as a check on the calculations for L~< A since symmetry conditions imply 37

that the same results should be obtained for L>Ao as for L< A except for a change in signs of u and v. These mechanical calculations do not contribute to the reader's understanding and so we omit them. The results for L>Ao are given below. For L> A the value of the observation is positive when Do.1 + e < <(-u)-e(L-Ao) (-v) A -<_-I I(-u-e(2.19) W A 1 +e The value of the observation has a relative maximum when 2cD L ol +e v u W 1 = y(U) ( - 2-) (2.20) 0 1 +e The boundary value of L for L Ao, denoted by A1 for which the optimum choice of the 0 observation quality Do would just break even with no observation is the simultaneous solution of Equations 2.19 and 2.20. Note that Equation 2.20 is identical with Equation 2.17. This results in Equation 2.21 below. (-u) -(y(-v) =O(u) ( - U) or 1(-u) u (-v) v + - + 2 (2.21) yO(u) 2 p (v) 2 and from Equation 2.8b we have v -u =2(A- A) 0 A comparison of Equations2.21 and 2.18 shows that the two equations are similar except for all signs. Thus if a pair (u1, v1), v1> u1 satisfies Equation 2.18 then (u2 = -v1, v2 = -U1) satisfies Equation 2.21 and v2 > u2. The analytic determination of the contour graph of the value of the observation is, in theory, completed. Equations 2.11 and 2.17 are all that are needed to determine the contour graphs. The actual graphs in Figures 12 and 13 were determined by use of a digital computer (see Appendix G). 38

Consider now the evaluation problem of determining how "good" the decisions are for the optimum nonsequential procedure. This evaluation is accomplished by determining the ROC curve for the procedure. The analytic derivation follows. In any predetermined nonsequential procedure there always exists a side condition on the observation length. This side condition is that maximum allowable quality, D, is finite. The observed quality, Do, which we choose, in general may depend on the available quality D.. We consider first the situation in which the available D is so large that this side condition is eliminated. That is, D may be chosen without reference to the available quality. Assume further that our initial opinion of the cause of the input is such that we take the observation. Mathematically this means that our initial L value is such that r1< L< A1. The ROC for the optimum nonsequential procedure in parametric form is found as follows. Equation 2.17 gives the relationship between the various parameters for the optimum nonsequential procedure. Using their equation as a starting point take the reciprocal of both sides. This operation results in Equation 2.22. W l+e 1 4oD 1 eL - (v-u) y(u) (2.22) By Equation 2.8a we have an expression for D in terms of u and v. Combining this with Equation 2.16 we can write 2D (1 + e =2(v-u)2 1 + e (v)) (2.23) If we now multiply Equation 2.22 by Equation 2.23 we obtain the parametric representation of the ROC. This is Equation 2.24. AA W He 1 o 1 (2.24) Wtle + )t= 2(v -u) 1 e v c 2 ~(u) + e (v) For any fixed set of costs and values Equation 2.14 is the parametric form of the ROC. Values of u and v are restricted to a range given by the solutions of Equations 2.18 and 2.21. Equation 2.24 allows one to evaluate how good the terminal decisions are in an optimum nonsequential procedure. The actual "operating point" on the ROC, i.e., the coordinates u and v, depend on L. Figure 14 depicts the ROC for the optimum nonsequential procedure as a function of the ratio. Notice that the ROC is constrained to an arc and the points (0, 0) and (1, 1). Figure 14 is plotted on normal paper (see Section 1.4). This allows us to determine easily how close our optimum nonsequential ROC approximates a normal ROC. (A normal ROC plots as a straight line with a slope of one.) 39

. 50- iL=-2.3.,40 -CONTINUOUS OBSERVATION CASE,30 -n=O Ao =o.20-.10-.05.04.03.02.0 1 I I I I I I I I I I I I I I I I I.01.02.03.04.C5.10.20.30 40.50.60.70.80.90.95.96.97 91 P ('AI/N) Figure 14, The ROC curves for the optimum nonsequential procedure for the continuous observation case as a function of the W/c ratio..99 40

The boundary equations for L for which the observation is profitable can be found by using Equations 2.18 and 2.21. Let (u1, v1) be on the lower ROC boundary, i.e., (u1, v1) satisfies Equation 2.18. The corresponding L value we denote by r1. 2 2 11 r1: = L + 2 ~ 2 (2.25) where (ul,v1) satisfy 1 A0 1 ~(ul) + e ( = 4(vl-1) (2.26) 1 +e In like manner on the upper ROC boundary (u2,v2) satisfies Equation 2.21 with the L value denoted by A1. 2 2 v2 u2 A1 = o + 2 2 (2.27) where (u2,v2) satisfy Equation 2.26. This completes the discussion of the continuous observation predetermined nonsequential procedure with large available quality. Perhaps the best summary of the predetermined procedure is the contour graph of the value of the observation shown in Figures 12 and 13. There are certain other aspects of a predetermined observation procedure which we have not as yet discussed. There remains the problem of a discrete observation problem in which the observation quality cannot be considered as a continuous parameter. And there is the situation in which the condition of a finite allowable D affects the selection of D. Let us consider the latter problem first. This situation can be explained most easily by reference to Figures 12 and 13. The contour graph of value shows clearly how the condition of small available quality affects the selection of D. If the available D, for a given L value, falls below the dotted curve for the optimum Do, then the observer chooses Do equal to the available D. Otherwise he chooses the optimum D. The procedure maximizes his expected value for the observation. The discrete observation problem and the complications that arise by the discrete nature of the observation quality are best illustrated by means of an example. 41

2.3.2 AN EXAMPLE OF AN OPTIMUM NONSEQUENTIAL PROCEDURE WITH DISCRETE OBSERVATION. Consider the following example of a predetermined nonsequential procedure with a c ratio of 30 and A = 0, i.e., the losses due to terminal errors are equal. Let the C 0 quantization in the quality of observation be one. In other words the quality of an observation may be chosen as Do = 1, 2, 3,.... The average risk function for an observation quality Do is found by combining the loss due to terminal decision errors with the cost of observing. In this example,. D = nd = n, n- 1, 2,3..... Assuming normal observation statistics with the density function of the input in noise N(Q-,DIo) and the density function of the input in signal-plus-noise N(-T, Do) we have for the average risk R(L,D) L -D FL D.*LDoDe - eL L 1 0 R*(LDo) 30 + L (2.28) =~ L 2 + ---e 2 30 1+e o 1_ +e o R*(L, D ) is a normalized risk function. Equation 2.28 is plotted in Figure 15 for D = n, = 0, 1, 2, 3. Observe that the observed quality, Do, for the optimum nonsequential procedure will never exceed three. From the risk functions shown in Figure 15 it is also clear that if a Do of at least three is allowed, one will never use a DO of one. The observation quality of one does not decrease the errors in a terminal decision enough to make up for the cost of the observation. This result is due, of course, to the quantization of the observation quality. The ROC for this procedure is shown in Figure 16. The ROC consists of two segments corresponding to a D of two and of three. The quantization of D results in some L values 0 0 having a D which is not unique. The ROC is discontinuous and P("A" ISN)Y is not a single 0 valued function of the false alarm probability, P("A" IN). For these L's we have from Equations 2.8a and 2.8b 2 2 V1- -U1 =Vff=v', ' 2L =v2 (2.29) 2 2 2 -U2 = n1d+d, 2L =v2 -u2 (2.30) Thus n v2+u2 = n (V1+u) (2.31) On normal-normal coordinates there is a doubling back of the ROC at each discontinuity. For the specific example of W/c = 30, P("A"1 SN) is not a single valued function at L = 1.048. If Equation 2.33 is calculated for W/c = 30, d =.25, and Do = nd for n = 0, 1, 2,..., 15, Table I is obtained. This is the same type of solution as obtained for W/c = 30 42

.15 -2.O -1.5 -1.0.5 1.0 1.5 2.0 L L Figure 15. The normalized expected loss function for an optimum nonsequential procedure with normal observation statistics and W/c = 30., =: O., and d = 1.0.

n=O Z.50 -a..40 -n=O.30-.20- W/c =30.0, 0= 0,d= 1.0.10-.05 -.04 -.03.02. I0 I I I I I i I I I I I I I I I I.01.02.03.04.05.10.20.30 40.50.60.70.80.90.95.96.97 ' P ("'N) Figure 16. The ROC curve for an optimum nonsequential procedure with normal observation statistics, discrete observation case, with W/c 30., AO = 0., and d = 1.0. 98 44

L T(L) R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 0 -5.40962.37848.35751 ~34187.32975.32104.31251.30641.30163.29882 ~29517.29325.29205 t.94.29147.1.4750.30489.30118.29745.29474.29287.29168 1 2101.29112.2.4502.40203 ~ 35896 - 35337 ~ 33838 ~ 32672. 31746. 31010.30423.29964.29609 ~.29347.29168.29061.2902.9025 ~ 3.4256 ~ 30152 ~ 29717 ~ 29381 ~ 29139 ~ 28976 ~ 28873.2884.2862 ~.4.4013 ~ 38072 ~ 35817 ~ 34129 ~.32850 ~ 31786 ~ 30961.30304 ~.29782 ~.29373.29064.28854.28704.28629.280 28759 ~ 5 ~.3775 ~.29324.28954.28674.28493 ~ 28375 ~.28307 ~ 230 28370.6. 3543. 3495. 33506. 32110. 31220. 30381. 29720.29181. 28768. 28443. 28210. 27764. 27969. 298.27962.28037 ~ 7 ~ 3318.28132.27860.27679 ~.27560.27506.250.27554 ~.27657.8.3100.3131. 36643.29756.29168.28574.28096.27723.27427.26854.27076.27012.27001.729.27110.27243 ~ 9.2891. 26679.26524.26435.261 26442.26503. 26682.26779 1.0O.2688 ~ 2754.27492 ~ 27087.26811. 26488. 26212.26010 ~.25863 ~.25773.274~ 25767 ~.25835 ~.25948.26098. 26286 1.05.25922.25441.25391.58 1.09.25161.25101 12o41.59 <1< 0 ~~~~~~~~~~~~.2 51. 299.25026.97 s8 1.2.21.2393.24293. 24295.24307.24236.24173.24146.24143. 34191.24371. 24634.24566.24752.24976.2523 5 1. 4i.193. 060.21205.21479.21771. 21925. 22061. 22193. 22347. 22164. 22722. 22948. 23202. 23476.23776. 24109 1.4!..93 _.06 1.6 166. 163.18363. 18892. 19337. 19682. 19986.20283.20569.20861.21168.21495.21843.22197.22572.22975 1_. 16.176 1.8.147.1502.15816.16483. 17070.17567.18016 ~18939 ~18847 ~17255.19664.20683 ~20507 ~20942.21388.21852 2.0 1192 1275. 13572.14311. 1504.1579.16179.16710.17223.17726.18219..8721.19225. 9771.2o245.2o766 2.2 0998 1081.10983. 12428.13181.13872.14520. 15140. 14669.16305. 16890.17461.18028.18596.19170. 19708 01 Table I. The normalized expected loss function for an optimum nonsequential procedure with normal observation statistics and W/c = 30., AO = O., and d =.25. 0

and d=1. The largest Do will never exceed fourteen. If one is allowed a Do of 9 or more, one will never take a D of 8 or less. The optimum risk function for these parameters is shown in O Figure 17. The ROC is not included for this case but consists of a number of straight lines, as before, where the discontinuities in L can be found by looking at the intersection of the risk function with T(L) for D =nd for n = 0,, 2,..., 15. 0 From the numerical examples presented it is clear what the effect of quantizing Do produces. The ROC is no longer continuous and single valued. This implies that for a given L the optimum Do may not be unique. The optimum observation quality is no longer unique for every L value. 2.3.3 AN EXAMPLE OF AN OPTIMUM NONSEQUENTIAL PROCEDURE WITH CONTINUOUS OBSERVATION. In illustration of the numerical results obtained in an optimum nonsequential procedure with continuous observations the following examples is considered. Assume the available D is large so that the choice of Do can be made independent of the available D. Let the loss due to a false alarm, WFA, be 20 and the loss due to a miss, WM, be 60. Further let the cost of observation per unit change in Do be one. In terms of W and O we have W W FA M W=2 FA 30 (2.32) FA+ WM A = en = -1.0986 (2.33) The nonobserving terminal loss function, T(L), is given by Equation 1.3, as derived in Section 1.4. L / T(L)= e W T(L) + eL "~2 1 + e, L < A~ (1.3) 1 W ' 2 (1 + e, L> A L 2 1 l +e The risk associated with an observation of quality Do can be found using Equation 2.3. R(L,Do) = P(SN) WM P("B" ISN) + P(N) WFA P("A" IN) + cDo (2.3) Rewriting Equation 2.3 in terms of A and W and normal observations we have R(LID) =- 2 (-v)+ -I - (= +e ) W) (D (u) + cD (2.34) 1 +e 1 +e 46

.3 - C,) Cl) O 0 w, W) w 0I 2 N.2 4 Cr 0 z Z. W/c 30.0 I = =.25.1 O, -2.0 -1.5 -1.0 -0.5 0 0.5 1.0 1.5 2.0 L, Figure 17. The normalized expected loss function for an optimum nonsequential procedure with normal observation statistics and W/c = 30., AO = 0., and d =.25.

In Figure 18, Equations 1.3 and 2.34 are plotted for W = 30, AO = -1.0986, and c = 1. In addition the risks due to terminal errors and the observation cost are plotted. This shows how the total risk is composed of its two component parts. Notice that the risk due to error is approximately constant as a function of L. This is also true for the symmetric case as shown in Figure 19. The risk due to terminal errors and observation costs are again plotted separately. The same general features of the risk function for any optimum nonsequential procedure are as indicated in Figures' 18 and 19. The most striking feature is that the terminal error loss for a fixed W/c ratio is practically constant as a function of L. 2.4 BOUNDS ON DEFERRED DECISION PARAMETERS BY USE OF OPTIMUM NONSEQUENTIAL PROCEDURES Nonsequential procedures can be used to yield bounds on certain parameters of deferreddecision procedures. This is possible because they represent nonoptimum deferred-decision procedures. Since the expected risk for the optimum deferred-decision procedure is not greater than the risk for any nonoptimum procedure, the point (in L) at which it agrees with the terminal risk function is at least as great as the point at which the nonoptimum procedure agrees with the terminal risk curve. Thus nonsequential procedures yield upper bounds on risk and interior bounds on the decision boundary points. The importance of these bounds is that they can be calculated analytically. It is evident from the example presented in Section 2.3.2 that the minimum risk curve for the optimum nonsequential procedure consists of segments of the various "discrete observation risk functions," i.e., the risk functions corresponding to Do = nd, n = 1, 2,.... Due to the quantization of D the greatest L value of the intersection of T(L) with the individual risk 0 functions is not necessarily A1, the intersection of the D = i x d observation risk function 0 and T(L). (This logic is valid for both L > Ao and L <_ o. We consider only L > A.) This is exactly equivalent to finding the greatest L value for which the value function is zero and D = nd, n = 1, 2,... This problem is solved graphically by use of the contour graph of the 0 value of the observation. Merely note on the Do axis which values of Do are possible and then find the corresponding L value for the value function level curve equal zero. The solution is to choose the largest L value so obtained. We repeat that this L value may not correspond to D = 1 X d. This is due to quantization of D. These L values serve as an upper bound for the O 0 general deferred decision boundary. For a continuous normal observation with WM = 3WFA the graph of these L values, (A, F), as a function of W/c is shown in Figure 20. Also shown in Figure 20 are the decision boundaries for the symmetric loss case. Note that the asymmetric decision boundaries are not a 48

14 12 -10 / T( L) CONTINUOUS OBSERVATION CASE W/c =3 0,o = - -I.0986 RE(L) +cDo 0 8 -0 I - 00 LU X 2 -0 ~~~~~~~~~~~~~~~~~~~~~~~~RE(L)~~~~~~~~~~ I RE(L --- I I I- - I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I I ~~~c~oo-.. I I I I I I I I - 2.4 -2.0 -1.6 - 1.2 -.8 -.4 0 Figure 18. The expected loss function for an optimum nonsequential procedure with normal observation statistics. The total expected loss is decomposed into the terminal decision error loss, RE (L), and the observation cost, cD. The parameters are W/c 30 o, A = 10986, and d= 1.0.

14 -T(L) 12 CONTINUOUS OBSERVATION CA W/c = 30,Ao=O RE(L) + CDO ~",] 6; / IRE(L) 0 0i w LL~ I 4 I 2 Figure 19. The expected loss function for an optimum nonsequential procedure with normal observations. The total expected loss is dec posed into the terminal decision error loss, RE(L), and the observation cost, cD0. The parameters are W/c = 30., A0=., and d = 1 SE LomL. 0.

U) en on 8 -uJ U l0 () aL 6 x LL -.2 -.8 -.4 0.4.8 -1.2 L Figure 19. The expected loss function for an optimum nonsequential procedure with normal observations. The total expected loss is decomposed into the terminal decision error loss, RE(L), and the observation cost, cD. The parameters are W/c = 30., AO= 0., and d = 1.0.

7 6 5 4 3 2 0 -I -2 -3 -4 -5 A1,A0o= A I Ao =-I.0986 LOCUS OF MIDDLE FOR Ao=O LOCUS OF MIDDLE FOR Ao=1.0986 CJl rP,Ao=o ~fI -71 I | I 1 2 11 1I I I I I I I1 10 102 10 W/c - Figure 20. The optimum nonsequential decision boundaries for large available quality, D, and continuous of W/c, for AO= O. and -1.0986. normal observations as a function

simple translation of the symmetric decision boundaries when they are plotted as a function of W/c. If however, the decision boundaries are plotted as a function of D then the asymmetric decision boundaries are a translation of the symmetric decision boundaries. This is shown in Figure 21. These results can also be obtained analytically. For continuous normal observations (L > A0) we solve Equations 2.11 and 2.17 simultaneously for L. 2cD L L-A l+e o o 1+e u)(2 2) (2.17) W A l+e This results in the same type of plots as obtained in Figures 20 and 21. If we are not permitted to choose Do on the basis of a priori odds, L, then one possible solution is to use L = 0 and solve Equations 2.17 and 2.20 simultaneously. This results in simpler expressions. For example, consider the symmetric loss, continuous, normal observation procedure. If we plot, the difference of the A as obtained by using our initial odds information and A as obtained by setting L = 0 (call this A = A*) as a function of W/c, the two values are 'close" to each other. The difference (A - A*) is plotted in Figure 22 as a function of W/c. The conclusion we can draw from this is that the effect of choosing Do on the basis of L is a "corner effect," i.e., the difference in the risk functions for the two procedures occur near the intersection with the terminal loss function. If we compare a nonoptimum nonsequential procedure and an optimum nonsequential procedure, using the Do of the nonoptimum procedure equal to the D of the optimum procedure, the differences in risk functions occur only in a small L interval near the terminal loss curve. As a means of furnishing an inner bound on the asymptotic deferred decision boundary, i.e., Do- xc, the nonsequential procedure optimum for L = 0 is, for many practical purposes, as good as the general optimum nonsequential procedure. The advantage in using the nonsequential procedure optimum for L = 0 is the less complicated equations which must be solved. 2.5 SUMMARY OF NONSEQUENTIA.L OBSERVATION-DECISION PROCEDURES The study of observation-decision procedures naturally begins with nonsequential observation decision procedures. The logical extension of the familiar fixed observation-decision 52

7 6 5 4 3 2-C ~ a l=o1.098686 2 0 - C -2 r,,AO.09 -5 -7 0 1 2 3 4 5 6 7 8 9 10 II 12 13 14 15 16 17 18 19 D - Figure 21. The optimum nonsequential decision boundaries for large available quality, D, and continuous normal observations as a functi of this available quality, D, for A0= 0. and -1.0986. 20 on I

.4.:3.2 AL= A-A* I I I I I I I I I I I I I I I I I I I I I I ul &I.I_ O0L 1.0 I I I 5.0 10 15 100 500 1000 W C Figure 22, The difference between the decision boundaries for the optimum nonsequential procedure with continuous normal observations as obtained by using the correct log-odds-ratio and a log-odds-ratio of zero plotted as a function of W/c.

procedure is a procedure which is still nonsequential but where the observer may choose his observation length before the start of the observation. This predetermined nonsequential procedure is valuable not only as a means of furnishing bounds on sequential procedure performance but also as a procedure previously not investigated (to the best of the authors' knowledge). In many physical problems a sequential observation-decision procedure cannot be implemented. A sequential procedure might be too complicated equipment-wise or the very nature of the problem might be such as to rule out sequential methods of observation and decision. These situations are the natural environment in which to apply the optimum nonsequential procedure discussed in this section. The analytic derivation of the optimum nonsequential procedure has been given for the stationary normal observation case. And although the particular numbers involved depend on the distribution of the observed variate under noise and signal-plus-noise the logic necessary to solve the problem does not. The results of the optimum nonsequential procedure are best summarized by means of the contour graph for the value of the observation (Figures 12 and 13). This graph, which plots the available quality vs. the log-odds-ratio, L, has as level curves the value of the observation. For a given available D the optimum observed quality, Do, can readily be found from this graph. The corresponding decision points (r, A) can then be found. 55

3 SEQUENTIAL OBSERVATION-DECISION PROCEDURES 3.1 PRELIMINARY REMARKS This section comprises the main part of this report. As explained in Section 1.1 one may view a sequential procedure as a procedure in which many intermediate decisions may be made before one terminal decision is reached. These intermediate decisions are: after each single observation should one make a terminal decision or take another single observation. (Here we are assuming discrete observations.) The particular sequential procedure we examine is called deferred decision. This is an optimum procedure for the maximization of expected value. (Or alternately, the minimization of expected loss, whenever the two are equivalent.) The available D is given as finite. With this basic restriction and the other parameters of observation-decision process given the problem is to determine the optimum procedure to minimize expected loss. The basic restriction of finite available D can be viewed as occurring in many ways. One such implementation is to consider a cost function which increases with time. This has been studied by T. Curry, Reference 7. Our concern is not so much in justifying the restriction of a finite available D but to assume the condition because of its generality. Note that Wald's sequential procedure and all nonsequential procedures are included as special cases in deferred decision. 3.2 METHOD OF SOLUTION AND AN ILLUSTRATIVE EXAMPLE The method of solution is based on the following fact. The expected loss at any stage of the observation-decision process is minimized by minimizing the expected loss considering the process has started at the stage of the process where one is now. Readers familiar with the terms of dynamic programming (Reference 8) will recognize this as the principle of optimality. Still another way to view this is to use the language of state variables. At each stage of a discrete observation decision process the state of the procedure is given by a specification of a certain minimum set of variables (see Section 1.4). The state of a system contains all the information available to predict the future of the system. The important fact about the specification of the state of a system or procedure is that future behavior depends only on the state now and not on past behavior. The fact that one can specify the state of the process is the basic idea in the method of solution of deferred decision. To optimize a deferred-decision process we minimize the expected risk. From the minimization we obtain decision points in L. The optimum procedure implies a unique set of decision points. Thus, basically we must find the 56

risk function at each stage n. Minimization of this function then determines our optimum procedure. Clearly, since we know T(L), the terminal loss function, we start the process of solution from T(L). We solve in an interactive manner the risk functions for n > O. (We may start the solution at any value of n for which the complete solution for smaller n is known.) The parameters of the observation-decision process assumed known are: (1) f(yISN) and f(y IN) (2) WFA and WM, i.e., W and A (3) cd, the cost of an observation of quality d (4) the quality, D, of the total observation In general these parameters are functions of the stage number, n, of the decision process. For n = 0 our expected loss is trivial since n = 0 means no observations may be made, and hence a terminal decision is called for. The expected loss, F (L), is equal to the terminal loss function T(L), since the cost of no samples is zero. Consider next the case where there is one allowable deferral, i.e., n = D /d = 1. For max o a given a priori probability of SN, P1(SN), let the transformed value be L1. If "y" is observed, the probability that SN is true is given by lemma 1, i.e., L = L1 + I n [M(y)] The observer is now in the n = 0 state. No more observations are possible. The loss for any "y" is F (Lo) = Fo(L + kn[l(y)]). This loss must be averaged over all "y" to obtain the expected loss for any observation, y. To this is added the cost of the observation. This is G1(L), the expected loss of deferring one decision. G1(L) = F + lfn[l(y)] f(y)dy + cd (3.1) f(y) is the probability distribution of fn[f(y)] and cd is the cost of a single observation of quality d. f(y) may be written in terms of f(yiSN) and f(yIN) as L f(y) = e L f(yISN) + 1 f(yIN) (3.2) L L l+e l+e Thus the optimum expected loss for n = 1 is F1(L). F1(L) = min [T(L), G1(L)] (3.3) 57

The intersections of T(L) and G1(L) are the decision boundary points rl and A1. Figure 23 indicates the alternatives that may be taken for WM > W and n = 1. M FA max The optimum procedure may thus be solved iteratively by the following equations. Lk1 = Lk + n[f(y)] (3.4) Gk(L) = f Fk1 L + ln[f(y) f(y)dy + cd (3.5) -00 Fk(L) = min [T(L), Gk(L)] (3.6) Equations 3.4, 3.5, and 3.6 define the iterative process that can be used to determine the risk functions, Fk(L), and the decision boundary points, (r Ak) for the optimum procedure. Hand calculations are prohibitive except for academic problems. Numerical solutions must be obtained by use of high speed digital computers. See Appendix I. As an illustration of the calculations involved consider the following simple example. Assume the following parameters of the observation-decision process are given. (1) WFA =WM = 1 (2) cost of a single observation = c 2 (3) f(ylSN)= 8y+ < y< 1 = O, otherwise f(yiN) =1, O<y<1 = O, otherwise (4) the problem is stationary. The method of solution is to work the problem one stage at a time from the n = O stage. The observation-decision problem is to observe "y," decide, "A," "B," or "defer," or any a priori opinion, L, quality of observation, and available D = n d. At stage n = 0, i.e., no possible deferrals, the expected loss is: L T(L)= e L < A 1 +e 1 L>A = IL ' L o 1 +e 58

EXPECTED LOSS W/2 T(L) GI(L) I I I I I F,(L) I L r7, Ao AL 00 - RESPOND"B" T "DEFER" *|- RESPOND "A " Figure 23. The optimum expected loss function as a function of the log-odds-ratio, L, for a deferreddecision procedure in which WM > WFA. 59

where Ao = Qn(WFA/WM) = 0. For L > 0, the decision is "A" and for L < 0, the decision is "B." The expected loss is F (L) = T(L). At stage n = 1 we have the possibility of deciding "A," "B," or "defer" with the knowledge that we have one deferral. If we defer we then apply the results of the n = 0 stage. At stage n = 1 we know L1, which represents our opinion of the cause of the observation and T(L). We wish to compare the cost of deferring our decision with the terminal loss function T(L). The expected loss if we defer our terminal decision is the average over all observations "y" of the "look ahead" expected loss function, T(L), plus the cost of a single observation, c. G1(Ll) = FO(Lo)f(y) dy + c = T(L) f(y)dy + c (3.7) illy ally In the above equation L represents the log-odds-ratio starting with a given log-odds-ratio L1 and having taken an observation "y," that is L = L1 + fn[f(y)] Further f) e (yISN) + L f(y N) e + 1 0 < y < (3.8) l+e 1+e 1+e Thus L G1(L = J T[L1 + n y] L dy + c (3.9) G() 0 L 1 I n ~1+e We note that 2n[e(y)] = 2 n y, 0 < y< 1 (3.10) is between -Qn9 and +Qn9. If L1 is less than -Qn9, the resultant L value will necessarily be below zero and the decision will be B. One result of the internal consistency in the definition of L is that 1 L L1 e + n[f(y)] f(y)dy = e T(L) (3.11) l1+n[(y) L 1 l'1+e l+e 60

Similarly, for L1 greater than en9, the resultant Lo value will always yield an "A" decision, and 0 the average of T(Lo) is T(L1) Therefore, we partially conclude that IL1 > n9, G(L) T(L1)+ c For IL1 < lQn9, Lo may be of either sign and hence T(L) must be expressed by two equations. Since T(L) has a different functional form for L < 0 and L > 0, the limits of integration on y must be determined. For -oo < Lo < 0 => -o < L1 + n[f(y)]< 0 Simplifying above we have L1/2 3e - 1 8 ~1/z And similarly for +co > L > 0 the limits of integration on y are 0 < y < 3e 8 Thus o 8 1 L 1 L e 1[(y) 1 + e l(Y) G1(L1) = LE L1 / 1 L L1/2 1 +e f(y) 1 +e 3e -1 8 L1/2 3e -1 8 Ll 1+e _(y) + 1 1+e jY)dy + c (3.12) L1 L1 0 1.1+e e C(y) 1+e?gration, the above equation becomes L1/2 G1(L) = (c -.125) +.75 L (3.13) L1/2 1 +e ine optimum expected loss for n = 1 is F1(L) = min [T(L), G1(L)]. To find the decision points for n = 1, set G1(L) = T(L) and solve for L. Call the intersection points F1 and A1. Since the problem is symmetric about L = 0, we need solve only for the intersection of G1(L) and T(L) for L> 0. Obviously h1 < n9, andrl = -A Therefore, 61

L FI(L)= L' 1 1 + e = (c -.125) + 1 + eL' eL/2.75 1 + e -A1 < L < A1 L >A1 (3.14) Without bothering the reader with details, we remark that the cost of observation, c, is restricted to the range 0 < c <.25; larger costs result in A1 = 0 and the whole decision process collapses because observation is too expensive. The expected loss if we defer with the possibility of two deferrals is G2(L). G2(L) is the average over all observations of the expected loss function of the previous stage, F1(L), plus the cost of a unit observation. For IL21 > A1 + fn9, G2(L2) = T(L2) + c, and we can foresee that A2 < A + fn9. Hence we are most concerned with IL21 < A1 + fn9. As before, the limits of integration for y must be determined since the functional form of F1(L) is in three forms. For -co < L < A 1 1,hi L2/2 A1/2 3e e -1 1 > y > 8 For -A1 < L1 < 1 L2/2 A1/2 L2/2 -A1/2 3e e - 1 3e e - 1 8 > Y> 8 For +co>_ L1 >A1 L2/2 -A1/2 3e e - I 8 -- >_ y > 0 8 Thus 1 L2 G2 (L2 ) = e L (y) a(L2,A1) 1 + e f(y) L2 a(L 2'A1) A2 1 + e b(L 2 A1) (c -.125) dy I L 1 + 2(y)dy + c (3.15) dy + c (3.15) L L2 1 +e (y) 1 +e J0 62

where: Lk/2 A./2 a(Lk, A.) = (k' i) 8 and Lk/2 -A i/2 3e e - 1 b(Lk' Ai) 8 For ILI and A restricted to (0, Qn3), the order of the limits of integration is proper; namely, 0 < a < b < 1. In this single example we shall assume c is sufficiently large so that A< On3. To evaluate G2(L) and the other integrations that arise the following integrals are needed. 1k 3 L/2 - (1) 21C(y) dy = -1 + 3 e e 2 (3.16) a(Lk' Ak 1) a(Lk' Ak 1 ) L k (2) |(y)dy 2 e sinh (3.17) b(Lk, Ak 1) a(LkAk-l) L /2 (3) dy = e sinh (3.18) (Lk' Ak ) (4) (Lk e(y) dy (k-1 (3.19) b(Lk, Ak 1 ) b(LkWAk- l) Lk/2 — 1/2 1 (5) dy = 3 e e - 1 (3.20) Using the above integrals G2(L) is given by 75L/2 -A1/2 A1 G2(L) = (c -.125).75 e 2(c -.125) sin + e + (3.21) and the optimum risk function F(L) is+e and the optimum risk function F2(L) is 63

L F (L) = e 1 +e L< -2 2.75 e L/ A1 = (c -.1 25) e'-+ L 2(c -.125) sinh + 3 + e 1 +e -A2 < L < A2 1 1 +e L> A2 (3.22) A2 is the solution of the equation G2(A2) - T(A2) = 0. This process of solution can be generalized to the nth stage since Fn (L) = min [T(L), Gn(L)] and Gn(L) can be written in closed form. eL/2 Gn(L)= (c -.125)+.75 L n n +e (3.23) where An-1 n-1 - n-1 / K = 2(c- 125) sinh +.75 2 n -1 + e n~~ ~ ~~~~ 75 Kn_ + and (3.24) K1 = 1 Thus L F (L) = e 1 +e L< -A n = (c -.125) 1 1 +e eL/2 +.75 K L n' 1 +e -A < L<A n n (3.2 5) L> A n The decision points, ~Ad, can be found in an iterative manner from Equation 3.26. 3K ~ 9(K 2 - ) + 16c (5- 4c) A = 2kni n I n (3.26) Notice that to solve for Fn(L) one must solve for each previous Gk(L) and Ak, k = 1, 2, 3,..., n - 1. For example the solution for the n = 5 deferred-decision problem inherently solves the solution of the n = 0, 1, 2, 3, and 4 deferred-decision problem. (When we speak of the "n equal something" deferred-decision problem we are specifying the available D for the problem.) 64

The risk function, F n(L), and the determination of A depend on the cost of a unit observaYkn tion, c. In order to evaluate an observation-decision procedure we use the ROC and the average number of observations, Yk(L). The method of solution for obtaining the ROC and the average number of observations is the same as for the risk functions. The logical basis is exactly the same. Thus to solve for the ROC at the nth stage we work the problem step by step from n = 0 stage to the n = n stage. Stage n = 0: For L < 0, the decision is "B". This implies the ROC is the point (0, 0). Similarly for L > 0, the ROC is the point (1, 1). For L = 0 either point is allowed. Stage n = 1: At stage n = 1 we know L1 and A1. For L <_-A1, the decision is "B". This implies that ROC is the point (0, 0). For L > A1 the ROC is the point (1, 1). For L in the open interval (-A1' A1) the ROC is the average of the previous ROC averaged with respect to the density function of the observation, "y." Let P("A"IL, SN) at stage n = 1 be designated as y1(L) and P("A"IN, L) at stage n = 1 be designated as x1(L). Thus Y1(L1) = P(+oo > L > 0ISN) = P < y < 3e - 1 SN a(L1,0) a(L1'0) -L /2 9 3 -1/2 Y1(L1) J f(yISN)dy= (y)Idy = - e (3.27) Y1(1 0ly) dy 8 (3.27) In a similar manner x1(L1) = P(+oo> L > O0IN) = P <y<3e N) x1 (Ll) = f(y N) dy = + e (3.28) The complete ROC for n = 1 is 65

(0, 0) L < -A 9 3 e-L/2 Y1(L) = — L/, -A1 < L < A (3.29) I 3 L/2 x1(L) 8 8 1 (1, 1), L>A1 We note that the curved part of the ROC is an arc of the hyperbola [9 - 8 y1(L)] [1 + 8 xl(L)] = 9 (3.30) Stage n = 2: Using the same logic as used in stage n = 1 we can find the ROC for n = 2. We assume that L2' A2 and A1 are known. Omitting the details of the integrations, the ROC for n = 2 is (0, 0), L < -A2 y (L)= 9_ 3 h eL/2 2 8 8 1 1 -A2 < L < A2 (3.31) (L)=-1 +3h eL/2 (1, 1), L> 2 where 3 A 1/2 = (A1/2) - sinh (A1/2)+e Z As before the curved part of the ROC is an arc of the hyperbola [9 - 8y2(L)] [1 + 8x1(L)] = 9 h2 (3.33) Stage n = n: Generalizing to the nth stage, in the same manner as above, we obtain the ROC for the nth stage as 66

(0, 0) 9 3 h e-L/2 Yn(L) 8 8 n-1 Xn(L)=-18+-h eL/2 (1, 1) h =h - - 9sinh n n-1 42 4 L< -A n -A <L<A n n L> A n (3.34) where ( A An/2 2+ e (3.35) and h =1 tO The curved part of the ROC is a hyperbola given by [9 - 8 y (L)] [1 + 8x (L)] = 9 h2 n n-1 (3.36) The average number of observations, Yk(L), can be obtained using the same logic as applied to the determination of the risk functions and the ROC's. At stage n = 0, obviously, y (L) is zero. Thus yk(LISN) = 0 and Yk(LIN)= 0. Stage n = 1: At stage n = 1 we know L1 and A1. Again the average number of observations is obvious. '1(LISN) = y1(LIN) = 1, Y1(LISN) = Y1(LIN) = 0O -A1 < L < A1 otherwise. (3.37) Stage n = 2: At stage n = 2, we know L2, A2, and A1 For ILI > A2 the decision is made immediately, the average number of observations being zero. For -A2 < L2 < A2 at least one observation will be made, the exact number being dependent on the observation. To compute the exact average number of observations the y (LI SN) and Y (LIN) of the preceding stage must be averaged over the observation "y. " a(L2, A1) Y2(L2I SN) = a1 n+1 Y1(LISN) f (y i SN) dy =1 + b(L2,1 1)67 67 a(L2'1A1) 2'A1) b(L2, _(y)dy

=1 + 4 2 i sinh( 2 -A2 < L2 < A2 (3.38) X1(L2/N) = 1 + a(L2'A1) ( 2'A1) b(L2, A1) a(L 2' 1) -i(L IN) f(y [N)dy= 1 +( b( 2' 1 ) dy L -sinh -< L< A< 4 2 2in(2' -A2 2 2 y2(LISN) = y(LIN) =, ILI > y2(LISN) = 1 + - e sinh(, ILL < y2(LIN) = 1 + 3 eL/2 sinh( 21) ILI < A2 A2 (3.39) A2 Stage n = n: The solution for the observations is given by nth stage can be now obtained iteratively. The average number of yn(LISN) = n(LIN) '= O -3 -L/2 Y (LISN): 1 + - e Q I Yn(LIN)= 1 + 3 e Qn n 4 n-1/2 ILI > A n ILI < A n (3.40) Ll < An n where Qn = Qn-1 4 + sinh () (3.41) and Qo = 0 0 We have obtained closed form expressions for the risk function, the ROC, and the average number of observations at any stage n. This is, in general, not possible except for some nonstatistical problems. However, this simple example does serve to exhibit the type calculations common to deferred —decision problems. The parameter that has not yet been specified is c, the cost of a single observation. 68

The logarithm of the likelihood ratio of this example is bounded between -Qn9 and +Qn9. The boundedness of fn[f(y)] limits the possible decisions that may be made at any stage n > 1. For instance, if Ak = 2 and Lk+1 is less than Ak - fn9 = -.2, then a single observation might lead to a "B " de c ision Lk+1 + n[C(y)] < -.2) or a continued decision, but could not lead to an "A" decision (Lk+1 + ~n[f(y)] > 2). These anomalies in the possible decisions that can be made have not been discussed in this example. To take account of these anomalies the risk function would have to be broken into more functional forms, which can be shown to be always less than 2, where n is the stage number. It was possible to limit the functional form of the risk function to that derived in the example by choosing the parameters of the problem correctly. In particular the cost of single observation must be chosen large enough so that the decision boundaries always satisfy A < a/2, where "a" is the bound on fn[f(y)]. The fact that anomalies occur in the decision space is a consequence only of the boundedness of en[ f(y)]. In a parallel study (Reference 9) of the application of deferred-decision theory to the clipper crosscorrelator, these anomalies occur naturally in a physical problem. In the present example if the cost of a single observation is such that.083 < c <.25 then the anomalies in decision space do not occur. For c in this range we can solve for An, F (L), ROC, and 7n(L) functions using the formulas derived. Consider a cost of a single observation, c =.1. Let us find the risk, ROC, and average number of observation functions for n = 1, 2, 3, and infinity. The solutions for n = 1, 2, and 3 are obtained from the formulas previously derived. To obtain the asymptotic functions, i.e., n = co, we use the recursion formulas and set n = n - 1 and solve. For n = co we have: (1) The risk function is L e F (L)= L < -A cc L oo 1 +e -.025 +.3477 cos-A < L < A (3.42) cosh (L/2)' co0 co0 1 L cc 1 +e (2) The ROC function is 69

(0, 0), L< -A 9 L/2 P("A" ISN) = y o(L) = +8-.29506 eL/ - < L < A (3.43) P("A" IN) = x (L) = 8 +.29506 e+ (1, 1), L>A 00 The curved part of the ROC is a hyperbola given by [9 - 8y(L)] [1 + 8x(L)] = 5.56 (3.44) (3) The average of number of observations is L eL 1 - _ Y (L) = e (LISN) + L Y (LIN)= 1 +.5268 cosh(L/2) ILI <A = 0, otherwise (3.45) (4) The decision point A is 00 A =.900 The graphs in Figures 24, 25, 26, 27, and 28 depict the solutions for n = 1, 2, 3 andco. Figure 24 is a graph of the risk functions. Figure 25 depicts the decision boundary as a function of the available D, D = n d. Although we speak of the decision "boundary" and draw a smooth max curve through the points, only the points for n = 1, 2,... have meaning. Figure 26 is a graph of the average number of observations. Figures 27 and 28 are graphs of the ROC functions. Figure 27 is drawn with linear paper, Figure 28 is drawn on normal normal paper. Figure 28 shows how close the problem is to a normal observation problem. (A normal observation ROC would plot as a straight line with slope one on normal normal paper.) 3.3 DEFERRED-DECISION, CONTINUOUS OBSERVATION CASE The method of solution presented in Section 3.1 can be implemented into a computing algorithm (see Section 4.1). For any given set of parameters one can use this computing algorithm to obtain the deferred-decision procedure. In this sense the algorithm solves any discrete observation-decision procedure. More analysis is possible if we assume a continuous observation and certain symmetries. These assumptions allow the analysis to become more specific. However, it is fairly obvious that the basic logic used is not tied to the assumptions of symmetry and continuity of observations. These assumptions are needed to obtain numerical answers and more detailed analysis. The solving of the more restricted problem is important because many aspects of the solution are common to less restrictive problems. 70

T(L) I O) U) 0.4 -I O w Fw (-) a_ x 05 N cr0 z n=2.3:~3 n-%0 - I I I I I I I I I I -1.0 -.8 -.6 -.4 -.2 0,2 4.6.8 L.0 Figure 24. The expected loss functions as a function of L for the illustrative example depicting the solutions for n = 1, 2, 3, and oc. max

1.0 * A DECISION BOUNDARY.8.6 -.4 2. t.. -2 -.3 -6 8 - -I.0 -r DECISION BOUNDARY I I II I I 12 II 10 9 8 7 -- n A4- Am A3 A2 Ao r4 r3 r2 6 5 4 3 2 0 Figure 25. The deferred decision boundary points for the illustrative example as a function of the available quality, D = n d. max

I.6 1.5 1.4,) z 0 cr aL U. 0 LLI w z (C Lli 1.3 i L n =3 n= 2 1.2 I 1.1 n=l 1.0 I I I i I I I I I -1.0 -.8 -.6 -.4 -.2 0.2.4.6.8 1.0 L Figure 26. The average number of observations for the illustrative example as a function of L depicting the solutions for n = 1, 2, 3, and cc. max 73

1.0.9 N:3 N:2 N:I.8.6.5 a..2.1 0.1.2.3.4.5.6.7.8.9 LO P (XIN) — Figure 27. The ROC curves for the illustrative example plotted on linear paper. 74

o." 4 /Q I cp"q z.50.40.30.20.10.05 04 03 02 01 I I I.01.02.03.04.05.10.20.30.40.50.60.70.8 0.95. P (','1 N) Figure 28. The ROC curves for the illustrative example plotted on normal coordinates. 96.97.98.99 75

Consider the heuristic physical model introduced in Section 1.1 shown again in Figure 29. Assume the input to the receiver is SN. The mechanism for making a decision consists of observing the output display and observing when the sampled output crosses from the continue portion of the scope into a terminal region. The observer receives information at times depending on the sampling clock. Suppose we increase the sampling clock's rate of sampling. The decision points become closer and closer and finally merge into a continuous curve as the sampling clock's interval approaches zero, i.e., as D becomes continuous. This decision boundary that results from continuous observation we call the standard form, S(Do, W/c). Using this concept of continuous observation one can derive analytic expressions for S(Do, W/c) and the risk function for the available D unbounded and d asymptotically small. The assumption of mirror symmetry of the observation statistics is needed for the analysis. The following two lemmas and one theorem are the basis of the analytic results. Let L(x) be defined as the mean value of (x) and P(x) the probability of (x). Let P(x, y) be the joint probability and P(xly) be the conditional probability of x given y. Lemma 2: The expected time to go from O to ~L when p1(n[f(y)]) = u is L tanh Proof: The expected time to go from 0 to ~L is "distance" E[tl{for 0-L] = E '"mean value of motion"] E[tlfor O - ~L] = - P(mean = +u, at L) + - P(mean = -u, at L) -L -L + -P(mean = +u, at -L) + - P(mean = -u, at -L) U -U Now L P(mean = +ul at L) e P(mean = -ulat L) And P(at +L) =.5, by symmetry of the origin and mirror symmetry of observation statistics. Since probability of error at L boundary is L we have 1 +e P(mean= +u, at L)= L 1 +e 76

ANTENNA INPUT,y.Q(y),THE LIKELIHOOD RATIO INTEGRATOR OUTPUT DECISION BOUNDARY OUTPUT DISPLAY Figure 29. A heuristic physical model of a deferred-decision problem. 77

11 P(mean= -u, at L)= 2 1 1 +e Similarly, L 1 e P(mean = -u, at -L) 2 1 L 1 +e P(mean = u, at -L) 1 L 1 +e Hence, L L EUtoeL L L Le -1 Ltanh(2)L (3.46) E[t0-L] = u L- u tanh (3.46) Q.E.D. Lemma 3: If the expected time to go from 0 to ~L is given by t(L), then the expected time to go from L1 to ~L2, where L1 < L2, is t(L2) - t(L1). Proof: Let E[t 0 - ~L2] = t(L2). Consider the following routes from 0 to +L2 (1) 0 - +L1 - +L2 (2) 0 - +L1 - -L2 (3) 0 - -L1 -+L2 (4) 0 - -L1 — L2 The expected value for the first leg of these routes is t(L1) so long as it is a first passage and we consider a path from 0 to -L1, and then to +L1 as a path from 0 to -L1 only. Similarly, any route from 0 to -L1 which reaches +L1 first is a route to +L1 and not -L1. The time for the second leg of routes 1 and 2 together and 3 and 4 together, are equal and are the desired time. Hence E[tI -L1 - ~L2] + E[tl +L1 - ~L2] t(L2) = t(L1) + 2 Since symmetry of motion means, E[t +L1 - ~L2] = E[t -L1 - ~L2] we have 78

E[t L1 - ~L2] = t(L2) - t(L1) (3.47) Q.E.D. Theorem 3: The asymptotic decision boundary for an optimum sequential procedure with symmetric losses, mirror symmetric observation statistics, independent observations, and continuous observation, satisfies the equation1 A* + sinh A* Wu (3.48) cc oc 2c where W = loss due to an erroneous decision. u = AI(kn[f(y)] ) for a single observation. c = cost of a single observation. 1 Proof: The probability of error corresponding to L = ~A is. By lemma 2 and lemma 3, 1 +e the average duration of observation is n = t(A) - t(L) where t(x) = x/u tanh (x/2) Thus the risk for the continue region, with boundaries A > 0 and initial log-odds-ratio of L is G(L; A)= W + [A tanh (A/2) - L tanh (L/2)] (3.49) 1 +e We note that W G(L; L) risk for immediate 1 + e termination. Thus for ILI < A, G(L, A) < G(L, L) and one continues the observation. To minimize G(L; A) we note that d -We c 2Ae A e - G(L;A)= -we + 2e + e - dA e A)2 u A)2 A (1~e ) (1+e ) e +1 1In our discussion the * refers to continuous observation and the "sub co" refers to the available D being unbounded. 79

WeA 2 (A + sinh A) - 1 (3.50) 2c The expression 2uW (a + sinh A) - 1 runs a course of <0 0, 0, > O so there is a single relative minimum for G(L; A) at A = A* where 00 A* + sinh A* uW (3.51) oo o 2c Q.E.D. Corollary: For normal observations statistics d Wd u =and A* + sinh A* = (3.52) 2 cc 00 4c Observe that A* is chosen at a relative minimum of the risk function. For A near A* oo o0 we can write aa ~(A- A- ) G(L; A) = G(L; A* ) + a G(L; A* ) 1 +. (3.53) o00 This can be useful in finding the sensitivity of G(L; A) to an erroneous choice of A* 0 Theorem 3 can be used to obtain an analytic expression for the risk function under the conditions, of course, stated there. This expression is used in the proof of Theorem 3 and is G(L; A) + u Atanh -2 L tanh (3.54) The first term of G(L; a) represents the risk due to terminal decision errors. The second term is due to the observation cost. Equation 3.54 is the analytic expression for the asymptotic risk function for continuous observation, mirror symmetric observation statistics, independent observations, and symmetric losses. For normal sampling the risk function reduces to that given by Eq. 3.55. G(L;A) W 2cA +d tanh - L tanh (3.55) 1+e This expression will be used in a comparison of the determination of A obtained from the computer algorithm. See Section 4.2. The analysis of the continuous observation case given here is essentially static in nature. We have assumed that the available D is unbounded. This allows one to use a method of solution 80

which cannot be used in problems where the available D is finite. The utility of these asymptotic solutions is that they furnish bounds on the optimum nonstatic problem and they serve to establish, in a practical sense, what available D constitutes an "unbounded available D." That is, a D of 10 is for many practical purposes, equivalent to a D of infinity. The detailed analysis and numerical examples have assumed certain symmetries. Three symmetries are involved. The risks WM and WFA are assumed equal. The observation statistics are normal and hence the distributions of fn[f(y)] are mirror images of each other about fn[f(y)] = 0, and the resultant optimum decision boundaries (in L) are symmetric, i.e., mirror images of each other about L = 0. For nonsequential observation-decision procedures the inequality of risk poses no special problem, and is equivalent, boundary-wise, to a shift in the initial L (as a function of D). Specifically, the boundary between the two terminal decision points is at L = AO, where o= Cn(WFA/WM) The resultant performance probabilities are related to the probabilities of L + kn[f(y)] exceeding A or not, which, of course, is equivalent to fn[f(y)] exceeding (A~ - L) or not. One might conjecture that in sequential procedures with mirror symmetry on the fn[f(y)] distributions, one would obtain mirror symmetric optimum boundaries about L = A. Theorem 4 shows this does not occur by showing that the conjecture fails for n = 1. Theorem 4: For nondegenerate sequential observations, the optimum decision boundary points Fl and A1 are equidistant from A if and only if A = 0. Proof: At n = 1, the risk function based on observing is We w LN W G1(L) 1 eL {nL(] <A -LISN' + L P fn[1(y)] > A - LIN + C (3.56) l+e I+e at A1, the terminal decision is "yes," or "A," and the terminal risk is WFA FA T(A1 WA (3.57) 1 +e at rl, the terminal decision is "no," or "B," and the terminal risk is 1 rl10AO WMe WFAe T(rl) = r = (3.58) 1 +e 1+e 81

At the boundaries, G and T are equal. Setting G1(A1) = T(A1) yields A -A WFA W FA PeISN] A1 A1~ P[1 n [(Y)] <A - AISN] 1 +e 1+e WFA + FA P[in[Q(y)] > A - A N] +C (3.59) 1 +e This can be written as C = i (t - e A1 1 WFA A1 0 +- e ______n_ Fr/ i [An~e(y)] - A S- )A (3.59) Similarly, from G1(rl) = T(rF1) we have rl-A rF-A o o AFA rF P[ n[Q(y)] >A - r ISI r l r0 1 +e 1+e + r P[ n[(y)] > A - r IN] + C (3.60) Rewriting we have e r (I - e P[ ] > IN -P n[f(y)] > - rl - n[(y) ] < A - rN WFA r1 0 L < 1 + e (3.60) Mirror symmetry on the fn[f(y)]distribution means that Equation 3.60. can be written e1-o -O A N eC en(y)] < r1 - s - epO Pn[(y)] > r1A SN - Pn[() > A IN] (3.61) FA e 1 +e Comparing 3.59 and 3.61, it is obvious that for A = 0, if A is a solution for Equation 3.59, r1 = -A is a solution for 3.61. Conversely, if r1 and A1 are symmetric about Ao, i.e., A0 - r1 = A1 - Ao then the bracketed qualities in these equations are equal. In order that both hold simultaneously the multipliers of the -conditional probability in SN must be equal 82

e +e 1 + e (3.62) From the assumed symmetry, AO =.5(A1 + r), so.5A -5r\ A e +e =+e 2 cosh (.5r1) = 2 cosh (.5A1) (3.63) Thus if A1 and Fl are symmetric about A0, either A1 = Fl and no observation occurs or A = -rF and A =0. 1 o Q.E.D. 3.4 DEFERRED-DECISION, DISCRETE-OBSERVATION CASE For any specific set of parameters the discrete-observation,deferred-decision procedure is solvable by use of the computing algorithm given in Appendix I. Using the algorithm it is possible to find the deferred-decision boundaries and the risk functions assuming normal observations, stationarity, and symmetric costs. Assuming a continuous observation the deferred-decision procedure can be found analytically. This analytic formulation also assumes an unbounded available D. We would like to answer the question of how the discrete observation problem and the continuous observation problem relate to each other for large available D. We would like to present a nonrigorous argument that admittedly has some weak parts, but which has helped us materially in understanding optimum decision boundaries. The mathematical formulation of all the observation-decision procedures we have discussed is from the Bayesian point of view. We base all our logical construction on utility functions. In particular we are interested in minimizing the expected loss of the observation-decision procedure. Whether we have a discrete observation or a continuous observation it is clear that the balance between observation cost and improved performance is the same for a given set of parameters. This implies that the L value where we terminate the observation should be the same whether we have a discrete observation case or a continuous observation case. In other words, the mathematical formulation specifies a certain balance between observation cost and terminal decision error probability in terms of average loss. In order for this balance to be realized the decision boundary is adjusted accordingly so that the terminate L value is the same for the discrete and continuous cases. Consider the decision boundary that results for a continuous deferred-decision problem. We call this decision boundary the standard form. What effect does the discrete nature of the 83

observation have on this standard form for the decision boundary assuming all other parameters are equal? The discreteness of the observation means mathematically that the mean value of the logarithm of the likelihood ratios in noise and in signal-plus-noise are finite, i.e., K(eu[f(y)]JN) and z(en[f(y) 'SN) are finite. We expect as the "mean motions," /i(fn[k(y)] | SN) and j(fQn[f(y)] IN), become small,the results of the discrete case would approach those of the continuous observation case. By the arguments presented previously we expect that all deferred-decision procedures W with the same - ratio will, on the average, possess the same terminate L value. If the observac tion is continuous we expect to be able to obtain this terminate L value exactly by placing our decision boundary at the terminate L value. However, if we have a discrete observation then the discreteness of the observation causes the observation procedure not to terminate on the decision boundary itself but rather on a L value greater than the decision boundary. In order to terminate on the same L value as for a continuous observation case we have to place the decision boundary inside the standard form. The amount inside depends on the "mean motions," i.e., the discreteness of the observation. Consider the state of an observation-decision procedure being just inside, say, the upper decision boundary. Assume the condition is signal-plus-noise. On the average, the next observation will drive the L value of the previous stage over the upper boundary. The average amount that the L variate exceeds the decision boundary we denote as ~, "the average excess over." The average excess over the decision boundary is a function of the mean motion only. It depends on the coarseness of the observation. The average excess over is the connecting link between the analysis of the continuous and the numerical results of the discrete case. The logical extension of these ideas implies the following relationship. A(Do, d, W/c) = S(Do, W/c) - (d) (3.64) A(Do, d, W/c) is the actual decision boundary, S(Do, W/c) is the standard form decision boundary 5(d) is the average excess over. This heuristic explanation of the connection between a continuous and discrete observation can be "experimentally verified" using a Monte Carlo technique on a high speed digital computer. The method is to sample randomly a normal distribution with a known mean and standard deviation. The sampled values are then added until they exceed a preset cut level. The amount 84

of excess over the cut level is directly related to the average excess over the boundary. The precise definition of the average excess over the boundary is mt)) 5(d) nEet)) where A is the actual decision boundary, Lt is the terminate L value. From our previous discussion the average excess over should be a function of the mean motion only. For normal observations jpuln[f(y)] ISN)-I= (n[f(y)] I=. (d) should be independent of the initial log-odds-ratio, L, and the value of the decision boundary, A. Computer results verify these conclusions. The results of the Monte Carlo experiment are summarized in Figures 30 and 31. Figure 30 is a plot of 5(d) vs. Vd for different initial L values. This graph shows that 5(d) is a function of d only and does not depend on the initial L value (provided the initial L value is a standard deviation or more removed from the decision boundary). This graph assumes a constant decision boundary as a function of the available D. Figure 31 shows the same relation between S(Do, W/c) and A(Do, d, W/c) holds true for the nonconstant decision boundary, i.e., the deferred decision boundary for small available D. In Figure 30 the assumption is that the truncation of observation procedure does not affect the decision boundaries. In Figure 31 the deferred decision boundaries for a d =.25, W/c = 30, and n = 14 were read into the computer. The Monte Carlo simulation was then max run. The results indicate that 5(d) for the static boundaries is approximately the same as for the changing boundaries for the same mean motions. Equation 3.64 can be verified in another way. Consider two different mean motions for the same W/c ratio. Then we can write d D 2 An (1 n Ad = 2n(dl)- (d2 (3.65) max/ man, In Equation 3.65 we can determine all the quantities numerically by use of a computer. The results serve as a further verification of Equation 3.64. The average excess over the boundary is the connecting link between the analytical formulating of the continuous observation case and the numerical results of the discrete observation case. In practice, we are generally interested in the discrete observation case. Thus 5(d) allows us to use the analytic formulation of the continuous case to determine the decision boundaries for the discrete without resorting to a numerical solution. 85

10 5 z 0 z 1.0 O0 en Lo z.0 O 0 c, (n. cr.05 I 0 x INITIAL L VALUE LO= A-d Lo-= -2d,/-3d.01L.01 I I I I I I I I I I I I I I I I I I I I I I I I I 1.05.1.5 1.0 5 10 I d2 — s Figure 30. The average excess over the decision boundary for constant boundaries in L as a function of the available quality for four initial L values. 86

0 0 0 2.0 - 0 AVERAGE TERMINAL VALUE T ~~~~~o I~ a 1.0 - DEFERRED DECISION BOUNDARY-" 2.5 -"-O 0 INITIAL L VALUE o Lo= 1.1 O O 0 cr' C) -.50 / 0 o,5 -I. I I I I I I. 14 12 10 8 6 4 2 0 STAGE NUMBER,n Figure 31. The average excess over the deferred decision boundary for two initial L values. Also shown is the deferred decision boundary for W/c = 30., A =2 O., d =.25 up to an including n 14. w~~~~~~~ ~0max O ~~~~~~max 87

4 NUMERICAL RESULTS AND COMPARISON OF OBSERVATION-DECISION PROCEDURES 4.1 NUMERICAL RESULTS FOR DEFERRED-DECISION PROCEDURES The value of (W/c) for a deferred-decision procedure is the parameter by which we compare different deferred-decision problems. For the numerical solutions we assume a stationary, symmetric, and independent observations. Using W/c as a parameter numerical results were obtained for a limited number of deferred-decision problems. The numerical solutions were obtained by using a high speed digital computer programmed to solve the iterative equations derived in Section 3.1. Let us review the basic operational equations used to find the deferred-decision boundaries, the average number of observations, and the ROC. The operational equations used to find the decision boundaries are basically the iterative equations needed to find the risk functions. At any stage of decision, with n possible deferrals remaining, and a future procedure known it is possible to compute the risk function, Fk(L). Fk(L) = T(L) L< rk or L A, Lk o k (4.1) = f FI[L+fn[e(y)];(y)dy+c, F k<L<k< k-1 i ] ' k k (rk,Ak) are the decision points in L at stage n =k. The optimum decision boundary is found by solving for the intersection of Fk(L) and T(L). To evaluate the performance of a decision procedure the average number of observations and ROC functions are used. The measure that leads to a detectability measure is the probability of a terminal "A" decision, given that with n deferrals possible, log-odds-ratio is Ln The specific functions of L which describes the ROC at stage k we designate as yk(L) and Xk(L). Define: Yk(L) = P(terminal decision is "A" ILk=L, condition is SN) xk(L) = P(terminal decision is "A" ILk=L, condition is N) The basic iterative equations for the ROC are Yk(L) = L< rk (4.2) = i Yk-1[L + Pn[f(y) f(ySN) dy, k < L< Ak 88

(Equation 4.2 continued.) = 1, L > Ak (4.2) xk(L) = 0, L< rk = fxkl [L + en[Q(y)l f(y IN)dy, F k L<Ak (4 3) =1, L> Ak For the average number of observations we have the following operational equations. Yk(LISN) = average number of observations under the condition SN, given that there are k deferrals possible and the a priori log odds ratio is Lk. -k(LIN) = average number of observations under the condition N, given that there are k deferrals possible and the a priori log-odds-ratio is Lk. Thus we have Yk(LISN) =0, L<rk or L >k (4.4) = 1 + ifk 1[L+tnL[(y)ISN] f(yISN)dy, rk < L< k ).k(LIN) =0, L<kor L>Ak (4.5) 1 + f k1 L + n [(y)IIN1 f(yIN)dy, rk< L< < Ak The computing algorithms which solve these iterative equations for the assumptions stated previously are given in Appendix B. The risk functions and optimum decision boundaries for deferred-decision procedures for two W/c ratios are shown in Figures 32 through 38. These were obtained by use of the computer program given in Appendix A. Figures 32 and 33 are plots of the deferred-decision normalized risk functions for W/c = 30, A = 0, and d = 1.0,.25. The normalization is to set the loss due to errors equal to one, i.e., 0 W = 1. The risk curves for d = 1.0 and d =.25 are only slightly different if one examines the risk function for the same D in the continue region one finds that the risk functions for d = 1.0 are slightly greater than for d =.25. This, of course, is due to the greater quantization in D. Or in other words, the risk for the procedure in which one can decide more often whether to make a terminal decision or not is clearly no greater than the risk for the procedure where one is not able to make this decision as often. 89

.5 T(L) a: W/c= 30.0, Ao=0,d,25 ao~~~ /1~~~~~ 0=~D-.25.4 n= 4D HL aX 0 w N -.3 -400.0 -2.0 -2.0 0 1.0.=4 3.0 L.Figure 32. The normalized expected risk functions for a deferred-decision procedure as a function of L with parameters W/c = 30., AO = 0., d =.25 and D =.25.,.50, 1.0, 2.0, 4.0, 8.0, and 10.0. 4.0

.5 - W/C=30,o= O, d=1.0 W n=D w.4A a. w D=I N.2.1D -4.0 -3.0 -2.0 -1.0 0 1.0 2.0 3.0 L -- Figure 33. The normalized expected risk functions for a deferred-decision procedure as a function of L with parameters W/c = 30., A0 = O., d = 1.0., and D = 1.0., 2.0, 4.0, 8.0, and 10.0. 0 L.0

X O3_ m: T(L) W/c I00, IAoO, d =.25 0 W D:.25 n=4D W w a..50 X W w o ~~~~~~~~~~~~~~~~~~D=I N O~~~~~~~~~~~~~~: z D=4.2.100~~~ ~D=10 -4.0 -3.0 -2.0 -1.0 0 1.0 2.0 3.0 LFigure 34. The normalized expected risk functions for a deferred-decision procedure as a function of L with parameters W/c = 100 A = 0., d =.25, and D =.25,~50, 1.0, 2.0, 4.0, 8.0, and 10.0. 0 4.0

Y.5 c) a.4 0 w J ICD cr.3 Z T(L) W/c = 100, A0O, d 1.O n=D.1 o0 I I -4.0 -3.0 -2.0 -1.0 o 1.0 - 4.0 L Figure 35. The normalized expected risk functions for a deferred-decision procedure as A0 = O., d = 1.0, and D = 1.0, 2.0, 4.0, 8.0, and 10.0. 0 a function of L with parameters W/c = 100.,

W/c = 100, AO=O d=.25 3 2 I -Jo c4 -I -2 -3 d= I 10 9 8 7 6 5 4 D- nd Figure 36. The deferred-decision boundaries as a function of the available quality, D = n 3 2 1 0 d for W/c = 100., A0 = O., and d =.25 and 1.0. max

W/c = 30,Ao=o I. t Jo0 C, -2. -- 10 9 8 7 6 5 4 3 2 d== nd 01 Figure 37. The deferred-decision boundaries as a function of the available quli D fond d =.04, and 1.0. The d =.04 is an approximnation (see text). 0 5,

5.0 - d =.25 L 0 -I.0 -10 9 8 7 6 5 4 3 2 D=nd Figure 38. The deferred-decision boundaries as a function of the available quality, D = ilnax d for W c = 30., 100., and 500. with d =.25. I

Figures 34 and 35 are plots of the normalized risk functions for W/c = 100, OA = 0, and d = 1.0,.25. A comparison with W/c = 30 and the same d indicates how the smaller cost of a single observation affects the decision boundary. For W normalized to unity a W/c ratio of 30 is a larger unit cost than a W/c ratio of 100. The effect of this smaller single observation cost is clearly evident. The risk, at the same D, is smaller. Intuitively, this is what we expect. The comparison for different d's at a W/c ratio of 100 is as before for a W/c ratio of 30. The risk for the larger unit d is slightly higher in the continue region. Other characteristics of the deferred-decision risk functions are evident from these graphs. Notice that the intersection of the various stage risk function, Gk(L), with T(L) intersect each other at a very shallow angle. This means that for any given stage of decision the specification of the decision boundary points are not critical. The risk near the boundary is a smooth function of the log-odds-ratio, L. From the risk function we obtain the decision boundary points by solving Gk(L) = T(L) for k = 1,2,3,.... The decision boundary, which is really a set of discrete points, is plotted as a function of the available D with the available D increasing to the left. It is often helpful to interpret the same axis, as a time axis increasing to the right. However, it should be noted that the observation-decision process does not necessarily have to run uniformly in time. The deferred-decision boundaries in Figure 36 are shown for a W/c = 100, OA = 0, and a d = 1.0 and.25. For purposes of discussion consider W normalized to one. Thus the cost per sample for W/c:= 100 is.01 d. This graph of the decision boundaries shows the effect of the quantization in d we discussed in Section 3.4 on the average excess over the boundary. The standard form for the decision boundary is the outer bound on the discrete-observation deferred-decision problem. As we decrease the value of d we expect to approach the standard form. This is shown in Figure 36. The boundary for a d =.25 is outside that of the boundary for d = 1. The separation between the two decision boundaries is approximately constant. This, of course, follows from the fact that the average excess over the boundary, t (d), is a function only of d. Figure 37 depicts the deferred-decision boundaries for a W/c = 30, A = 0, and d=1,.25, and.04. The decision boundary labeled d =.04 was not obtained using normal observation e statistics, Instead a binomial distribution was used to simulate the normal deferred-decision problem. The probabilities of the binomial problem were picked to simulate a W/c = 30 for the normal problem. This simulation is discussed further in Section 4.3. Note again the characteristics discussed previously. The boundaries "move out" in L as d becomes smaller and the separation between decision boundaries for different d's remains fairly constant for all L. 97

In Figure 38 we plot the decision boundaries for d =.25 with W/c = 500, 100, and 30. We note that as W/c increases the decision boundaries "move out" in L. This can be viewed in two ways. Consider first that our losses, W, are normalized to one. Then larger and larger W/c ratios imply a smaller single observation cost. Thus as we increase our single observation cost we expect our boundaries to "move in" in L toward L = 0. Alternately, suppose we normalize the single observation cost c to unity. Increasing W/c ratios signify increasing error losses. Increasing error losses indicate the necessity for better quality decisions which are accomplished by "moving out" the decision boundaries in L. For normal observation statistics an increase in available D can be related directly to the output signal-to-noise ratio of the receiver. A doubling of available D is an increase of 3 db in (S/N)o. (See Section 1.4.) The larger the available D, the larger (S/N)o provided the observer uses the entire D available. In a sequential process the entire available D is not always used. Instead one balances the greater increase in errors due to erroneous terminal decisions against a savings in observation cost. The (S/N)o actually used is equal to D for a normal ROC. (The detection procedure assumes the condition, either SN or N, is stationary throughout the observation.) Physically, an available D = nmax d of 4, for a normal ROC, is a (S/N)o of 6 db. Thus the plots of the decision boundary could be read as the decision boundary plotted against the integrated (S/N)o in db, again, strictly speaking, only for a normal ROC. Referring to the risk function plots consider the shape of the risk function as the available D becomes larger. The risk curves tend to flatten out, as the available D increases. When the decision boundary also flattens out the risk due to terminals errors will be fairly constant as a function of L. The bow in the risk curve is thus mainly due to the cost of the average number of observations. Figures 39 and 40 are plots of the conditional average number of observations for W/c = 30, A = 0, and d = 1. For symmetric losses and mirror symmetric observation statistics the 0 conditional average number of observations, 7k(L/SN) and k(L/N), are mirror symmetric about L = 0. Figure 41 is a plot of the average number of observations, i.e., L eL 1 - k(L) =+e Yk(LISN) + L k(LIN) (3.63) Vk(L)~l 1+ek +e The ROC for this case is given in Figure 42 on normal-normal paper. The use of the average number of observations and the ROC can be viewed as an alternate way of presenting the risk associated with a deferred-decision procedure. The ROC represents errors due to 98

4.0 t W/c =30Q, o =0, d = 1.0 La. o z Cf z 0.-. Co Z Z 0 03 2 LLi (3 4 n" I.U 3.0 t 2.0 4 n = I 1.0 0 -2 -I 0 2 L - Figure 39. The conditional average number of observations for a deferred-decision procedure for the condition SN as a function of L with W/c = 30., A0= 0., and d = 1.0. 99

00I '0']:= P P' '*0 V '0 = 3/A qjI!aM o uoi ounJ O sl N uoij -Ipuooo at; oj aannpao~ood uoisopapp-paaJap up.OJ SISsuoBAasqo jo JOaq quuiu OB.19AV WUOW!PU0 o E) OL 0o7 axn.iLj ~2 IO~ 1~0l-;f ' "t.....I 'I ~~~ ' ~0 0.1 I=u rl G'3 M m In z cm 0 -0'2 ~=u C,, 0 -I z Ur= z o g=u u~ m 1:> r 9=u 0 Z m L 0'1 =P '0=~V' 0~ = /M ' 0 'f

4.04 W/c =30, Ao=O,d = 1.0 n=6 z 0 n a0 LL. 0 z (fL ui 3.0+ 2.0+ n=2 n = I 1.0 - 0 -I 0 I 2 L Figure 41. The average number of observations for a deferred-decision procedure as a function of L with W/c = 30., A0=:0., and d = 1.0. 101

99 98 97 96 95 90- ' 80 70w I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ / 3:50.0 z 60 50 CL 40 30 20 10 5 4 3 2 23 4 5 10 20 30 40 50 60 70 80 90 9596 97 P (/N) --- Figure 42. The ROC curve for a deferred-decision procedure for the signal 6 db below the expected size signal. The expected signal quality is d = 1 and the parameters are W/c = 30. and AO = 0. )8 99 102

99 98 97 - 96 95 - 90 - 80 70 60 z U) -. 50 w _c ='=30.0, o= 0 DESIGN d=I.0 40 RECEIVED d=.25 30 - 20 - 10 5 4 3 2 2 3 4 5 10 20 30 40 50 60 70 80 90 95 9697 P ("AN) —.Figure 43. The ROC curve for a deferred-decision procedure on normal coordinates with parameters W/c= 30. A = 0., andd= 1.0. 103

4.0 C,) z C, o0 LLI CD Z w cr 3.0 I n = 8 W/c=30, 1o=0 DESIGN d = 1.0 RECEIVED d=.25 j 2.0 I n= 2 n=l 1.0 0 -I 0 I 2 L - Figure 44. The average number of observations for the signal 6 db below the expected size signal as a function of L. The expected signal quality is d = 1 with W/c = 30. and A0 = 0. 104

wrong terminal decisions. If we multiply the average occurrence of error by the loss associated with these errors we obtain the risk due to errors. In like manner if we multiply the average number of observations by the cost of a single observation we obtain the cost due to the observation. The sum of these two costs or risks is just the total risk we obtain by taking expected values as discussed previously (see Section 3.1). The ROC and average number of observations are useful in that they break up the total risk into its component parts; the part due to terminal, decision error and the part due to observation cost. The evaluation of any decision process can thus be determined by finding the ROC and the average number of observations for the particular procedure in question and comparing these ROC's and average number of observations with the optimum procedure. For example, suppose we design our receiver to work on decision boundaries for W/c = 30 and a d = 1. The ROC and average number of observations are given in Figures 41 and 42 for the optimum receiver. However, suppose instead of receiving a d = 1 on each single observation we received only a d =.25. The performance for this "6 dbbelow expected signal" (one quarter in power), is given by the ROC in Figure 43 and yTk(L) function in Figure 44. It is possible to determine the total risk as explained before. By examination of the ROC and k(L) functions we can evaluate the performance of this receiver compared to a receiver designed for a d =.25 and W/c = 30. The various risk functions, optimum boundary plots, ROC functions, and average number of observations are presented to exhibit what the "typical" properties are of the various functions discussed. The general features discussed here are representative of the same functions with different parameters. 4.2 APPROXIMATE METHODS FOR THE DETERMINATION OF THE ASYMPTOTIC DFCTSION BOUNDARY, A A.Wald (Reference 3) was able to develop formulae for the determination of the asymptotic decision boundary, A0, given that ip [n ( (y)) 3 was small and the error probabilities, P("B" ISN) and P("A" IN). Basically Wald chose a point of operation on the ROC and from this determined A by minimization of the average number of observations. His point of view was to specify the error probabilities P("B" SN) and P("A" IN) and then obtain uA, irrespective of error costs, observation costs, and a priori probabilities, by minimizing the average number of observations. The point of view taken in this paper is Bayesian. Our procedure is to assume knowledge of the error costs, the observation costs, and a priori probabilities and from these determine 105

the ROC, the average number of observations, and decision points which minimize the expected risk. In a physical problem the determination of the error costs, the observation costs, and the a priori probabilities must be obtained by physical measurements, prior experience of others familiar with the problem, etc. For the same physical problem the costs and probabilities assigned by one designer may differ from those of a different designer. This is perfectly acceptable from the Bayesian point of view. Each receiver designer is trying to minimize his expected risk based on his own opinion of what the costs and probabilities are. Although for some readers this may be an unacceptable point of view, for analysis to be applicable to the physical world it is a reasonable philosophy. This viewpoint is basically different from that of Wald. It is the point of view taken by an "internal" operator who is trying to maximize a utility function. Internal measures are subjective and are computed by averaging external measures using subjective probabilities. The words "external" and "internal" refer to the evaluation that are used in the decision process. An internal evaluation is a comparison with the procedure which maximizes some utility function since maximizing a utility function is, by definition, optimum internally. An external evaluation is an evaluation, irrespective of costs and subjective measures, determining only the error probabilities, P("B" ISN) and P("A" IN). In practice, we operate internally and evaluate externally. The first approximate method for determination of Ax has been presented in Theorem 3. This is given by A* + sinh* uW (3.48) co oo 2c For normal observation statistics u = d/2 and Equation 3.48 becomes A * + sinh * = Wd (3.52) 0o co 4c The assumptions inherent in Equation 3.52 are continuous observation, symmetric losses, independent observations, stationarity, and unbounded available D. An alternate derivation of the above formula can be made using Wald's approximations for the average number of observations, or in Wald's terminology, the average sample number, ASN. This derivation also obtains the general form for the ROC for any unbounded sequential procedure. The expected loss of a sequential decision process with the assumptions of Equation 3.52 can be written as 106

E[ loss] = P("B" ISN) + P("A" N) W Ll+e 1+eL +LeL Y(L ISN) + L (LIN) c (4.6) 1+eL 1+e L where: P("B" ISN) is the probability of a miss P("A" IN) is the probability of a false alarm W is the loss due to an error c is the cost of a single observation. The two conditional average number of observations, j7(LISN) and;(LIN), can be written approximately, (Reference 3), as y(L ISN) I= I P(B I1SN)Bn 1ISN)]Qn P(A (4.7) d -i J-PQTAT' IP("N)j P [ 1 -P N 2 T I_-P("_A__IN) PQ'A" ) INN y(LIN) = [ 1-P( A" IN)] en [ 1P("A BN) + n (4.8) d P(?TBBt ISN) ~ 1-P("TBT' I) Let us simplify the notation by defining a and '3 as a = P("A" IN) (4.9) 3= P("B"T SN) (4.10) Using the above substitutions the expression for expected loss is E[ loss] = W +- fn + (1-3)(n )] 1 +e + -LWa +[(1-a) in~ —) +a log (4.11) l +e To operate in an optimum manner internally, we wish to minimize the expected loss. Therefore differentiate Equation 4.11 with respect to a and f and set the resulting two equations equal to zero. aac L L d 1+e l+e l+e aE e e 2c (- ( 13 1- d 1 a a — LW e L nI - - nn.. = 0o (4.13) 1+e 1 +e L 1 +e 107

Let a (4.14) a and b 1-a (4.15) Equations 4.12 and 4.13 can be written Wd Wd = Qn a - n b + e -a (4.16) Wd - en a - n b + e+L(a-b) (4.17) 2c L Subtracting the above equations and eliminating the e terms we obtain Equation 4.18. Wd =n() + a (4.18) 2c b b a For a single valued solution, since Wd/2c is fixed, the left-hand side of Equation 4.18 is a constant. This implies that a/b is constant. Let a/b = a2 (4.19) 0 Then it follows from the definitions of a and b that a = (4.20) Simplifying the above we obtain (a 2-1)a3+a+ = 1 (4.21) Equation 4.21 is the equation for the ROC for the sequential decision procedure in which the available D is unbounded, i.e., a sequential procedure with constant decision boundaries. a is the solution of Equation 4.22. Sc=fna2)+a --- (4.22) 2c O o a Since o L aa eL o (4.23) Then.n a = n() +L (4.24) O 108

But by definition the left-hand side of Equation 4.24 is A. Thus, substituting in Equation 4.22 co we have A -A Wd cc cc = 2A + e - e (4.25) 2c 00 Simplifying, we obtain Wd 4- =A + sinh(A (3.52) Summarizing, if one is given symmetric values and costs and independent normal observations of quality d, then to run a sequential observation-decision procedure, one derives (by assuming small spill over) the asymptotic boundary values from A + sinhA = W-4 (3.52) cc co 4c The asympotic ROC for this sequential procedure is approximated by the hyperbolas [(eci) a~+~l [(e i) +1i] =P1 e (4.26) For initial logarithmic odds L, and ILI < A, the operating point on the ROC is O~ oO P("A" IN) = a = e -1 (4.27) 2A A -L o0 oc P("A" ISN) = 1- e= e (4.28) e c-1 If ILI > A then no observations should be taken and a terminal decision is made immediately. 00 The total probability of an error (for ILI < oo) is L e 1 1 1 P(error) = L A A+ (4.29) l+e 1+e e +1 e +1 Notice that this expression is independent of L. The average number of observations is - eL 1 2 e -1 (4.30) y(L) =-Ly(LISN) + -LY(LIN) - _ -+e l+e oo+1 e +1 e +1 109

(Equation 4.30 continued.) =- A otanh (-) - L tanh ( The expected loss or risk is found by adding Equations 4.29 and 4.30 multiplied by the appropriate-cost. This risk is given in Equation 4.31. W e 1 -1 R(L) +1 = L ]A +2c R) Ad + a L L+l (4.31) e C0+1 e +1 This risk is extremely "flat" as a function of L. For example assume W/c = 300. By Equation 4.22 we find that A = 140 which in turn implies that A = en A = 4.925. Thus, the risk is given by Equation 4.31 as F150 139 e L_ 1 R(L) = L-4 + (4.925) - - L L 2c e +1 - 5.919 - L e 2c (4.32) eL+1 I 2e 2c T(L) = 751- L (4.33) 1+e i Equations 4.32 and 4.33 are plotted in Figure 45. As a further example consider a comparison of the standard form asymptotic boundary, A*, obtained by using Equation 3.55 G(L;A) )W 2 I - Ltanh (3.55) G(L; +eA) + 2c tanh() Ltanh() and a determination of A obtained from the computer algorithm (see Appendix A). G(L;A) 00 given by Equation 3.55 is the risk as a function L for a sequential observation-decision procedure with constant decision boundaries at -Aand ILI < A. Consider the terminal risk function for A> 0. T(L) = W (4.34) l+e The difference between G(L;A) and T(L) we denote as D(L) and is given by D(L) = G(L;A) - T(L) = ( L + 2c L tanh( - L2c Atanh (4.35) 110

.15 U) 05 w.10 I 0 N -i 0 z.05 0 T(L) W — =300 1 AO = 0 CONTINUOUS OBSERVATION CASE -6.0 -5.0 -4.0 -3.0 -2.0 - 1.0 0 1.0 2.0 3.0 4.0 L 5.0 6.0 Figure 45. The Wald sequential procedure expected risk function as a function of L with parameters W/c = 300. and A = 0. for the continuous observation case.

Let W/c = 30. D(L) evaluated for these parameters is the following D(L) = 1 l5+L (eL-) - 3.247j (4.36) e + If D(L) is plotted as a function of L, the graph of Figure 46 is obtained. The L value for which D(L) is zero is A * for this set of parameters. From Figure 46 this occurs for A * = 2.33. 00 00 The average excess over the boundary for a d =.25 is approximately.27 (see Figure 30). Thus the "actual" boundary, A, we expect to obtain from a numerical solution on the computer is A = - 5 (d=.25) = 2.06. The computer results for W/c = 30 and d =.25 give a A of 2.03 which compares favorably with the analytic results obtained by use of Theorem 3 and o(d). The computer algorithm obviously does not obtain A o since the program must be terminated for economic reasons. The value of A obtained from the computer can be improved by using the previous A 's to predict A. One method of doing this is by a sequence-to-sequence transformation (Reference 9). This is a method by which a limit to a slowly convergent sequence may be obtained by assuming a type of convergence. For example, if we assume a logarithmic type of convergence we use a sequence to sequence transformation given by the following formula An -(An-1 )(1n+ B = (4.37) B1,n 2A - -A n n-1 n+1 A value of A = 2.05 is obtained as a possible better value of A. Note that the answer ob00 c0 tained by a sequence to sequence transformation is not necessarily the limit of the sequence in the ordinary sense of the word. 4.3 INNER AND OUTER BOUNDS FOR THE DEFERRED-DECISION BOUNDARY The decision procedure for deferred decision is based on knowing the decision boundary for the specific parameters of interest. We have not been able to obtain an analytic expression for the decision boundary. Our method of solution is to use a computing algorithm to solve the problem iteratively. Although we cannot obtain an analytic expression for the entire deferred-decision boundary we have obtained an analytic expression for the deferred decision boundary for large n and d small. It is also possible to obtain fairly good inner and outer bounds on the deferred-decision boundary. 112

.3 W/c =300, o =0 D(L) AG =2.33 L -- Figure 46. The difference function, D(L), between the Wald sequential expected risk function and the terminal loss function as a functio of L with parameters W/c = 300. and A0 = 0. for the continuous observation case. L.5

The optimum nonsequential risk function and the asymptotic1 sequential risk function constitute upper and lower bounds on the deferred-decision risk function. From the risk function one can easily determine the bounds on A as a function of W/c; + An being the decision points n n at stage n for the deferred-decision procedure. The equations for the decision boundaries as determined by the optimum nonsequential procedure and the Wald sequential procedure have been previously derived. As shown in Section 2.3.1 the intersection of Equation 2.17 and 2.20 serves an inner bound on A n. The Wald sequential approximation gives Equation 3.52 as an outer bound on An This again assumes symmetric losses, normal and independent observations, and stationarity. Figure 47 is a plot of the inner and outer bounds on An as a function of W/c. As an example of how well the risk functions of the optimum sequential procedure and the Wald sequential procedure bound the deferred-decision risk function consider the following parameter values for the observation-decision process. Let W/c = 300, AO = 0, and assume normal observation statistics. For these parameters we obtain the plot given in Figure 48. This is a plot of the two risk functions. The deferred-decision risk functions lie somewhere between the two risk functions plotted in Figure 48. Table II, below breaks the total risk into its two component parts; that due to errors and that due to observation costs. Note that for the Wald sequential procedure the risk due to errors is almost constant as a function of L, as mentioned previously. Optimum Nonsequential Procedure Wald Sequential Procedure Risk Due to- Total Risk Due to Total 1 y(L) Errors Risk L Errors Risk 0 16.0 6.66 22.66 0 9.7 2.14 11.84 ~1.00 12.2 8.25 20.45 ~.94 8.8 2.12 10.92 ~2.33 11.0 7.29 18.29 ~1.99 6.7 2.13 8.80 ~2.80 9.0 7.19 16.19 ~2.20 6.2 2.14 8.34 ~2.94 4.4 2.14 6.54 ~4.60 0.7 2.12 2.84 Table II. The expected risk decomposed into the terminal error loss and the observation cost for the optimum nonsequential procedure and the Wald sequential procedure for W/c = 300. and A0 = 0. l"Asymptotic sequential" procedure means the sequential procedure with constant decision boundaries, i.e., the Wald procedure. 114

7 6 OUTER BOUND AS DETERMINED BY ~~z I A + SINH A () z a- 4 z 0 O -. o 3 -l 0/ C) 2 INNER BOUND AS DETERMIND BY THE OPTIMUM NONSEQUENTIAL PROCEDURE 100 W C Figure 47. The inner and outer bounds for the asymptotic deferred-decision boundary as determined by the optimum nonsequentia procedure and the Wald sequential procedure, respectively, as a function of W/c. 1000 Ll

.15 (n W OPTIMUM NONSEQUENTIAL c300, A0 0 wr N W CONTINUOUS OBSERVATION H ~~~~~~~~~~~~~~~~~~~CASE U W LU a..10 X W.IO 0.05 --6.0 -5.0 -4.0 -3.0 -?_.0 -I.0 0 1.0 2.0 3.0 4.0 5.0 L Figure 48. The normalized exected risk function for the optimum nonsequential procedure and the Wald sequential procedure as function of L with parameters W/c = 300. and A0= 0. for the continuous observation case. 6.0 L

The use of Equation 3.52 as an upper bound on the deferred-decision boundary assumes an unbounded available D. To investigate the asymptotic behavior of the deferred-decision boundaries we would like to allow n = D /d to be "large." The determination of what a large 0 n is dependent upon the results. We would like to know how large n must be to be considered effectively infinity, practically speaking. Economically it is prohibitive to study the asymptotic behavior by use of the computer programs developed for the normal deferred decision problem. However, if one studies a problem in which [A n[ (y)],i.e., the "mean motion," takes on only two possible values (one under each condition) the computing costs are reduced by orders of magnitudes. Physically this problem is that of a symmetric clipper crosscorrelator. We call this the "rapid probe" approximation to the normal deferred-decision problem. Its use is in "probing" the asymptotic behavior of normal deferred-decision procedures economically. The justification for using a distribution other than normal to approximate the asymptotic normal problem has been given by Wald (Reference 3) and Blackwell and Girschick (Reference 11). The formulas developed by Wald and Blackwell and Girschick for the average number of observations have one thing in common. The only quantity that is explicitly connected with the sampled distribution of log-likelihood-ratio is the mean motion, pu [4nf (y)J. Thus it seems reasonable that if one has any type of distribution of fn[f(y)] where the mean motions are small compared to the separation of the boundaries that, by the central limit theorem, for large number of observations, the distribution of Cn[ f(y)] at termination will approximate a normal distribution of fn [(y)]. Thus, for starting a priori odds, L, away from the boundaries one would expect to obtain a good approximation to a normal distribution at termination for small mean motion. For a priori odds close to the boundaries, the justification for using various distributions of Qn [Q (y)] is not as clear. For the rapid probe we have, from Blackwell and Girschick (Reference 11), the solution for the asymptotic boundaries of a binomial distribution. This solution assumes that the mean motions under each condition are equal, as in our case. If y = 1 with probability p in SN and 1-p in N, while y = O with probability 1-p in SN and p in N, then -n[!(y=l)] = n hn [ (y=O) = 2n(p) =- 2n(I) hence 117

/p[en[!(y)]lSN] = p-(l-p) in_ ) /[{n [(y)]ISN] = (2p-1) in (4.38) while / [ln[l(y)]IN] = -(2p-l1) n( p)(4.9) We say the '"equivalent normal quality" is de, found by equating mean log-likelihood-ratios de = 2(2p-1) n(jLp) Using a rapid probe to study the asymptotic behavior of a normal deferred-decision procedure we obtain the asymptotic decision boundary shown in Figure 49 as a function of W/c. The plot of the asymptotic boundary as obtained analytically from Blackwell and Girschick is shown in Figure 49. This serves as a verification for the computer program in the rapid probe approximation. - The computer program used in the rapid probe is given in Appendix C. Although we do not know the deferred-decision boundary analytically we have presented bounds on the boundary. These bounds are summarized in Figure 49 along with two computer determined "asymptotic" boundaries. The computer determined asymptotic boundaries were the boundaries determined at D = 10. 4.4 COMPARISON OF SEQUENTIAL AND NONSEQUENTIAL OBSERVATION-DECISION PROCEDURES We wish to make a comparison between the sequential and nonsequential observation-decision procedures we have discussed in this report. The nonsequential procedures discussed were the fixed observation procedure and the optimum nonsequential observation-decision procedure. The sequential procedures examined were the optimum sequential procedure (deferred decision), and the Wald sequential procedure with an abrupt termination. We have basically four different observation-decision procedures we can compare. There exists many combinations of possible comparisons. We wish to find the most meaningful of these combinations and also a meaningful basis for comparison. The standard approach in the past has been to compare the fixed observation procedure and the Wald sequential procedure. The basis of comparison, in our opinion, was not completely fair to the fixed observer. The comparison was made by assuming first that both the fixed observer and the Wald sequential observer operate at the same ROC point, that is, the 118

7 6 A + SINH A W 5~~~~~~~~~i~~~~~~~- 4c z 4 (/, C 0 2 O * ANALYTICAL RESULTS FOR BINOMIAL OBSERVATION X COMPUTED RESULTS FOR DEFERRED DECISION 10 100 w C Figure 49. The asymptotic decision boundary as determined by the rapid probe simulation, the optimum nonsequential procedure, an the continuous observation analytic solution as a function of L. Also shown are numerical results obtained by use of a digital comput 1000 der;er.

error probabilities for a terminal decision are equal. With this assumption the average number of observations for the two processes are compared. The usual rule of thumb is that the sequential procedure saves about 50 percent in the average number of observations taken. However, if we take the point of view that we will compare the total expected risk of the fixed observation-decision procedure and the Wald sequential procedure we have some interesting results. The most striking result is that the fixed observer does not operate at the same ROC point as the sequential observer. In other words, starting with the same set of parameters W/c, L, and D, and using the Bayesian philosophy of minimizing an expected risk function, the fixed observer and the sequential observer do not have the same error probabilities. If the fixed observer is forced to operate with the same error probabilities as the sequential observer, he is being unduly penalized. An illustration by use of a numerical example will point out the characteristics discussed above of a comparison between sequential and nonsequential observation-decision procedures. Consider the optimum nonsequential and optimum sequential observation-decision procedure for W/c = 30, D = 10, and d =.25. For L = 0 we obtain the figures shown in Table III. The third column in Table III labeled "matched ROC, nonsequential procedure," is the risk if we force the nonsequential observer to work at the same error probabilities as that of the optimum sequential observer. Note that the average number of observations for the nonsequential observer is approximately double that of the optimum sequential observer in this case. Suppose that instead of forcing the nonsequential observer to operate at the same ROC point we allow him to operate in an optimum manner in accordance with observation-decision parameters given, Optimum Sequential Optimum Nonsequential Matched ROC, Procedure Procedure Nonsequential (Deferred Decision) Procedure Normalized risk due to errors.1025.1707.1025 Normalized risk due to obser..1200.1207.2150;7(d) 3.60 3.62 6.45 Normalized Expected risk.2225.2914.3175 ROC Point (.1025,.8975) (.1707,.8293) (.1025,.8975) Table III. The comparison of deferred-decison (optimum sequential procedure) and the optimum nonsequential procedure for W/c = 30., AO = O, d =.25, D = 10, and L = 0. 120

i.e., W/c = 30, D = 10, d =.25, and L = 0. In this situation we obtain the figures presented in column two labeled "optimum nonsequential procedure." These figures represent the various risks or expected losses the nonsequential observer incurs in operating in an optimum manner. If we compare the average number of observations for this procedure and that of the optimum sequential procedure we see that they are approximately equal. The error probabilities, however, for the optimum nonsequential observer are greater, i.e., the optimum procedure for the nonsequential observer to follow is not to operate at the same ROC point but to incur larger error probabilities while keeping the average number of observations about equal to that of the optimum sequential observer. Thus to make a meaningful comparison between sequential and nonsequential observationdecision procedures we will compare the optimum sequential procedure (deferred-decision) and the optimum nonsequential procedure which we discussed in Chapter 2. Obviously a comparison of deferred-decision and a fixed observation procedure is not meaningful since the fixed procedure can be made as poor, in the sense of expected risk, as we wish by choosing the available D large. The general aspects of a comparison between deferred-decision and the optimum nonsequential procedure are readily evident by use of a numerical example. Assume the observation decision procedure parameters are W/c = 30, Ao = 0, d =.25 and L = 0. If we plot the average number of observations and the probability of a correct decision as a function of the available D we obtain Figures 50 and 51. Shown in Figure 50 are the average number of observations for four observation-decision procedures-the fixed, the optimum nonsequential, the Wald sequential with abrupt truncation, and deferred-decision procedure. The same four observation-decision procedures are plotted in Figure 51. The Wald sequential procedure with abrupt truncation used decision boundaries that are the asymptotic boundaries of the deferred decision procedure. The following generalities can be made by referring to Figures 50 and 51. For small available D the error probabilities for all observation-decision procedures are approximately equal. The differences in the expected loss for the various procedures for small available D occurs in the risk due to the average number of observations. However, at large available D just the opposite occurs. The average number of observations for everything except the fixed procedure are approximately equal. The savings for deferred decision occurs because of better decisions, i.e., less terminal decision errors. Thus, we see that one cannot make a simple statement like "you save 25 percent in the average number of observations and 20 percent in better decisions using deferred decision." The savings in expected loss and where it occurs, i.e., in better decisions or fewer observations, depends on the available D. 121

4 -o x 3 -D EFERRED DECISION -J O TRUNCATED WALD PROCEDURE, OQ BOUNDARY AT A = 2.0 LLU 2rr (n, 0 LU W o g /-=30, A0=o a:f: /d =.25, L =O 0 1 2 I I I 6 7 8 9 10 O I 2 3 4 5 6 7 8 9 I0 AVAILABLE QUALITY, D Figure 50. The average observed quality plotted against the available quality for a fixed observation procedure, an optimum nonsequential procedure, a truncated Wald sequential procedure, and a deferred-decision procedure with parameters W/c = 30., d =.25, and L = 0. 122

0.9 F z o C) LJ. ca: o U co cL 0 a 0.8 I FIXED OBSERVATION PROCEDURE / 1 —TRUNCATED WALD PROCEDURE, BOUNDARY AT A=2.0 /'/ -OPTIMUM NONSEQUENTIAL DEFERRED DECISION C =30, A = o d=.25, L O 0.7 ' 0.6 0.5 L o 2 3 4 5 6 7 8 9 10 AVAILABLE QUALITY, D Figure 51. The probability of a correct terminal decision plotted againstthe available quality for a fixed observation procedure, an optimum nonsequential procedure, a truncated Wald sequential procedure, and a deferred-decision procedure with parameters W/c = 30., d =.25, and L = 0. 123

The same graphs shown in Figures 50 and 51 for L f 0 exhibit the same general aspects discussed above. Referring again to Figures 50 and 51 we notice the performance of the Wald sequential procedure with abrupt truncation is very close to that of deferred decision. Deferred decision is optimum. The Wald sequential procedure with abrupt truncation will never be better than deferred decision. This can be seen in Fig. 52 where the risk of the abrupt truncation procedure is shown for different values of the decision boundary along with the risk for deferred decision. For different available D the optimum abrupt truncation procedure has different decision boundaries. If one could obtain an expression relating the decision boundaries to the available D for the abrupt truncation procedure one would have a sequential proceduring more easily implemented than deferred decision with a performance close to that of deferred decision. Even without this expression the different abrupt truncation procedures are so close to each other that one can come close to the optimum procedure by educated guessing at the value for the decision boundary. In summary we have the following rules of thumb in comparing sequential and nonsequential observation-decision procedures. For small available D all procedures are practically the same with respect to error. The differences in risk occur because the average number of observations for deferred decision is smaller. For large available D the average number of observations for all procedures (except fixed observation, of course) are practically the same. The differences in risk occur because deferred decision makes better decisions. 124

0.5 n Or 0.4 [L X W N 0 z 0.3 0.2 0 5 D = nmax d Jo Figure 52. A comparison of the expected risk function for various truncated Wald procedures and deferred decision with parameters W/c = 30., d =.25, and L = 0.

5 FUTURE STUDIES, SUMMARY, AND CONCLUSIONS 5.1 FUTURE STUDIES AND GENERALIZATIONS OF PRESENT SOLUTION FOR DEFERRED DECISION The observation-decision procedure we have examined called deferred decision has been shown to be the optimum sequential observation-decision procedure. This observation decisionprocedure has implicitly assumed a simple signal hypothesis. In a physical problem this is a simplification of the actual situation. The received signal, in general, will be from a composite signal hypothesis. There will be at best slight changes in one or more of the observed parameters, e.g., the phase of the signal might be a random variable. Mathemetically this is a change in the probability density functions of one or more of the observed variables with each succeeding observation. This, of course, means that the likelihood ratio of the observation is a function of the observed parameters and the number of observations that have been taken. The above discussion indicates the course of our future studies. We wish to apply the ideas of deferred decision to the composite signal hypothesis case. This has been completed for the case of a signal known except for phase and a signal known except for amplitude where the initial opinion of the amplitude is distributed according to k -cs -.5bs2 f(s) = Ask e e (5.1) This distribution includes the Gaussian, truncated Gaussian, Rayleigh distributions, and Pearson Type III as special cases. The basic idea which allows one to solve the composite signal problem economically is the following: the distribution of the parameters of which one has only statistical knowledge of is closed, i.e., the distribution of the signal parameter after an observation is the same type as the distribution before the observation was taken. We assume we can use an infinite memory if need be. The fact that the distributions of one or more of the signal parameters are closed allows one to remember only a finite set of variables with the same state of knowledge as an infinite memory allows. We have been able to reduce the dimensionality of the problem without any loss of information. In the process of making an optimum decision for a composite signal hypothesis re-evaluations of the probability distribution functions for the signal parameters are made. This continual "updating" of the probability distribution functions is not the primary goal of the observation decision process. The primary goal is to make optimum decisions about the nature of the physical cause of input to our receiver. The re-evaluation of the probability distribution functions is sometimes termed "adaptation" or "learning." 126

5.2. SUMMARY AND CONCLUSIONS OF OBSERVATION-DECISION PROCEDURES In this report we have examined various observation-decision procedures. The usual observation-decision procedure discussed in the literature, in connection with detection theory, is the fixed observation procedure. In a fixed observation procedure the observer observes for a given quality after which a terminal decision is made. This procedure is the simplest process in making a terminal decision. However, the expected risk associated with this procedure may be very large in comparison with other types of decision procedures. The general nonsequential observation-decision procedure is a fixed procedure with the observation quality a variable chosen by the observer before the observation begins. This procedure is optimized by choosing the correct quality to observe. The optimum observation-decision procedure, in the sense of minimum expected risk, is the process called deferred decision. This process includes nonsequential procedures and nonoptimum sequential procedures as special cases, e.g., the familiar Wald sequential procedure. The quality of the observation in a sequential process is a random variable. The optimization of the procedure is accomplished by determining the correct decision boundary points for which a terminal decision is made. Although the approach in this report has been from the Bayesian viewpoint, the standard statistical approach may be considered within our present framework. Standard or objective statistical tests are usually nonsequential observation-decision procedures. The length or quality of the observation is determined independent of L, the a priori log-odds-ratio. Thus these tests are, in general, nonoptimum procedures. The study of optimum nonsequential observation-decision procedures is included in this report for two reasons. First, these procedures are of interest in themselves. In many practical situations they are the only observation-decision procedures that are possible. Also the added complexity of a sequential process may not be warranted. Second, these processes are useful as bounds for deferred decision. The optimum nonsequential procedure is completely analytically determined whereas deferred decision is not (except for some academic problems). The results of the optimum nonsequential procedure are best summarized by means of Figure 13. This is a plot of the available quality versus the a priori log-odds-ratio with the value of a decision as a parameter. This graph depicts the manner of how to choose the observation quality for a given set {L, W, c) so as to optimize the observation-decision procedure. The greater part of this report has dealt with deferred decision. The method of solution was not restrictive to any special distributions on the observed variate or to the assumptions 127

of stationarity and symmetry. These assumptions were made to obtain 'more specific results. The basis for the method of solution is the so-called "principle of optimality" (Reference 8). The determination of the standard form for the deferred-decision boundary points is found analytically in the asymptotic portion of the decision boundary. and assumes a continuous observation. The discreteness of the observation causes an average excess to "spill over" the decision boundary. This average excess over the decision boundary connects the computer results used in determining the decision boundary and the analytic results obtained by assuming a continuous observation in the asymptotic portion of the decision boundary. The same relationship which connects the standard form and the computer results for large available D is also valid for small available D. The comparisons among the various observation-decision procedures are given in Section 4.4. The results of the comparisons are that one should compare on the basis of the total expected risk. The standard approach of forcing the nonsequential observer to operate with the same error probabilities as the sequential observer gives the sequential observer an unfair advantage. Deferred decision is the optimum procedure. For small available D all procedures are essentially the same with regards to the risk due to errors. The differences in risk occur because of the average number of observations. For large available D the average number of observations for all procedures (except the fixed observation procedure) are practically the same. The difference in risk occurs because deferred decision, on the average, makes better decisions. 128

REFERENCES 1. H. F. Dodge and H. G. Romig, "A Method of Sampling Inspection," The Bell System Technical Journal, Vol. 8, pp. 613-631, 1929. 2. Walter Bartky, "Multiple Sampling with Constant Probability," The Annals of Mathematical Statistics, Vol. 14, pp. 363-377, 1943. 3. Abraham Wald, Sequential Analysis, New York, John Wiley and Sons, Inc., Chapman and Hall Ltd., London, 1947. 4. H. H. Goode, Deferred Decision Theory, Cooley Electronics Laboratory Technical Report No. 123, The University of Michigan, Ann Arbor, Michigan, July 1961. 5. T. G. Birdsall and M. P. Ristenbatt, "The ROC: Receiver Operating Characteristic," Internal Memorandum 24, Cooley Electronics Laboratory, The University of Michigan, Ann Arbor, Michigan, April 1957. 6. W. W. Peterson, T. G. Birdsall and W. C. Fox, "The Theory of Signal Detectability," Trans. of the IRE Professional Group on Information Theory, Vol. 4, September 1954. The material in this article may also be found in: W. W. Peterson and T. G. Birdsall, The Theory of Signal Detectability, Cooley Electronics Laboratory Technical Report No. 13, The University of Michigan, Ann Arbor, Michigan, July 1953. 7. Thomas Curry, Radar Scan Theory Using a Bayes Sequential Observer, Ph.D. Thesis, Carnegie Institute of Technology, Pittsburgh 13, Pennsylvania, June 2, 1959. 8. Richard Bellman, Dynamic Programming, Princeton University Press, Princeton, New Jersey, 1957. 9. P. Cota, Applications of Deferred Decision Theory to the Clipper Cross Correlator, Cooley Electronics Laboratory, The University of Michigan, Ann Arbor, Michigan (in preparation). 10. D. Shanks, "An Analogy Between Transients and Mathematical Sequences and Some Nonlinear Sequence to Sequence Transforms Suggested by It, Part I," Naval Ordnance Laboratory Memorandom 9994, Naval Ordnance Laboratory, Silver Springs, Maryland, July 1949. 11. D. Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions, New York, John Wiley and Sons, Inc., London, Chapman and Hall, Ltd., 1954. 129

Appendix A THE COMPUTER PROGRAM FOR THE DETERMINATION OF DEFERREDDECISION BOUNDARIES Included in this appendix is the computer program used to obtain the deferred decision boundary points. The computer programs were all written in an algebraic source language developed at The University of Michigan called the Michigan Algorithm Decoder (MAD). Anyone familiar with any of the various computer source languages should be able to follow the programs included here. A general block diagram of each computer program is presented, followed by a detailed block diagram for those interested in the details of programs. The computer program for the deferred decision boundary points assumes normal observations, stationarity, equal error losses, and independence of observations. The program is written in such a manner as to take advantage of these assumptions and cannot be easily generalized to less restrictive problems. This was done for economic reasons. The following are notes to help explain the various symbols that are used in the block diagrams. (1) The A(1) to A(50) are used in all Stieltjes integrations of functions with respect to the normal distribution function. The value of A(k) is given by A(k) = Ex 1.02 k -.02 < I(X) <.02 k}. (2) FO is "old expected risk function." FNL is "new expected risk function." The "new expected risk function" is computed from the "old expected risk function" by averaging with respect to the normal distribution function. (3) LTSN and LTN are the new values of L in SN and N, respectively. The latter are found taking the old value of L and temporarily assuming the observation had the R value (normal, with mean.5D and standard deviation DP). (KSN, KN) and (CSN, CN) are the integer and fractional parts of (LTSN, LTN) used for interpolation in the computation of the risk functions. They are computed once and stored. D1 is a linear subscript to calculate for the computer where to store these constants (for faster operation). (4) The computation of expected risk functions (integrals) is the main part of this program. The two conditional risk functions in SN and N are combined to form G. This risk function is then compared with the terminal risk function to find the decision boundary. For the risk function in N the integrand is called DGN and the sum GN. For the risk function in SN the integrand is called DGSN and the sum GSN. 130

Appendix B THE COMPUTER PROGRAM FOR THE DETERMINATION OF THE ROC AND AVERAGE NUMBER OF OBSERVATION FUNCTIONS This appendix presents the program used to determine the ROC and average number of observations for the deferred-decision procedure. A general block diagram is presented followed by a detailed block diagram of the computer program. As in the computer program for the deferred decision boundary points use is made of the assumptions of symmetry and stationarity to speed up the program. The program cannot be easily generalized to less restrictive problems. The following are notes that help to explain the various quantities that are indicated in the block diagrams. (1) The A(1) to A(50) are used in all Stieltjes integrations of functions with respect to the normal distribution function. The A's are given by A(k) = E1x.02 k -.02 < (x) <.02 k}. (2) TEND is an integer corresponding to ten times D. (3) POSNA is "probability old given SN of A" PONA is "probability old given N of A NOSN is "(average) number (of observations) old given SN NON is "(average) number (of observations) old given N The functions being computed have the second letter N for "new." This iteration technique computes four "new" functions from the values of the four "old" functions. (4) LTSN is the new value of L if one had started at the old value L(I) and (temporarily) assumed that the observation had the R value (normal, with mean.5D and standard deviation DP). TENLT is ten times LTSN, separately identified because it is used several times. This will be used for table look up and interpolation, all tables stored in.1 steps in L. The K and C values are the integer and fractional parts of TENLT. TENLT = K + C, O < C < 1. These interpolation constants are stored and used repeatedly in the computation of the ROC and average number of observations. (5) The computation of the expected values (integrals) involves computation of the integrand, summation, and normalization of the final sum. The key is Name of Function Name of Sum Name of Integrand PNSNA ROCSN DROCSN PNNA ROCN DROCN NNSN ASNSN DASNSN NNN ASNN DASNN 131

Appendix C THE COMPUTER PROGRAM FOR THE RAPID PROBE The computer program for the binomial approximation to the normal case, called the rapid probe, combines the computer programs presented in Appendixes A and B into a single, very efficient program. The reason for the increased efficiency is that the integrations of the risk functions, the ROC functions, and the average number of observation functions are replaced by exact formulae. The exact value of the decision boundary is quantized depending on the initial a priori odds. For convenience we interpolate to obtain a smooth value for the decision boundary. In this sense, the decision boundary is not an exact value. All other formulae calculated by this program are exact. The numbers used in the formulae for obtaining the new functions from the old functions depend on the W/c ratio we wish to simulate by the binomial approximation. The following are notes that help to explain the various symbols that are used in the block diagrams. (1) S1 and S2 are the expected risk function in SN and N, respectively. S3 and S4 are the average number of observations in SN and N, respectively. (2) The computations of risk functions and average number of observations are made for L > AO only since these functions are symmetric about bA. (3) 14 and 15 are the values of likelihood ratio in SN andN, respectively. The functions T1, T2, T3, T4, are the temporary functions corresponding to S1, S2, S3, and S4, respectively. (4) Delta is the "upper" decision boundary. The "lower" decision boundary is the negative of Delta. Appendix D THE COMPUTER PROGRAM FOR THE DETERMINATION OF THE AVERAGE EXCESS OVER THE DECISION BOUNDARY The Monte Carlo simulation was performed using a high speed digital computer (IBM 704). The Monte Carlo simulation presented here can be used for any boundary points since these boundary points are read in as data. The computer program is given in block diagram form followed by a detailed block diagram. The following are notes to help explain the various symbols that are used in the block diagrams. 132

(1) TRUNC is the truncation value on the number of samples that terminates any specific run if the addition of the random samples for this number of samples did not exceed the preset cut level (boundary). (2) DC and MC are the sample counters for the upper and lower boundary. D(I) and M(I) store the density function for the sample values. (3) XO and XU are the accumulated excess over the preset cut level and T is the random variable that is obtained from a normal distribution. Appendix E ERROR ANALYSIS OF THE COMPUTER PROGRAM FOR THE DETERMINATION OF THE DEFERRED-DECISION BOUNDARIES There are two basic sources of error in the computer programs used for finding the deferred decision boundary points. One source of error is in the approximation of the probability density functions of N and SN by discrete density functions. The other main source of error arises in the interpolation of the various functions that must be integrated. The interpolation is necessary because the various continuous functions must be approximated by a finite set of numbers. There are in addition other random computer errors. The analysis here is given in two parts. The first section is not, strictly speaking, an analysis but only presents evidence of the size of errors and its comparative value. An actual analysis is too complicated. The second part is an analysis of the interpolation error. The error we examine is the error in the risk function since the risk function is the basic function from which the decision boundaries are calculated. In order to determine the nature of an acceptable error, a normalization is made on the error found in a risk function. If the error in a risk function is so large that the cost of observing, i.e., 1/W, is comparable to the error,then the risk function is meaningless. The normalization is to compare the error to the cost of observation for a single observation. We feel that an error of 10 percent or less of the cost for a single observation is an acceptable error. This is because a 10 percent change in the cost of observation affects the decision boundary points by 2 to 3 percent. In view of the fact that the a priori estimate of SN and N and the determination of W is an estimate this seems to be a reasonable error. Figure El is a graph of T(L) - G35(L) for a W/c = 30 depicting a 10 percent change in the cost of a single observation. Figure E1l also shows what this means in terms of the decision boundary point. 133

Further evidence that an error of 10 percent of cost is acceptable can be seen by an examination of the risk functions. If at stage n = 24 the cost of a single observation is increased by 10 percent, the percentage difference in the risk functions after 11 stages is less than one percent. This is again true for W/c = 30. The greatest error in the risk function occurs away from the decision boundary at L = 0. On the decision boundary the percentage difference in the risk functions after 11 stages is less than.2 percent. Roughly speaking the change in the risk function accumulates as n times the cost of a single observation around L = 0, where n is the allowable number of deferrals. This can be seen from the fact that if the decision boundary cannot be reached by,[4u(f (y))], i.e., when the mean motion of the distribution, is in n steps then the result is as if the boundary was not present and the risk error accumulates for n stages. However, if the decision boundary can be "reached" by the mean motion of the distribution then the amount of the cost of observing added to the risk function, for these L's, is less than n times the cost of a single observation. The boundary acts to wash out the effects of errors in the risk functions due to an apparent change in the cost of a single observation. This explains, roughly speaking, why the risk function contain very little error for L's near the boundary due to an apparent change in the observation cost. Since the analytic form for G1(L) is readily obtained the difference T(L) - G1(L) can be found as L T(L) - G(L) P(B/N L P(B/SN) - cost l+e 1+e L _ 1 e -L+.5d e -L.5d cost (E-1) 1+e I+e The quantity T(L) - G1(L) from the computer solution can be compared with the analytic solution to find the magnitude of the error in the risk function due to two causes; the integration technique and the assumption that T(L) is linear over the increment in L, in this case.1 steps in L. The assumption that T(L) is linear over a.1 change in L introduces not more than a 1.5 percent error. The first numerical results obtained for the risk function for deferred decision employed the use of ordinary Riemann integration. In the interest of both computing time and integration error all the Riemann integrations were converted to Stieltjes integrations in subsequent programs. 134

The normal distribution function was approximated by the use of 50 values, each value representing 2 percent probability. The error due to approximating the continuous distribution function by a discrete distribution of 50 values can be determined as follows. The integral we wish to consider is the average of a function with respect to a normal density function. This integral is approximated by a sum of 50 values. 50 4- oo + 00ii~x~oz IE-2) {_ G(L)f(y)dy G(L)dF(y) G(4i) x.02 (E-2) - Xo i=l The 4 i are chosen to represent 2 percent of the area under the normal probability density function. If G(L) is assumed linear over each A L interval we have error per interval =.02(a4i + b) -f(ax + b) D(x) dx = a.02i -x (x) dx (E-3) where 41(x) exp - ( 2 2 Let us scale the x axis so that m = 0 and a = 1. Now fx 1(x)dx = @(x) Thus errori = a - 4(xi) + 4b(xi+1 for the ith interval. We therefore chose i = 50 -(x1)] to eliminate the linear error term. These 4. are the "mean" values of the normal r.v. in each 2% interval, since 1 x. I x 1 (x) dx i X. I +(x) dx 135

The error introduced by the Stieltjes integration is approximately 1 percent as determined by comparing the computer solution for T(L) - G1(L) with the analytic solution for T(L) - G1(L). This error occurs for each stage. However, since the curvature of the risk function becomes less and less this error will decrease as the stage number, n, increases. The interpolation error is due to a linear interpolation of the risk function. The curvature of the risk function is not large, however,, and allows linear interpolation to be satisfactory. To obtain a bound on this error consider the correlation due to the second difference (i.e., the quadratic term). If a function, f(x), is expanded by use of differences obtained from the tabular values we have f(x)=f(a)+k(k 1 )2+ k(k-2)6 k 2)+. (E-4) where: k = (x - a)/AL 6 = kth difference k "a" is the value of x at the tabulated points. The greatest value of k =.5 is obtained at the midpoint of an interval. Therefore, the maximum error due to neglecting the quadratic term is Ierror <k(k1) 62 125 62 -- 22 2- 2 For a d =.25 and = 30 this is an error of less than.000125 or 5 percent of the sampling c cost. The sum of all errors discussed is less than 10 percent of the sampling cost. Since the error values discussed were upper bounds, the actual error would be expected to be less than this. The refinement of numerical procedures is a never ending process. There are obvious ways of reducing the computing errors still further, e.g., the use of quadratic or cubic interpolation. However, there is a balance between meaningful results and time spent and exact results and time spent. 136

Appendix F THE ERROR ANALYSIS COMPUTER PROGRAM Included in this appendix is the error analysis computer program used to determine some of the conclusions we made in Appendix E. The program is highly specialized and was written for the express purpose of investigating the sources of errors in the program described in Appendix A. Appendix G THE COMPUTER PROGRAM FOR THE DETERMINATION OF THE VALUE CONTOURS FOR THE OPTIMUM NONSEQUENTIAL DECISION PROCEDURE The contour graph of the value of the observation for a predetermined observation-decision procedure is found numerically using the following computing algorithm. This graph presents the information of a predetermined procedure in a convenient form and serves to describe the properties of a predetermined procedure graphically. 137

READ IN DATA COMPUTE AND STORE T(L) COMPUTE MEAN MOTIONS COMPUTE LIKELIHOOD RATIOS AND INTERPOLATION CONSTANTS I I INTEGRATION OF RISK FUNCTIONS -l SUBSTITUTION OF NEW RISK FUNCTION COMPUTATION OF DECISION BOUNDARIES 1 PRINT OUT OF RISK FUNCTIONS AND DECISION BOUNDARIES CONVERGENCE I NO CHECK I t YES Figure A.1. The general block diagram for computation of the deferred-decision boundary points. 138

CD INCREMEN1 FO(I) = T(I) Figure A.2. Data read in and computation of T(L).

I MHS(R) = A(R) *DP MSN(R) = MHS(R)+.5*D _., MN(R) = MHS(R)-.5*D 0-" LT SN = 10. I L(I) + MSN(R)| LTN = 10 * I L(I) + MN(R)I KSN = LTSN KN = LTN CSN =LTSN-KSN CN = LTN - KN D1 =56*I+R KSN(Di) = KSN KN(D1) = KN CSKD1i)= CSN CN(D1)= CN LTSN(D1) = LTSN LTN(D)= LTN Figure A.3. Computation of mean motions, likelihood ratios, and interpolation constants.

NSET ES STATEMENT 1=0 TEST 1>60 INCREMEDIFF(I)T(Iz SET GSN ~ DIFFMI GN=0.0 R- GN=:0+N +()CNCS >0 NO TEST R>50 DI=56*I-RR R=R-i NO LTSN = LTSN(D1) I DGSN = CSN * fO(KSN'l)+ (I.-CSN) * FO(KSN) is DGN=CN* FO (KN+i) +(I.-CN)* FO(KN) INCREMENT LTN =LTN(D1) LTSN-CUT, LTN-CUT GSN CN =CN(DV O DGS=CS NO G *GNCS CT(KSN cl)+(I.-CSN)*TT(SN <0 DGN =CN*C T(KN-1) +(I.-CN) 1 I(KN) CSN =CSN(DV) KN = KN(D0) KSN= KSN(D1) Figure A.4. Integration of risk functions. =GSN+OGSN I:GN+DGN A.5. Computation of decision boundaries, print out, convergence check, and substitution of "new" risk function.

READ IN DATA a DO PRELIMINARY CALCULATIONS 1 I COMPUTE & STORE T (L) I COMPUTE a STORE ROC a AVERAGE NUMBER OF OBSERVATIONS FOR N= I COMPUTE NEW L's FROM PREVIOUS L's AND MEAN MOTION I COMPUTE THE ROC a AVE. NO. OF OBSERVATIONS FUNCTIONS COMPUTE THE RISK FUNCTION I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DO FOR ALL L IN CONTINUE REGION PRINT QOUT THE ROC, AVE. NO. OF OBSERVATIONS, a RISK FUNCTIONS SUBSITUTE "NEW" ROC, ETC. FUNCTIONS FOR "OLD" ROC, ETC. FUNCTIONS.1 NO I CONVERGENCE CHECK YES Figure B.1. The general block diagram for computation of the ROC and average number of observation functions for the deferred-decision procedure. 142

Figure B.2. Data read in and computation of T(L). 143

POSNA(I) = ( DP L(I)) PONA (I) = -DP ) NOSN (I) = 1.0 NON (I) = 1.0 - 40. _ ---— IM POSNA(I) = 1.0 L PONA (I)= 1.0 NOSN (I) =O NON (I) =0 POSNA(-I) = I.- PONA(I) PONA (-I) = I.- POSNA(I) NOSN (-I) = NON(I) NON (-I) = NOSN (I) Figure B.3. Computation of the ROC and average number of observation functions for n = 1 and mean motions. 144

PNSNA(I) =.02*ROCSN PNNA (I) =.02*ROCN NNSN (I) =.02* ASNSN+1. NNN (I) =.02* ASNN + 1. C1 DROCN =0.0 DASNN = 0.0 ROCSN = ROCSN + DROCSN ROCN = ROCN + DROCN ASNSN = ASNSN + DASNSN ASNN = ASNN + DASNN DROCN-C* PONA(KN +1) + (1.-C) * PONA( KN) DASNN =C* NON(KN+1) + (i.-C) * NON(KN) Figure B.4. Integration of the ROC and average number of observation functions.

G- [T(-I] * [-PNSNA(I)+ COST * NNSN (I)] + [T(I)] * [PNNA(I) + COST * NNN(I)] RISK(I)= G Figure B.5. Computation of the expected risk.

PRINT OUT A DP, W, N D,N COST, A (N) SET I=O TEST I>J+ OVER I=l+ I INCREMENT YES SET TEST I>J+OVER I=I+ INCREMENT YES NO NO PRINT OUT L(I), RISK (I), DIFF(I) POSNA(I), PONA (I), NOSN (I), NON (I) POSNA(I) = PNSNA (I) PONA (I) = PNNA (I) NOSN (I) = NNSN (I) NON(I)= N N N (I) POSNA(-1) = I.- PONA (1) PONA (-I) =.- POSNA(I) NOSN (-I) = NON (I) NON (-I) = NOSN (I) 7 N-NMAx>O.0 NO MA Y ES. Figure B.6. Print-out, convergence check and substitution of "new" functions for "old" functions.

START PROB. COMPUTE. a STORE TERMINAL LOSS FUNCTION COMPUTE a STORE ROC AND ASN FOR N=O COMPUTE MOTIONS COMPUTE "NEW" FUNCTIONS FROM "OLD" FUNCTIONS BY EXACT FORMULA (REPLACES INTEGRATIONS) COMPUTE RISK DUE TO ERROR AND RISK DUE TO ASN COMPUTE A(N) SUBSTITUTE "NEW" FUNCTIONS FOR "OLD' FUNCTIONS PRINT OUT ROC, ASN,RISK AND A(N) CONVERGENCE NO CHECK YES Figure C.1. The general block diagram for the binomial approximation to the normal deferred-decision procedure. 148

DECLARE INTEGER I,K,N,NMAX,I4,I5 DI,NLO,IC ~cUP Z(I) = I.-ZC(I) L(I)=.05*(I-100) SI (I) =. S2(I)= I. S3(I)=O. S4 (I) = 0. t T),'aI.+ EXR [L(I)] Figure C.2. Computation of T(L), ROC, and average number of observation functions.

Figure C.3. Computation of "new" functions from "old" functions and of risk functions. ~1 o!ES K:196 I~ 'JI SE S I SET A=(KYEIS l:NMPN96 YEs I=100 ES -Is S 3(I -T I SI T ST I: I IK Ic- oo I K NO I I -NO NO INREMENT INCREMENT INCREMENT Si M = TIMI IC=200-1 )I PRINT OUT S2(I):T2(I) S = S(I):-S i(IC) TiV,T2(V SM) = T 3MI L — S2(V)=1-S(IQ)~- T 3(I),T4(I) S4(I)= T4(I) S(I)=S4(IC) D(I),RISK(I)SNC) S4(I) Ss(IC) Figure C.4. Computation of the decision boundary, print out, convergence check and substitution of "new" functions for "old" START P ROB. functions.

READ IN DATA AND PRELIMINARY CALCULATIONS I 1 READ INITIAL POSITION AND INITIALIZE RANDOM SUBROUTINE K_ SET. TO INITIAL POSITION t I OBTAIN ONE RANDOM NUMBER READ IN NEW STARTING POSITION FOR RANDOM SAMPLE 1 SELECT GAUSSIAN MOTION AND ADD TO PRESENT POSITION LESS THAN 3 IO3 OVER TRIALS 103 COUNTER PRINT OUT AXO AND ASN 1 TEST WHETHER SAMPLE IS WITHIN BOUNDARIES IN jOUT STORE AVERAGE EXCESS OVER BOUNDARIES AND AVERAGE SAMPLE NUMBER Figure D.1. The general block diagram of the Monte Carlo simulation for determination of the average excess over the decision boundary. 151

Figure D.2. Data read in, preliminary calculations, and initialization..n1 CA3 Figure D.3. Selection of random number, gaussian motion, and truncation check. Figure D.4. Convergence check and print-out.

t T-G 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 L Figure E.1. A graph of T(L) - G35(L) for a W/c = 30 depicting a 10 percent of a single observation. change in the cost 153

READ IN DATA AND DO PRELIMINARY CALCULATIONS COMPUTE AND STORE T(L) COMPUTE AND STORE MEAN MOTIONS OF LIKLIHOOD RATIO COMPUTE LIKELIHOOD RATIOS AND INTEGRATION INTERPOLATION CONSTANTS SUBSTITUTE NEW RISK FUNCTION FOR OLD RISK FUNCTION COMPUTE AND STORE RISK FUNCTION! DETERMINE IF L IS IN THE CONTINUE REGION YES DO FOR ALL L j1 CALCULATE A(N) li PRINT OUT (N), RISK FUNCTION INCREASE COST OF OBSERVATION BY 10% AT N=24 I r~~~~~ —A_.., -- 1% A I FOR N < 24 FOR N > 24 CHECK IF N3> 34m HI SET N=25 AND INCREASE COST OF OBSERVATION BY 10% YES Figure F.1. The general block diagram of the error analysis program. 154

Figure F.2. Computation of T(L) and the mean motions of the likelihood ratios. 155

LTS N 10.* [L(I)+ MSN(R) LTN = IO.* [L(I)+ MN (R)] KSN = LTSN KN = LTN CSN = LTSN-KSN LN = LTN- KN Figure F.3. Computation of the likelihood ratios, interpolation constants, and the risk functions.

PRINT OUT DP, W, N, D*N ) COST, A(N) RISK FUNCTION Figure F.4. Computation of decision boundary and print out of risk function and decision boundary. 157

01 o00 Figure F.5. Increase observation cost by 10% at n = 24 for 10 trials and substitute "new" risk function for "old" risk function.

MACHINE CLEARED,DATA READ IN AND SET-UP. TTABLES OF Z,ZC,L COMPUTED VALUE COMUTED FROM O IN INCREASING L UNTIL VALUE <O TABLES OF PSNA,PNA, VAVE MADE. VALUE COMUTED FROM O IN DECREASING L UNTIL VALUE < O TABLES OF PSNA,PNA, VAVE MADE. OPTIMUM d BOUNDARIES FOUND BY COMPUTING STAGE VALUE WITH THE PRECEDING STAGES VALUES, BOUNDARY STORED L VALUE (0) > 0 D < DMAX PAINT OUT L, PSNA, PNA, VALUE, D INCREASED D > DMAX VALUE (0) < 0 PAINT OUT COMPLETE TABLE OF BOUNDARIES, OPTIMUM AND LIMITING Figure G.1. The general block diagram for the calculation of constant value contours for the optimum nonsequential procedure. EXECUTE ZERO. OLDDI F(120), DI F( 120), ODP(120), OLOVAL(120), OPTDP(120) LLOW(120) LHIGH(120) MACHINE CLEANED, DATA READ IN AND SET-UP. TABLES OF Z, ZC AND L COMPUTED I I~~~~~~~~~~~~~~~~~~~ Figure G.2. Data read in and preliminary calculations.

TH2 TH3 VE =.25*DE+(L(I) -LCUT) / DE UE = VE -.5 tDE PSNA(I)= 5 +.5r ERF. (VE) PNA(I):,5+.5*ERF.(VE) PSNB: I.- PSNA(I) PNB = I.- PNA(I) VE.25* DE +(LI) - LCUT)/ DE UE = VE -.5*DE PSNA(I) =.5+.5,ERF.(VE) PNA(I) =.5 +.5*E RF. (VE) VALUE I = WSN*Z(I)*PSNA(I) - N *Z C(I)*PNA(I) - D VALUE (I) = -WSNrZ(I)*PSNB +WN* ZC(I)+ PNB -D 03 o NO N VALUE 1 VALUE (I. <0 < YEY J2=1 j JI = VALUE COMPUTED FROM 0 VALUE COMPUTED FROM O IN INCREASING L UNTIL IN DECREASING L UNTIL VALUE< O TABLES OF VALUE< O TABLES OF PSNA,PNA, VALUE MADE PSNA, PNA, VALUE MADE I w~uE 4 -i Figure G.3. Computation of the value of observation.

TH 4 OPTDP(I) = SQRT. ( D) + DP- DIF(I) / ( ABS.(DIF(r)) +-.ABS.(OLDDI F (I)) Im"|-~~~~~~~~~~~~ ~OLDVAL(I) =VALUE(I ) J OPTIMUM d BOUNDARIES FOUND BY COMPARING STAGE VALVES WITH PRECEDING STAGE'S VALVES BOUNDARY POINTS STORED PRINT FORMAT OUT L(I), PSNA(I) VALUE(I) VALUE (O)>O D<DMAX PRINT OUT L, PSNA PNA, D INCREASED VALUE (O)< O D> DMAX Figure G.4. Computation of optimum d boundaries and print out of P("AT" I SN) and P("A" I N).

TH5 PAINT FORMAT TITLE PRINT RESULTS WSN, WN, < CUT PRINT FORMAT RESOLT. m~~~~~~~~3~~~~~~~~ L(I), OPTOP ( I), ODP(I) LLOW(I) / L HIGH( I)/ I. PRINT OUT COMPLETE TABLE OF BOUNDARIES, OPTIMUM AND LIMITING I,ure G.5. Complete print-out.

DISTRIBUTION LIST Copy No. Addressee 1-2 Office of Naval Research (Code 468), Department of the Navy, Washington 25, D. C. 3 Office of Naval Research (Code 436), Department of the Navy, Washington 25, D. C. 4 Office of Naval Research (Code 437), Department of the Navy, Washington 25, D. C. 5-10 Director, U. S. Naval Research Laboratory, Technical Information Division, Washington 25, D. C. 11 Director, U. S. Naval Research Laboratory, Sound Division, Washington 25, D. C. 12 Commanding Officer, Office of Naval Research Branch Office, 230 N. Michigan Avenue, Chicago 1, Illinois 13-22 Commanding Officer, Office of Naval Research Branch Office, Box 39, Navy No. 100, FPO, New York, N. Y. 23-42 Defense Documentation Center, Cameron Station, Building No. 5, 5010 Duke St., Alexandria 4, Virginia 43-44 Commander, U. S. Naval Ordnance Laboratory, Acoustics Division, White Oak, Silver Spring, Maryland 45 Commanding Officer and Director, U. S. Navy Electronics Laboratory, San Diego 52, California 46 Director, U. S. Navy Underwater Sound Reference Laboratory, Office of Naval Research, P. O. Box 8337, Orlando, Florida 47-48 Commanding Officer and Director, U. S. Navy Underwater Sound Laboratory, Fort Trumbull, New London, Connecticut, ATTN: Mr. W. R. Schumacher, Mr. L. T. Einstein 49 Commander, U. S. Naval Air Development Center, Johnsville, Pennsylvania 50 Commanding Officer and Director, David Taylor Model Basin, Washington 7, D. C. 51 Office of Chief Signal Officer, Department of the Army, Pentagon, Washington 25, D. C. 52-53 Superintendent, U. S. Navy Postgraduate School, Monterey, California, ATTN: Prof. L. E. Kinsler, Prof. H. Medwin 54 Commanding Officer, U. S. Navy Mine Defense Laboratory, Panama City, Florida 55 U.S. Naval Academy, Annapolis, Maryland, ATTN: Library 56 Harvard University, Acoustics Laboratory, Division of Applied Science, Cambridge 38, Massachusetts 57 Brown University, Department of Physics, Providence 12, R. I. 163

Copy No. Addressee 58 Western Reserve University, Department of Chemistry, Cleveland, Ohio, ATTN: Dr. E. Yeager 59 University of California, Department of Physics, Los Angeles, California 60-61 University of California, Marine Physical Laboratory of the Scripps Institution of Oceanography, San Diego 52, California, ATTN: Dr. V. C. Anderson, Dr. Philip Rudnick 62 Dr. M. J. Jacobson, Department of Mathematics, Rensselaer Polytechnic Institute, Troy, New York 63 Director, Columbia University, Hudson Laboratories, 145 Palisade Street, Dobbs Ferry, N. Y. 64 Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 65 Johns Hopkins University, Department of Electrical Engineering, Johns Hopkins University, Baltimore 18, Maryland, ATTN: Dr. W. H. Huggins 66 Director, University of Miami, The Marine Laboratory, #1 Rickenbacker Causeway, Miami 49, Florida, ATTN: Dr. J. C. Steinberg 67 Litton Industries, Advanced Development Laboratories, 221 Crescent St., Waltham, Massachusetts, ATTN: Dr. Albert H. Nuttall 68 Institute for Defense Analysis, Communications Research Division, von Neumann Hall, Princeton, New Jersey 69 Commander, U. S. Naval Ordnance Test Station, Pasadena Annex, 3202 E. Foothill Blvd., Pasadena 8, California 70 Chief, Bureau of Ships (Code 688), Department of the Navy, Washington 25, D. C. 71 Chief, Bureau of Naval Weapons (Code RU-222), Department of the Navy, Washington 25, D. C. 72 Cornell Aeronautical Laboratory, Inc., P. O. Box 235, Buffalo 21, New York, ATTN: Dr. J. G. Lawton 73 Autonetics, A Division of North American Aviation, Inc., 3370 East Anaheim Road, Anaheim, California, ATTN: Dr. N. Schalk UNIVERSITY OF MICHIGAN Il III I9IIIII5 02514 3 9015 02514 7532 164