RSD-TR*84 QUANTITATIVE ANALYSIS OF A MOMENT BASED EDGE OPERATOR Paul H. Eichel Edward J. Delp Department of Electrical and Computer Engineering and Computer, Information, and Control Engineering Program The University of Michigan Ann Arbor, Michigan 48109-1109 April 1984 CENTER FOR ROBOTICS AND INTEGRATED MANUFACTURING Robot Systems Division fg hA, COLLEGE OF ENGINEERING IT THE UNIVERSITY OF MICHIGPN i G / ANN ARBOR, MICHIGAN 48109-1109 L?

v /0/ 7Z

RSD-TR.-84 TABLE OF CONTENTS 1. INTRODUCTION: EDGE DETECTION IN DIGITAL IMAGES..... 3 2. DEFINITION OF THE SAMPLE- VARIANCE OPERATOR.......... 3 3. IMAGE EDGE MODEL.................................................... 4 4. PROBABILITY DENSITY FUNCTION OF THE S.V. OPERATOR............. 5. APPROXIMATIONS TO THE DENSITY FUNCTION..................... 6. EDGE DETECTION...................... 7 7. PREDICTING THE PRATT FIGURE OF MERIT........................... 8 8. CONCLUSION........11 9. REFERENCES.................................................... 12 10. APPENDIX........................................ 13

RISD-TR-44 ABSTRACT2 Many mathematical operators capable of locating intensity edges in discrete images have been proposed in the literature. We introduce a new operator, based on the sample variance of a group of pixels, which exhibits two unique properties: freedom from a predefined shape and a rigorous stochastic formulation. We demonstrate the utility of the latter property in applying detection theoretic principles to the task of edge detection and in theoretical predictions of experimental performance measures.'This work was partially supported by the Center for Robotics sad Integrated Maufactnring, Robot Systems Division, College of Engineering, University of Michigan and the Air Force Office of Scientific Reeauch ander Ort F49620 82-C-0089. Quantitative Analysis of a Moment Bamed Edge Operator

RSD-TR-684 1. INTRODUCTION: EDGE DETECTION IN DIGITAL IMAGES Detection of intensity edges, that is of local intensity discontinuities at the boundaries of objects, is a low-level process of fundamental importance in image processing. An edge operator is a mathematical operator of small spatial extent which responds in some way to these discontinuities, usually classifying every image pixel as either belonging to an edge or not. The literature abounds with edge operators [1],[21. Generally speaking these operators tend to fall in one of three broad classifications: 1) gradientbased; 2) template matching, and 3) parametric models. All of these operators usually employ some global or adaptive threshold on their outputs as a decision device to finally decide whether a given location is on an edge or not. In this paper, we investigate the use of the sample~ variance of a local group of pixels as an edge operator. The use of the sample variance in image boundary detection is not new. It has been used for example to detect regions of similar texture, changes in which are defined to be edges[31. It has also been used as an intensity edge operator [41. We shall show that given an appropriate image model the Sample - Variance operator has a rather simple probability density function. The problem of edge detection can then be formulated using classical detection theory based on a composite likelihood ratio test. Along with another result which is presented in the Appendix, the optimum threshold for the Pratt Figure of Merit, a common performance measure for edge detectors, can be obtained. 2. DEFINITION OF TH SAMPLE- VARIANCE OPERATOR The Sample - Variance (SV) operator will be defined as the sample variance of a group of N pixels, i.e.: s5- ~ x,.(1) where: N= number of pixels in operator. X,= gray level value of the i pixel in pixel group. We note that an important feature of this definition is that no predefined operator shape is implied for the edge operator. In fact, the operator is a function of an arbitrary group of pixels. These pixels may be chosen in any manner in the spatial dimensions of a single image, through time in a sequence of images, or Quantitive Anlysi of a Moment red Edte OJtor

RSD-TR-6-84 across several spectral channels (e.g. Landsat imagery) of image data. 3. IMAGE EDGE MODEL The image model considered in this paper is a two-dimensional discrete parameter random field I51. Given a probability space (n,A,P), a measurable function, f,(w); w E f; E 12, is defined on this space as: w) + r,(w): n x I'.R (2) Here, #g is a deterministic discrete image consisting of two or more regions, each of which is of constant gray-level over its extent. The boundaries or edges between these regions have a ramp-like cross section as in Figure 1. The height, width, and direction of a ramp are parameters of a given edge. To this function go is added a zero mean i.i.d. Gaussian distributed random field n(w). For fixed r, the random variable f,(w) therefore has the density: P (r,z)- c'a (3) where: a2 = noise variance 4. PROBABILITY DENSITY FUNCTION OF THE S.V. OPERATOR Given the image model of Section 2, the problem then is to apply the SV operator and identify which pixels belong to an edge. We take a detectiontheoretic approach to this problem by fiit finding the probability density function of the SV operator (i.e. find the pdf of S2 in Equation 1). We can then use one of several well-known tests based on the likelihood ratio to obtain a decision threshold. Under the assumption of Section 2, each X, of the SV operator (Equation 1) is a random variable with density function as in Equation 3 i.e. the X,'s are i.i.d. Gaussian random variables with variance a and means: E(X,) i (4) 4 QuftItative Analysis of a Momen-BeDld Edse Operator

RSD-TR-5s84 where r- is the position vector of the i t pixel. Letting: x - xi Y, (5) we can rewrite Equation I as: NS2 N y where: E(Y,) } (E{j }-, } ) var(Yi) - NS2 Thus, the variable V2 of Equation 8 is the sum of squares of normal, unit variance random variables with unequal means. This is known to have a noncentral chi-square distribution of N-I degrees of freedom [6]. One degree of freedom is lost due to the fact that the sample mean is subtracted from each X, in Equation 7 instead of the actual mean. The non-centrality parameter, e, is given by: es (E(Y,}) (7) We will denote the density of Ns2 by Xqe) i.e. chi-square with v degrees of freedom and non-centrality parameter 0. 5. APPROXIMATIONS TO THE DENSITY FUNCTION The variable NS2 is distributed as a non-central chi-square. Unfortunately, there is no closed form for the probability density of X(*,e) [7J. We therefore investigate suitable approximations for the density of X,.e). (1) First, we approximate X(2,e) by a central chi-square variable, pXx), where the constants p and x are chosen so that X.e) and,PX) have the same mean and variance [81: Quantltatle Analysb of a Moment Based Edge Operator 5

RSD-TR-6-84 0 02 p 1-+ -. X = v+29 So: _X__.__.is approximately X2(8) (8) (2) Second, we approximate the newly formed X'2 variable as a normal variable. The standard approximation is [81: 12x' 1 N((2X-1),1) (o) Putting Equations (10) and (11) together, we have: or: N |( )1P2)2N | 2N 1 (10) where: p -+ 1+ e e-.(E(Y }) 2 V =v- N-i P+20 This approximate density for s is used for all calculations of error probabilities, etc. Numerical justification of this approximation has been tabulated by Patnaik[8], Wilson[9] and others. The approximation is fairly good at N=10 and converges rather rapidly as N increases. 6 Quantitative Analysis of a Momeat BIased Edge Operator

RSD-TR-5.84 6. EDGE DETECTION We now address the original problem of identifying edges in the image. We would like to use our knowledge of the operator's density function to obtain a classification threshold based on a statistical test. From Equation 10 we see that the SV operator has an approximately normal density function whose mean and standard deviation are a function of the noise power, a2, the non-centrality parameter, 0, and the number of pixels in the operator, N. e in turn depends on the proximity of an edge in the operator window. Examination of Equations 4, 6 and 7 reveals that e is a measure of the variance of the deterministic image elements,', in the operator window. Thus, when the operator window is located at a region of the image in which there is no edge, 0 will be small. On the other hand, when the window is centered on an edge, the parameter has a high value since both low and high element values, g; are present. Further, e varies smoothly between these two extremes as the ratio of high to low element values changes between one half and either zero or one, i.e. as the window goes from a centered position over the edge to one side or the other. This is illustrated in Figure 2 for a 3x3 window and a ramp edge model with height to width ratio of 2. The contours in Figure 2 are loci of edges with equal parameter 0, normalized to that of a vertical edge through the center of the window. That is, an edge whose midline is tangent to the 0.5 contour, for example, has a 0 value 0.5 times that of a vertical edge through the center pixel. Edges near the center of the window have a high 0 value and the parameter falls off with increasing distance from the center. The response is nearly isotropic and very similar to the square-root Sobel (3x3) operator whose magnitude response contours are shown in Figure 3 for comparison. The mean of the density function of the random variable s is a function of the parameter e. Referring to Equation 10, we see that a large value of 0 (edge present) gives rise to a density function with a large mean as compared to smaller values of o (no edge). Thus the density function of S conditioned on the presence of an edge in the operator window has a different mean from that of s conditioned on the absence of an edge (Figure 4). Our job is to chose a decision threshold, St, so as to minimize errors. The choice of st cannot be based on a simple likelihood ratio test because e can take on a range of values. Let the Ho hypothesis be the event, denoted e0, that no edge is present in the window and H1 be the event, e, that an edge is present. Under H1 then, e may lie in some range 0 E [a,b] where a is the maximum value of 0 (vertical centered edge) and b is appropriately chosen from Figure 2 (e.g. b - 0.5 * a). Under H0, E (b,0]. In other words the H1 hypothesis is that an edge falls inside the b/a contour of Figure 2, the Ho Quantitative Analyis of a Moment Basd Bdgs Opetstor 7

RSD-TR-5-84 hypothesis is that the edge falls outside that contour. A composite likelihood ratio test [101 on the variable s for deciding whether or not an edge is present is then: f p(sleo,eo) p(0o) deo do A p(rlel,~l) p(E,) d >, >( where: l- l[a,bl =o (b,01O dl decide el is true do = decide eo is true In the case p(e) is not known a priori, one simplifying assumption is that edges are uniformly distributed i.e. no particular edge location or orientation is any more likely than any other. One may then calculate p(9) for this case from Figure 2 by means of differential area. That is, p(*) is proportional to the area bounded by the contours corresponding to 0 and 9 + de. Equation 11 can then be evaluated numerically to obtain a decision threshold, St, for.. If T - 1, the resulting test is maximum likelihood. Other possibilities include MAP (T = P(edge) ), Neyman-Pearson, Bayes Risk, and Minimax tests. The probability of error can also be calculated. An example of this procedure will be given in the next section. 7. PREDICTING THE PRATT FIGURE OF MERIT The Pratt Figure of Merit [111 for edge detectors has been used in the literature as a means for making performance comparisons among edge detectors. It is an experimental tool to quantify the performance of edge detectors in an effort to make such comparisons less subjective. It' penalizes for both Type I and Type II errors as well as for displacements of detected edge points from true location. In use, an artificial image is constructed with a single vertical ramp edge of height AI, width two pixels, and with added ud noise of variance 4. The edge detector's output from this image is then used to form the figure of merit defined by: 8 Quanstitative Anslysis of a Momte Bsed Edge Operator

RSD*TR-4.84. I;o 1 F -S (12) where: I = max (ID,I,) I =_ number of ideal edge points ID= number of detected edge, points _, = displacement of the is detected edge point from the ideal edge. a - scaling parameter In the past, the merit value for an edge detector was found by experimentally changing the magnitude threshold for the detector until the best merit value was obtained. This experimental optimization had to be performed with a new artificial image for each value of signal-to-noise ratio of interest. IIowever, since the SV Edge Operator has a known probability density, we can establish apriori what the Pratt Figure of Merit is as a function of signal-to-noise ratio, and what the optimum threshold on S should be in order to attain that merit value. The ability to do this is crucially dependent on the following Theorem which is proved in the Appendix. Theorem: When p(slel) and p(sleO) are normally distributed, then the threshold for s that maximizes F is the threshold St such that: P(doe 1) - P(d1,co) (13) The corresponding optimum figure of merit is: F (d) (14) P(d,) We will use this Theorem along with the material developed in the previous section to predict the Pratt Figure of Merit for a 3x3 SV operator. We first expand the joint probabilities of Equation 13: O0 P(do,e,) = P(e1) f f p(ule,,U,) p(U,) de, d, s, 1E0ln Quantitative Aniyis of ar Moment Based Edge Opeutor 9

RSD-TR-6-84 P(dl,eo) = P(e o)f P(f o dIo do (15) -oo #oE N Now since the edge is vertical of width two pixels and the operator is a 3x3 square, only three values of 0 are possible. The first corresponds to the window centered on the edge:,1 E l,- { (3h/2)2 } The other two possibilities of e correspond to: 1) the window being centered one pixel to either side of the edge midline; and 2) the edge is completely outside the window: 0o E no = {(h/2a)2,0} he2 where a is defined to be the signal to noise ratio. Furthermore, if the test image is square of size MxM, these three values of 0 have a priori probabilities in the ratios' 1:2:M-2. Consequently, Equations 15 become: P(do, e) - p(l e1,0-(3h/2f) d 2St M-2S P(d1,eo) - M f p(leeo(h/2)2 + M p(l ) de (16) 2 Qi 2St + -2Q... St t M O29 e ] + M 0'2M a5j, where il,a1 are calculated from Equation 10 for l-=(3h/2r)2, etc. Substituting Equation 16 into Equation 13, we have: Q[St ] - 2Q [' ] + (M ) Q 2,J (17) 10 Qasntitatlve Analysis of a Miomemt Bsed Edge Operator

RSD-TR-6.84 which may be solved numerically for the threshold, St. Equation 14 is then used to find the resulting Figure of Merit. Figures 5 and 6 illustrate these results. They compare the theoretical figure of merit and optimum threshold verses snr (solid lines) to the experimental data (stars). The experimental points were obtained by iteratively changing the threshold until the highest figure of merit was obtained, a procedure universal to the literature. The accuracy of the probability model for the SV operator and the above Theorem clearly allows us to predict the outcome with assurance. 8. CONCLUSION In this paper we have shown that the Sample - Variance operator is useful for detecting intensity edges in discrete images. Under the hypothesis of additive, Gaussian distributed i.i.d. noise, the operator's probability density function may be very accurately approximated by a normal density. This allows the use of a composite likelihood ratio test to determine a decision threshold for the operator. In this way we can implement a variety of classical detection schemes such as Maximum Likelihood and MAP. We have also proved a theorem concerning predicting the optimum decision threshold for the Pratt Figure of Merit. The predicted optimum threshold for a 3x3 SV operator was found to be virtually identical to the actual value obtained by the usual experimental method. This demonstrates the power of the detection theory approach to finding edges in images. Quantitative Analysis of a Moment Baed Edge Opeat.r 11

RSD-TR-5-84 9. REFERENCES [1] Kunt, M., "Edge detection: a tutorial review," Proc. of ICASSP 82. IEEE Inter. Conf. on Acou. Speech 8 Sig. Proc. vol.2, 1982. [2] Davis, L., "A survey of edge detection techniques," Computer Graphics and Image Processing." vol.4, 1975, pp248-270. [3] Weszka, J., C. Dyer, and A. Rosenfeld, " A comparative study of texture measures for terrain classification," IEEE Trans. on Sys. Man 8 Cyber. vol.6, no.4, April 1976, pp269-285. [4] Hale, J., "Detection of elementary features in a picture by non-linear local numerical processing," Proc. Third Int. Joint Conf. on Pattern Recognition. 1976, pp764-768. [5] Wong, E., Stochastic Processes in Information and Dynamic Systems McGraw Hill, 1971. (6J Hogg, R. and A. Craig, Introduction to Mathemtical Statistics Macmillan Co., 1971, pp.316-317. [71 Tiku, M., "Laguerre series forms of noncentral chi-square and F distributions," Biometrika vol.52, 1965, p415. [8J Patnaik, P., "The noncentral chisquare and F distributions and their applications," Biometrika vol.36, 1949, p202. [9] Wilson, E. and M. Hilferty, "The distribution of chi-square," Proc. Nat. Acad. Sci. vol.17, no.694, 1931. [10] Melsa, J. and D. Cohen, Decision and Estimation Theory McGraw-Hill, 1978, pp.21-53. [11] Pratt, W. and I. Abdou, "Quantitative design and evaluation of enhancement/thresholding edge detectors,' Proc. of the IEEE vo.67, no.5, May 1979, pp.753-763. 1 Quantltative Analysis of a Moet Bad Edge Operator

RSD-TR-6-84 10. APPENDIX Theorem: When p(sleo) and p(Ieli) are normally distributed, then the threshold for s that maximizes F is the threshold s, such that: P(do,e) - P(dl,~o) This Theorem will be proved in several parts, and only for F > 0.5. For F less than this, only a weaker statement can be made. (i) Lemma (1): If st is chosen such that P(doel) - P(dl,eo), Then:,m = l = ID. (See Equation 12) proof: Let IT = total number of image points. Then: 4- IT P(e) IP{(doelc) + P(d Il, (18) - IT{P(dvieo) + P(difei)} i rP(d,) D Io *u I maxl,1 )I= 1 = ID Let FTr F resulting from St- T chosen as above. Let Ft - F resulting from threshold st - t (ii) Lemma(2): Ft<Fr;t > T proof: ID 5 I P(dl) and P(dl) is a monotonically nonn-increasing function of i. Furthermore:'t. T ID 1- (Lemma (1)) *Io <I; t> T * t* -t; >_T Thus: Quantitative Analysi of a Moment Baed Edge Oeao 13

RSD-TR-5-84 Ft = 7; t>T it - + ali 2 is a monotonically non-increasing function of t, since I1 is a constant. The lemma follows. (iii) Lemma (3): Ft _Fr; <t T if Fr > 0.5 First, we show that:, P(d1,e ) Ft (d ); t < T (19) proof: M - ID; < T (see above). Therefore: 1 1o T Ft _ A -; t < T Now, if a detected edge point is a real edge point,, 1 = o. 1. The detected edge points that are not real edge points are randomly scattered throughout the image (uid noise). So 4 is large on the average. Therefore: Y1 << E fase eds ptoit + al, re. edt peits 1 al,2 - ITP(dl,e I) Therefore: Ft m I, f eg ete 1 + tID (, epe psnts + aI + fdse edg p1.tJ I + at, = ~1 {IrP(dl,el)) Ir P(d l)) P(dj) Next we prove the lemma by showing that: 14 Qmntiti tmate Anbelys of a Ment Baed Edge Operator

ISD-TR-6$84 dF > 0 for f < T if Fr > 0.5 (20) proof: Using Equation 19: dFt d f P(d1,ei) 1 dt ~ d1if P(d 1) i - [i { {P (d,e )P(d1) - PI (di)P(d f,,) So: >dF o <P> o, < P(d,) (21) dfF P' (d) P(d<) since P' (d,) P(d,) < 0 PI (d,el) di P(d l -.d p(*,e)d - p(,e 1) And similarly: P' (dl) - p(t,el) + p(, eo) dFt p(>,e1) P(die1) d > p(u,e1) + p(t,eo) P(d1, e) + P(d1,o) 22) (22) P(u,eo) P(d1,eo) p(,' 1)" P(Il,, 1) Now, from Equation 19: P(d 1, 1e) P(dl,eo 1) + P(d1to) P(d1,eo) F - 1 Qualiltttiv.e Anaiayb ot a Momenst Eaed Edgl Opeu.tor 16

RSD-TR-6-84 P(d1,e1) Examining Equation 22 in light of Equation 23, we see that Equation 20 and the Lemma will be proved if we can but show: p(,e) > 1 <;t T (24) p(u,e 1) But by definition, T is such that: P(do,el) - P(d1,eO) -0 T ie. f p(*pe j)dt f p(,,eo)d* 2 r or: P(Me) fp(I'e)d - P(eo)fp(uleo)d (25) or:. P(e)Qt l, " - P(e0) Q T0 Now since a1 > ao always for this operator (see Section 5 and Equation 10), Equation 25 therefore implies: P(6l) p(-Trlc) < P(eo)p(,-Tlco) (26) - P(o)p(,- 2l1co) P(e I)p(,- Tl e ) p( - T, eo) 1 < p(, Te) (27) ==) 1 < p(,,e); g < T This proves Lemma 3. Lemmas 2 and 3 together prove the Theorem when FT > 0.5. Equation 19 gives the resulting Figure of Merit. If F,, < o.s, only Lemma 2 holds; i.e. the optimum threshold occurs at T or less. 16 Quuantittive: An1aly of *a Momnt BDuod Edp Operator

Figure 1: A ramp edge in cross section. Here, the height is h and the width is two pixels. of.0. T)DL srl I a.e t 1 Figure 2: Contours of equal parameter e for 3 3x3 SV operator. The displacement is in pixels.

60. t j -I* *.s *.. -.s -...o. *'.8 1Figure 3:' Contours of equal magnitude for the 3.x3 Sobel operator. re) pl) S St S Figure 4: The density function of S conditioned on the presence or absence of an edge in the operator window.

c1 2.cS, o t.1OO~rn tt. -n \o F 1o Figure o f Merit. ~1.0 0 10 0 0.t a U 2 2 W So1r-&l tLO't~tsP r*sCtO Figure 6: Predicted (curve) verses actual (stars) Pratt Figure of Merit.

3 9015 02i63 06lw