OPTIMAL SAMPLING FOR COORDINATE MEASUREMENT: ITS DEFINITION AND ALGORITHM Tony C. Woo and Ren Liang Department of Industrial & Operations Engineering University of Michigan Ann Arbor, MI 48109-2117 Technical Report 92-29 April 1992

Optimal Sampling for Coordinate Measurement: Its Definition and Algorithm* Tony C. Woo and Ren Liang Industrial and Operations Engineering, University of Michigan, 1205 Beal Avenue, Ann Arbor, MI 48109-2117, USA Abstract Accuracy in discrete sampling with coordinate measuring machines is first defined as the discrepancy of a finite point set, the lower bound for which was established by a Fields medalist. Then, from number theory, a particular sequence of numbers is used to compute the sampling locations -- resulting in a nearly quadratic reduction in the sample size, over the uniform or random distributions, at the same level of accuracy. Finally, experimental errors in the measurement of machined surfaces modeled as random processes are compiled, validating the number-theoretic predictions. * Invited talk at Quality through Engineering Design, an Indo-U.S.-Japan Conference, Bangalore, India, January 11-14, 1993. This work was supported in part by the Fulbright Foundation, grants from the National Science Foundation DMC-8903029 and DMC-9113791, and a grant from the University Research Fund, Scientific Research Laboratory, Ford Motor Company.

1. INTRODUCTION Quality has many connotations and implications, such as customer satisfaction, management style, etc. In the engineering context, quality can be associated with precision in manufacturing. In the life cycle of a product, precision is expressed as tolerances in dimensioned drawings during the design stage. Viewing the ideal shape (with nominal dimensions) as the input to a manufacturing channel [1], its output (the realized shape) is corrupted by noise in the channel. As there are encoding and decoding schemes in signal communications, tolerance specification serves as a decoding scheme: inspection detects errors (but does not correct them). The difficulty in error correction is deeper than that which meets the eye. The signal in the manufacturing channel is not only complex in its dimensionality (as products are usually three-dimensional rather than one-dimensional bit streams), there are semantics embedded in the signal as the functionality of the product. Inspection, as an error detection scheme, should be in itself error free. Confidence in the inspection process is however questionable. In 1988, R. Walker of Westinghouse issued an alert, through a medium called GIDEP (Government Industry Data Exchange Program) [2]: that the vendors of coordinate measuring machines (CMMs) used different algorithms in their machines -- resulting in a disagreement of as high as 60%! (An immediate conclusion one can draw is that good parts may have been rejected and bad parts may have been in service.) This phenomenon was subsequently termed as the divergence of methods [3]. There are actually two places where methods are invoked during inspection: for data gathering (or sampling) and for data interpretation (or fitting). Section 2 of this paper reviews the methods for both and bring out an important finding — the choice of a metric is subjective and is dependent on the implicit function of the design; but within the same metric there exists an absolute (hence objective) bound in accuracy leading to an optimal sampling strategy. Section 3 clarifies optimality as accuracy and time in coordinate measurement. Based on the work by K.F. Roth, the Fields medalist, accuracy in approximation is expressed by a mathematical notion called the discrepancy of a finite set of points, for which a lower bound exists [4]. For a finite set of N sample points, the time for sampling and for fitting can also be quantified in terms of N. Then, a sampling strategy based on the Hammersley sequence [5] is compared to uniform sampling. It shows a remarkable improvement of nearly-quadratic reduction in the number of samples, hence time, yet maintaining the same level of accuracy. (Loosely speaking, a quadratic reduction of 250,000 points yields 500.) Finally, implementation of the algorithm and simulation of inspection on machined surfaces [6] are conducted. The results confirm the theoretical prediction. This paper thus contributes to the convergence of sampling methods. 1

2. SOURCE OF DIVERGENCE Because the notions of distance and metric are central to the source of divergence, it is useful to review them first. Then, divergence in fitting algorithms is given a context. 2.1 Distance and Metric The basic idea in fitting is that of an elevation in dimensionality. Sample points are not only numerous, they are also zero-dimensional, topologically. By fitting a curve through points, or a surface through curves, hence elevating the dimensionality, two objectives are achieved. One, the size of data storage is reduced -- for later algorithmic processing. Two, a group behavior emerges -- for interpretation and decision-making. While there is abundant literature on approximation such as [7], distance is the basic element in computation. Consider the familiar Euclidean distance between two points (xl, yi, zl) and (x2, y2, z2). For each of the three components of a vector, the difference, e.g. (xl-x2), is first captured. Then, the differences are squared, summed, and a square root taken over the sum, hence the rootmean-square or RMS difference: d2 = [(xl-x2)2 + (yl-y2)2 + (z1-z2)2]/2 (1) The metric applied to (1) is the so-called L2 metric and it has the general form under the Lp metric [8] as: dp = [ I xl-x2I + Iyl-y2IP +I zl-z2 IP]I/ (2) Absolute value is taken so as to accommodate the odd and even nature of the exponent p. When p > 2, one finds a variety of distances, each having its own merit. In particular, when p = o, (2) becomes a mechanism for selecting the largest of the three components -- as supremum, or the maximal element in the set: d = sup [ Ixl-x21, I Y1-Y21, I zl-Z21] (3) Transition from (2) to (3) can be understood by assuming that one of the terms, say I Y1-y2 1, is the largest of the three. Dividing the entire right hand side of (2) by this term yields dpI= Ifxlx2l1PI YlY21 + (I z-z21 IY1Y2I P (4) 1 Y ll-y21 I yl-Y21 + yll-Y21 Y' 2

The first and the third terms in (4) vanish when p becomes very large, hence effectively rendering the supremum as in (3). Figure 1 illustrates the various dp. Figure 1 2.2 Fitting Algorithms Having seen that distance can adopt infinitely many measures, it should not be surprising that there are a variety of algorithms for elevating the dimensionality of point data -- by fitting them into curves and surfaces [9]. The word "fitting" implies a pre-conceived notion of some ideal geometry. Indeed, there must be some polynomial of a fixed degree into which the data fits; for, otherwise, fitting becomes arbitrary. The kinds of ideal geometries in computer aided design (CAD) are: linear, quadratic, cubic, and higher order. Examples of linear geometry are lines and planes; and examples of quadratic geometry are cylinders, cones and spheres. When the geometry is cubic or higher, there are few familiar shapes with the possible exception of a torus (which can be used to blend surfaces as fillets). In CAD, these high order surfaces are called free-form surfaces, implying the degrees of freedom that a designer has in expressing the desired shape through a set of control points. By virtue of the degree of local control the designer has on the shape, there are various interpolators [10, 11], such as Hermite, Bernstein and other polynomials, giving rise to Coons, Bezier, Bspline, non-uniform B-spline (NURB) curves and surfaces that are familiar to the CAD community. Conceptually, there are three basic criteria to fit data to these ideal geometries: INNER, OUTER, and THROUGH. (By the distinction of INNER and OUTER, there is an implied direction that indicates the material side of the workpiece under inspection.) Data can be fitted so that they all lie on the inside (the material side) of the ideal geometry, or conversely, on the outside. While one can take the two extremes by taking both the INNER and OUTER, giving rise to a "tolerance zone," fitting an ideal geometry THROUGH the data, such that some lie on the inside and others on the outside, is a third alternative -- giving a sense of an "average." (It may be conjectured that the criterion for fitting data and the choice of a metric depend on the functionality of the component as implied by the intent of the design [12].) To ensure that the ideal geometry satisfies any of the three criteria, a distance function must be invoked, which can be linear, as in (3), or non-linear as in (1) or (2). Thus, fitting algorithms are seen as linear or non-linear optimization problems in general [13] whereby some form of a (linear or non-linear) distance is minimized [14] as constrained by an ideal (linear or non-linear) geometry. When the ideal geometry is 3

simple, as in the case of lines, circles, planes, and cylinders, analytic solutions are possible [15]. After the data has been fitted, it is compared to the ideal geometry. Here, again, the notion of distance must be invoked. For speed, one may choose the largest component, as in (3), for an approximation to distance. For accuracy, the RMS distance as in (1) is adopted. Recall that the objective of fitting point data is to elevate the dimensionality. Now that the zero-dimensional points have been raised to one-dimensional curves or two-dimensional surfaces, the distance between the fitted and the ideal curves and surfaces once again invokes that of the points: to find the distance between two curves, for example, one computes the distance between a pair of closest points on the two curves, and treats it as the distance between the curves. Adopting the paradigm of the Lp metric, one can take the square of the closest points, the cube,... as so on -- as distances. Without compounding the divergence in methods for assessing the distance between the elevated entities, the irony of elevation may be noted: the dimensionality of point data is elevated from zero to one (or two); yet the distance used is for zero-dimensional entities. 3. SAMPLING ALGORITHM Discrete sampling is inherently an approximation process. If the number of samples is infinite, one may imagine that the error in approximation approaches zero. And when the sample size is finite, the error must be nonzero. This observation induces two practice questions: For any sampling sequence of size N, how small can the error be? (Section 3.1 answers this question.) And, in what way should the N samples be located in space? (This is answered in Section 3.2.) Together, these two sections address the accuracy and the time for coordinate measurement. 3.1 Sampling Error It has been noted that once a metric is adopted, the variety of methods is less divergent than otherwise. But this still leaves open the question: Suppose one adopts the L2 (or Lo) metric, what is the best method for sampling? In the last section, the measure of "goodness" has been heuristic (or subjective): taking the least of the squares or minimizing the maximum. This section reports an objective criterion for assessing sampling errors. Not that such a criterion contradicts the works of Gauss, Newton and Tschebychev [7,9,14], it was discovered at least a century later by K.F. Roth of the Imperial College, who received the equivalent of a Nobel in mathematics -- the Fields medal -- in 1958. Roth's contribution is in establishing a lower 4

bound in the discrepancy D between a finite set of N points and any function, in d dimensions [4]. d-1 D O(N-l(log N) 2 (5) The significance of (5) is that the lower bound is expressed only in terms of N and d; it is devoid of any constraint on the function which the samples approximate. Most important to coordinate measurement is that one can determine if a finite set of samples of size N, giving rise to a discrepancy D, meets Roth's bound: hence giving the least amount of error possible. For calibration of the sampling sequences in the next section, Roth's bound in low dimensions are: d = 2, D O(Nlog N/N) (6) d=l, D -O(1/N) (7) As important as discrepancy is to coordinate measurement, the concept may be unfamiliar. The remainder of this section gives an outline of the basic idea along with definitions. It is convenient to begin in two dimensions with a unit square [0,1] x [0,1] in which a region of interest bounded by some function resides. The area of the region can be approximated with points pi, P2,*. PN by the fraction: 1 N area - x (Pi) (8) i=l where X (Pi) is the characteristic function indicating whether a point pi E [0,1]2 is in or not in the region: X (Pi) = 1 if Pi is in the region 0 otherwise Generalizing (8) to d dimensions whereby the unit square becomes a unit hypercube Id = [0,1]d bounding a function f(t), t E Id, one has: Id ft) dt N (pi) (9) i=l Calling the difference between the two sides of (9) as the residual: R(t, N, pi) = d f(t) dt- X NX (pi) (10) i=1

the discrepancy in the L2 norm, or L2 discrepancy, is defined as [16]: D2 = { Jid [R(t,N,pi)]2dt} (11) Note the square root, the mean (by taking the integral in the sense of (9)), and the squaring of R. Likewise, in the Lo norm, the Lo discrepancy is defined by: sup Do= Id IR(t,N,pi)l (12) While there is a similarity between d2 in (1) and D2 in (11), and likewise between do in (3) and Doo in (12), it is noted that fIt) in (10) is a real-valued function that must be Riemann integrable. A variation of the residual (10) is to take the difference between the volume of a hypercube and its approximation [16]: d 1 N R*(t,N,pi) = n tj- X (i) (13) j=l i=l giving rise to D2* and Doo*, respectively: D2*= { d [R*(t,N,pi)]2dt} (14) Doo= IdP IR*(t,N,pi)l (15) And Roth [4] established the lower bound: d-1 Doo* > D2 > k N-l(log N) 2 (16) where the constant k is in terms of the dimension d only. The existence of such a bound triggered a flurry of activities in the construction of sequences of numbers, one of which is the subject in the next section. 3.2 Sample Size and ocations It may be intuitive that, the larger the sample size, the more accurate the measurement is. Such an intuition is confirmed by (7) in that the minimal error of approximation is inversely proportional to the sample size of N, in one dimension. But as the dimensionality of the sample points increases, the error 0

is also directly proportional to some power of the logarithm of N, as in (6). The outstanding question is: are there sequences of numbers that approach or even reach Roth's bound, as in (5)? The answer turns out to be in the affirmative, from the works of researchers in number theory. Before going into sequences, a word on the dimensionality d of the samples is in order. For a surface under inspection to be visible to an inspection machine, the surface can only be assumed to be single-valued. When the machine is a point sampling machine, it may be assumed that a command coordinate pair (xi, yi) is issued and a value zi is read. When the machine is a line sampling machine, it may be assumed that a command coordinate (xi) is issued and a pair (yi, zi) is read. The dimensionality of a sample, though always 3 after the data is read in, differs by the type of machinery used. In this work, sampling with the CMM is discussed; hence d = 2. To choose freely among the pairs (xi, yi), (xi, zi), and (yi, zi), the command coordinate (si, ti) is identified with the term sample. The sequence of numbers that Roth used to exhibit the tightness of his bound [17] is the Hammersley sequence [5]. Its coordinates (si, ti) in two dimensions are defined by: si = i/N (17) k ti= k bj2-j j=O where i E [O,N-1] N = total number of points k = Flog ii = ceiling of (log i) bj = binary digits for representing i As the binary representation of integers is used, logarithm will be taken to the base 2. Suppose N = 10 Hammersley points are used for inspection. The procedure for computing the Hammersley coordinates are as follows: 1. Determine k: Since i E [0,9], Flog ii = 4. 2. Determine bj: The four-bit representation (b3 b2 bI bo) for i are: (O 0 0), (0 0 0 1), * (1 0 0 1). 7

3. Compute si: According to (17), the si coordinates for the N = 10 points are: 0/10, 1/10, 2/10,... 9/10. 4. Compute ti: The binary representation for the ti coordinate is obtained by taking the "mirror image" of the binary representation for i, about the decimal point. For instance, for i = 1 = (0 0 0 1) or sl = 1/10, the mirror image is (1 0 0 0) or ti = 8/16. For i = 7 = (O 1 1 1) or si = 7/10, the mirror image is (1 1 1 0) or t7 = 14/16. The N = 10 Hammersley points are illustrated in Figure 2. The 10x10 grid is to be interpreted as the locations for the 100 uniform samples. Figure 2 In the construction of his sequence, Hammersley shows [5] that the discrepancy in d dimensions is of the order: DH - O(N-l(log N)d-1) (18) which is off by a factor of one-half power as compared to the absolute lower bound established by Roth in (5). While slightly suboptimal, the Hammersley sequence shows a nearly quadratic reduction in the number of points needed by uniform sampling -- for the same level of accuracy as measured by discrepancy. More precisely, it is known from Monte Carlo methods that, as the number of samples N approach infinity, the error becomes indistinguishable from that of a uniform sequence [16]. In particular, the error for a uniform sequence is O(N-1/d). To compare the number of uniform points NU against the Hammersley points NH, suppose that they yield the same discrepancy in two dimensions: _ =log NH (10) \/u- = NH where NU denote the number of uniform points and NH the number of Hammersley points. It becomes immediately obvious that there are almost quadratically many NU as there are NH: NU= (1 NH (19) A plot of the relation between N and NH is given in Figure 3. A plot of the relation between NU and NH is given in Figure 3. S

Figure 3 4. IMPLEMENTATION The confluence of mathematics and precision engineering has yet many bridges to be established. In his article [18], R. Hocken reports that there needs to be characterization of errors in the kinematics and dynamics of the CMM, lobing errors in the probe of the CMM, and the uncertainty in the manufacturing processes that create the workpiece. As the characterization of machine by machine, probe by probe, and workpiece by workpiece can be precise but laborious, the classical mathematization of manufacturing as random processes [19] seems once again attractive. Not only because random processes offer first order (mean), second order (variance) and higher order statistics [20] that capture ensemble behavior of manufactured surfaces, evidenced in a beautiful catalog of machined surfaces [21], it turns out that random processes also provide a crucial link to Roth's bound and Hammersley's sequence [22, 23]. It is in this connection that the measurement for the roughness of a machined surface, modeled as Wiener, Gaussian or any process, permits the employment of discrepancy [24] hence the Hammersley and other sequences [25]. A number of experiments have been conducted to assess the accuracy of the Hammersley points for coordinate measurement. The number of samples range from 16 to 15,625 -- for two reasons. First, the uniform distribution of NU = 10x10, 25x25, 50x50 and 125x125 = 15,625 points are used as benchmark. Secondly, the Hammersley points are chosen at NH = 16, 32, 64 and 128 to approximate the reduction in (19). Machined surfaces are simulated with two models: Wiener and Gaussian. For each of the two surface models, the uniform and the Hammerslsy points are used as samples so that their (respective) accuracies can be assessed. to give meaning to the notion of discrepancy (11,14), the roughness of a surface is measured, such that the average heights h at a point (s,t) on a surface is aproximated by N discrete measurements: l J I h(s,t) I ds dt - E I h(pis, pit) I (20) o o Ih~s~t~s dt 1i= where pi are the s-coordinates of the ith point in a sequence of N points. (Absolute value is taken on h(s,t) in (20) so as to ensure that the asperities do 9

not cancel out.) the similarity between (20) and (9) may be noted here. Calling the difference between the two sides of (20) as the error: I1 [1 1 N e(s,t,N,pi) = J 0J Ih(s,t) I ds dt - NS I h(pis, pit) I (21) the root-mean-square error in surface measurement eRMS, by invoking the L2 metric, is simply: 1 fl 1l/2 RM =SJ J0 0 [(s,t,Npi)]2 f(s,t)ds dt (22) where f(s,t) is the probability density function (for the Wiener and the Gaussian distributions). Again, note the similarity between (22) and (11). Figure 4 shows thirty-two Hammersley points superimposed on a Wiener surface and on a Gaussian surface. It may be seen that the Wiener surface is smooth and periodic, simulating single-point cutting with a rotary tool. The Gaussian surface is more textured, simulating such processes as electrodischarge machining or sand-blasting. Figure 4 Twenty runs are made for each of the Wiener and the Gaussain surface models and the RMS error in using the Hammersley points versus the uniform points are calculated based on (22). Table 1 shows the experimental results, on the RMS errors for the Wiener surfaces, with approximate correspondences in the equivalent number of sample points. uniform Hammersley number RMS error number RMS error 50x50 0.01414 64 0.00678 25x25 0.03295 32 0.01081 10x10 0.09469 16 0.01407 Table 1. RMS Error on Wiener Surfaces Table 1 shows that, for approximately a quadratic reduction in the numbr of points, the Hammersley samples are more accurate that uniform sampling. Results on the twenty Gaussian surfaces are given in Table 2. I 0

uniform Hammersley number RMS error number RMS error 125x125 0.02254 128 0.01766 50x50 0.08092 64 0.04398 25x25 0.06107 32 0.05453 1Ox10 0.52517 16 0.08300 Table 2. RMS Error on Gaussian Surfaces For approximately a quadratic reduction in the number, the Hammerslay points are of the same order in accuracy as the uniform points, as shown in Table 2, with the possible exception of the last row. 5. CONCLUSION This work introduces the application of the Hammersley sequence to coordinate measurement, hence bringing some convergence to sampling methods. The basis upon which the Hammersley sequence applies is that it exhibits a discrepancy approaching the number theoretic limit asserted by Roth. And disrepancy is related to the RMS error in measuring surface roughness. For point sampling, users of the CMM may wish to examine the reported strategy, over random or uniform sampling. For line sampling, practitioners in computer vision may take heart that a scan of 500 x 500 points can now be dramatically compacted [26]. And likewise for plane sampling, such as in computer tomography, a similar saving in data storage is in hand. 6 REERENCES 1. U. Roy, C.R. Liu and T.C. Woo, Review of Dimensioning and Tolerancing: Representation and Processing, Computer-Aided Design, 23(7), (1991), 466 -483. 2. R. Walker, GIDEP Alert, X1-A-88-01, (1988), August 22. 3. R. Schreiber, The Methods Divergence Dilemma, Manufacturing Engineering, (1990). 4. K.F. Roth, On Irregularity of Distribution, Mathematika, 1, (1954), 73-79. 5. J.M. Hammersley, Monte Carlo Methods for Solving Multivariate Problems, Annals of New York Academy of Sciences, 86, (1960), 844-874.

6. G.M. Zhang and S.G. Kapoor, Dynamic Generation of Machined Surfaces, Part 2: Construction of Surface Topogaphy, ASME Trans., J. Engineering for Industry, 113, (1991), 145-153. 7. M.J.D. Powell, Approximation Theory and Methods, Cambridge University Press, (1981). 8. Y.A. Schreider, What is Distance, University of Chicago Press, (1974). 9. P. Lancaster and K. Salkauskas, Curve and Surface Fitting, Academic Press, (1986). 10. M. Mortensen, Geometric Modeling, Wiley, (1985). 11. G. Farin, Curves and Surfaces for Computer Aided Geometric Design, Adacemic Press, (1988). 12. S.Y. Chou, T.C. Woo, and S.M. Pollock, On Assessing Circularity, Technical Report 92-16, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, (1992). 13. K.G. Murty, Linear complementarity, linear and nonlinear programming, Heldermann Verlag, Berlin, 1988. 14. W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes: the Art of Scientific Computing, Cambridge University Press, (1989). 15. V.B. Le and D.T. Lee, Out-of-Roundness Problem Revisited, IEEE Trans. Pattern Analysis and Machine Intelligence, 13 (3), (1991), 217-223. 16. L. Kuipers and H. Niederreiter, Uniform Distribution of Sequences, John Wiley and Sons, New York, (1974). 17. K.F. Roth, On Irregularity of Distribution, Acta Arithmetica, 37, (1980), 67 -75. 18. G. Caskey, Y. Hari, R. Hocken, D. Palanvelu, J. Reja, R. Wilson, K. Chen, and J. Yang, Sampling Techniques for Coordinate Measuring Machines, Proc. NSF Design and Manufacturing Systems Conf., Atlanta, GA, (1992), 779-786. 19. J. Peklenik, Contribution to the Theory of Surface Characterization, Annals of the CIRP, 12, (1963), 173-176. 20. S. Ross, Stochastic Processes, Wiley, (1983). 21. K.J. Stout, E.J. Davis, and P.J. Sullivan, Atlas of Machined Surfaces, Chapman and Hall, London, (1990). 22. Cipra, B.A., The Breaking of a Mathematical Curse, AAAS Science, (1991), 165. 23. H. Wozniakowski, Average Case Complexity of Multivariate Integration, Bull. AMS, 24 (1), (1991), 185-194. 24. R. Liang and T.C. Woo, Accuracy and Time in Measuring Surface Roughness, Part 1: Mathematical Foundations, Technical Report 92-17, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, (1992). 25. T.C. Woo and R. Liang, Accuracy and Time in Measuring Surface Roughness, Part 2: The Uniform, Hammersley and Zaremba Sequences, Technical Report 92-18,Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, (1992). 26. T.C. Woo, G. Lang and R. Liang, Square-root Sampling of Image Arrays, Technical Report 92-19, Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, (1992). /2

I xl - 2 IP or Izi - z21P I xl- x2 I or Izl - z21 Figure 1 Relative Importance of the individual items in (4) for different p

1. 0. 0. 0. 0. 0. o! i i. C; 4 X i =.. 0................. ----............... —: --- —---------- --------- -----------........................___ _ _ _ _......... i ---— o --- '1 - ^ - [ (!__ _ _ I~ i_! -___ _ _ _ __ _ _ _ 0.0 0.2 0.4 0.6 0.8 1.0 X Figure 2 The Distribution of Ten Hammersley Points

nu 8000 6000 4000 2000 0 0 50 100 150 200 Figure 3 Hammersley Points (nh) vs Uniform Points (nu) under the Same Accuracy

(a) Thirty Two Hammersley Points on a Modified Wiener Surface (b) Thirty Two Hammersley Points on a Gaussian Surface Figure 4 Hammersley Points on Wiener and Gaussian Surfaces