THE UNIVERSITY' O1F MICl —IGAN COMPUT'ING RESEARC I LABORA'ORYt TIlE USE OF THE DEANZA IP6400 IMAGE PROCESSOR FOR IOCALI WINDOW OPERATIONS Edward J. Delp and Nirwan Ansari CRIrTR-7-84 JANUAItY 1964 Room 1079. East Engineering Biuilding Ann Arbor, Michigan 48109 USA Tel: (313) 763-8000'Any opiniionis, findings, anid coriclusions or recorTimeridations expressed in this publication are those of the authors and do not necessarily reflect t-he views of the funidingp agency.

2- t"-c L R65 -

THE USE OF THE DEANZA IP6400 IMAGE PROCESSOR FOR LOCAL WINDOW OPERATIONS Edward J. Deip and Nirwan Ansaril The University of Michigan. Department of Electrical and Computer Engineering Ann A~rbor, Michilgan 481019

ABSTRACT This report discusses the use of the DeAnza P6400 image processor for high speed local window operations. A system overview of the image processor and its important features are first introduced. Algorithms for linear and non-linear window operators are then presented.

- 2 - 1. Introduction Most image processing operations take a large amount of memory space and cpu time. General purpose computers are not usually equipped with features necessary for high speed image processing operations. Nowadays, high speed image processing is desirable in many industrial and medical applications. The DeAnza IP6400 is used for its special features that make real time processing feasible. This report will first introduce an overview of the IP6400, and its most important option, the digital video processor (DVP), which provides facilities for various processing operations. Algorithms using the IP6400 to perform general 3X3 local window operations, nXm local window operations, 3 X 3 median filtering, and n X m median filtering are then presented. 2. System Overview 2.1. The DeAnza [P6400 The 1P6400 Series Image Array Processor contains many features which make it a powerful imaging system. It can be used solely as a display system which provides high resolution color, pseudo color, and multiple monochrome image display. Coupled with other options, real time image processing can be achieved using the IP6400. Among the options acquired in our system are a video signal digitizer and control, alphanumeric overlay generator, graphic overlay channel, tr'ackball cursor generator and control, joystick cursor generator and control, external synchronous source input, video output controller, and,'Most of all, the digital video processor arithmetic unit which makes window operations feasible. Our system also contains four 512x512 8-bit memory planes and four video cards with 8 bit DAC's. It is interfaced with a VAX 11/780 host comput',~r

SYNC -- ALPHANUMERIC CURSOR - OVERLAY I ~~~~B r IMAGE IMAGE U REFRESH - ITT ARRAY I_ v s MEMORY PROCESSOR ITTIf BY S DVP BYPASS! VO.C BYPASS ~~~~BYPASSI Figure 1. Overview of the DeAnza IP6400O* 2.2. Digital Video Processor (DVP) The importance of the DVP is due to its ability to handle operations on a large amount of image data in one video frame time(33 ins). The data paths of the DVP are shown in Figure 2. The DVP' has two arithmetic/logic units (A.LU) which allows the conditional outputs from the test ALU to determine swapping of the two operation codes; of the operational ALU. In other words, the results of a test on an individual pixel can be used to determine the operation on that

A 0Input | J-;:j| U Wriy~Bitte ~~.... Shifter Input OP Ms A Byte AUXi L of Aux u 30M Shifter Array c) - Bit Input - BIB2 PlaneI A B IMask chi W4SB Write Enable Input Bit nt - Plane -_Ci Mask eh2 Zero ~~~~~~~~~LS Zero Byte Analog LBof Camera Ts 2Shifter L Array U L ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~Bit Input ~~~~~~~~~~~~~~~~~~~~~Plane D I ~~~~~~~~~~~~Mask p~~ -4 ~~~~~~~~~~~~~~~~~~~~~~~~ch3 Write Enable Write,SB Enable sConstant K Figure 2. Data Path or the DVP""O pixelthroui opeatio Al ALU

-5- 5 - channels directly or through the Intensity Transformation Tables (ITT) under control of the Memory Port Control register (MEMPC). The INPDP also allows data acquisition from the camera A/D, and constant values from constant register or auxiliary input. Users can also make use of it to achieve compound operations such as 16-bit arithmetic operations. Under the control of the Output Data Paths and Shift Control register (ODPSH), output data from the ALU's can be shifted or rotated as either two independent 8-bit data values or a combined 16-bit data value to each image memory channel including the graphic overlay. The Bit Plane Mask register (BPM) controls the output writing to any single bit or combination of bits of each image memory channel. The Destination registers (DEST) may be used to scroll data output from the DVP to the target memories in multiples of 16 pixels. This is convenient for arithmetic operations on two memory channels. For arbitrary offsets, rather than restricted by multiples of 16 pixels, there are scroll registers associated with each memory to scroll, zoom, and selectively inhibit data from the source memories. Thus, arithmetic operations can readily be done by appropriate manipulation of these registers. The rest of the paper will discuss the philosophy of using the 1P6400 to perform local window operations including median filtering. The mnemonics, virtual addresses and function of the DVP registers and other registers which are used for window operations are found in Appendix 3. Local Window Operations The simplest image operation is the local window operation. These operations take a great deal of time if done on a general purpose computer because the operations are usually performed

- 6 - -6A DVP operation is a simple arithmetic or logic operation on an image through the DVP. A series of programs have been written to perform operations using the DVP. These programs mainly set up correct communication between the host computer and the 1P6400 by loading and synchronizing the registers and operations of the IP6400. Manual entries for the various programs can be found in Appendix B to G. 3.1. 3X3 Wndow Operation A 3X3 local linear window operation on an image can be thought of as a convolution of a 3X3 window with the image. Consider the following weighted window: WI W2 W3 W4 W5 W6 U WS8 W9 As the window moves across the image, the pixel value in the center of the window at that instant is replaced by the sum of the products of the window coefficients and the corresponding pixel values of the image. In a global sense, the operation can be perceived as the result of the sum of nine image planes, in which each plane is the scrolled image multiplied by one of the window coefficients, i.e. w1 X (original image scrolled one pixel left and up) + w2X (original image scrolled one pixel up) + w3 X (original image scrolled one pixel right and up) + w4 X (original image scrolled one pixel left) + w5X (original image) + w6X (original image scrolled one pixel right) + t" X (original image scrolled one pixel left and down) + w8 X (original image scrolled one pixel down) + wgX (original image scrolled one pixel right and down). The versatility of the DVP comes into play for such operation. Each multiplication of one coefficient with the whole scrolled image takes only one video frame time (one DVP operation). The overall operation would take only nine video frame times if.~o modification is needed after the nine mentioned operations

- 7 - does not include interface timing between the IP6400 and the host computer, which is usually small. To be exact timing for communication between the IP6400 and the host computer should also be considered. This timing depends on the speed of the host computer and the number of tasks being performed by the host computer. Another aspect that should be considered in using the DVP is that the two ALU's are unsigned 8-bit ALU's which are not capable of performing multiplication or division directly. In order to get around this, the Intensity Transformation Table (ITT) associated with each memory plane must be employed as a look-up table multiplication. Algorithms that require logarithmic or linear tables can also be implemented. The logarithmic table has the advantage of taking care of both positive and negative values, but it loses one bit of accuracy because both positive and negative values are represented by 8-bit data. The linear table is likely overloaded if the window coefficient is large. Hence, a scaling factor is necessary to avoid overloading the Intensity Transformation Table. The linear table is preferred in this operation since it can be implemented easier without the complication of the sign. To multiply a memory plane with a coefficient, the ITT associated with that plane is loaded with 256 bytes of data corresponding to values of the products of the absolute value of the coefficient and gray levels ranging from 0 to 255. The ITT is then used for indexing the pixel intensity. For the 3 X3 window operation, the result can be accumulated in one memory plane by summing each of the nine scrolled images (scrolled according to each window position) through its associated ITT which was loaded with indices (products of the absolute value of the coefficient and the gray levels from 0 to 255) corresponding to gray levels from 0 to 255. If a coefficient is negative, the ALU performs subtraction. Synchronization is very crucial for programs that involve DVP operations. Any necessary calculations for the next DVP operation should be done in the host computer as soon as the APGO bit (initiating the DVP operation) of the Control and Status register (CSR, see Appendix A) has been set. Only when the APR bit (ready bit) of the Control and Status register becomes set (indicatingt the completion MA a DVP operat'on), should the required 1P6400 registers

-8The 3X 3 window operation software is written in the C programming language using a VAX11/780 host computer running UNIX. Three memory planes of the IP6400 are used for the operations. The original image to be processed is assumed to be in memory plane 0, and the result is placed in memory plane 1. Memory plane 2 is used to hold the information of overflows and underflows. Both memory plane 1 and 2 should be cleared before the operation. The algorithm consists of the following steps: (1) read in nine window coefficients. (2) do step 3 to step 5 for each window coefficient (i.e., 9 times). (3) load ittO with indices corresponding to products of the scaled coefficient (each coefficient is divided by the scaling factor) and image gray levels ranging from 0 to 255. (4) set memory 2 as the input to the Operational ALU, and memories 1 and 0 as inputs to the test ALU. Note that data from memory 0 is scrolled accordingly before the input to the test ALU is set. (5) if a coefficient is negative, perform subtraction in the test ALU (memory 1 - memory 0), and place the result in memory 1 (accumulating the result in memory 1). At the same time, decrement the input of the operational ALU (decrement memory 2 which stores underfiows and overflows information), and place the result in memory 2. Otherwise, perform addition in the test ALU and increment the input of the operational ALU. Successive results are therefore accumulated in memory 1, while the information about overflows and underfows is accumulated in memory 2. The result in memory 1 would be correct if there are no underflows or overflows and the scaling factor mentioned before is 1. If the indication that a pixel in memory 2 is an underflow, the corresponding pixel in memory 1 (accumulated result) is actually a negative number. But the ALU's of the DVP are unsigned and image gray level valuts cannot be negAivew. It is logical, to set to zero or take the absolute value of those pixels which are underfiows. If a pixel is found to be an overflowed value, that pixel is set to 255. From 2's complement arithmetic, an underfiow occurs when a borrow is needed and an overflow when there is a carry bit. Pixels of memory 2

- 9 - -9which holds the overflow and underflow information should have a value of zero it there is no overflows or underflows. They should have values close to ff (hex) if there is underflows and values from 1 to 8 if there is overflows. From this information overflows and underfows can be corrected using the swapping ability of the DVP's ALU's. The following steps are used for correcting the underfow and overflows: (6) do step 7 to step 8 twice. (7) place memory 1 as input to the operational ALU, and memory 2 as input to the test ALU. (8) perform addition (memory 2 + constant) in the test ALU. (a) zero underfows: let the constant be 80 (hex), if there is carry bit from the test ALU, set output of the operational ALU to zero (80 hex plus an underfow indicator of value close to f hex would produce a carry bit), or otherwise, pass input of the operational ALU as the output (pixels not underflowed are passed unchanged). Place output of the test ALU back into memory 2 which now holds the sum of previous values and the constant. Place output of the operational ALU in memory 1 which now holds the result of having corrected underfiows. set overflows: let the constant be 7f(hex), if there is a carry bit from the test ALU, set output of the operational ALU to ff(hex) (7f hex plus the sum of an overflow indicator with value from 1 to 8 and 80 hex (previously done) would produce a carry), or otherwise, pass the input of the operational ALU as the output. Place output of the test ALU back in memory 2, and output of the operational ALU in memory 1 which now holds the correct result with both underflows and overflows taken care of. If complementing underfiows (taking absolute values) is desired, step (8b) should be taken instead of (8a): (b) underflows: let the constant be 01(hex), perform complementation of the input of the operationa~l ATLU if the-re lis a1 caIrry bit f'rotm the& tesat ATLU (thisa will copeetthoe%&

- 10taken care of at the same time in the next step). Place output from the test ALU in memory 2, and the output from the operational ALU in memory 1 (which now holds the result with the underflows corrected). overflows: excessive underflows and overflows can be corrected like step (8a), except that the constant becomes fe(hex). Set output of the operational ALU to 255 if there is a carry bit from the test ALU, or otherwise pass the input unchanged (summing fe(hex), 01(hex) (previous operation) and the overflow indicator would not produce carry bit if and only if the indicator is zero, in other words, there is either overflows nor underflows). Place output from the operational ALU in memory 1. Finally, the result in memory 1 should be scaled (multiplied) by the scaling factor. As mentioned earlier, the scaling factor is used to avoid overloading the ITT. Since partial results are accumulating in memory 1, severe truncation inaccuracy will result if the ITT is overloaded when it is loaded as a multiplication table. This truncation error would be accumulated and thus is undesirable. As a rule of thumb, the scaling factor which is determined by the user is usually chosen as the absolute value of the largest window coefficient provided that the magnitude of the chosen coeffilcient is greater than one. It would be one if magnitudes of all the window coefficients (I1ff Po). ~A-1e0 one pixel left and up (0,l Iff).-. -(lff,lff) -unscrolled (1,lIfe) (O -* —- 1(0,fe) one pixel right and down (of 0)~~~~~~~~~~~~~i(1f0

- 11are less than one. Since the result is scaled by the scaling factor, the window coefficients are never affected by this scaling factor. The final step is: (9) scale the result in memory 1 by the same scaling factor through the ITT of memory 1. Place the result back into memory 1. The final result is then found in memory 1. The video pixels are organized as a 512x512 array. Columns are numbered in the x-direction (from left to right) from 0 to 511. Rows are numbered in y-direction (from bottom to top) from 0 to 511. Pixel location (0, 511) is then the upper left hand corner of the video display. The 3X3 window operation requires 9 different scrolling positions for the convolution. For example, the scrolling positions for wl (one pixel left and up) is scrx=lff(hex) and scry=0(hex) where scrx and scry are the scroll registers corresponding to x and y directions(see Figure 3). For 3X3 window operations the scrolling positions corresponding to the window positions wl to wg would be as follows: window positions sc rx (hex) scry (hex) WI ~1 ff 0 U,2 ~~0 0 U,3 1 0 W4 I ~~ff 1 ff Ws6 0 1 ff WIL ~~1 1 ff WI ~ 1ff I fe U, 0 lfe U,9 1 lfe The overall 3X 3 window operation takes n+3 DVP operations, where n is the number of nion-zero window coefficients. There exists some inaccuracy using the 1P6400 due to the roundoff errors caused by the Intensity Transformation Table and the fact that the ALU's of the DVIP are only 8-bit ALU's. There is however a tremendous gain in speed. There is little visible difference between most linear window operations done on the 1P6400 and window operations done on a general purpose computer.

- 123.2. X m local window operation The philosop)hy of using the IP6400 to perform n X m local linear window operation is very similar to that or 3 X3 local window operation. In fact, all the necessary DVr operations are of the same type. It takes nm+3-z (z is the number of zero coefficients) DV operations for a n m local window operation, nm-z DVP operations for accumulating the sum of the products of each window coefficient and its correspondingly scrolled image, and 2 DVP operations (exactly the same operations described for 3 X 3 window operation) for correcting underflows and overflows, and 1 DVP' operation to scale the result by the same scaling factor mentioned before. One point that should be mentioned here is the evaluation of the scrolling positions for different window size. Consider the following window: w(O,O) w(1,0).. w(m,O) w(O, 1)... w(m, 1) w(07n)....w(m,n) Since the upper left corner of the video display is positioned by scrx=O and scry=lff(hex), the scrolling positions (specified by scrx and scry registers; associated with each memory channel, see Appendix A) can be related to the window coefficients by the following equations. For window position w(i,j): ocrz[i) = (sacrzo-!- + i) m. mask 8cru[Al = (secryo+-~ _j) ma-sk where ocrzo = 0 ocryo = lf/Ahex) mask = OlfA hex)

- 13 - integers. 3.3. 3 X 3 Median Filter Using the EP6400 for two-dimensional median filtering is challenging. Median filtering is a non-l'inear window operator and consists of a great number of sorting operationsl21[31. One of the best known properties ol median filtering is that of noise reduction with edge intensity being preserved13]. For a 3X3 window the median is the 5th largest pixel value within the window. The algorithm to determine whether a given pixel is a median value within a window is to count the number of other pixels within the window having values less than, and the number equal to the specified pixel. If there are four pixels within the window having values less than a particular pixel, that pixel is the median of the window. If there are more than four pixels within the window having values less than the pixel, the pixel cannot be the median. When there are less than four pixels having values less than the particular pixel, that pixel can still be the median depending on how many pixels within the window have gray value equal to the pixel. The conditions for a pixel being the median within the window can be summarized as follows: Conditions no. less minimum no. equal median 0 4 yes 1 3 yes 2 2 yes 3 1yes 4 0 yes 5 - no 6 - no 7 - no 8 - no Table 1 - Conditions for a pixel being the median within the window A program for 3 X3 median filtering using the 1P6400 was written in the C programming

- 14is placed in memory 2. Memory 1 is used for comparison, and hence contains the same image. Memory 3 is used to accumulate the conditions for each pixel such that it is the possible median of its 3X3 window. The algorithm consists of the following steps: (1) copy original image in memory 0 to memory 1. (2) clear memory 2 and 3. (3) do step 4 to step 8 for each window position (9 times). (4) find the number of pixels within every 3X3 window that have values less than that specified by the window position. This takes eight comparisons. Accumulate this information in memory 3 (5) mask out the impossible choices that cannot be medians. (6) find the number of pixels within every 3X3 window that have values equal to that specified by the window position. This takes another eight comparisons. Accumulate this information with that of step 4 in memory 3. (7) Using the information in memory 3, possible pixels specified by that window position that are medians are determined. The result is accumulated in memory 2. (8) clear memory 3 so that it -is ready to store the conditions'for pixels specified by next window position as possible medians. The DVP must have the capability of comparison to carry out the above steps. Comparisons can be done through the two ALU's by performing one's complement subtraction in the test ALU whose carry output bit in turn determines the swapping of the two operation codes of the operational ALU. Comparisons are done memory plane by memory plane. Consider the following window: w(0,1) w(1,0) w(2,1) Iw_(0,2) w(1,21) wI2,1)j

- 15Comparing pixels specified by w(O,O) with the rest of pixels specified by other window positions is done by comparing properly scrolled images corresponding to their relative positions to w(0,O) with pixels specified by w(O,O). For example, pixels specified by w(O,O) are compared to those of w(1,O) by comparing original image with the original image scrolled one pixel left, to those of w(IJ1) with the image scrolled one pixel left and up, and so on. Since the upper left corner of the unscrolled image is positioned by scrx=O and scry=lff(hex), the corresponding upper left corner of the image scrolled one pixel left is then positioned by scrx=lff(hex) and scry=lff(hex) (see Figure 3). The scrolling positions for comparing window position w(k, 1) with respect to w(i, j) are related as follows: cr = (cro+(i-k)) & mask cry = (cryo+(l-j)) & mask where crz and 8cry are values for scroll register8 z and y (see Appendix A) scrxo = 0 8cryo = fA hex) mask = lfA hex) & is the logical operator AND For the sake of programming, scrolling positions with respect to the center of the window (assume the center positions: scrx=O,scry=l1ff) are first evaluated. Then scrolling positions for comparisons with respect to each specified position can be evaluated in a loop without concern for the relative indices of the window. The'number of pixels having values less than that specified by a window position can be found through the operation of the two ALU's as follows: (1) perform step 2 to step 3 for each window position within the window with respect to the specified window position (there are eight window positions with respect to the specified position, i.e. eight comparisons). (2) perform -one's complement subtraction in the test ALU (C-D-1) with the inputs of the or'igi

- 16only if C is greater than D, i.e. the compared pixel has value less than that specified by the window position within its 3 X 3 neighborhood. (3) set memory plane 3, which stores conditions of pixels that are possible medians, as the input to the operational ALU. Set up the operation codes such that the input is incremented if there is a carry bit from the test ALU, otherwise the input is passed unchanged. Accumulate the output in memory 3. After eight comparisons for each window position, memory plane 3 holds information on the number of compared pixels being less than the center pixel of the window. As discussed above, numbers ranging from 5 to 8 are impossible choices of medians. To mask out those impossible choices, the ALU's are employed again as follows: (1) add 7b(hex) to memory 3. Memory 3 consists of values ranging from 7b(hex) to 83(hex) in which 80(hex) to 83(hex) become impossible choices. (2) mask out 80(hex) to 83(hex). Note that adding 80(hex) to a number greater than or equal to 80(hex) would produce a carry bit. Therefore, add memory 3 to a constant (80 hex) in the test ALU. Set memory 3 as the input to the operational ALU. Pass the input unchanged if there is no carry bit from the test ALU. Otherwise, set output to zero. Memory 3 consists of zeroes and values from 7b to 7f corresponding to pixels in the image that are possible medians. Next, the number of pixels having values equal to that speci'fied by the window position are determined and accumulated in memory 3. This is done in the following steps: (1) perform step 2 to step 3 for each window position within the window with respect to the specified window position. (2) perform one's complement subtraction (C-D-1) in the test ALU with the inputs of the scrolled image (compared position) as C, and the unscrolled image (specified window position) as D. Thus, there is a carry out if and only if C is greater than D. Set REQ and RCC

- 17code of the operational ALU is carried if and only if C=D, i.e. pixels compared are equal. (3) set memory 3 as the input to the operational ALU. Increment the input of the operational ALU if the condition for the conditional operation code of the operational ALU is met, i.e. compared pixels are equal. Otherwise, pass the input unchanged. Place the output of the operational ALU in memory 3. Memory 3 then contains conditions for corresponding pixels of the specified window position that are medians. As mentioned above, a pixel is a median if it meets the conditions listed in Table 1. Through the above operations, impossible choices have been masked out, and thus a pixel specified by the window position is a median if the corresponding pixel in memory 3 has a value greater than or equal to 7f. Finally, medians can be accumulated in memory 2 for each specified window position as follows: (1) set the operation code of the test ALU with addition (C+D), and the inputs of memory 3 (scrolled according to window position, see explanation below) as C, and a constant, 81(hex) as D. Hence, a carry bit occurs only if a pixel value in memory 3 is greater than or equal to 7f, in other words, the corresponding pixel in the or'ig'inal image is the median. (2) set the inputs of the operational ALU with memory 2 (accumulated result) as A, and the original image (scrolled according to window position, see explanation below) as B. Pass B unchanged as output to memory 2 if there is a carry out from the test ALU. Otherwise, pass A unchanged. Thus, pixels determined to be medians for each window position are accumulated in memory 2. If a pixel is determined to be the median, the center of the neighborhood should be replaced by this pixel. For a specified window position, say w(O,O), a pixel determined to be the median should be scrolled one pixel right and down replacing the center pixel. That is why memory 3 and the original image mentioned above must be scrolled according to the specified window position as inputs to the ALU's. These scrolling positions are related to their window positions by the follow

- 18For w(i,j): scrxl (crxo+1-i) Y mask cryj ( lcryo-1+j) t mask where crx[il and scri4l are values of scroll register z and y (to specify a scrolled position) crxo = 0 cryo = lfJ( hex) mask = If hex) is the logical operator AND The mask (ff) is used for the same reason mentioned earlier. The 3X3 median filtering takes altogether 182 DVP operations. It requires 1 DVP operation to preblank memory 2 and 3 and 1 DVP operation to copy the original image into memory 1. For each window position (there are nine) it requires 8 DVP operations to find out the number of pixels having values less than. a particular pixel for every pixel in the 3X3 window, another 8 DVP operations for finding number of pixels having values equal, 2 DVP' operations to mask out impossible choices of medians, 1 DVP operation to accumulate medians, and 1 DVP' operation to clear memory 3. There is a tremendous gain in speed when compared to median filtering done on a general purpose computer. It takes only 144 comparisons between image planes (up to 512x512) for the above operation. If done on a general purpose computer, 8 comparisons for each pixel of a 512x512 image (2097152 comparisons)are required. Since the operation does not involve the ITT's, there is no loss of accuracy such as the truncation inaccuracy mentioned in Section 3.1 and 3.2 3.4. n X m median filter

- 19evaluation of the scrolling positions. For window position w(i,j): cr[J = ( -crxo+- i) & mask 2 cryI = (cryo- +j) & mask where scrx(ij and scri4l are values of the scroll registers (to specify a scrottlled position) scrzo = 0 cryo lf=Ahex) masRk = lfAhex) & is the logical operator AND The mask is used for the same reason mentioned earlier. Note that n and m must be odd numbers. Modification also has to be made to the conistant added in memory 3 (step 1 of masking out imposisible choices of the previous section) to mask out impossible choices so that conditions for following operations to determine the medians remain the same. Since the constant added for a 3 X3 window is 7b(hex), 80(hex)-5(hex), the constant added for n Xm window is then 80(hex). ((nX m) + 1)(hex). In this way, a pixel of an image is the median of its n Xm neighborhood if the corresponding pixel in memory 3, which holds conditions for possible medians, is greater than or equal to 7f(hex), the same conditions for 3 X 3 median filtering. The whole operation takes (2nm(nm+l)+2) DVP operations. It requires 2 DVP operations for copying the original image to memory 1, preblanking memory 2 and 3, and, for each window position (there are nm), 2(nm-1)+2 DVP operations for finding conditions(number less, mask out choices, number equal) of possible medians, 1 DVP operation for accumulating the result, and 1 DVP' operation clearing memory 3 to set ready to store conditions of the next window position. It take 2n(nm1) omprisns o pefor th fiterng sin th TP6400 -An52x1(m1)c

-204. Dlacusslon The advantage of using the IP6400 for window operations lies in the capability of processing up to one megabyte of image data in a single video frame time. Various image processing operations can be implemented through the IP6400 to achieve high speed. The sobel operator (edge detector), for example, can be implemented using two 3X3 window operations, and one frame addition, a total of 25 DVP operations as follows (see Appendix B to G): (1) place original image in memory 0 (2) execute dwin33 -1 -2 -1 0 0 0 1 2 1 2 a c (detect vertical edge, result is placed in memory (3) execute dmove 1 3 (move result into memory 3) (4) execute dwin3_3 -1 0 1 -2 0 2 -1 0 1 2 a c (detect horizontal edge, result is placed in memory (5) execute dadd 1 3 2 (add both horizontal and vertical edges and place result in memory 2) The Laplacian operator for edge enhancement can be performed in 8 DVP operations. Tremendous gain in speed is the major significance in using the 1P6400 for various image processing operations. 5. References [I] IP6400 Programmer's Manual, DeAnza Systems, Inc., September, 1981. [21 T. S. Huang, G. T. Yang, and G. Y. Yang, "A Fast Two-dimensional Median Filtering Algorithm,-" IEE rans Acoust., Speech, S.iga Prcsin,vl ASSP. 27, pp13-8 Feb 1979.

-21Appendix A - E1P8400 registers

-22CONTROL AND STATUS REGISTER MNEMONIC- ICSR VIRTUAL ADDRESS 117426 OCTAL 151 14 13 121 11 10 9 8 7 6 5 4 3 2 1 O F A I P R I E A A R P N E N L D T A N G V O P P P Z D O X I O F I P T O I F I I G D E E T T S E E R E E S S S S O BIT FUNCTION NAME 0 Array processor GO bit APGO 1 Array processor interrupt status APIS 2 Peripheral interrupt status PIS 3 Even/Odd field status EOFS 4 Vertical interval status VIS 5 Graphics overlay enable GOE 6 Interrupt enable INT 7 Array processor rea'dy APR 8 Real time interrupt enable RTI E 9 CBC Delta field enable DFE 10 Phase locked oscillator status PLOS 11 Initialize INIT 12 External I/O Select EXT 13 Alphanumeric overlay enable ANOE 14 CBC Packed datA enable PDE

-23MEMORY PORT CONTROL REGISTER Mnemonic IMEMPC, Virtual Address 117424 octal 15 14 13 1211 10 9 8 7 6 5 4 3 2 1 0 I I I I P p p.P V V V V NOT USED 3 2 1 0 3 2 1 0 3 2 1 0 ~~~~....... Memory thru ITU to video Memory thru ITU to array 1<~o Memory thru ITU to optional port O disables and 1 enables as described. Bit Label Function 0-3 VO-V3 Enable data from the memory channels to pass through the ITU to video VO Memory Channel 0 thru ITU to video V1 Memory Channel 1 thru ITU to video V2 Memory Channel 2 thru ITU to video V3 Memory Channel 3 thru ITU to video 4-7 PO-P3 Enable data from the memory channels to pass through the ITU to the Array Processor P0 Memory Channel 0 thru ITU to Array Processor P1 Memory Channel 1 thru ITU to Array Processor P2 Memory Channel 2 thru ITU to Array Processor P3 Memory Channel 3 thru ITU to Array Processor 8-11 I0-13 Enable optional port 10 Memory Channel 0 thru ITU to Optional Port *11 Memory Channel 1 thru ITU to Optional Port 12 Memory Channel 2 thru ITU to Optional Port 13 Memory Channel 3 thru ITU to Optional Port

-24Image Scroll and Zoom Registers - 12 Bits MNEMONIC VIRTUAL ADDRESS CHANNEL ISCRXO, ISCRYO 117460, 117462 octal 0 ISCRX1, ISCRY1 117464, 117466 octal 1 ISCRX2, ISCRY2 117470, 117472 octal 2 ISCRX3, ISCRY3 117474, 117476 octal 3 ISCRX4, ISCRY4 117514, 117516 octal 4 overlay X Scroll and Zoom Register 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 |NOT USED |BL BE SE E8 E7 E6 E5 E4 E3 E2 E1 EO..... Starting Element (Horizontal) Element Sign Bit Scroll with Black Enable Horizontally Scroll with Black Enable Vertically BIT LABEL FUNCTION 0-8 EO-E8 Specify starting element in horizontal direction 9 SE Element sign bit 0 Data zeroed after horizontal wrap around if BE is 1 1 Data zeroed before horizontal wrap around if BE is 1 10 BE Scroll with Black Enable Horizontally - Element o No data is zeroed 1 Enable data to be zeroed with horizonttZJ. wrap around

-25Y Scroll and Zoom Register 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 NOT USED DD Zi ZO SL E8 E7 E6 E5 E4 E3 E2 El EO..........'~............ Starting Line (Vertical) Line Sign Bit Zoom Disable writing to image memory by the DVP B IT LABEL FUNCT I ON 0-8 LO-L8 Specify starting line in vertical direction (777 to 0 octal is top to bottom) 9 SL Line sign bit 0 Data zeroed after vertical wrap around if BL is l(top part of video display is black) 1 Data zeroed before vertical wrap around if BL is 1 (bottom part of video display is black) lOwil ZO-Zi Zoom control bits o Ol1tol1(No zoom) 0 2 tol1Zoom 1 O 4tol1Zoom 1 18 tol1Zoom 12 DD Disable DVP writing to image memory channel. 0- Enable writing to memory channel by DVP 1. Disablei i

-26DESTINATION REGISTERS Mnemonic Virtual Address Function IDESTX 117420 Destination - element IDESTY 117422 DestinationY - line Destination X Register 15 12 11 10 9 8 7 6 5 4 S S x x x x x NOTUED 1 0 8 7 6 5 4 NOTUSED Starting Element Position 4-8 4-S8 Starting Element Position in image memory for data from the DVP. Must be multiples of 16. 10,11 Si so Direction and number of bits to shift the output of the digitizers A/D converter before Bits zero on input t input to the array processor. the array processo 6 bit 8 bit~ Digitizer Digitizer 0 0 Normal - no shift. Accept output 2 MASB's Noncfrom the AID codverter. 0 1 Shift right 1 bit - divide 3 MSB's MSB A/D converter output by 2 1 0 Shi-ft lepft 1 bit- - mutltiply MSB9 LS LS,B R.

-27Destination Y Register 15 9 8 7 6 5 4 3 2 1 O l5{.I.Y Y Y Y Y Y Y Y Y Y NOT USED 8 7 6 5 4 3 2 1 O Starting Line Position Bit Label Function from DVP_

-28DVP Bit Plane Mask Registers Mnemonic Virtual Address Function IBPMA 117440 Bit plane maskA IBPMB 117442 Bit plane mask IB3M 1142l Bit pln0s 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 M M M Ni M l M M' M M M M M M M 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Bit plane mask bits Bit plane mask bits 0 - Do not write to bit plane (Disable) 1 - Write to bit plane (Enable) Bit Plane Mask Bits Label Register Memory channel 0-7 MO-M7 A 0 8-15 M8-M15 A 1.0-7 *MO-M7 B 2 8-15 M8-M15 B 3

-29Constants Register MNEMONIC- ICONST VIRTUAL ADDRESS 117444 15114 13 12111 10 9 8 7 6 5 4 3 2 1 0 K K K K K K KK C ~ C C C C C 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 Kl Cns - AL I Constant - Test ALU Bit Label Function 0-7 CO-C7 K2 Constant -alternate input B of Test ALU 8-15 KO-K7 K1 Constant - alternate input D of Operatlonal ALU Constant value range is 0 to -377 octal. Arithmetic is performed as l's complement. All values are positive and considered as intensities.

-30DVP Input Data Path MNEMONIC - INPDP VIRTUAL ADDRESS 117446 OCTAL 1514 12 11 12 0. 6 5 4 3 2 6 5 A D K B Z C C C TD C C O X 1 0 I 1 1 0 0 I 1 0 S A input B input C input D input memory memory memory memory channel channel channel channel select select select select Auxiliary Ms Camera A/D LS input Constant Constant Test ALU out'put Select D input Force LS to 0 D input control A Input Control B Input Control C Input Control 0-Test ALLU to shifter Oprational ALU Test ALU INPUT Name ALU OUTPUT Name A Operational ALUI MS Byte of SHIFTER B Operational ALU MS Byte of SHIFTER C Test ALUI LS Byte of SHIFTER D Test ALU LS Byte of SHIFTER

-31Output Data Paths and Shift Control Register MNEMONIC- IODPSH VIRTUAL ADDRESS 117454 OCTAL 15 14 13 12 11'10 9 8 7 6 5 4 3 2 1 0 0 0 o 0 0 0 V V R R D s S S NOT USED 3 2 1 0 U L LL 2 1 O # of Bits and direction of shift Byte/Word mode Rotate (end around) or shift (zero fill) Overflow (Left)-all bits set to 1 Selects memory channel: O-Lower byte of Shifter Output data path 1-Upper byte of Shifter Shift control bits Upper Byte - MS byte, output of Operational ALU Lower Byte - LS byte, output of Tet ALU (or B1 selector) Bits Label Function 0-2 S0-S2 Shift Code - specifies the direction and number of

-32Test ALU Operational Code Register MNEMONIC - ITSTOP VIRTUAL ADDRESS 117450 OCTAL 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 W W. R R C C C C F F E E C E C S S S S 3 2 1 0 I 1 1 0 C 0 M C 3 2 1 O Condition Select one of code for 16 operations counter to be incremented Complement of Carry 0-Carry, 1-No Carry Force SO-S3 to 0 Select Arithmetic or if LSB of C input Logical Operation is 0 Determine one of 48 if CC is set (=1) operations to perform Force Bit O of data in shifter to O Inputs A = B condition code Write-enable region: Complement carry condition code anywhere, cursor defined, or overlay defined 2 bit condition code field for OP ALU code. Bit Label Function 0-3 S0-S3 Test ALU operational codes. See Table

-33Test Operational Code Table ARITHMETIC FUNCTIONS Operational Code " Outnut Based on Comnlement Carry NPUTS M CC S3 S2 S1 SO CC O0 CC = 1 C, D O X 0 0 O 0 C + 1 C C, D O X O 0 0 1 (C or D) + 1 (C or D C, D 0 X 0 0 1 0 (C or D) + 1 (C or D) C, D O X 0 0 1 1 0 377 octal C, D 0 X 0 1 0 0 C + (C and D) + 1 C + (C and D) C, D 0 X 0 1 0 1 (C or D)+(C and D) + 1 (C or D)+(C and D) C, D O X 0 1 1 O0 C - D C - D - 1 C, D O X 0 1 1 1 C and D C and D -1 C, D 0 X 1 0 0 0 C + (C and D) + 1 C + (C and D) C, D O X 1 0 O I1 C + D + 1 C + D C, D 0 X 1 0 1 0 (C or D)+(C and D)+ 1 (C or D)+(C and D) C, D o X 1 0 1 1 (C and D) (C and D) -1 C, D O X 1 1 00 C + C + 1 C + C C, D 0 X 1 1 O 1 (C or D) + C + 1 (C or D) + c C, D 0 X 1 1 1 0 (C or D) + C + 1 (C or D) + C C, D O X 1 1 1 C C - 1 LOGICAL FUNCTIONS Operational Code Output Based on Complement Carry INPUTS M CC S3 S2 S1 SO (CC = DON'T CARE) C, D 1 X O O O O C C, D 1 X O 0 O 1 C or D C, D 1 X 0 0 1 0 C and D C, D 1 X 0 0 1 1 0 C, D 1 X 0 1 O 0 C and D C, D 1 X 0 1 0 1 D C, D 1 X 0 1 1 0 C exclusive or D C, D 1 X O 1 1 1 C and D C, D 1 X 1 O O 0 C or D C, D 1 X 1 0 O 1 C exclusive or D C, D 1 X 1 O 1. 0 D I C, D 1 X 1 0 1 1 C and D C, D 1 X 1 1 0 0 377 octal, all bits set C, D 1 X 1 1 0 1 C or D C, D 1 X 1 1 1 O0 C or D C, D 1 X 1 1 1 1 C Ts Operational Code Tabl

-34Operational Code Register MNEMONIC - IOPOP VIRTUAL ADDRESS 117452 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 O _W M 3 2 1 0 W M C 3 2 1 0 OP ALU Codes Default OP ALU'Used if Test ALU codes codes match Complement O-Force car ly Complement O-Force carr: of carry 1-No carry of carry 1-No carry O-Arithmetic O-Arithmetic Mode: 1-Logical Mode: 1-Logical Enable writing to memories Enable writing to memories Default OP ALU codes Conditional OP ALU codes Bits Label Function 0-3,8-11 SO-S3 Operational ALU operation codes 4,12 CC Complement of Force Carry 0 - Force Carry 1 - No Carry

-35Operational Code Table ARITHMETIC FUNCTIONS Operational Code Output Based on Complement Carty INPUTS M CC S3 S2 S1 SO CC = 0 CC - 1 A, B O X 0 0 O 0 A + 1 A A, B O X 0 0 0 1 (A or ) + 1 (A or B) A, B 0 X O 0 1 O (A or B) + 1 (A or B) A, B O X O 0 1 1 0 377 octal A, B 0 X 0 1 0 0 A +.(A and B) + 1 A + (A and B) A, B 0 X 0 1 0 1 (A or B)+(A and B) + 1 (A or B)+(A and B'A, B X O 1 1 0 A - B A - B - A, B 0 X 0 1 1 1 A and B A and B -i A, B 0 X 1 0 0 0 A + (A and B) + 1 A + (A and B) A, B O X 1 0 0 1 A + B + 1 A + B A, B O X 1 0 1 0 (A or B)+(A and B)+ I (A or B)+(A and B) A, B 0 X 1 0 1 1 (A and B) (A and B) -1 A, B O X 1 O0 O A + A + 1 A + A A, B O X 1 1 0 1 (Aor + A + 1 (Aor B) + A A, B 0 X 1 1 1 0 (A or B) + A + 1 (A or B) + A A, B 0 X 1 1 1 1 A A - LOGICAL FUNCTIONS Operational Code Output Based on Complement Carry (CC - DON'T CARE) INPUTS NM CC S3 S2 Sl SO A, B I X 0 0 O 0 A A, B I X 0 0 0 1 A or B A, B 1 X 0 0 1 0 A and B A, B i X O O 1 0 A, B 1 X O 1 O O A and B A, B I X 0 1 0 1 B A, B 1 X 0 i 1 0 A exclusive or B A, B I X 0 1 1 1 A and B. A, B 1 X 1 0 0 0 or B A, B 1 X 1 0 O 1 A exclusive or B A, B l X 0 1 0 B A, B X 1 0 1 1 A and B A, B 1 X 1 1 0 O 377 octal, all bits set A, B X 1 10 1 A or B A, B 1 X 1 1 1 0 A or B A, B 1 X 1 I 1 1 A

-36Appendix B -3 X3 IUnear window operation

-3/Appendix ( B) UNIX Programmer's Manual Appendix ( B ) NAME dwin3_3 - 3X 3 window operation SYNOPSIS (1) dwin3_3 lwll [w21 [w31 [w4j w1 51 w61 1w71 [w8j [w9| [s [a] [c (2) dwin3_3 [fl datafile DESCRIPTION Three memory planes are used for this operation. The image to be processed should be in memory plane 0. The result is located in memory plane 1. Memory plane 2 is used for the overflow indicators. The parameters include the nine window coefficients, a scaling factor, underflow complement option, and memory 1 & 2 preblank option described as follows: wl to w9 window coefficients arranged in the order of wl to w3 as first row of the window, w4 to w6 as second row of the window, w7 to w9 as third row of the window. All coefficients must be numerical values. s scaling factor. It is a numerical value, usually selected to prevent overflow of the lookup table (ITT) of DeAnza. As a rule of thumb, when absolute values of some coefficients are larger than one, the absolute value of the largest coefficient should be the scaling factor. This scaling factor scales down the coefficients for arithmetic manipulation in DVP of DeAnza, and then scales up the result. HIence, the scaling factor does not affect the coefficients cn the whole window operation. Yet, it is necessary to prevent overflow of the ITT of DeAnza for arithmetic manipulation. a This option complements, instead of zeroes underflow. Default would zero underflow. c This option preblanks memory 1 and memory 2. Default would not preblank the memory planes. Parameters can be input from a datafile by typing f before the datafile. AUTHOR N. Ansari (7/7/83) FILES /dev/deanza /usr/include/sys/iz.h SEE ALSO IP 6400 Programmer's Manual, DeAnza Systems, Inc. 7th Edition U of M- ra 1

-38Appendix C -n X mn ilnear window operation

-39~Appendix ( C) UNDCIX Programmer's Manual Appendix ( C NAME dvwin - n x n window operation SYNOPSIS (1) dvwin n [ml [w[lll... [w[nm|l [s| lal [c (2) dvwin f datafile DESCRIPTION Three memory planes are used for this operation. The image to be processed should be in memory plane 0. The result is located in memory plane 1. Memory plane 2 is used for the overflow indicators. The parameters include the size of the window, the window coefficients, a scaling factor, underflow complement option, and memory 1 & 2 preblank option described as follows: ~~n ~ width, number of rows of the window. ~~~m length, number of columns of the window. will to w[nm window coefficients arranged in the order of row by row basis. All coefficients must be numerical values. ~~~s ~ scaling factor. It is a numerical value, usually selected to prevent overflow of the lookup table (ITT) of DeAnza. As a rule of thumb, when absolute values of some coefficients are larger than one, the absolute value of the largest coefficient should be the scaling factor. This scaling factor scales down the coefficients for arithmetic manipulation in DVP of DeAnza, and then scales up the result. Hence, the scaling factor does not affect the coefficients on the whole window operation. Yet, it is necessary to prevent overflow of the ITT of DeAnza for arithmetic manipulation. a This option complements, instead of zeroes underflow. Default would zero underflow. c This option preblanks memory 1 and memory 2. Default would not preblank the memory planes. Parameters can be input from a datafile by typing f before the datafile. AUTHOR N. Ansari (7/14/83) FILES /dev /deanza /usr/include/sys/iz.h SEE ALSO IP 6400 Programmer's Manual, DeAnza Systems, Inc.

-40Appendlx D -n X m median filtering

-41~Appendix ( D) UNIX Programmer's Manual Appendix ( NAME dvmedian - median filtering SYNOPSIS ldvmedian [n [ml DESCRIPTION dvmedian performs two-dimensional median filtering using the DeAnza 1P6400. The default window size is 3X3. The maximum window size is 25X25. Four memory planes are used for this operatio. The image to be processed should be in memory plane 0. The result is located in memory plane 2. Memory planes 1 and 3 are used for intermediate processes. The only parameters are to specify window size. ~~n ~ width, number of rows of the window. ~~~m length, number of columns of the window. AUTHOR N. Ansari (7/23/83) FILES /dev/deanza /usr/include/sys/iz.h SEE ALSO IP 6400 Programmer's Manual, DeAnza Systems, Inc.

-42Appendlx E - denhance

-43~Appendix ( E) UNIX Programmer's Manual Appendix ( E NAME denhance - deblur a picture SYNOPSIS denhance DESCRIPTION denhance is a deanza version unsharp masking. It uses a Laplacian operator t deblur a picture. It is a 3x3 window operation with a weight of 5 at the center, -1 at the nearest neighbors, and a 0 at other neighbors. The picture to be processed should be placed in memory plane 0, and the result is placed in memory plane 1. DIAGNOSTICS None AUTHOR N. Ansari (7/7/83) Fa usr/includeSsys/iz.h /dev/deanza /usr/include/sys/iz.h SEE ALSO IP 6400 Programmer's Manual, DeAnza Systems, Inc.

-44Appendix F - dadd

-45~Appendix ( F) UNIX Programmer's Manual Appendix F.) NAME dadd - adds two images from Deanza. SYNOPSIS dadd chan chan2 chan3 DESCRIPTION dadd will add the images stored in chanl and chan2 together and stores the result in chan3. chanl, chan2, chan3 can be any of the four channels: 0,1,2 or 3. DIAGNOSTICS Bad ags are ignored. AUTHOR Nancy Cam (3/08/83) FILES / dev/deanza BUGS No known bugs.

-46Appendix G -dmove

-47~Appendix ( G) IJNIX Programmer's Manual Appendix ( NAME dmove SYNOPSIS dove [chanX [chanYj DESCRIPTION This routine allows the user to move an image from chanX to chanY, where chanX and chanY can be from 0 to 3. DIAGNOSTICS Synopsis will be given upon detecting illegal parameters. AUTIIOR N. Ansari (7/7/83) FILES /dev/deanza /usr/include/sys/iz.h SEE ALSO IP 64I00 Programmer'si Manual, DeAnia Systems, Inc. UN1VER51' OFMICHIGAN \1\ WA5VA 618