Mapping Irregular Algorithms in a Custom Computing Image Processing Framework F. Planque, I. Kraljic, Y. Savaria MiroTech Microsystems Inc. 395 Ste-Croix #202, St-Laurent, Qc H4N 2L3 Canada Tel: (514) 744-6476 Fax: (514) 744-6018 Email: i.kraljic@mirotech.com ------------------- 1. Introduction ------------------- A real-time image processing framework based on reconfigurable computing and the hardwired dataflow paradigm was proposed [Belanger99]. Operations are defined at the level of basic image processing primitives (convolution, noise filtering...). Each operation is implemented as one physical operator that can be mapped in reconfigurable logic. All operators are encapsulated and present uniform interfaces for data and control. Cascading operators for parallel execution is straightforward and does not increase execution time. A hardware core library that contains 50+ basic operators (n x n convolutions, median, histogram, noise filters, FIRs, FFTs, morphology...) has been developed. The paper aims at showing how complex irregular algorithms can be adapted for implementation on the dataflow framework. The mapping of connected component labeling and image warping will be presented. ------------------------------------ 2. Connected component labeling ------------------------------------ Connected component labeling is an essential operation for blob analysis and target recognition types of applications. A classical two-pass algorithm [Haralick92] has been selected for implementation in the reconfigurable dataflow framework. Due to its complex control and memory intensive architecture, the labeling is a good candidate to demonstrate mapping of irregular algorithms. The final paper will present in detail the labeler's architecture, including a discussion on the trade-offs that were made to achieve real-time performances. Availability of content-addressable memories (CAMs) greatly simplifies implementation of connected component labeling. State-of-the-art Xilinx FPGAs have support for CAMs; however their size is small (a 4 kbit block RAM can implement a 16x8 CAM) [Brelet00]. Standard memories were thus used in the labeler. Internally, the hardwired dataflow model is broken to allow efficient processing, however, at the operator level, the model is restored. Hence, the labeler is fully compatible with other framework cores as it presents the same uniform interfaces. A first labeler has been implemented and validated in real time. It is limited to 512 temporary labels and 254 final labels. Upgrading to higher label counts is straightforward. The core uses half of a Virtex xcv300 and can process 50 Mpixels/second. Several add-on cores for the labeler have been developed: area, bounding box and center of gravity. ------------------------------------ 3. Image warping ------------------------------------ Image warping is another algorithm whose implementation in a hardwired dataflow is not trivial. The "inverse mapping" algorithm [Wolberg90] is better adapted to a dataflow execution mode than the forward mapping algorithm. The paper will present a generic nearest-neighbor resampling warper currently in the final stage of completion. Linear and bicubic resampling will be discussed. As with the labeler, the image warper is fully framework-compatible. ------------- References ------------- [Belanger99] N. Belanger, I. Kraljic, Y. Savaria. A Reconfigurable-Computing Real-Time Image Processing Framework. In Third Annual Workshop on High Performance Embedded Computing HPEC'99. [Brelet00] J.-L. Brelet. Using Block RAM for High Performance Read/Write CAMs. Xilinx App. Note XAPP204 (v1.2) May 2, 2000. [Haralick92] R. M. Haralick, L. G. Shapiro, Computer and Robot Vision, Vol. I, Addison-Wesley, 1992. [Wolberg90] G. Wolberg. Digital Image Warping. IEEE Computer Society Press, 1990.