## "Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs"

Zoltán Nagy and Péter Szolgay

University of Veszprém

AbstractA Cellular Neural Network (CNN) is a non-linear dynamic processor array. Its extended version, the CNN Universal Machine (CNN-UM), was invented in 1993 [1]. The CNN paradigm is a natural framework to describe the behavior of locally interconnected dynamical systems which have an array structure. So, it is quite straightforward to use CNN to compute the solution of partial differential equations (PDE). Several studies proved the effectiveness of the CNN-UM solution of different PDEs [2], [3]. But the results cannot be used in real life implementations because of the limitations of the analog CNN-UM chips such as low precision or the application of non-linear templates. Emulated digital CNN-UM architectures, such as the Falcon emulated digital CNN-UM processor array [4], seem to be more flexible than their analog counterparts both in cell array size and accuracy while their computing power is just slightly smaller. By implementing these architectures on reconfigurable chips it is possible to change the cell model and evaluate the new architecture in very short time.

Simulation of compressible and incompressible fluids is one of the most exciting areas of the solution of PDEs because these equations appear in many important applications in aerodynamics, meteorology and oceanography. In general, ocean models describe the response of the variable density ocean to atmospheric momentum and heat forcing. In the simplest barotropic ocean model a region of the ocean’s water column is vertically integrated to obtain one value for the vertically different horizontal currents. The more accurate models use several horizontal layers to describe the motion in the deeper regions of the ocean. The governing equations of the ocean model are derived from the Navier-Stokes equations of incompressible fluids. CNN-UM solution of the Navier-Stokes equations was described in [2]. But the non-linearity of the state equations does not make it possible to utilize the huge computing power of the current analog CNN-UM chips.

Some previous results show that configurable emulated digital architectures can be very efficiently used in the computation of the CNN dynamics [4] and in the solution of simple PDEs [5]. The Falcon emulated digital CNN-UM architecture can be modified to handle the non-linear templates required in the computation of the advection terms of the barotropic ocean model [6]. This architecture was implemented by using Handel-C on a mid-sized FPGA with one million equivalent system gates on our RC200 prototyping board from Celoxica [7]. The performance of this solution is limited by the speed of the memories and the width of the memory bus. But even this restricted solution is 60 times faster than a Pentium IV 3GHz processor. If larger FPGA and wider memory bus are used, 1700-fold performance increase can be achieved.

In this work, we investigate the use of FPGAs in the solution of a 3D ocean model such as the Princeton Ocean Model (POM) [8]. The governing equations of the POM is more complex than the barotropic ocean model but present-day multi-million gate FPGAs are large enough to accommodate such a huge arithmetic unit. One of the key elements of the implementation is to exploit the locality of the computations which makes it possible to reduce the I/O bandwidth requirements of the processor. Additionally the arithmetic unit should be optimized to solve the governing equations of the model by using fixed-point numbers because floating-point arithmetic requires large area. The expected speedup of this architecture is 2-3 orders of magnitude higher than a Pentium IV 3GHz processor. Moreover the architecture makes it possible to connect the processing elements on an array structure which further improves the computing performance.

**References**

- T. Roska and L. O. Chua, “The CNN Universal Machine. An analogic array computer”, IEEE Trans. On Circuits and Systems-II, Vol.40, pp. 163-173, 1993.
- T. Roska, T. Kozek, D. Wolf, L. O. Chua: “Solving Partial Differential Equations by CNN” Proc. of European Conf. on Circuits Theory and Design 1992.
- P. Szolgay, G. Vörös, Gy. Erőss “On the Applications of the Cellular Neural Network Paradigm in Mechanical Vibrating System”, IEEE. Trans. Circuits and Systems-I, Fundamental Theory and Appl., vol. 40, no. 3, pp. 222-227, 1993.
- Z. Nagy, P. Szolgay “Configurable Multi-Layer CNN-UM Emulator on FPGA” IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications, Vol. 50, pp. 774-778, 2003
- Z. Nagy, P. Szolgay “Numerical solution of a class of PDEs by using emulated digital CNN-UM on FPGAs” in Proc. of 16th European Conf. on Circuits Theory and Design, Cracow, September 1-4, 2003
- Z. Nagy, P. Szolgay “Emulated Digital CNN-UM Implementation of a Barotropic Ocean Model” Proceedings of the International Joint Conference on Neural Networks, IJCNN 2004, Budapest, Hungary, July 25-29, 2004
- Celoxica Ltd. Homepage [Online] Available: http://www.celoxica.com
- The Princeton Ocean Model (POM) Homepage [Online] Available: http://www.aos.princeton.edu/WWWPUBLIC/htdocs.pom/index.html