Michael J. White* and Clay Gloster, Jr., Ph.D., P.E.
Department of Electrical & Computer Engineering
Howard UniversityAbstract
This paper presents the design and implementation of a reconfigurable processor that is used to implement the Discrete Fourier Transform (DFT) algorithm for complex input data. Both real and imaginary input data is represented using the IEEE standard for single precision floating point numbers. This reconfigurable processor is very similar to the Dynamic Instruction Set Computer (DISC) developed in the mid 1990's.1 While the DISC processor functions similar to a cache loading hardware for a new instruction that has not been used recently into the FPGA during program execution, this research uses a library of processors that each contain a small set of specialized instructions. The quasi-dynamic processor library consists of a set of processors with a few custom instructions that are tailored to each application. This paper demonstrates the effectiveness of this approach for the DFT.
The instruction set of the reconfigurable processor that has been developed allows the reuse of particular op-codes for different instructions that are loaded into the FPGA. This combination microcomputer/instruction set provides the performance advantages of Reduced Instruction Set Computers (RISC) as well as the benefits of a large instruction set offered by Complex Instruction Set Computers (CISC). Since each processor contained in the library uses a maximum of 16 unique instructions, the instruction decode logic is simple and the maximum clock frequency approaches that of RISC processors. With this approach, the library can theoretically contain an infinite number of unique instructions. Therefore, the quasi-dynamic reconfigurable processor- based approach can offer a large number of instructions, which is an advantage of CISC processors.
At Goddard Space Flight Center, work is under way to develop fast optical control algorithms using the Fourier transform. The desired algorithm should not be required to use an input vector that is of size 2N. While zero filling, the placing zeroes in an array of less than 2N until the array is of size 2N could be used, it is desirable to determine if this is avoidable. Another major goal is for the optical control algorithm to process a 512x512 array in 0.01 seconds.2 Consequently, since the data array can be some size other than 2N and it is still desirable to use Fourier analysis for the control application, the DFT is the obvious candidate.
The reconfigurable processor includes a flexible data path and hard-wired control unit that can be configured for each application. The processor instruction set includes a number of floating point vector instructions to enhance processor performance. A standard interface that enables different function cores to be loaded into the data unit that are specific to each application. For the DFT, a core was developed that includes several smaller cores that have been interconnected. These sub-cores include: a floating-point multiplier, adder, complex accumulator, and sine/cosine look-up table.
While many researchers have implemented the Fourier transform on FPGAs,3,4 these algorithms are typically restricted to real input data that is represented using fixed-point integer arithmetic. This approach is prevalent primarily because previous FPGAs did not contain enough logic resources to fit several modules. Today, the logic capacities of FPGAs are in the range of 2-4 million gate equivalents with maximum clock rates on the order of hundreds of MHz. More specifically, the FPGA used in this paper can fit approximately 100 floating point multipliers/adders on a single chip.
This paper presents the implementation of the Discrete Fourier Transform on a reconfigurable processor. The DFT is used as an example application to demonstrate both the flexibility and potential performance gains of the reconfigurable processor. Details of the processor design are presented along with execution times for a 1024-point DFT.
References
* Member, AIAA
- M. J. Wirthlin and B. L. Hutchings. DISC: A Dynamic Instruction Set Computer. In Third IEEE Workshop on FPGAs for Custom Computing Machines, 1995.
- Personal correspondence, Semion Kizhner, 2003.
- Botros, N and W. Zakhem, "FFT Processor Using Field Programmable Gate Arrays", Proceedings of the Fifth Annual IEEE International ASIC Conference and Exhibit, 1992, Sept 21-25, 1992, Rochester, NY, pages 115-118
- Szedo, G, V. Yang, and C. Dick, "High Performance FFT Processing using Reconfigurable Logic", Conference Record of the Thirty-Fifth Asilomar Conference on Signals, Systems, and Computers, 2001, Pacific Grove, CA, pages 1353-1356 vol.2.