"A Programmable Single Chip Digital Signal Processing Engine"
Paul Chiang and Pius Ng
Digital signal processing has been widely used in satellite, radar and other surveillance applications. The ever-growing data array size and the sophisticated processing algorithms have continuously been pushing the demand on processing power. The current solutions to meet these high-power-computing requirements include special function ASICs or programmable FPGAs. The ASIC approach can deliver the required computing power. However, it has a high cost and long development time with limited programmability via parameters and configurations. Normally a redesign is required during a major upgrade or scale up of data set. The programmable FPGA offers the flexibility of modification and upgrades. But, individual FPGAs cannot deliver the required performance. Consequently, multiple FPGA chips or multiple boards are used to meet the computational demand. This paper describes a system solution with a high performance programmable device based on the Field Programmable Object Array (FPOA) technology. The SOA13D40-01 device provides programmability without the sacrifice of high performance. As an example, the implementation of a full digital signal processing application is described in this paper.
From the raw sensor data, to the final information extraction and analysis, a generic digital signal processing system usually includes the following components:
· Data extraction: Receive sensor data, sort and select useful data, and perform preprocessing to prepare for down-stream algorithms.
· Spatial vs. temporal domain processing: The signal normally covers two-dimensional over time. There are algorithms applicable to the 2D dataset such as average or low pass filter to smooth out the signal, or other 2D spatial filter to detect an edge or any abnormal signal. At the same time, similar algorithms can be applied to the time domain data to find out information over time.
· Frequency vs. time domain processing: A set of signal processing algorithms are more effective in the frequency domain. That is, the time domain data is first transformed into frequency domain using a transformation operator such as an FFT. It is then followed by a simple filter on the FFT output before applying an IFFT to convert it back to time domain. A typical example is the convolution using FFT.
· Feature extraction: After the above processing stages, a set of criteria will be applied to extract pre-defined features.
· Characterization: Final conclusion or information gathering will be made based on the values of the collected feature set.
Besides these basic processing blocks a complete signal processing system also requires a high speed I/O and a high bandwidth interface to external memory for data storage and retrieval. A single SOA13D40-01 device supports all of these requirements with the following features
- 400 silicon objects that include
- 256 ALU objects that performs arithmetic operations plus truth function generation
- 80 RFs (register file) that provides distributed and fast data storage
- 64 MACs deliver main computation for most of the signal processing algorithms
- 1 GHz core clock
- 7.6 Kbytes of internal memory organized into distributed 12 blocks for concurrent access at ¼ or 1/3 core clock speed
High Speed I/O
- Two external memory interfaces that support RLDRAMII DDR at 250 MHz, that is, 2 Gbytes/s bandwidth per port
- Four GPIO ports with 40 programmable pins per port supporting up to 125MHz data rate
This paper describes how these computing resources were grouped into multiple blocks to support system functions such as external memory access and arbitration, external host processor interface, internal memory bus to link up to three processing blocks. Each of the processing blocks can be programmed to perform data extraction, spatial or temporal processing, frequency domain processing, and final feature extraction.
This paper concludes with a performance summary of a space satellite application that has been demonstrated on SOA13D40-01 device with a 400 MHz core clock and 206 Silicon Objects.
2005 MAPLD International Conference Home Page