PROGRAMMABLE TECHNOLOGIES WEB SITE

A scientific study of the problems of digital engineering for space flight systems,
with a view to their practical solution.


Arithmetic

Usage of this WWW Site


Acknowledgements: A special thanks to Ben "VHDLCohen" for a number of these links.

VHDL and Generators

Arithmetic Module Generator for High Performance VLSI Designs


Arithmetic Module Generator
for High Performance VLSI Designs

VHDL Library of Arithmetic Units


VHDL Library of Arithmetic Units
MICROSWISS Project TR-EZ-001

Reto Zimmermann
Integrated Systems Laboratory
Swiss Federal Institute of Technology (ETH) Zurich, Switzerland

Distributed Arithmetic Abstract
Distributed arithmetic is a bit level rearrangement of a multiply accumulate to hide the multiplications. It is a powerful technique for reducing the size of a parallel hardware multiply-accumulate that is well suited to FPGA designs. It can also be extended to other sum functions such as complex multiplies, fourier transforms and so on.
Multiplication in FPGAs Abstract
Multiplication is basically a shift add operation. There are, however, many variations on how to do it. Some are more suitable for FPGA use than others.  This page is a brief tutorial on multiplication hardware.

 

Multiplication

Distributed Arithmetic Abstract
Distributed arithmetic is a bit level rearrangement of a multiply accumulate to hide the multiplications. It is a powerful technique for reducing the size of a parallel hardware multiply-accumulate that is well suited to FPGA designs. It can also be extended to other sum functions such as complex multiplies, fourier transforms and so on.
Multiplication in FPGAs Abstract
Multiplication is basically a shift add operation. There are, however, many variations on how to do it. Some are more suitable for FPGA use than others.  This page is a brief tutorial on multiplication hardware.

 

Division

LEON Division


Hardware Divider For Leon Processor

A European Space Agency have developed a processor named Leon. Synthesizable VHDL model of the processor is also made available. The project involves design and implementation of one of the following arithmetic units in two technologies; FPGA and Synopsys standard library.

fp_divider.pdf


A Floating Point Divider for RC Systems

  • An Overview of Floating Point Arithmetic
  • IEEE Floating Point Formats
  • Examples of Floating Point Division
  • Examples of Floating Point Addition
  • Implementation of a 32-bit Floating Point Divider
  • Conclusions
http://www.eng.uci.edu/~alberto/PhDdiss/an99phd.pdf Low Power Division and Square Root

Abstract
The general objective of our work is to develop methods to reduce the energy consumption of arithmetic modules while maintaining the delay unchanged and keeping the increase in the area to a minimum. Here, we present techniques for dividers and square root units realized in CMOS technology. The energy dissipation reduction is carried out at different levels of abstraction: from the algorithm level down to the implementation, or gate, level. We describe the use of techniques such as switching-o not active blocks, retiming, dual voltage, and equalizing the paths to reduce glitches. Also, we describe modifications in the on-the- y conversion and rounding algorithm and in the redundant representation of the residual in order to reduce the energy dissipation. The techniques and modifications mentioned above are applied to several division and square root schemes, realized with static CMOS standard cells, for which a reduction in the energy dissipation of about 40 percent is obtained with respect to the standard implementation optimized for minimum delay. This reduction is expected to be even larger if low-voltage gates, for dual voltage implementation, are available.

dh_arith_97.pdf SRT Division Architectures and Implementations

SRT dividers are common in modern floating point units.  Higher division performance is achieved by retiring more quotient bits in each cycle. Previous research has shown that realistic stages are limited to radix-2 and radix-4.  Higher radix dividers are therefore formed by a combination of low-radix stages. In this paper, we present an analysis of the effects of radix-2 and radix-4 SRT divider architectures and circuit families on divider area and performance.  We show the performance and area results for a wide variety of divider architectures and implementations.  We conclude that divider performance is only weakly sensitive to reasonable choices of architecture but significantly improved by aggressive circuit techniques.  Lang [5] analyze the tradeoffs of using several of these optimizations in the context of static CMOS standard-cells.  Williams [8] presents a self-timed dynamic CMOS divider comprising a ring of five radix-2 stages that incorporates several of these techniques, and he also presents an analysis of the performance and area effects of the architectural components. Prabhu [9] presents the tradeoffs encountered when designing the Sun UltraSparc radix-8 divider.  In contrast to previous works, this paper analyzes in detail the effects of both circuit style and divider architecture on the area and performance of divider implementations.  We present the performance results using the technology-independent

 

CORDIC

The CORDIC Algorithm

http://www.fpga-guru.com/files/crdcsrvy.pdf

Summary

CORDIC is an acronym for COrdinate Rotation DIgital Computer. It is a class of shift-add algorithms for rotating vectors in a plane. In a nutshell, the CORDIC rotator performs a rotation using a series of specific incremental rotation angles selected so that each is performed by a shift and add operation.

Rotation of unit vectors provides us with a way to accurately compute trig functions, as well as a mechanism for computing the magnitude and phase angle of an input vector. Vector rotation is also useful in a host of DSP applications including modulation and Fourier Transforms.


FPGA Implementation of Sine and Cosine Generators Using the CORDIC Algorithm

Tanya Vladimirova
Hans Tiggeler
Surrey Space Center

Military and Aerospace Applications of Programmable Devices and Technologies International Conference
September 28-30, 1999

A2_Vladimirova_P.pdf
A2_Vladimirova_P.doc

Abstract
This paper is concerned with FPGA implementation of CORDIC schemes for fast and silicon area efficient computation of the sine and cosine functions. The results of theoretical investigation into redundant CORDIC are presented. Summary of CORDIC synthesis results based on Actel and XILINX FPGAs is given. Finally applications of CORDIC sine and cosine generators in small satellites are discussed.

Keywords
CORDIC, sine, cosine, FPGA, synthesis, redundant signed-digit system.


Home
Last Revised: February 03, 2010
Digital Engineering Institute
Web Grunt: Richard Katz