NASA Office of Logic Design

NASA Office of Logic Design

A scientific study of the problems of digital engineering for space flight systems,
with a view to their practical solution.


PROGRAMMABLE LOGIC APPLICATION NOTES

March, 1997

Richard Katz
Electronic Systems Branch
Goddard Space Flight Center
(301) 286-9705
richard.b.katz@nasa.gov

This column will be provided each quarter as a source for reliability, radiation results, NASA capabilities, and other information on programmable logic devices and related applications. This quarter the column will expand on existing topics covered in previous additions and add to the data set; additionally, more tips for programmable logic will be discussed along with a continuation of the 'design and test' section. Many of the tips are based on actual case studies in the field at various NASA and industry sites and are fairly common. If you have information that you would like to submit or an area you would like discussed or researched, please give me a call or an e-mail.

DESIGN AND PROGRAMMING TIPS

Single Point Failures. This section will discuss some potential single point failure (SPF) modes in Actel FPGAs and some device specific considerations. Note that virtually all integrated circuits have 'SPFs' in them; this includes everything from advanced microprocessors down to a simple quad NAND gate. For critical applications, standard design rules should be followed to eliminate SPFs and redundant functions which would be included in separate IC's. This applies to microprocessors, discrete logic chips, ASICs, and FPGAs. An exception to this rule is the class of circuits designed such as many dual-redundant MIL-STD-1553B transceivers are: there are no common pins, no common silicon resources, and two independent die are used.

Charge pumps are used in Actel FPGAs to bias the transistors that isolate modules of the logic cells during programming and connect them in normal operation. Failures of the charge pump or the charge distribution system may cause unpredictable operation of the device and failures may propagate to the I/O pins. The possibility that multiple logic modules and multiple I/O pins not following their truth table from a single point failure may not be ruled out. Note that NASA and Actel have not seen any charge pump failures during normal operation; the only related failures have been during radiation testing, which normally sets the total dose limit for devices of this class.

It should be noted that the charge pump takes a significant period of time to start after power-up, during which the I/O pins are in an unknown state. This time is normally less than 1 mSec, generally a shorter time than other components such as high Q crystal oscillators, but not-insignificant depending upon how the FPGA is used in the system. This is in contrast to normal CMOS devices, which generally have I/O pins that behave well and follow their truth tables at quite a low voltage. For instance, FPGA I/O modules that have been programmed as inputs may behave temporarily as outputs that are in the logical '1' state, and may temporarily source current into the drivers connected to them. Device outputs are also not guaranteed to follow their truth tables and may source current at startup, although they 'logically' should be sinking current to produce a logic '0', as is frequently used in power-on reset circuits. The time period for the startup transient depends on the power supply slew rate and other factors, including radiation exposure, which has been shown to increase the startup time and current requirements. Increasing radiation doses have shown that the device will consume larger static currents and will cause the device to fail; the failure level is dependent upon the device selected and is generally regarded as a lot-specific parameter. The only known/suspected charge pump/distribution problems seen in Actel FPGAs have been from total ionizing dose radiation testing.

FPGAs, like any VLSI gate array device, can have a variety of failure modes, which affect more than a single logic gate. These include VCC and GND distribution failures, clock buffering and distribution failures, failure of lines used for tie-offs, etc. Note that in the Actel architecture the CLKBUF macro is often used to drive high fan out loads such as flip-flop clears, with the Act 2 architecture having no dedicated clear lines. Also, it is a requirement that a relatively small number of special-purpose pins, such as the MODE pin, remain grounded for correct operation. This is especially important in a radiation environment. It is conceivable that the bond wire connecting the MODE pin to the MODE pad, or connections of the MODE pin and signals derived from it throughout the FPGA can be considered potential single point failures. Intermittent device operation has been reported from several laboratories where breadboard and/or engineering model hardware have malfunctioned on the bench from floating MODE pins.

Actel FPGAs need time to 'start' and care must be exercised for any critical spacecraft function implemented with FPGAs (or any other component) as any design that directly connects the outputs of an FPGA to critical spacecraft controls could malfunction at power-up without any failure of the FPGA. Some precautions that may be taken include:

  1. Do not attach an FPGA input to the analog part of a power-on reset circuit
  2. Buffer and isolate the FPGA outputs from any critical spacecraft controls that require proper operation during the startup transient.
  3. Provide appropriate error detection/correction/fail-safe and isolation schemes for applications that require tolerance to single point failures. For certain reliability levels, this may require putting redundant functions in separate IC packages, as is frequently done with discrete device designs.

Act 3 and C-Modules. For moderate SEU hardness in Actel devices, C-Module flip-flops are frequently used and can be substituted for S-Module flip-flops inserted in designs by Actgen, Actmap, etc. In the Act 3 architecture, there are 3 low skew clocks: 2 clocks are similar to the Act 2 clocks; the third clock, which has the highest performance, is called the HCLK. The HCLK is optimized and can not be used as a general-purpose signal. In particular, it can not be used to drive the clock inputs of C-Module flip-flops.

NASA/GSFC Intellectual Property. NASA GSFC has purchased site licenses for VHDL code, which implements PCI Initiator, PCI Target, and UART functions. These are freely available for the GSFC community to use.

Adapters

Two additional ceramic quad flat pack (CQFP) to pin grid array (PGA) adapters have been designed. The first translates the A1460 from a QFP197 PGA207; the second translates the A14100 from a QFP256 PGA257. Like our other adapters, designs may be translated between these packages without having to run the place and route tools, preserving the FPGA's timing analysis.

Floating Inputs. This topic has come up several times in the last month and some additional data has been taken. For review, all unused pins are programmed by the Actel software to be outputs, driving low, and they should be left unconnected. If grounded, for instance, large currents may be seen on powerup. For the special pins such as SDI, DCLK, PRA, and PRB, see the note in a previous edition of EEE Links. Some data has been taken on the A1020x family, plotting ICC vs. Vin, for a 'floating input,' with Vin driven with a 22 kohm source impedance. Significant peak currents were observed with a maximum current increase of approximately 3 mA. It is suggested that all unused inputs be properly terminated.

HARDWARE INFO

MORE ON A1020B VARIANTS: The A1020B is processed at either Matsushita (MEC) or Texas Instruments (TI) with most radiation studies being performed on the MEC devices. Tables included in these application notes show that the A1020B is susceptible to latchup. Additionally, starting about Jan., 1995, Actel began shipping 0.9 m A1020B devices; no radiation data, to the best of our knowledge, is available on this version of the product. Further latchup testing on A1020B 1.0 m samples has been completed. For the samples tested, the latchup threshold of the TI A1020B was < 25 MeV cm2/mg. For the recent MEC device, the threshold was 28 MeV cm2/mg and 38 MeV cm2/mg. Obviously, the latchup threshold and lot specific testing should be carefully considered for each mission.

MORE FOUNDRIES: Previously, Actel die was manufactured at either Matsushita (MEC) or Texas Instruments (TI). With the newer products such as Act 3, XL, and A3200DX families, parts may come from the other foundries such as Winbond and Chartered; some devices such as the A1460A and the A14100A are available from multiple foundries, depending on the speed grade. Take care in the use of radiation data and feel free to contact the author for additional information or clarification.

MORE ACT 3 DEVELOPMENTS: A lot of work is ongoing with the Act 3 devices, the A1460A and the A14100A, as has been reported here previously. Recent work by NASA, Aerospace Corp., the University of Texas, and ESA includes total dose testing, heavy ion testing, and proton testing. Testing a manufacturing lot of MEC A14100A's showed a total dose capability of approximately 15 krads (Si), less than what was seen on a the lot of MEC A1460A's reported previously. Note that these results are better than the Winbond A1460A, which showed a capability of approximately 4 krads (Si). A preliminary conclusion is that the MEC Act 3 'A' series has superior total dose performance as compared with the Winbond parts. For proton susceptibility, see the section on SEE Performance. Tests on the 'BP' series are planned in the near term. The 'B' designator indicates a move to the 0.6 angstrom process from 0.8 angstrom; the 'P' indicates PCI-compliant I/O.

Total Dose Results. As mentioned above in the Act 3 Development section, a manufacturing lot of MEC A14100A's were tested and performed well up to 15 krads (Si). An A1280XL 0.8 m device (Winbond) was TID tested at a dose rate of 9.8 rad (Si) / second. At this high dose rate, the part failed between 2 and 4 krads (Si) with a current increase of approximately 700 mA. This part was manufactured at the Winbond foundry. These results are consistent with failure levels observed during proton testing. Additionally, several A1280XL 0.6 angstrom devices (Chartered) failed at 2.5 krads (Si) during proton exposure.

LPGA's

An initial test was conducted on a LPGA (laser programmable gate array). This device is configured by having a laser remove all unwanted connections in the routing array, as opposed to an antifuse or SRAM-based FPGA where desired connections are made. A Chip Express QYH500 channeled gate array was tested with heavy ions at BNL in February, 1997. This device has approximately 60,000 usable gate array gates, is 883B qualified, and has very quick turn-a-round time; this fits a niche between FPGAs and mask programmable ASICs. The observed SEU threshold was approximately 35 MeV cm2/mg and latchup was first detected at approximately 60 MeV cm2/mg. Additionally, the device was subject to a significant total dose during heavy ion testing and showed a slight increase in current and no functional failures. This technology will be evaluated further.

PROGRAMMABLES AND PROTONS

Further tests have been conducted on several programmables with low upset thresholds for susceptibility to protons; data is presented in a column of the SEE tables. The A14100A, the A1280XL 0.8 m device, the A1280XL 0.6 m device, and the CLAy-31 were tested with 200 MeV protons. All of the Actel devices' S-Module flip-flops were upsettable. For storage in the I/O Modules, upset was observed in Act 3 but not in any of the Act 2 devices, consistent with the heavy ion data. Upset was also detected for the CLAy-31 SRAM-based FPGA.

PIPELINE PERFORMANCE.

For high speed applications, operations or computations are frequently broken into stages with pipeline registers separating them. This technique can be used to maximize performance at the expense of additional flip-flops and latency through the pipeline. The increased use of pipeline registers may cause a shortage of dedicated flip-flops, particularly if TMR is used to protect against SEUs; this may necessitate the use of C-Module flip-flops in the pipeline.

Since clock skews cannot be controlled to maximize performance in FPGAs, designing for maximum performance requires selection of flip-flop types. In the event of a shortage of S-Module dedicated flip-flops, choices have to be made where to put C-Module based flip-flops.

Sample pipelines have been designed, placed and routed, and simulated to study the effects of flip-flop selection on pipeline speed and to investigate the often-heard remark that S-Modules are required for high performance. The sample design consisted of 4 blocks, each block having 10 pairs of flip-flops with three gates in between each flip-flop pair. The first two gates' output nets were given the attribute PRESERVE to instruct the combiner not to eliminate the logically unneeded gate. The third gate's output was not PRESERVED to enable the S-Module to combine the last logic gate for maximum performance. The 4 blocks used all of the combinations of flip-flops for pipelining (CC, CS, SC, and SS) to compare their performance. Simulations were run, under worst-case military voltage/temperature conditions using the RH1280 to represent Act 2 devices and the A1460A-1 to represent Act 3 devices. Designer 3.1 on a PC was used for all of the work with normal routing. Results listed are maximums, with the TIMER program including setup times in the calculations.

As can be seen from the table, the delays are broken into two separate groups with the critical factor being the choice of flip-flop receiving the data. Based on these simulations, performance will not be negatively impacted by using C-Module flip-flops either in non-critical pipeline stages or at the beginning of a critical pipeline stage.

  RH1280   A1460A-1  
 

tPD(max)

Path Type

tPD(max)

Path Type

1

30.7

CC

25.9

CC

2

30.5

CC

25.8

SC

3

30.4

CC

25.4

CC

4

30.1

CC

25.3

SC

5

30.0

SC

25.2

SC

6

30.0

CC

25.2

CC

7

29.9

SC

25.2

CC

8

29.8

SC

25.1

CC

9

29.8

CC

25.1

SC

10

29.7

CC

24.8

CC

11

29.7

SC

24.7

SC

12

29.6

CC

24.7

SC

13

29.4

CC

24.4

SC

14

29.4

SC

23.6

CC

15

29.3

SC

23.4

SC

16

29.3

CC

23.0

CC

17

29.2

SC

22.9

SC

18

29.0

SC

22.8

SC

19

28.5

SC

22.8

CC

20

28.3

SC

22.1

CC

21

20.0

CS

15.8

CS

22

19.8

CS

15.7

CS

23

19.3

CS

15.7

SS

24

19.2

CS

15.6

SS

25

19.0

CS

15.4

CS

26

19.0

SS

15.3

SS

27

18.9

SS

15.3

CS

28

18.8

CS

15.3

SS

29

18.8

CS

15.3

SS

30

18.8

CS

14.9

SS

31

18.7

CS

14.9

SS

32

18.7

SS

14.9

CS

33

18.7

SS

14.8

SS

34

18.7

CS

14.8

SS

35

18.5

SS

14.6

CS

36

18.4

SS

14.4

CS

37

18.3

SS

14.4

CS

38

18.3

SS

14.4

SS

39

18.2

SS

14.4

CS

40

18.1

SS

14.3

CS

Design and Test

This section will continue to address design and testability issues for programmable logic devices. The emphasis will be on what design engineers can do up front to ease the testing process and produce better, more testable designs. This edition will focus on some frequently seen design practices followed by a digital design guidelines/checklist.

Tying Inputs Together. For a high fan-in gate, two or more input pins should not be driven by the same input signal. Instead, use either a more specialized macro or tie the unused input pins to a VCC or Ground. This will also decrease the capacitance being driven and improve performance.

Chopper Circuit (gate-delay one-shot). The one-shot waveform, which is generated by the delay of a gate(s), does not provide a stable pulse width, due to manufacturing variations. Thus, it should not be used in the logic design and is difficult to test, especially for ATPG. This structure comes in many forms: this is one example.

Digital Design Checklist

Initialization. Can all sequential circuits been initialized (e.g., MASTER CLEAR line included or a short initialization sequence)? Even a simple toggle flip-flop used as a clock divider presents problems to ATE and ATPG programs. Although, for many circuits, you can show mathematically that the circuit will self-initialize, this is difficult for ATPG to determine.

State Machine Design.

Good state machine design dictates that there are no lockup states. Are the machines settable so that this logic may be tested? If the design uses a one-hot encoding, for instance, can an SEU activate two states resulting in a hazardous condition such as two tri-state drivers being simultaneously enabled?

For long sequences, have the counters been 'broken' so that the test sequences are of reasonable length? If the counter is settable, the test length may be shortened even further.

Visibility. Have all key signals been made accessible to the ATE? Have these signals been given test points on the PCB?

Synchronization. Have clocks signals been made directly controllable by the ATE and observable at the board level? Some FPGA permit clock selection inside of the part. For testability, it is often beneficial to route the selected clock off chip and then back into the programmable device. Circuits that use the output of a counter, for example, to clock other parts of a circuit, are difficult for ATPG software to figure out. Additionally, repeated cycling through all clock/counter states generates a considerable number of vectors and increases test vector generation time (which is already too long).

Number of Clocks. Are all clocks of differing phases and frequencies derived from a single master (controllable) clock? If not, are the number of differing frequencies minimized? This will reduce the number of vectors needed to initialize the device and improve fault coverage.

Built In Test (BIT). Has built in test been incorporated into the design? A small percentage of logic inside the programmable device, in particularly the larger devices, makes test program generation (and system checkout) much easier. Does the device include JTAG 1149.1 capability and is it enabled? Can Single Event Upsets or a software error put the part into a potentially inoperable or damaging test mode? Is the failure rate contribution of BIT within stated constraints? Is the additional power consumption attributable to BIT circuitry within stated constraints?

Redundancy. Has controllability and observability been considered for redundant circuits? Can each of the redundant elements and any voting mechanisms be tested for functionality by the ATE? During temperature tests with the equipment box closed?

Internal Oscillators. Allow internal oscillator circuits, if used, to be overridden by the test equipment (preferably by routing to the rest of the circuit via an external pin). ATPG software has a difficult time with internal oscillators.

Asynchronous Feedback. Are there asynchronous feed-back loops? These are difficult to test. Additionally, they should be avoided since the pulse widths are difficult to control. For FPGAs, in particular, the delays are highly routing dependent. Clocking or asynchronously clearing multiple flip-flops with one pulse (i.e., terminal count of a counter or sequencer) are especially troublesome.

Gated Clocks. Are gated clocks used? Are output signals gated with the clock? These are frequently 'glitch generators' and can be troublesome both from a testing and a standpoint.

SEE PERFORMANCE OF FPGAs

SUMMARY OF SEE PERFORMANCE OF VARIOUS FPGAs

Device

Feature Size

SEU Let th

Sat x-section

Temp

Upset w/ Protons

A1010

2.0

25

5x10-6

R-->100C

 
A1020

2.0

25

5x10-6

R-->100C

 
A1020A

1.2

25

3x10-6

R

 
A1280 C

1.2

23

3x10-6

R-->100C

No

A1280 S

1.2

5

8x10-6

R-->100C

No

A1020B

1.0

28

2x10-6

R

 
A1280A C

1.0

28

2x10-6

R

No

A1280A S

1.0

5

8x10-6

R

No

A1280A I/O In

1.0

       
A1280A I/O Out

1.0

28

     
A1280A 3.6V

1.0

   

R

 
A1020B

0.9

       
A1280XL C

0.8

     

No

A1280XL S

0.8

     

Yes

A1280XL I/O

0.8

     

No

RH1280 C

0.8

22

8x10-6

R-->125C

No

RH1280 S

0.8

4

9x10-6

R-->125C

Yes

RH1280 I/O

0.8

       
A1460A C

0.8

~30

~2x10-7

R

 
A1460A S

0.8

~8

1x10-6

R

Yes

A1460A I/O

0.8

~8

2x10-6

 

No @ 5V

A1460A C 3.3V

0.8

~25

8x10-7

R

 
A1460A S 3.3V

0.8

~6

2x10-6

R

 
A1460A I/O 3.3

0.8

~8

7x10-6

   
A14100A C

0.8

~28

~1x10-6

 

No

A14100A S

0.8

~8

   

Yes

A14100A I/O

0.8

     

Yes

A1280XL C

0.6

     

No

A1280XL S

0.6

     

Yes

A1280XL I/O

0.6

     

No

Atmel AT6002

0.8

7-8

     
Xilinx XC3090

0.65

4-7

     
ATT2C04 config

0.5

< 7.88

     
ATT2C04 data

0.5

> 10

     
UT22VP10

1.2

       
CLAy-31 config  

~5

~7x10-8

R

Yes

CLAy-31 data  

~5

~8x10-7

R

Yes

Device

Feature Size

SEL

SEDR

Clock Upset

A1010

2.0

NO

   

A1020

2.0

NO

YES

Observed

A1020A

1.2

NO

YES

 

A1020B MEC

1.0

< 38

YES

 

A1020B TI

1.0

< 25

YES

 

A1280

1.2

NO

YES

 

A1280A

1.0

YES†

YES

 

A1280A

3.6VDC

1.0

NO

NO

 

RH1020

1.0

NO

YES

 

A1280XL

0.8

NO

YES

 

RH1280

0.8

NO

YES

 

A1460A

0.8

NO

YES

 

A14100A

0.8

NO

YES

 

A1280XL

0.6

NO

YES

 

Atmel AT6002

0.8

~11

   

Xilinx XC3090

0.65

4-7

   

AT&T ATT2C04

0.5

< 7.88

   

Quick Logic††

0.6

< 60

   

UT22VP10

1.2

     

CLAy-31

 

> 60†††

   


Notes:

  1. A1460A results same for routed global clocks and HCLK.
  2. Cross-sections are in cm2/flip-flop or bit of configuration RAM.
  3. † Latchup detected only with MODE pin high (not flown in this configuration).
  4. Single cell, C-Module latches have been tested but the data has not yet been analyzed.
  5. Blank cells denote either 'not measured' or 'not yet observed.'
  6. †† During latchup, the programmed power supply limit of 800 mA was consistently reached. Repeated latchups resulted in the eventual functional failure of the device.
  7. Note that all test organizations do not test identically (i.e., VCC = 4.5V or 5.0V) and the definition of thresholds also vary; the tables are meant as a guide only with more detail available upon request.
  8. ††† Analysis showed that high currents were probably not SEL; SER (causing internal device shorts and snapback are currently theorized for triggering latchup detection.

The obvious conclusions are:

  1. Flip-flops made from two C-modules are relatively hard.
  2. Flip-flops made from a single S-Module are relatively soft.
  3. TMR techniques are required to make flip-flops very hard ( < 10-10 errors/bit-day)
  4. A1020B devices are the only Actel devices known to latch up.
  5. All the dielectric antifuse devices tested have shown susceptibility to SEDR. The RH1280 is more resistant with its redesigned antifuse. Running at 3.6 VDC lowers the bias across the antifuse and no SEDR was observed for A1280A devices under these conditions.
  6. RH1280 devices offer no significant improvement in SEU performance for C- and S-Modules.
  7. 3.3 volt operation eliminates SEDR from Actels and increases SEU rate.
  8. With respect to SEUs, Act 3 I/O registers behave similarly to S-Modules while the Act 2 I/O latches behave similarly to C-Modules.

Shielding and FPGAs

With the drive to use more advanced, commercial off the shelf (COTS) technology, coupled with the increase of composites, total ionizing dose is becoming more of a concern and a frequent topic of discussion. The implementation of radiation shields is generally quite straightforward but requires mission-specific analysis. For instance, one cannot have a '50 krad (Si)' radiation shield. Parameters that affect the performance of the shield include the shield material, shield thickness, shield geometry relative to the die size and location, amount of material surrounding the device (i.e., box walls), and the type of particle and its energy. For instance, electrons are generally easier to stop than protons. Also, there are practical limits to the amount of shielding that can be added since the shields can get quite massive and thick. There is a point where the benefits of additional shielding decrease to the point where the increases in mass are not worth the marginal benefit.

NASA, through the SBIR and DDF programs are studying and testing different shielding techniques and their integration into space flight hardware. Below are curves recently obtained using 46 MeV protons. Note that for this energy proton, two shields actually increased damage to the integrated circuit with the protons exiting the shielding material at a decreased energy; the third shield, which was dense and thick, stopped all of the protons and fully protected the part. Additional tests, evaluations, and analyses of various techniques will continue.

ACKNOWLEDGEMENTS and REFERENCES


Home - NASA Office of Logic Design
Last Revised: November 27, 2003
Digital Engineering Institute
Web Grunt: Richard Katz
NACA Seal