"Using Software Rules to Enhance FPGA Reliability"

Chandru Mirchandani
Lockheed Martin Transportation and Security Solutions

Abstract

One question frequently asked by system designers is, how reliable is my design? But what does this really mean? To a program technical lead it might mean “will the system perform at the required quality level for performance, speed and data integrity over the life time of the mission (and then some)”. This is a very tall order for designers to meet if there is a constraint on how much time and money can be spent and how much of that money can be spent for experimentation, prototyping and implementation. Software engineers have been plagued with predicting the quality of their software and have done so based on historical information. However, this does not always pan out because innovative designs do not reuse complex code, but instead develop new complex code. The software programmer has then resorted to devising novel methods of improving the quality and hence the ‘reliability’ of their systems by developing metrics that can be tracked and dynamically adapted to incorporate improvements in quality. This paper examines a process by which these rules can be extended towards reliability and fault tolerance in Field Programmable Gate Arrays.

Summary

Software systems are very difficult to test. Thus software system testers together with software system engineers have devised many processes to ensure that critical threads in software are tested before the system is used operationally. Still, due to the complexity of some software, the reliability of the software does not reach a maturity level until the initial faults, due to integration, new code release, unexpected scenarios, etc., are uncovered and removed from the system. One reason for this is the fact that complex software is generally created for new and innovative systems and uses new lines of code, rather than re-used lines of code. Thus the predicted reliability of the ‘new’ software, though based on expert opinion and a historical knowledge-base, cannot be validated completely to ensure that the software will meet the design reliability requirements. Using an adaptive model, the software systems engineers are able to re-engineer the software based on new test data whilst the system is being tested. This increases the confidence in the newly developed system and at the same time allows the initial infant mortality region of the Rayleigh curve to be bridged faster.

This paper proposes a method by which an adaptive model can be developed for FPGAs such that the reliability growth of the device is more apparent at the early stages of system test. This will mitigate risks and improve the reliability of the delivered FPGA-based system.

 

2005 MAPLD International Conference Home Page