A Low Complexity Method for Detecting Configuration Upset in SRAM Based FPGAs

Raymond J. Andraka and Jennifer L. Brady
Andraka Consulting Group,Inc.

Abstract

Field Programmable Gate Arrays (FPGAs) have brought reasonably priced, high performance processing to applications that previously could not be done without expensive custom logic. This has opened the door to a variety of experiments that were previously too expensive to accomplish. The application of FPGAs to environments where upset is likely has been slower because of the high logic cost associated with mitigating upset. The concern over upset extends to the configuration program for the reconfigurable (SRAM based) FPGAs. Fortunately or unfortunately, the devices best suited to signal processing and the larger logic density devices tend to be the SRAM based ones. The complexity of configuration integrity monitoring has hindered application of these devices in environments where upset is likely.

The traditional approach to monitoring the configuration is to read back the configuration and check it against the intended configuration. This is time consuming, requires a separate controller, and is not compatible with logic memory elements in the Xilinx devices (shift register and RAM CLBs). Our design, a high data rate 4K point FFT, uses those memory elements extensively in order to fit within the device size and power limitations. Rather than using the traditional read back approach, we verify the logic by periodically sending test vectors through it and checking the results. We have taken the novel approach of using test vectors to detect changes from a baseline rather than checking for logic correctness. This leads to a surprisingly simple implementation consisting of a linear feedback shift register to generate a known pattern and a cyclic redundancy check signature to verify the output matches the baseline output.

Proposed paper outline (first cut):

Introduction
    Traditional SEU detection methods
        Triple mode redundancy (TMR)
        Configuration scrubbing/readback
    Drawbacks of traditional methods
    Limitations imposed by Virtex devices
        Readback of CLB memory elements corrupts configuration
        Memory elements are key components in DSP designs

Our Design
    Block floating point 4K point FFT at 100 MS/sec
    High data rate, low power, very limited real estate and memory
    Can't fit TMR
    Extensive use of SRL16's, can't rely on readback
    Short gaps in processing during load/unload operation

Implementation
    LFSR to generate and inject a known pseudo random pattern
    CRC computed on output to check for any changes to baseline
    Wavefront approach for checking delay buffer
    Assumes uncorrupted logic is correct
    Traditional methods (configuration readback) to check surrounding logic

Application
    Non-critical data (sensor arrays).
Detected fault - verify (two consecutive faults), throw away data since last non-faulted set, reconfigure device to restore configuration

Advantages
    Low complexity
    very small footprint, power
    Permits use of Virtex SRL16 and CLB RAM components in susceptible environments

Limitations
    Will not detect transient events,
        Suggest a reasonableness test on data (variance?)
    Requires slack time within processing to allow test set
    Does not check the check logic adequately
    Fault isolation to module only
    Useful for data path, awkward for control paths

Testing

Conclusion
    Used in combination with readback scrubbing to verify configuration integrity
    Avoids problems of readback of memory logic
    Low complexity permits use in tight situations.

Acknowledgements