"A Novel High-level Dynamic Hardware-Software Remapping Technique for Mission Critical Reconfigurable Computers"

Luis E. Cordova and Duncan A. Buell
University of South Carolina

Abstract

A hybrid reconfigurable supercomputer platform is composed of a mix of microprocessors, reconfigurable chips, and on-board memory. Initially, the microprocessor side carries out tasks of the main application software. In order to improve performance of an application, time-consuming subroutines in software can be run on reconfigurable hardware with explicit parallelism. The complexity of providing system-level fault tolerance to a reconfigurable computer is increased by the variety of resources and heterogeneous character of its architecture. We introduce a novel high-level dynamic hardware-software remapping technique aiming fault tolerance for applications deployed on reconfigurable computers used for critical missions. A high-level complete description of a system enables us to remap dynamically its description across microprocessor and reconfigurable resources. Our remapping technique allows for the system to adapt dynamically—and gradually—its implementation from purely software to purely reconfigurable hardware depending on the environment while maintaining reliability. The high-level specification also allows fast prototyping, debugging, cycle accurate simulation of mission critical applications, fault injection, reliability profiling, and customization for different radiation environments regardless of the availability of special hardware. We show seamless integration of our technique with error correction codes, multi-data-path voting, heartbeat signaling, and injection of soft-software and soft-hardware single event upsets through soft-software and softhardware saboteur high-level devices affecting on-chip memory and on-board memory. We also devise mechanisms for data-path fault diagnosing and self-repair actions for reconfigurable-reconfigurable and reconfigurablemicroprocessor resources. Our technique benefits applications ranging from space missions with adaptable high performance computing demands down to large high performance supercomputing machines with terrestrial-based radiation hardening demands.

2005 MAPLD International Conference Home Page