## SEFI Mitigation Technique for COTS Microprocessors: Demonstration Using Proton Irradiation Experiments Manish P. Pagey, David Czajkowski, Praveen Samudrala, and David Strobel Space Micro Inc., 9765 Clairemont Mesa Blvd., Suite A, San Diego, CA 92124 The use of leading-edge commercial microprocessors in space applications is often precluded due to their susceptibility to single-event upsets (SEU) and single-event functional interrupts (SEFI). The SEU and SEFI threshold LETs for commercial microprocessors can range from 0.2 to 9 MeV-cm²/mg[1]. In orbit environments, these thresholds translate to upset rates ranging from a few upsets per day (unacceptable) to a single upset per year (acceptable). SEFI is increasingly becoming the dominant mechanism for complex system-on-a-chip (SOC) devices. SEFIs are observed in complex integrated circuits and microprocessors as unexpected "hangs" during normal operation of the component. Such interruptions are believed to be due to single-event upsets in critical regions (such as a state machine) that force the circuit to an invalid state. Swift *et al.*[2] have advised space designers that SEFI is an emerging radiation hardness assurance issue with the solution cited as "removal of power supply and subsequent re-initialization." This is currently the most common method of recovering from SEFI events in microprocessors. While a very safe solution, this procedure can be time consuming and result in severe design and operation consequences. Figure 1: The H-Core chip detects SEFIs in commercial microprocessors and activates a set of signals to revive the system without requiring power cycling and without loss of data. The mitigation technique presented in this paper uses an external circuit, called the "SEFI Hardened Core" (H-Core), to monitor and manage a COTS microprocessor (CPU) during SEFIs. The H-Core is responsible for detecting the occurrence of SEFI events and, in case of such an event, asserting a sequence of signals until complete recovery of the microprocessor is confirmed. In addition, the H-Core also provides capabilities for application programs to restore their states after recovery from a SEFI event. This technique was tested by irradiating a commercial microprocessor using a proton beam. An 850MHz Intel Pentium III (PIII) microprocessor was selected as a test device during our experiments. The PIII series processors have a well-known sensitivity to protons and, in particular, are known to be susceptible to SEFI[3]. The PIII processor was used with a VSBC-8d single-board computer by Versalogic Inc. The proton irradiation experiments were performed at the Crocker Nuclear Laboratory (CNL) of the University of California, Davis. The Radiation Effects Facility at CNL is based on a 76" isochronous cyclotron. In our experiments, the PIII chip was irradiated with a focused beam of 51MeV protons at room temperature. No other components of the VSBC-8d computer were exposed to irradiation. After reviewing the datasheet of the PIII processor[4], the H-Core technique was designed to assert the following signals (in increasing level of severity): BINIT#, INIT#, LINTO, IRQ5, LINT1/NMI, and RESET#. The VSBC-8d computer also contains a hardware watchdog timer that can be used to reset the computer in case of a hang. This watchdog was used in a subset of experiments to revive the processor after SEFI events. In addition, a software watchdog supported by the GNU/Linux operating system was also utilized in a subset of experiments to perform a software reboot in case of SEFI events. Similarly, in a subset of experiments, the local APIC timer was used as a watchdog to generate the LINT1/NMI signal in case the processor stopped responding. Figure 2: Success rates of various H-Core signals used with the PIII test processor after SEFI events. At least one of the H-Core signals was able to revive normal operation of the processor in all cases when a SEFI was observed. The success rate of each of the H-Core signals was also recorded during the experiments and is summarized in Figure 2. The results of this study show that the SEFI mitigation technique is capable of restoring the normal operation of an affected microprocessor without requiring the removal of power supply. The use of a sequence of signals allows one to revive the microprocessor using the least severe technique possible so that normal operation can be restored faster. After entering a valid state the microprocessor can restore its normal operation without cycling power to the enclosing system. ## **Bibliography** - 1: F. Irom, F. Farmanesh, A. Johnston, G. Swift, and D. Millward, Single-Event Upset in Commercial Silicon-On-Insulator PowerPC Microprocessors, 2002 - 2: G. Swift, F. Farmanesh, S. Guertin, F. Irom, and D. Millward, Single-Event Upset in the PowerPC750 Microprocessor, 2001 - 3: J. Howard Jr., M. Carts, R. Stattel, C. Rogers, T. Irwin, C. Dunsmore, J. Stattel, C. Rogers, T. Irwin, C. Dunsmore, J. Sciarini, and K. LaBel, Total Dose and Single Event Effects Testing of the Intel Pentium III (P3) and AMD K7 Microprocessors, 2001 - 4: Intel Corporation, Pentium III Processor for the PGA370 Socket at 500MHz to 1.13GHz: Datasheet Revision B, 2001 ## First Author Information: Name Manish P. Pagey **Affiliation** Space Micro Inc. Address 9765 Clairemont Mesa Blvd., Suite A, San Diego, CA 92124 **Telephone** (858) 309 6610 x3037 **FAX** (858) 309 6619 Email pagey@spacemicro.com