Volume 1

WIRE Mishap Investigation Board Report

June 8, 1999

 


 

Table of Contents

 

WIRE MISHAP INVESTIGATION REPORT

Volume I

Signature Page (Board Members)

List of Members, Advisors, Observers, and others

Executive Summary

Acknowledgement

Introduction
  • WIRE Program Description
10
  • WIRE Mishap
10
  • Method of Investigation
11
  • Identification of Possible Cause
12
  • Operational Scenario Timeline Overview
12
  • Attitude Control/Dynamics Analysis
14
  • Pyro Electronics Analysis: Failure Mechanism Determination
17
  • Summary of the Mishap
18
  • Root Cause of WIRE Mishap
19
  • Significant Contributing Causes
20
  • Contributing Causes
20
Lessons Learned and Recommendations 22

 


 

Volume II

Appendices

 


Table of Contents

Appendices

Volume II

A.  Implementation of Wide-Field Infrared Explorer (WIRE) Contingency Plan
          Dated February 1999
3
B.  Failure of the Wide-Field Infrared Explorer 6
C.  NASA MISHAP Report, NASA Form 1627 13
D.  WIRE Mission Launch and Early Orbit Chronology of Launch Events,
          May 10, 1999
16
E.  WIRE Anomaly Investigation Report, JPL Team, May 24, 1999 41
F.  Small Explorer WIRE Failure Investigation Report,
         Richard B. Katz, Goddard Space Flight Center, May 27, 1999
138
G.  GSFC Developed "Fishbone" Mishap Cause Diagram 209
H.  Pegasus Attitude Disturbance Impact to the Wide-Field Infrared Explorer (WIRE) 230
I.  Analysis of US Space Command Debris Data 237
J.  Timing of the WIRE Vent Opening and Associated Data, May 6, 1999240 240
K.  Flight Data Correlation Kimberly Brown, Dave Everett,
         Goddard Space Flight Center
253
L.  WIRE Pyro Electronics Box Description, Space Dynamics Laboratory,
          Utah State University
264
M.  WIRE Anomaly Investigation Initial Review Board Briefing,
           March 23, 1999
277
N.  Pegasus Flight 26 Quick Look Report (WIRE), Orbital Sciences
            March 11, 1999
344
O.  NASA Parts Advisory NA-046, All Actel Families 362

 


Signature Page

 

____________________________
Darrell R. Branscome
Chairman
NASA Headquarters 
Deputy Associate Administrator,
Office of Space Flight
____________________________
Michael E. Card
Executive Secretary
NASA Headquarters
Program Manager, Office of Safety
and Mission Assurance
____________________________
Dr. Richard H. Freeman
Chief Engineer,
Goddard Space Flight Center
____________________________
James Adams
Chief, Rapid Spacecraft
Development Office
Goddard Space Flight Center
____________________________
Wentworth O. Denoon
Deputy Director, Office of Flight
Assurance
Goddard Space Flight Center
____________________________
John A. Hrastar
Deputy Director, Space Science
Goddard Space Flight Center
____________________________
Robert T. Sherrill
Chief Engineer
Langley Research Center
____________________________
Robert T. Bechtel
Lead, Avionics System Group
Marshall Space Flight Center
____________________________
Approved
Dr. Edward J. Weiler
Associate Administrator
for Space Science
____________________________
Concur
Frederick D. Gregory
Associate Administrator
for Office of Safety and
Mission Assurance

Advisors

Office of General Counsel: GSFC/Greg LaRosa

Office of Public Affairs: GSFC/Mark Hess


OBSERVERS

 

__________________________________
JPL/Perry Bankston
Manager
Device Research & Application Section
__________________________________
JPL/Jim Clawson
Manager
Reliability Engineering Office
__________________________________
JPL/Gary Coyle
Mechanical Systems/Structural Engineer
_______________________________
JPL/Neil Yarnell
Manager
Flight Systems Section
__________________________________
JPL/Matthew R. Landano
Deputy Manager
Planetary Flight Projects Implementation Office
__________________________________
JPL/John Klein
Manager
Europa Orbiter Integration & Testing
__________________________________
Wayne R. Frazier, Headquarters
(ex-officio)
Manager, Office of Safety and Mission Assurance
__________________________________
William T. Huddleston, Headquarters
(ex-officio)
WIRE Program Executive

 


Supporting Technical Team Members

Goddard Space Flight Center: Jet Propulsion Laboratory:
Tom Correll
David Everett, Lead
Bryan Fafaul, Lead
Jack Galleher, (contractor)
Taylor Hale, (contractor)
Richard B. Katz
A. Acord
J. Clawson
J. Kievit
M. Landano, Lead
T. Luchik
G. Macala
T. Nguyen
R. Ross
D. Swenson
M. Underwood
Flight Operations and Data Support:
Nancy Goodman
Teresa LaFourcade
Cathy Mansperger
Jackie Mims
Mike Prior
Maxine Russell
Space Dynamics Laboratory:
Wally Gibbons
John Kemp
Frank Redd, Lead
Scott Schick
Support:
Mike Blau
Kim Brown
Pat Crouse
Quang Nguyen
Giulio Rosanova
Tom Spitzer
Mark Voyton

EXECUTIVE SUMMARY

The Wide-Field Infrared Explorer Mission objective was to conduct a deep infrared, extra galactic science survey. The Wide-Field Infrared Explorer was launched on March 4, 1999, and was observed to be initially tumbling at a rate higher than expected during its initial pass over the Poker Flat, Alaska, ground station. After significant recovery efforts, WIRE was declared a loss on March 8, 1999.

The WIRE Mishap Review Board has determined that the telescope instrument cover was ejected earlier than planned and at approximately the time the WIRE pyro electronics box was first powered on. The instrument's solid hydrogen cryogen supply started to sublimate faster than planned, causing the spacecraft to spin up to a rate of sixty revolutions per minute over the twelve hours following the opening of the secondary cryogen vent. Without any solid hydrogen remaining, the instrument could not perform its observations.

The root cause of the WIRE mission loss is a digital logic design error in the instrument pyro electronics box. The transient performance of components was not adequately considered in the box design. The failure was caused by two distinct mechanisms that, either singly or in concert, result in inadvertent pyrotechnic device firing during the initial pyro electronics box power-up. The control logic design utilized a synchronous reset to force the logic into a safe state. However, the start-up time of the Vectron crystal clock oscillator was not taken into consideration, leaving the circuit in a non-deterministic state for a time sufficient for pyrotechnic actuation. Likewise, the startup characteristics of the Actel A1020 FPGA were not considered. These devices are not guaranteed to follow their "truth table" until an internal charge pump "starts" the part. These uncontrolled outputs were not blocked from the pyrotechnic devices' driver circuitry. There has been no evidence or indication of any component failure although component failures were considered in the investigation.

A significant contributing cause of the anomaly was the failure to identify, understand, and correct the electronic design of the pyro electronics box. Design errors in the circuitry, which controlled pyro functions, were not identified. The pyro electronics box design was not peer reviewed, and other system reviews conducted by the instrument design organization did not focus on the electronics box. At the time the Systems Design Review was conducted for WIRE the design of the pyro electronics box was not completed. It is the assessment of the WIRE Mishap Investigation Board that a peer review held during the design process, by people with knowledge of and expertise regarding pyro circuit design would have identified the turn-on characteristics that led to failure.

A large number of failure scenarios were evaluated during the investigation to determine the cause of the cover ejection. These included; pre-launch, launch, powered flight, separation, software, operations, design and component reliability faults. Based on comprehensive, systematic review of data, it was determined the cover was most likely ejected at the time the WIRE pyro electronics box was turned on due to a transient condition that exists in the pyro electronics during startup. This transient condition is the direct result of the non-deterministic initialization of a Field-Programmable Gate Array (FPGA) that controls both the arming and firing circuits in the pyro electronics.

Although some design attention was given to the startup behavior of the FPGA, the design contained unidentified idiosyncrasies that triggered the cover ejection. The system design did not contain sufficient start-up lockout protection or independent provisions to prevent the FPGA startup operation from propagating to the firing circuits.

The anomalous characteristics of the pyro electronics unit were not detected during subsystem or system functional testing due to the limited fidelity and detection capabilities of the electrical ground support equipment. Post-flight circuit analyses conducted as part of the failure investigation have predicted the existence of the anomaly and it has been reproduced confidently using engineering model hardware.

 


ACKNOWLEDGEMENTS

 

The WIRE Mishap Investigation Board wishes to thank the technical teams from the Goddard Space Flight Center, the Jet Propulsion Laboratory, and Space Dynamics Laboratory for their cooperation, support, and technical analyses which were crucial to the resolution of the WIRE mishap.


Introduction

WIRE Program Description

WIRE was a Small Explorer Mission designed to conduct a deep infrared, extra galactic science survey 500 times more sensitive than the Infrared Astronomy Satellite (IRAS) Faint Source Catalog. The Jet Propulsion Laboratory (JPL) and teaming partner, the Space Dynamics Laboratory (SDL) of Utah State University provided the WIRE instrument. The instrument consists of a cryogenically cooled, 30-centimeter telescope and all associated electronics designed to detect faint astronomical sources in two infrared wavelength bands. Goddard Space Flight Center (GSFC) provided the three-axis-stabilized spacecraft bus, system integration, and operations.

The Wire spacecraft was launched March 4, 1999, at approximately 6:57PM PST from the Western Range/VAFB, California, into a planned 540 kilometer orbit using Orbital Sciences Pegasus XL launch vehicle. Planned mission duration was four months.

WIRE Mishap

The WIRE launch was nominal with the first ground station contact at McMurdo, Antarctica occurring without incident. All planned activities for the pass were accomplished with all systems appearing nominal.

The spacecraft was tracked using the facilities at McMurdo, Antarctica; Poker Flat, Alaska; and NORAD. The first tracking pass started over McMurdo about 20 minutes after the Pegasus XL separation from the L-1011 and lasted about 10 minutes. During this McMurdo pass, ground commands were transmitted as soon as practical to perform a planned secondary venting of the secondary hydrogen tank rather than wait for the spacecraft-stored on-board sequence. The NORAD tracking began about 40 minutes after the end of the McMurdo pass and reported tracking three separate objects in orbit - one about the size and mass of the cover.

Spacecraft tumbling was observed at the initial McMurdo, Antarctica ground pass, but was consistent with expected Pegasus separation "tip off" predictions. During the next pass at Poker Flat, Alaska, the spacecraft tumbling rates, which should have been damping down due to the Attitude Control System (ACS) Bdot controller, were not reduced, but were increasing. Analyses were initiated by the WIRE operations team to understand the observed anomaly and to verify the integrity of the Bdot controller. The Three Axis Magnetometer (TAM) and the Torque Rods phasing were analyzed. After continued analyses it was determined the TAM was functioning nominally. Within 36 hours of launch, the instruments 4-month supply of cryogen was completely exhausted. The WIRE scientific mission was declared lost on March 8, 1999. Spacecraft recovery efforts continued and were successful. (Volume II, Appendix D provides a detailed chronology of launch and early orbit events through Launch plus 7 days.)

Method of Investigation

On March 5, 1999, the WIRE Program Executive declared a spacecraft emergency and the WIRE Contingency Plan, dated February 1999, was implemented. (Volume II, Appendix A) On March 18, 1999, the Associate Administrator for Space Science established the NASA WIRE Spacecraft Mission Failure Mishap Investigation Board, with Darrell R. Branscome, Office of Space Flight, Chairman. (Volume II, Appendix B) A final, written report was requested June 1, 1999.

The Mishap Investigation Board meetings were conducted at the GSFC on March 23, April 14, and April 29. Twice weekly telecons were also conducted with the Board and technical teams through April. Weekly telecons were conducted through May.

The NASA WIRE Spacecraft Mission Failure Mishap Investigation Board was supported by technical review teams from each major mission organization. JPL formed an independent review team on March 5 to support investigation of the root cause of the anomaly and to identify actions to preclude similar occurrences on future missions. This JPL Anomaly Team performed a comprehensive, systematic and objective review of the anomaly by investigating all functional areas of the design, design review, design verification, Assembly Test and Launch Operations (ATLO) and initial flight operations (Volume II, Appendix E). The Space Dynamics Laboratory (SDL), supplier of the WIRE instrument to JPL, also formed an investigation team. The GSFC WIRE mission team also initiated failure investigations. At the request of WIRE Mishap Board member, Dr. Richard H. Freeman, Richard B. Katz from the GSFC Microelectronics and Signal Processing Branch conducted a failure mechanism analysis of the electronics design, (Volume II, Appendix F).

At the first WIRE Mishap Investigation Board meeting on March 23, the individual teams quickly blended to form an integrated team fostering full and open communications. The combined JPL, SDL and GSFC technical team supported the NASA HQ Mishap Investigation Board, led by Chairman Darrell Branscome. The JPL team was led by Matt Landano, the SDL team was led by Frank Redd, and the GSFC team was led by Bryan Fafaul and Dave Everett.

Identification of Possible Causes

To ensure the broadest range of possible mishap failure scenarios, JPL and GSFC independently developed thoughts regarding possible causes. JPL developed a list of eighteen (18) possible functional causes (see matrix Volume II, Appendix E) covering mechanical, thermal, environmental, electrical, software/flight sequence and operational functional disciplines. GSFC developed a detailed fish-bone cause and effect diagram that approached the possible cause based on implementation and development processes. (Volume II, Appendix G) The JPL list was compared to the GSFC fish-bone diagram and found to be functionally consistent.

Operational Scenario Timeline Overview

The WIRE spacecraft was launched from a Pegasus launch vehicle involving a captive carry on an L-1011 aircraft. At the appropriate altitude, the Pegasus was dropped with first stage ignition following approximately 5 seconds later. The spacecraft separated from the third stage of the launch vehicle approximately nine minutes after drop. All spacecraft systems appeared to operate within nominal ranges during captive carry, drop, boost and separation phases.

The WIRE Mishap Investigation Board reviewed minor Pegasus launch anomalies as contributors to the mishap. The Board determined that the Pegasus launch had no impact on the WIRE mishap. All launch loads were less than, or equal to design launch loads. (Volume II, Appendix H)

Approximately ten seconds after spacecraft separation, the solar array release wax thermal actuators were energized and attitude control electronics were turned on. The solar arrays were fully deployed about 90 seconds after separation.

The solid hydrogen in the instrument cryogen tanks nominally absorbs a small amount of heat when ground cooling is terminated before lift off. Since the cryostat had a limited ground hold time, approximately 9 hours, the opening of the secondary tank vent as soon as possible on orbit to effect the cool down of the onboard hydrogen was important to maximize mission life. Because of this, the secondary tank vent pyro was to be opened at the earliest opportunity by ground commanding. If ground commanding were not possible, a backup sequence stored on the spacecraft would execute and open the vent about 40 minutes after separation.

The WIRE Operations team took advantage of a tracking pass from the McMurdo ground station starting at about 20 minutes after separation. The following uplink commands were transmitted on approximately one second centers; Pyro Electronics-A on; Pyro Electronics-B on; Pyro Arm; Secondary Vent Pyro Fire. Subsequent "quick-look" review of telemetry indicated that the pyro electronics box was initially off before the first command (as expected), and that the firing telemetry for the secondary vent command from the electronics was normal.

At about the time the command to fire the secondary vent pyro was sent, spacecraft attitude control rates were observed to increase. This was expected since the vent opening would release the small amount of hydrogen gas liberated by the heating of the cryostat after liquid helium cooling had been terminated just prior to launch. This rate was expected to be quickly damped by the attitude control system as the secondary tank restored equilibrium to the solid hydrogen at its new low in-space temperature. Spacecraft attitude control rates increased rapidly with the secondary vent opening, then increased at a slower rate. At about this time, the McMurdo tracking station pass ended because the spacecraft was no longer in view.

During the telemetry outage, spacecraft onboard sequences were executed to open the secondary vent (already opened by ground command), and open the primary tank vent (wax-thermal actuator). Execution of these events was nominal. The next tracking pass at the Poker Flat tracking station began about 90 minutes after separation from the Pegasus. At this time, cryostat temperatures were checked by turning on the WIRE Instrument Electronics (WIE). This action also caused the instrument to take image data from the focal plane. The spacecraft tumble rate was higher than at the end of the previous pass, although the magnetic torque controller was still operating. Cryostat temperatures were not exceptionally high at this point, but it was clear that the spacecraft was going "out of control". Later tracking passes were used to acquire data, which did show elevated temperatures in the cryostat and increasing attitude rates. Hours later, several contingency operations were executed focused on regaining control of the spacecraft. These contingency operation included the intentional firing of the cover eject pyro after it was concluded that the science mission already had been lost.

NORAD tracking data was acquired that indicated the aperture cover was separated from the spacecraft. The combination of this information with analysis of image data serendipitously acquired while the WIE was on suggested that the cover was ejected well before commands were sent to cause its release. (Volume II, Appendix I)

The spacecraft attitude rates were eventually brought under control after all the solid hydrogen sublimated and was vented. The spacecraft was evaluated for functionality after attitude was stabilized. Other than the loss of all the solid cryogen, the spacecraft appeared to be intact without damage and was performing nominally, including the telescope sensors and electronics. Nominal spacecraft operations suggested that the cover ejection was not the result of catastrophic mechanical failure.

Attitude Control/Dynamics Analysis

The WIRE spacecraft attitude control dynamics time line history was reviewed by the Board to determine when the cover was deployed. Dave Everett of GSFC constructed the WIRE Launch Day Timeline from spacecraft telemetry. Details of this analysis are found in Volume II, Appendix J. Table 1 shows page two of this analysis as an example of timeline data.

TABLE 1

99-064-03:26:10 First McMurdo pass begins  
99-064-03:27:07 /SNOOP command sent ground system event
99-064-03:27:08.5 Barker time for SNOOP packet 1
99-064-03:27:08.7 FARM B counter increments for SNOOP transfer frame time
99-064-03:27:20 /SNOOP not in bypass sent ground system event
99-064-03:27:21.3 Barker time for /SNOOP packet 1
99-064-03:27:22 Command verification for /SNOOP ground system event
99-064-03:27:42 /PSACEPWR ON ground system event
99-064-03:27:42 /PSDSSPWR ON ground system event
99-064-03:27:42 /PSEARTHSENS ON ground system event
99-064-03:27:43.5 FARM B counter inc for /PSACEPWR ON transfer frame time
99-064-03:27:44.7 FARM B counter inc for /PSDSSPWR ON transfer frame time
99-064-03:27:45 /PSPYROA ON ground system event
99-064-03:27:45.3 FARM B counter inc for /PSEARTHSENS ON transfer frame time
99-064-03:27:45.6 All pyro box telemetry shows box is off packet 10
99-064-03:27:46 /PSPYROB ON ground system event
99-064-03:27:46.3 Barker time of a command (/PSPYROA) packet 1
99-064-03:27:46.5 FARM B counter inc for /PSPYROA ON transfer frame time
99-064-03:27:47 /IPYRO ARM ground system event
99-064-03:27:47.2 Pyro bus A "ON" and B "OFF" in telemetry packet 11, PSPYRO
99-064-03:27:47.5 Sharp increase in spacecraft body rates packet 29
99-064-03:27:47.8 FARM B counter inc for /PSPYROB ON transfer frame time
99-064-03:27:48 /ISECVENT DEPLOY ground system event
99-064-03:27:48.2 Pyro bus B shows "ON" in telemetry packet 11, PSPYRO
99-064-03:27:49.0 FARM B counter inc for /IPYRO ARM transfer frame time
99-064-03:27:49.2 Essential bus shows 100 mA rise in current due to pyro box arming relay packet 11, PSESSCURR minus PSACECURR
99-064-03:27:49.5 Barker time of a command (/ISECVENT) packet 1
99-064-03:27:49.6 FARM B counter inc for /ISECVENT DEPLOY transfer frame time
99-064-03:27:50.2 Essential bus shows 70 mA rise in current due to pyro box arming relay (previous sample caught current in the middle of its increase, this is the rest of the increase) packet 11, PSESSCURR minus PSACECURR
99-064-03:27:50.6 Telemetry indicates secondary vent fire voltage exceeded threshold (last sample 5 sec before) packet 10, ISECPYROMON
99-064-03:27:52 /ISECVENT RESET ground command ground system event
99-064-03:27:53 /IPYRO RESET ground command ground system event
99-064-03:27:53 /PSMASTERTHRM ENABLE ground system event
99-064-03:27:53 /PSTHERMACT1 ON ground system event
99-064-03:27:53 /PSTHERMACT2 ON ground system event
99-064-03:27:53.9 FARM B counter inc for /ISECVENT RESET transfer frame time
99-064-03:27:54 /SCRTSENABLE RTSNUM=15 ground system event
99-064-03:27:54 /SCRTSSTART RTSNUM=15 ground system event
99-064-03:27:54 /PSSCSRVHTR ON ground system event
99-064-03:27:54 /PSSCOPHTR ON ground system event
99-064-03:27:54.5 FARM B counter inc for /IPYRO RESET transfer frame time

It can be seen at time 03:27:47.5 that a sharp rise in spacecraft body rates was recorded.

 

Attitude Control/Dynamic Conclusions

The spacecraft telemetry relevant to the attitude control system operation and the resulting spacecraft dynamics were reviewed the by the Board. Volume II, Appendix E, JPL Wire Anomaly Investigation Report, Volume II, Appendix D, WIRE Mission Launch and Early Orbit Chronology of Launch Events and Volume II, Appendix J, Timing of WIRE Vent Opening, provide additional details of this analysis. The following conclusions are consistent with the telemetry and observed dynamics, both flight and simulated:

    1. Spacecraft attitude control and dynamics appear to be nominal prior to opening the secondary hydrogen vent.
    2. Spacecraft dynamics initially appear to be nominal at the opening of the secondary hydrogen vent.
    3. Spacecraft dynamics after the initial venting at the opening of the secondary hydrogen vent are not nominal and are consistent with a continued venting of the hydrogen at a rate much lower than the initial vent rate.
    4. The continued venting of hydrogen resulted in a torque being applied to the spacecraft that was about twice as large as the counter torque that the Magnetorquers could apply. The result was that the spacecraft continued to spin-up even though the attitude control system was performing properly.
    5. The continued venting of the hydrogen at a rate that would overcome the Magnetorquers capability is consistent with that which would result from the heat load applied to the spacecraft cryogen system if the telescope cover came off at roughly the same time as the secondary hydrogen vent opening. However, there is no obvious dynamic signature in the data that could be directly identified as the impulsive ejection of the cover.

Table 2, WIRE First Pass Telemetry, plots the spacecraft x, y, and z-axis body rate change data as a function of time. It can be seen that the WIRE spacecraft begins to move after the pyro electronics box is turned on, but before the time of the secondary vent fire command. This data indicated to the Board that the cover could have been ejected about the time the pyro electronics box was turned on.

Telescope cover ejection dynamics were also considered as a possible source of torque on the spacecraft. The cover is nominally ejected at 1m/sec and has a mass of about 7 kg. This means that an impulse of 7kg-m/sec would be delivered to the spacecraft at cover ejection. If the line of force of the cover ejection misses the spacecraft by moment arm R, then the resulting angular momentum imparted to the spacecraft would be 7R kg-m/sec or 7 R Nms. Given the spacecraft transverse inertia of about 75 kg-m2, the spacecraft angular rate that would result from cover ejection is 0.093 R rad/sec or approximately 0.05 deg/sec per centimeter that the cover force misses the spacecraft center-of-gravity. The center-of-gravity was specified to be within 1 inch of the telescope centerline. Therefore, cover ejection would only induce a few tenths of deg/sec rates on the spacecraft, much smaller than can be observed at the secondary hydrogen vent opening.

Cryostat/Thermal Analysis

The cryostat/thermal analysis of the WIRE spacecraft launch anomaly found no credible evidence of a dewar-related failure other than early deployment of the cover. Heat rates into the dewar during the early instrumented portion of the flight were nominal. On-orbit thermal gradients and heating rates are in complete agreement with the nearly 40 watts of solar/Earth heating expected through an open cover, and not in agreement with temperature release of the cover from some other external or internal sources. The greatly accelerated on-orbit cryogen venting rate and resulting spacecraft momentum increase is also in agreement with the premature release of the cover before or at the time of the secondary vent opening.(Volume II, Appendix E)

 

Pyro Electronics Analysis: Failure Mechanism Determination

The initial step in this investigation, conducted at GSFC with support from JPL and SDL, was to review the design schematics and the parts list. The schematics were analyzed for possible sources of error combined with known characteristics of the parts used to implement the design. Several design issues became apparent and these areas were probed further, to see if there was a feasible explanation for the mishap. Initially, components available in the GSFC laboratory were tested and evaluated; based on the preliminary results, the experiments and characterization was repeated on flight spare components, obtained from Space Dynamics Laboratory (SDL), lending credibility to theories. (Volume II, Appendix F provides the full text of Richard Katz’s analysis of the failure mechanism.)

Analysis of the power-on transient characteristics of the Field Programmable Gate Array (FPGA) used to implement the logic in the Pyro Box Electronics demonstrated that a specific signature emerged; a memory effect was shown. This memory effect, the device's transient performance (uncontrolled behavior and initial flip-flop state) emerged as a function of the amount of time the FPGA was unpowered. These transient signals directly affected the "ARM" and "FIRE" signals that control the actuation of the pyrotechnic device firing.

Utilizing the knowledge gained from GSFC laboratory testing, the failure mode was replicated on the pyro box Engineering Test Unit (ETU) at SDL. The memory effect on ETU hardware matched that from GSFC laboratory tests, indicating that the same mechanism was present. The WIRE flight Pyro Box had been powered off for a considerable amount of time prior to launch; thus, the in-flight failure was consistent with the signatures seen in both laboratory and post-flight ETU testing.

Summary of the Mishap

The probable direct cause of the WIRE mishap was transient outputs from the pyro electronics box that fired the cover pyrotechnic devices when the box was initially powered-on in flight. The underlying theme of this mishap is that the ideal models of components do not match their actual behavior and that the fidelity of simulators and other support equipment used for design and verification tests was less than required to detect the failure mode that occurred.

The Spacecraft Power Electronics (SPE) switches spacecraft battery power to the pyro electronics box via electromechanical relays. The power source and the method of connection significantly affect the transient power-up phenomenon: whether by battery and relay (hard start) or by lab power supply (soft start). As a result, the rise time of spacecraft supplied +28V line to the pyro electronics box, when it is powered in-flight is relatively short. This results in the immediate availability of power to the arming relay and FET for firing the pyrotechnic devices. As configured, the derived +5VDC logic in the pyro electronics box has a relatively slow rise time, affecting the transient behavior of the Vectron crystal oscillator and Actel A1020 FPGA.

Start time for both components is a function of power supply rise time and must be considered during design. The design structure used in the pyro electronics box employed a synchronous reset that was not activated until after the oscillator started. As a result, flip-flops were not initialized to a safe state. Likewise, the FPGA's outputs, which are not controlled during the startup transient, are directly connected to the relay and FET driver electronics.

Instrument level testing at SDL used live pyrotechnic devices. However, a SPE simulator (laboratory power supply) with non-flight rise time characteristics was used, which led to false-positive test results when observations were made at the output of the pyro electronics box. Spacecraft level testing used an existing commonly used Electro Explosive Device (EED) simulator that did not accurately model pyrotechnic devices; additionally, the event indication of the EED simulator was ambiguous and the failure was not recognized during test processes.

Root Cause of WIRE Mishap

The root cause of a failure is the mechanism that directly caused the mishap. Contributing causes include events or conditions that could have been used to identify this condition as the phenomena has been understood. Contributing factors are other events or conditions that might have been able to prevent the mishap and should have been done significantly better.

The root cause of the WIRE mission loss is a digital logic design error in the instrument pyro electronics box. The transient performance of components was not adequately accounted for in its design. The failure was caused by two distinct mechanisms that, either singly or in concert, resulted in inadvertent pyrotechnic device firing during the initial pyro box power-up. The control logic design utilized a synchronous reset to force the logic into a safe state. However, the start-up time of the Vectron crystal clock oscillator was not taken into consideration, leaving the circuit in a non-deterministic state for several milliseconds, a time sufficient for pyrotechnic actuation. Likewise, the startup characteristics of the Actel A1020 FPGA were not considered. These devices are not guaranteed to follow their "truth table" until an internal charge pump "starts" the part. These uncontrolled outputs were not blocked from the pyrotechnic devices' driver circuitry. There has been no evidence or indication of any component failure although component failures were considered in the investigation.

Significant Contributing Cause

The significant contributing cause of the WIRE mishap was the failure to identify, understand, and correct the electronic design of the pyro electronics box. Design errors in the circuitry, which controlled pyro functions, were not identified. The pyro electronics box design was not peer reviewed, nor were other system reviews of the electronics box conducted. At the time the WIRE Systems Design Review was conducted, the design of the pyro electronics box was not completed. It is the assessment of the WIRE Mishap Investigation Board that a peer review held by persons with knowledge and expertise regarding pyro circuit design would have identified the turn-on characteristics of the pyro electronics box that led to failure.

Contributing Causes

  1. The spacecraft system test program did not uncover the design failure mode. A contributing cause which should have been able to prevent the WIRE mishap was the failure to correctly identify the source of the signal which caused the Electro Explosive Device (EED) Simulator to "latch" upon Pyro Box power-up during spacecraft integration testing. The incident was incorrectly attributed to excessive sensitivity of the EED Simulator. In fact, it possibly was an indication of the transient that caused the in-flight mishap. (It should be noted that this device has been used on numerous occasions for successful verification testing of NSI based pyrotechnic subsystems and is a standard GSFC test device). The EED simulator does not accurately simulate the load of a pyrotechnic device for the first 21 milliseconds. This possibly prevented a large current transient from being registered on the power-input lines during spacecraft test. Additionally, the box does not provide adequate information about all input signals capable of firing a pyrotechnic device.

2. A second significant contributing cause to the mishap was the fact that the instrument and the pyro electronics box test programs did not uncover the design failure mode. The box-level testing used the turn-on of a laboratory power supply instead of the closure of a relay to power the pyro box. The slow rise time of the power supply (150-200 ms) masked the short (2 ms) turn-on output transient of the pyro box. Later testing with the ETU pyro box demonstrated that testing with the laboratory power supply but using a relay to apply power would have found this turn-on transient.

3. A third contributing cause to the mishap was the lack of documentation for the Actel A1020 FPGA's power-up transient characteristics in the device data sheet. This information is available in the FPGA Data Book and Design Guide in two application notes.

4. A fourth contributing cause to the mishap was the lack of documentation for the Vectron 200 kHz oscillator's start time in the device data sheet.

  1. There was no system level end-to-end test with live pyrotechnic devices in an as-flown configuration. Although not necessarily a contributing cause by itself, the absence of this test coupled with the low fidelity simulators may also be considered a contributing factor to the mishap.

 

Lessons Learned and Recommendations

The following paragraphs provide a set of proposed lessons learned and recommendations that can be taken to preclude future similar occurrences. The "Lessons Learned Information System" (LLIS) format has been used in constructing this section.

1. Description of Driving Event:

The pyro electronic box design did not appropriately consider electronic transients’ effects known to occur at power turn-on of electronics.

Lessons Learned:

Perform electronics power turn-on characterization tests, particularly for applications involving irreversible events. In some applications, power turn-off characterization may also be important and should be considered.

Recommendations:

    1. Independent, separate pyro inhibits should be considered for mission critical events, particularly if all pyro functions can be simultaneously armed and enabled. Hence, activation of a pyro event would require two separate actions—one separate action to enable the inhibit and another to fire the pyros. This approach would preclude spurious transient pyro firings during turn-on and preclude sympathetic firings induced by sneak path and/or crosstalk/magnetic field interactions that may occur in cabling.

b. Testing only for correct functional behavior should be augmented with significant effort in testing for anomalous behavior, especially during initial turn-on and power on reset conditions.

 

2. Description of Driving Event:

Detailed Peer Review was done on the WIRE Instrument Electronics (WIE) only. Detailed peer reviews of the pyro electronics box and its interfaces with the spacecraft were not done.

Lesson Learned:

Detailed, independent technical peer reviews are essential. Furthermore, it is essential that peer reviews be done to assess the integrity of the system design, including an evaluation of system/mission consequences of the detailed design and implementation.

Recommendations:

    1. Peer reviews should be required by Project Management and held as often as necessary.
    1. Peer reviews should consider the heritage capability and limitations of the support equipment to be used for testing the flight design.
    2. Project review board members should consistently penetrate the system and subsystem functional design and implementation to expose risk areas, particularly where multiple/complex interfaces exist. Reviews should fully define spacecraft and payload interface requirements, and have a cognizant systems person from each program element review the other persons’ test program and payload/spacecraft simulators for fidelity.

3. Description of Driving Event:

Consideration of vent-produced torque received little/no analysis for a worst-case venting scenario.

Because the expected nominal vent rate from the secondary tank was low, the WIRE team spent little effort on the design of the secondary vent exit. A simple T at the exit of the vent would adequately balance the thrust from a nominal flow rate, and the exit was placed as close as possible to the exit point on the cryostat shell to minimize the pressure (and therefore temperature) inside the secondary tank. Unfortunately, the team never analyzed the effect of this exit design during a worst-case venting scenario.

Lesson Learned:

The design configuration and location/mounting of external vent hardware should consider the possibility of a worst-case venting scenario to prevent mission loss or major degradation.

Recommended Action:

System and subsystem engineers should consistently evaluate functional designs and implementation to expose risk areas, particularly where multiple/complex interfaces exist. Projects with multiple components, i.e. spacecraft bus and a separate instrument, require complete team cooperation, openness, and the ability to penetrate and understand each other’s design responsibilities in a timely manner.