NASA Office of Logic Design

NASA Office of Logic Design

A scientific study of the problems of digital engineering for space flight systems,
with a view to their practical solution.


Design, Analysis, and Test Guidelines


SAFETY DESIGN REQUIREMENTS AND GUIDELINES FOR MUNITION RELATED SAFETY CRITICAL COMPUTING SYSTEMS

STANAG 4404, Edition 2

Introduction (excerpt): Applications of computing systems to munitions provide increased versatility and capability that may make the munition much more effective in its intended role. However, the level of risk associated with these computing systems must be ascertained in order to ensure that the munition system conforms to the acceptable level of risk defined in the system specifications. Safety critical applications of computing systems include not only control functions, where the computer exercises direct control over a piece of hardware, but those where the output is used to make safety critical decisions, such as monitors of safety critical functions. Computing system safety is a discipline of system safety engineering that specifically addresses the unique issues and problems associated with computing systems in such applications.
     The aim of this agreement is to provide generic safety design requirements and guidelines for the design and development of all classes of conventional munition systems in which computing systems have or potentially have safety critical applications. These requirements and guidelines are designed such that, if properly implemented, they will reduce the risk of the computing system causing an unsafe condition, malfunction of a fail safe system or non-operation of a safety function in the munition.


Minimizing HDL Design Errors

Ben Cohen
VhdlCohen Publishing

Minimizing_Design_Errors_HDL.pdf
Minimizing_Design_Errors_HDL.doc

Abstract

This paper discusses processes, methodologies, and classes of tools necessary to minimize ASIC and FPGA design errors.

(added July 5, 2001)

Design Guidelines and Criteria

for

Space Flight Digital Electronics

nasa_guidelines

Introduction (excerpt)
This monograph discusses design guidelines and criteria which form a basis for the specification, design and evaluation of digital avionics for spaceborne applications. The goal of this work is to help ensure that the design of the hardware is flight worthy. The responsibility for a reliable design and hardware rests solely with the design and analysis team.  The material presented here will concentrate effort on items that are often seen to be problems in space flight digital hardware, giving us the most benefit for the time and effort expended.  This list will not state how to design a particular circuit, perform an analysis, or prepare the results but will cover a list of items that needs to be considered for a successful and robust design.


White Paper on Definitions for and Approach to Anomaly Handling

Professor Nancy Leveson
Aeronautics and Astronautics Department
MIT

Rich Katz, NASA Office of Logic Design

Introduction
The current definitions of in-family and out-of-family performance used by some organizations essentially allow any anomalous behavior to be labeled as “in family” once it has been accepted or waived, whether or not it satisfies the requirements, specified performance limits, or good engineering practice.  By not determining the root cause and mechanisms and by not bounding worst-case performance, limits on the effects of repeated events are not bound. The Columbia accident and the treatment of the foam shedding over the life of the Shuttle demonstrate the problems with these definitions and practices.


"Usage of EEPROM in Digital Designs"

Saab Ericsson Space
D-G-NOT-00385-SE
2004

Background

In the frame of a project several problems related to the EEPROMs have been encountered. Since several manufacturers are using the same die in the devices these problems are also relevant to them. The purpose of this document is to describe the encountered problems, symptoms and suggest a safe design methodology concerning the usage of EEPROMs in digital designs.


Clock Skew and Short Paths Timing

March, 2004
clock_skew_actel_2004.pdf

Introduction
Differences in clock signal arrival times across the chip are called clock skew. It is a fundamental design principle that timing must satisfy register setup and hold time requirements. Both data propagation delay and clock skew are parts of these calculations. Clocking sequentially-adjacent registers on the same edge of a high-skew clock can potentially cause timing violations or even functional failures. Figure 1 shows an example of sequentially-adjacent registers, where a local routing resource has been used to route the clock signal. In this situation, a noticeable clock skew is likely.


The Ten Commandments of Excellent DesignXL

Peter Chambers,Engineering Fellow
VLSI Technology
1997  (notes)

Abstract
This report will give you some pointers that will help you design synchronous circuits that work first time. Ten commandments that should always be followed!


RT54SX72S: Propagation Delay vs. Life

sxs_delta_data

Overview
The enclosed chart shows aggregate data from multiple lots of RT54SX72S FPGAs, totalling 1,040 individual devices.  The speed data, with a mean delay of 73.8 ns, is measurements of the binning circuit, which is representative of logic paths and is used in determing device speed grade.  Note that the delays over life do not necessarily "track," as there are both differences in the changes of delay as well as some differences in the sign of a delay change.

Design, Test, and Certification Issues for Complex Integrated Circuits

Report No. DOT/FAA/AR 95/31
L. Harrison and B. Landell
August 1996

Abstract
  
This report provides an overview of complex integrated circuit technology, focusing particularly upon application specific integrated circuits. This report is intended to assist FAA certification engineers in making safety assessments of new technologies. It examines complex integrated circuit technology, focusing on three fields: design, test, and certification.. It provides the reader with the background and a basic understanding of the fundamentals of these fields. Also included is material on the development environment, including languages and tools.
   Application specific integrated circuits are widely used in Boeing 777 fly by wire aircraft. Safety issues abound for these integrated circuits when they are used in safety critical applications. Since control laws are now executed in silicon and transmitted from one integrated circuit to another. reliability issues for these integrated circuits take on a new importance. This report identifies certification risks relating to the use of complex integrated circuits in fly by wire applications.


Intel® Xeon™ Processor Design Guidelines

Overview
A variety of very detailed notes on how to design with this high performance processor.


Design for Validation

Sally C. Johnson and Ricky W. Butler
NASA Langley Research Center
design_for_validation.pdf

Abstract
The use of computer hardware and software in life critical applications, such as for civil air transports, demands the use of rigorous formal mathematical validation procedures. However, formal specification and verification will only be tractable if the system is designed in a manner that lends itself to formal methods. Likewise, accurate reliability analysis will only be tractable if the number of interacting components that must be individually included in a single reliability model is kept to a low number and if their failure behavior interactions can be modeled simply. Also, the system must be designed such that the system reliability does not directly depend on system parameters that cannot be accurately determined. This paper presents a design methodology based on the concept of designing a system in such a manner that it can be rigorously validated, or "design for validation."

Timing Analysis of Asynchronous Signals

Introduction
Timing analysis of all signals must be performed.  A common mistake in the analysis of digital systems is the timing analysis of asynchronous signals.  With respect to flip-flops, these inputs are usually called PRESET and CLEAR.  For MSI devices, there are other signals that are similar such as the JAM input to counters or shift registers.  Although these signals are labeled "asynchronous" they do have restrictions on their use.  Failure to meet the specifications can result in incorrect operation.  Obviously, data being "jammed" into flip-flops must meet setup and hold times with respect to the asynchronous command pulse.


Suggestions for VHDL Design Presentation

Goals
  • Detailed design review and worst case analysis are the best tools for ensuring mission success.
  • The goal here is not to make more work for the designer, but to:
    • Enhance efficiency of reviews
    • Make proof of design more clear
    • Make design more transferable
    • Improve design quality
NSCAT Digital Subsystem Design Documentation and Analyses  


Mishap Investigation Procedures

TOR-0090(5530-01)-l
5 February 1990
R. A. Hartunian and Space Systems Division, Air Force Systems Command

mishap_tor-0090_5530_01_1.pdf

Introduction

When a program incurs a major launch or on-orbit failure, two general actions are required to recover from the failure:

  • An Investigation Phase - Identification of the root cause/causes of the failure.
  • A Recovery Phase - Development of a recovery program which identifies and implements positive, high-confidence fixes for the root cause/causes.

Since the Recovery Phase is very system specific, this document will primarily address the Investigation Phase.

A comprehensive and systematic evaluation of all available data is required in order to identify all credible possible causes of a failure and to provide the necessary corrective actions to assure that the failure does not recur. This report discusses proven failure investigation methodology and tools which will be useful in assuring that such comprehensive, systematic evaluations are conducted and that the necessary documentation trail to support review of the investigation is provided. The guidance provided is derived from a review of the past methodology/procedures followed by USAF Space Systems Division and The Aerospace Corporation with emphasis on those failures occurring since 1981.


Root Cause Analysis (RCA)

Q/ Associate Administrator for Safety and Mission Assurance AE/Chief Engineer
July 1, 2003

root_cause_analysis_bradley_2003.pdf

Summary
This memo provides the results of a joint engineering and safety effort to establish common terminology in NASA to facilitate improved communication and understanding of RCA.  RCA is a structured evaluation method that identifies the root causes of an undesired outcome and the actions adequate to prevent recurrence.

The Agency has adopted this definition for RCA and other revised definitions for root cause and related terms to ensure that when teams say that they have reached a "root cause," they have completed detailed evaluations, identified all relevant organizational factors, and/or exhausted all data. These definitions are applicable to both engineering and safety teams that are using RCA to evaluate problems or mishaps.


Process Based Mission Assurance:

About Mishap Investigation

About Mishap Investigation
    
This website is maintained by the NASA Headquarters Office of Safety and Mission Assurance to provide a collaborative environment between government agencies, academia, and the commercial sector to promote the exchange of knowledge and advance the development of accident investigation methodology and tools.
     This website has been developed to provide a timely exchange of technical information by enhancing communication and coordination between organizations and agencies. It is hoped that this site will facilitate technical interaction, promote synergy between organizations, allow sharing of documents, and assist in the identification of technical issues.


Galileo AACSE: Worst Case Analyses (WCA) Description and Criteria

wca_galileo

Forward
This document defines the guidelines used to perform the Worst Case Analysis of theGalileo Attitude and Articulation Control Subsystem Electronics.  these guidelines reflect worst case analysis requirements defined in reference 2.


NASA ASIC Guide

 

 


Shortcomings in Ground Testing, Environment Simulations, and Performance Predictions for Space Applications

E.G. Stassinopoulos and G.J. Brucker
NASA Technical Paper 3217
April 1992

stas_92

Abstract
This paper addresses the issues involved in radiation testing of devices and subsystems to obtain the data that are required to predict the performance and survivability of satellite systems for extended missions in space. The problems associated with space environmental simulations, or the lack thereof, in experiments intended to produce information to describe the degradation and behavior of parts and systems are discussed. Several types of radiation effects in semiconductor components are presented, as for example: ionization dose effects, heavy ion- and proton-induced Single-Event Upsets (SEUs), and Single-Event- Transient Upsets (SETUs). Examples and illustrations of data relating to these ground testing issues are provided. The primary objective of this presentation is to alert the reader to the shortcomings, pitfalls, variabilities, and uncertainties in acquiring information to logically design electronic subsystems for use in satellites or space stations with long mission lifetimes, and to point out the weaknesses and deficiencies in the methods and procedures by which that information is obtained.


Human-Rating Requirements and Guidelines for Space Flight Systems

NPG: 8705.2
Effective Date: June 19, 2003
Expiration Date: June 19, 2008

human_rating_n_pg_8705_0002.doc
human_rating_n_pg_8705_0002.pdf

P.1 PURPOSE
NASA’s policy is to protect the health and safety of humans involved in and exposed to space flight activities, specifically the public, the crew, passengers, and ground processing personnel. This document aids in the implementation of that policy by establishing human-rating requirements for Agency space flight systems that carry humans or whose function or malfunction may pose a hazard to NASA space systems that carry humans. This document provides the requirements, procedures, and guidelines to design and certify as human-rated all space flight systems involving humans or interfacing with human space flight systems prior to and after becoming operational. The intent of this certification is to provide the maximum reasonable assurance that a failure will not imperil the flight crew or occupants and that personnel may be recovered without a disabling injury if there is a mishap. Certification ensures that conditions that could adversely affect the safety of personnel are mitigated. The human-rating process is used to maximize the safety of the crew and passengers. Other requirements for safety and mission assurance are documented in NASA Headquarters Office of Safety and Mission Assurance (OSMA) policy and program-specific requirements documents. (For space suits and human maneuvering units, human rating implies flight certified.)

Electronic Reliability Design Handbook

MIL-HDBK-338B 1 October 1998
mil-hdbk-338b.pdf
mil_hdbk_338b_contents.pdf

1.1 Introduction This Handbook provides procuring activities and development contractors with an understanding of the concepts, principles, and methodologies covering all aspects of electronic systems reliability engineering and cost analysis as they relate to the design, acquisition, and deployment of DoD equipment/systems.

1.2 Application This Handbook is intended for use by both contractor and government personnel during the conceptual, validation, full scale development, production phases of an equipment/system life cycle.


JPL Reliability Analyses Handbook

July 1990
JPL D-5703

jpl_d_5703.pdf

e-mail for access

I. INTRODUCTION (excerpt)

A. General: This document provides guidelines for performing and reviewing reliability analyses associated with flight equipment. It is responsive to the analysis requirements of JPL D-1489 (Ref. 1). In addition, it provider procedures for identifying, preparing, processing, tracking and resolving deficiencies in the analyses and/or design. This document does not address analyses required in direct response to safety concerns. It should be emphasized that these analyses are not an after-the-fact documentation of what resulted from the design process, but are an active integral part of the design process. There should be immediate action taken if unacceptable analysis results are found.

B. Purpose: The analyses guidelines provide a centralized source of information on performing and reviewing reliability analyses. The purpose is to promote uniformity of the various methodologies, both within a specific project and from project to project. The review guidelines not only provide information to assist the review function, but by explicitly defining what the reviewer should be looking for, the analyst performing the analysis can provide the information in a form that is understandable to the reviewer,


Payload Vibroacoustic Test Criteria

NASA-STD-7001
June 21, 1996
NASA Technical Standard

7001.pdf

Purpose:
The primary objective of this standard is to establish a uniform usage of test factors in the vibroacoustic verification process for spaceflight payload hardware. The standard provides test factors for verification of payload hardware for prototype, protoflight, and flight acceptance programs. In addition, minimum workmanship test levels are included. With the exception of minimum workmanship test levels, the test levels are given in relation to the "maximum expected flight level” (MEFL). Although the major emphasis of the standard is on test levels, the standard also covers the subjects of test duration, test control tolerances, data analysis, test tailoring, payload fill effects, and analysis methods.


Payload Test Requirements

NASA-STD-7002
July 10, 1996
NASA Technical Standard

7002.pdf

Purpose:
This standard provides a NASA-wide common basis from which test programs shall be developed for NASA payloads. The document defines a succinct standard set of flight hardware test requirements which provide for the necessary verification of design adequacy and flight worthiness of NASA spacecraft. Compliance provides consistency across the Agency and its contractors, thereby facilitating the sharing of hardware between Centers and programs. Compliance also provides a basis for establishing a baseline pedigree that allows the conduct of a "qualification by similarity" evaluation process for "heritage" hardware without the complicating need to consider the variability of test requirements.


SX-A/RT54SX-S SSO Preliminary Results

October 2, 2002
Actel Corp.

sso-10-1-02_actel.pdf

Publication Motivations
  • To address some of the concerns expressed by some customers prior to full report completion.
  • To assist customers with the currently available data to make assessment of their chip and board designs.
  • Currently more data are being collected. The current guidelines will be substantiated with more data.
 

DESIGN CHECKLISTS FOR MICROCIRCUITS
GUIDELINE NO. GD-ED-2203
PREFERRED RELIABILITY PRACTICES
JOHNSON SPACE CENTER

NASA Reliability Preferred Practices
A. PURPOSE

This manual is produced to communicate within the aerospace community design practices that have contributed to NASA mission success. The information presented has been collected from various NASA field centers and reviewed by a committee consisting of senior technical representatives from the participating centers.

B. APPLICABILITY

The information presented in this manual represents the "best technical advice" that NASA has to offer on reliability design and test practices. The practices contained in this manual should not be interpreted as requirements, but rather as proven technical approaches that can enhance system reliability. Application of the practices and guidelines is strongly encouraged, but the final decisions regarding applicability resides with the particular program or project office The manual is divided into two technical sections. Section II contains reliability practices, including design criteria, test procedures, and analytical techniques that have been successfully applied on previous space flight programs. Section III contains reliability guidelines, including techniques currently applied to space flight projects where insufficient information exists to certify that the technique will contribute to mission success.

An Outline of Worst Case Analysis Requirements for Digital Electronics

WCA_Requirements.pdf

Abstract
     Every designer’s goal is mission success: the production of a correctly functioning system.  One of the keys to achieving that goal is the worst case analysis (WCA). A detailed WCA, if performed during the design phase, can find design problems that may not be found during the test phase. Timing errors, interface margin problems, and other design flaws may manifest themselves only under limited operating conditions that are not present during test, such as temperature extremes, age, or radiation, or in limited operating modes that are not exercised in test. The only way to guarantee that no design flaws exist in a circuit is to carefully analyze the circuit and prove their absence.
     The purpose of a WCA is to prove the design will function as expected during its mission. The spirit of analysis is proof: all circuits are considered guilty of design flaws until proven innocent. The following is an outline of WCA requirements which introduces the circuit design items that must be reviewed as part of the WCA.

Digital Timing Analysis Tools and Techniques

Timing.pdf

Abstract
     The timing analysis is a crucial part of a digital system’s worst case analysis. Every latched device has timing requirements -- set-up times, hold times, etc. - - that must be met in order to guarantee correct system operation, and the goal of the timing analysis is to determine whether they are met. Because each device input can have many sources whose timing can vary with circuit operation mode , the timing analysis can be very complicated and time consuming.  Thus many attempts at automating the timing analysis task have been made. But, the task is sufficiently complex that attempts to fully automate it have, so far, had only limited success. This report examines several timing analysis methods, and discusses their strengths and weaknesses.

Root-Sum-Square (RSS) Calculations of Digital Timing Delays

RSS.pdf

Abstract
     The subject of RSS versus extreme value calculations arises often in worst case analyses because the calculation of a quantity, e.g., the delay of a digital parts chain, required to be less than some value, will yield a smaller result when calculated by the RSS method than by the extreme value method, making it easier to claim that requirements are met.
     The validity of RSS is often debated without exploring its mathematical basis. This report discusses the basis for RSS calculations and the method’s limitations. Although the discussion is centered around calculating the propagation delays of digital circuits, the basic theory and conclusions apply to any application of RSS.


Electrical Power Systems, Direct Current, Space Vehicle Design Requirements

Aerospace Report No. TOR-2005(8583)-2
May 11, 2005
 

Abstract
The Technical Operation Report baselines an updated set of requirements for spacecraft electrical power and distribution systems.  It is intended to be used as a starting point for upgrading of previous military specifications in this area, or for development of a new specification dedicated solely to power system requirements.  An ancillary use of the document is to edify those in the acquisition process such that they may more thoroughly understand the basic considerations of power system design, as well as subtler and sometimes unaddressed issues that can adversely affect mission success if not addressed.

Home - NASA Office of Logic Design
Last Revised: February 03, 2010
Digital Engineering Institute
Web Grunt: Richard Katz
NACA Seal