THE RIGHT STUFF

for

AGING / INTERMITTENCE / NO FAULT FOUND

Brent A. Sorensen - Paul W. Sorensen

Aging Electronics Research and Development

Universal Synaptics Corporation

1801 West 21st Street Ogden, Utah 84401

Tel: (801) 731-8508 E-Mail: contact@usynaptics.com

Visit us at our web page: www.nofaultfound.com

 

 The Right Stuff (.pdf)

ABSTRACT

For those in the aging avionics repair and maintenance business, the acronyms NFF (No Fault Found) and CND (Cannot Duplicate) are unfortunately, all too familiar terms. After several decades of frustration with this illusive phenomenon, it continues to consume an enormous amount of test and diagnostic effort and is the source of considerable finger pointing within the multi-level avionics repair model.

During the early formative years of the JCAA (Joint Council on Aging Aircraft), established in response to a rash of aviation accidents and space failures, NFF was initially overlooked or was at least significantly underrated because it was considered by many to be just a "maintenance nuisance that does not hazard the aircraft". This premise was perhaps loosely supported by a lack of proper paper trails that tracked NFF problems and little if any official investigations taking these mysterious failures to their true root cause.

There are undoubtedly many causes of NFF and all of them should be addressed. The question is: Where do you start? Which will be the most beneficial?

Our particular efforts have focused on the literal or statistical analysis of "No Fault Found" where if the system or device's NFF rate has increased with age and deterioration, a physical fault is probably present. However, if it isn’t found during normal testing then it probably only fails intermittently. Similarly, having an intermittent failure mode, it probably cannot be detected or diagnosed at testing time because of known and demonstrated limitations in the measurement equipment used to perform the tests.

In this paper we are going to detail the problem of age-related intermittence and its testing difficulties. More importantly, we will report on our recent successful efforts working with Total Quality Systems, (TQS) Ogden, Utah developing a team-based overhaul system called IFDIS that incorporates the necessary testing procedures and equipment that are proving to be phenomenally successful in resolving this chronic "aging" aspect of the NFF problem.

Intermittence occurs randomly in time, place, amplitude and duration. The very nature of the failure mode suggests that the ability to detect and further isolate intermittence to root cause is based on detection SENSITIVITY and PROBABILITY, rather than traditional methods centered around ohmic measurement accuracy. Simply put, you can’t detect an intermittent until it occurs, and then you might only have a few chances to catch it on the specific circuit when it does. Trying to measure fractions of a milliohm one circuit at a time is ineffective for this particular failure mode.

Through extensive hands-on failure analysis and repair of NFF avionics and other aging electronics, our research revealed that nearly all failures are caused by underlying intermittence in the circuit path interconnections, not the electrical components. The electrical components generally fail "hard" and are, by-comparison, easy to troubleshoot and repair. In contrast, the interconnecting "devices" mostly fail intermittently. These types of devices are hereby defined as the connectors, crimps, splices, circuit board traces and via's, solder joints, bulkhead connectors, backplanes, switches, circuit breakers, fuse receptacles, etc. In short, all the "electromechanical" devices that "glue" a circuit together.

Just like machinery, these particular devices wear out gradually or contamination will build up over a very long period and rarely, unless through damage, will they be working perfectly well one minute and the next become a repeatable, testable, hard failure. Instead, the electromechanical devices go into a very long and frustrating period of low-level intermittency as their mechanical tolerances change depending on their age, wear and environmental conditions at any given time such as temperature, humidity, vibration and contamination.

When a particular circuit device’s electromechanical intermittence becomes of sufficient magnitude, its overall electrical function will begin to malfunction, resulting in increasing intermittent-type system failures, which, when subsequently tested on the ground in a static environment, may very well be functioning sufficiently well as to avoid detection.

It’s at this point that NFF’s circular logic and confusion begins to occur. When a malfunction is reported, but is no longer evident or easily detectable with standard equipment, the technician has very few expedient diagnostic choices… is the intermittence in the aircraft or is it in the box? It’s highly unlikely that the pilot was mistaken or imagined the original flight malfunction. So, line technicians are often left to simply "shotgun" the repair to try and address the original write-up in a timely manner. Unfortunately, by removing system elements prior to locating the root of the intermittent, there is now the potential that the removal was not necessarily the problem. Suggestions that the technician simply pulled the wrong item due only to inadequate training, tech orders, experience, etc., tend to overlook the original undetected intermittent malfunction that started the process and still exists somewhere in the system.

Since intermittence occurs primarily in the electromechanical devices, when the "most likely opportunity" is calculated, the Line Replaceable Unit (LRU) becomes the most prominent suspect. There are several hundred if not thousands of failure points in each avionics box, whereas the aircraft circuits and connections leading into the box may be just one or two hundred. An 8/10:1 LRU opportunity for failure over the aircraft is probably a good ballpark estimate.

THE TESTING SOLUTION

Once intermittent failure modes are clearly understood, it becomes quite evident why the vast array of traditional test equipment cannot efficiently or effectively test for this problem.

In a typical avionics system, there are thousands of internal and external circuit paths moving through thousands of physical interconnection points which are all aging to some degree and will fail intermittently long before they fail permanently, and it only takes one of them reaching this condition to condemn the entire unit. Since it is virtually impossible to probe such a system, and even if tried, the probability that you would be measuring the right line, at the right time, looking for the right signal would be infinitesimal.

By any reasonable scientific explanation of the problem, to catch intermittents on the ground, you need to have phenomenal testing speed (sensitivity) and 100% bandwidth. In other words, the proper technology for the task would be able to test all of the failing system’s lines, all of the time, in a simultaneous and continuous fashion. In contrast, traditional test equipment does just the opposite. These devices employ digital sampling and averaging techniques to achieve high levels of parametric accuracy. Because they are sampling and averaging and even more importantly only able to measure one circuit at a time, this leaves some large gaping holes in intermittence testing coverage.

To confront these problems, the IFD was developed specifically with intermittence requirements in mind. It uses super sensitive analog detection technology on the front end and digital reporting and data processing technology on the backend, and it does it all in an efficient, parallel circuitry manner. The IFD line of Intermittent Fault Detectors consistently detects intermittent circuit events on an unlimited number of circuits, simultaneously, at glitch durations down to as low as 50 nanoseconds.

What does this mean in the overall scope of intermittence detection probabilities?

While certainly not comparing ourselves to Albert Einstein, his formula, E=MC2, which explained the forces unleashed by the atomic bomb, is very similar to the probability gains derived from the IFD technology when used to catch random intermittents. To explain and demonstrate this enhanced capability in a large system of simultaneous circuit paths under test we use a formula that we affectionately call:

Universal Synaptics’ Law of Intermittent Fault Detection Probability or

P=SC2

P is the Probability that IFD technology will be able to catch an intermittent versus any other comparable piece of test equipment.

S is the Single circuit intermittence detection Speed improvement ratio that the IFD has over the single circuit intermittent detection speed capability of any comparable testing technology (for the IFD use 100ns).

Simply stated, what is the ratio of the shortest glitch detectable by any two pieces of test equipment?

Example: 100us divided by 100ns = 1000:1 or 100ms divided by 100ns = 10,000:1

C is the number of circuits that require simultaneous testing to root out the systems intermittence.

Example: 1000 test points in a typical avionics box.

Result:

Using this formula of test coverage or probability gain of the IFD technology at 100ns over some of the best digital technology at 100us results in an IFD probability gain of P equal to 1000 times faster on a single line S, times 1000 lines squared for simultaneously tested, equals 1000 x 1000 x 1000 or ultimately 1,000,000,000:1

Note: The reason the number of circuits under test is squared is that while other single point or even scanning-type testers are testing one circuit at a time, the IFD is simultaneously testing all the other circuits for the same duration. As the other tester moves on to test a new circuit, the IFD continues to test all the other connected circuits for the same period and so forth and so on.

This billion-to-one improvement in detection probability using IFD technology in a typical avionics system may, at first glance, seem hard to believe, and, indeed, we have found ourselves having to defend the technology in some circles because of its tremendous probability gains. But, these same rather simple to compute metrics show conclusively why the IFD technology works so well for the intermittence phenomenon and why this technology can see real intermittent circuit occurrences that other methods may not be able to. Given this "explosion" in test coverage, it should be clear why the IFD is the only applicable technology capable of detecting, resolving and, possibly just as important, gauging the overall problem and levels of Intermittence and NFF.

IFDIS: THE RIGHT STUFF

Intermittent Fault Detection and Isolation Systems (IFDIS) can best be described as a 3-pillared approach to resolving NFF / Intermittence.

Its primary pillars consist of;

1.) The implementation and use of serialized data tracking to identify bad actors and repeat problems by Aircraft or Line Replaceable Unit (LRU).

2.) The application of environmental stimuli to duplicate the operational environment and rapidly expose even the "softest" of intermittence during test time.

3a.) The use of specialized intermittence testing technology -- IFD (Intermittent Fault Detectors) -- developed by Universal Synaptics specifically to detect and isolate the underlying intermittent causes at levels of sensitivity and probability never before possible.

3b.) Highly engineered test adaptation to ensure that all of the potential failing circuit interconnects in the suspect devices are all tested simultaneously and continuously while closely simulating the actual operational environment.

 

TESTING SUCCESS

To test our theory of NFF we teamed with TQS (Total Quality Systems) of Ogden and Hill AFB, on an SBIR project to examine the NFF testing aspect of a typical Avionics test and repair facility. Hill’s shop provided a number of problematic LRU chassis that would not pass functional testing and yet could not be diagnosed down to a specific defect (NFF) so it could be fixed.

IFDIS testing was able to pinpoint the cause of failure in nearly 98% of the test cases. In addition to pinpointing specific intermittent reasons for test failure, we also discovered, on a rough average, 2-3 other defects that had previously caused failures or were about to. In about 15-20% of the test cases, hard failures and miswires were also discovered that legacy testing could not detect or diagnose.

  1. Piet van Dijk, Frank van Meijl, "Contact Problems Due to Fretting and Their Solutions" AMP Automotive Development Centre, AMP Journal of Technology Vol. 5 June, 1996
  2. Brent Sorensen, Gary Kelly, Artur Sajecki, Paul Sorensen, "An Analyzer for Detecting Aging Faults in Electronic Devices", IEEE, AutoTestCon, September 1994
  3. Brent Sorensen, "The Achilles Heel of Modern Electronics", Evaluation Engineering Magazine, June 2004 Feature Article
  4. Wayne Tustin, "Random Vibration and Shock Testing", Equipment Reliability Institute, 2005