While concern over the cost of test is changing attitudes to when and how it’s done, the cost of failure means that we’re now testing everything, everywhere.
We are entering the era of total test: where things remain on test through their lifecycle and not just at the end of the production line to try to avoid failures in the field. Industries such as automotive and aerospace illustrate how attitudes to test have changed. Why? The high cost of product recalls and the increasing reliance on automation for safety.Stefan Singer, automotive field applications engineer at chipmaker Freescale, says the car makers are keen to eradicate component failures: “The trend towards zero defects is a big topic in automotive: customers have moved from measuring the number of failed parts per million to parts per billion.”
This does not necessarily mean that car producers and other manufacturers have suddenly started paying more for the higher quality. “Typically automotive customers look for very cheap solutions,” says Singer.
The problem for manufacturers is that exhaustive test is an expensive process. In high-volume electronics, for example, the cost of testing can easily outpace that of producing parts in the fab.
Chris White, product development manager of semiconductor test at National Instruments, said at NI Week: “The cost of manufacture has been declining but the cost of test has remained stubbornly flat,”
Rebeca Jimenez, vice president of worldwide test at chipmaker IDT, adds: “Our main challenge is reducing the cost of test and increasing test utilisation.”
Hafeez Najumudeen, product marketing manager for power meters and analysers at Yokogawa Europe & Africa, says: “The need to increase efficiency has taken top priority. Automation is the overall trend even for simple end-of-line tests.”
Even with automation, the tests themselves take time. How long can you afford a product to sit on a manufacturing test rig before you are ready to deem each one good or bad? One option is simply to do more in parallel, which is what companies such as Averna and IDT have done with custom rigs based on NI’s modular test hardware.
“Customers want to reduce test time by testing in parallel with multiple parts on the same station. We run as many tests as possible at the same time to reduce the cost of test and have reduced system cost by 40 per cent with this system,” Jean-Levy Beaudoin, vice president of sales at Averna, said at NI Week, pointing to a dual-device test rig built for automotive subsystem supplier Continental.
Rigs developed by IDT can test up to four chips at the same time. Because the tests do not quite take the same amount of time, IDT test director Glen Peer said the throughput gain is not quite fourfold over a single-site tester but it is providing significant cuts in test cost. Further gains, Beaudoin says, can be made by using intelligent fixtures that detect different parts and adjust the tests to suit so that equipment does not have to be duplicated and then sit idle for long periods because certain products are not coming off the line in a consistent order.
Welcome to the real world
Even in the case of industries such as automotive which chase after the zero-defect point, once parts are on the rig they can be exhaustively tested. Instead, manufacturers have to look for proxies that give an insight into the likelihood of a part failing at some point in the future. That means digging past the actual functions and into the mechanisms of failure, for both mechanical parts and their electronic controls and subsystems.
The growing sophistication of electronics presents major issues. It may seem to make sense just to test explicit functions, but this is becoming less likely to pinpoint potential failures, as it is more or less impossible to test a product 100 per cent functionally in any reasonable amount of time. For some systems, the real word is simply too slow.
Take the example of navigation systems that use the Global Positioning System. One way to test GPS receivers is simply to turn on the receiver and see what happens. But this sort of ‘open sky’ test takes time. In developing its test systems for GPS receivers, Spirent focused on attributes of the signal likely to trip up failing modules.
For other telecom systems, Agilent developed its EXTC communications system to exercise multiple radio interfaces at once, under conditions that a handset would rarely encounter in real life.
Robert Hum, general manager of Mentor Graphics’s deep sub-micron IC division, says some tests need to find ways to work out what is going on inside the package and not just decide that a product is OK if it performs user-visible functions correctly: “If you test only at the logic level, you will never get to zero defects. You need to go down to the transistor and the layout. Every fab’s process will give you a different defect spectrum. And the tests developed for logic won’t help you for memory or analogue circuits, which have different failure modes.”
Traditionally, electronic logic tests focused on abstract models of failures, using concepts such as stuck-at faults. The tester would be programmed to look for conditions where logic gates seemed to give the same answer no matter the input – their outputs would be stuck at a certain logic value. But this model has run into problems in recent years. One is that many of the gates are hidden behind others and cannot easily be tested individually. And the typical defects themselves changed. Many of the problems that plague ICs today are connection issues where a wiring element shorts out or touches a conductor it is not meant to. Tests have begun to focus on the physical consequences of these problems.
Reading the signs
Steve Pateras, product marketing manager at Mentor, adds: “It’s a concept that was not developed for automotive but turns out to be very good for it. We look at each logic cell in the library and we analyse its layout and understand the types of defect that can occur and then inject all possible shorts and opens.
“We then run an analogue simulation of all the possible combinations and see which ones result in a different output to what was intended. This turns out to be much more effective than just looking at stuck-at faults.”
For assembled equipment, similar pressures apply. How can you speed up production test and target issues that will mark out the failures, even if they do not appear to be broken? The key is to look for signs of problems known to lead to failures rather than test on real-world signals. In power systems, for example, spurious harmonics in the electrical waveform when analysed in the frequency domain can be the fingerprint of a looming problem. In more general-purpose electronics, unusual current consumption can be a warning sign of trouble ahead.
Najumudeen says harmonics analysis is more often used during development to determine the effectiveness of different circuit designs but “in the case of critical products such as power systems at the border of public energy supply networks, customers do measure harmonics at the end of the production line”.
Long-term learning can further optimise testing by identifying problems in returned products that were missed during test or by locating redundant tests and those that are simply not necessary. In the semiconductor industry, yield analysis using test data is becoming one of the ways companies try not only to drive down test cost but also to make designs less failure-prone.
Jim Robinson, general manager of the Internet of Things solutions group at Intel, says of his company: “We measure everything. Whatever you think can happen within a factory we have a spreadsheet on it. We believe that because of the massive savings you can get by extracting data measuring it and making decisions it is possible to completely disrupt and change manufacturing.”
Luke Schreier, senior manager for automated test product marketing at NI, says analysis can help streamline test programmes as well as detect latent issues: “You may see in the data that a certain test has never failed in a million, so why not take it out altogether?”
Test need not be confined to the end of the production line. Airbus wants to move some test functions into the processes themselves for its ‘factory of the future’, using tools that have their own computerised measurement and communications functions.
Intelligence in components
Bernard Duprieu, head of manufacturing engineering R&D at Airbus, said at NI Week: “In our processes we have a lot of manual tasks. We don’t have live input; it’s dead input: we have to datalog manually. With cyber-physical systems we can simplify this task and determine in real time, for example, the torque applied to a fastener.”
David Fuller, vice president of R&D for application and embedded software at NI, added that such intelligence could potentially be added to critical fasteners: “It would be great if a screw being tightened says ‘I need to be fixed’. A robot then zooms over and turns it to the right torque.”
Such fasteners could report their status over the lifetime of the final product as test and measurement begin to head out of the factory. This trend is partly driven by safety standards such as ISO 26262 for the automotive industry and partly by the growing ability to interrogate systems remotely over the internet.
Joe Salvo, director of GE Global Research, says: “Every part that is going to be made will have a unique pedigree that it will carry from the day it’s produced to the day it’s decommissioned.” The qualities of that pedigree will change over time. A big problem that microelectronics now faces is that of ageing: despite being solid-state, the componentry inside the more advanced chips in use today can degrade to the point of failure long before their estimated service life. And each system will age differently.
ARM fellow Rob Aitken says: “Ageing is very workload-dependent: what it is running, and how long it runs and how long it sits there doing nothing.”
Hum adds: “A lot depends on the thermal environment of each little transistor. We’ve seen a couple of weird cases where we get one transistor ageing faster than its buddies, so you start to get timing skew. The transistor itself doesn’t fail but the circuit does because things get out of sync downstream. You can analyse and predict the effects but you have to know where to look.”
Those ageing effects need to be tested over time. Pateras says: “A key aspect of ISO 26262 is long-term reliability. You want to be able to look for failure mechanisms over time and that requires you to periodically test the part. That requires built-in self-test mechanisms, where you apply test patterns generated on-chip.”
Sanity checks will be needed to ensure that the systems don’t provide false diagnoses because they have insufficient information. Aitken says: “I was thinking of the problem of error reporting with a car I owned. When I was driving along it piped up and said the ABS is broken. Not only that, it said the traction control is broken and engine control is broken. I ignored it and after a while the light went off. What really happened was that the battery was low.”
As long as the sensors are verified to be sending worthwhile data they can provide more information about the overall behaviour of the system they are in. Terry Wilson, IT principal at Duke Energy, says the company has installed in-field test hardware to monitor its network of equipment remotely. “Using this we found a bearing defect on a blower motor in one our locations. It allowed plenty of time to plan a repair months ahead during downtime. All of the analysis and detection were performed remotely.”
The use of early alarms can inform more than maintenance. NI president and co-founder James Truchard uses the example of a combine harvester fitted with sensors and communications to help “decide when to service it. You could use the data to see what was working well and see where ears of corn are being dropped and then use the data to design the next generation of combine harvester.”
Pateras adds: “We see the IoT [Internet of Things] as a huge area for these techniques. The IoT requires knowledge that the connected devices are working properly. BIST provides that capability.”
Salvo says such online instrumentation will be vital to keeping products going: “The physical world is going to evolve at the speed of software. You are going to be forced to keep upgrading and innovating or you are going to die.”