Low-power chip will bring voice recognition to IoT devices
A low-power chip specialised for automatic speech recognition has been developed by MIT researchers who believe it significantly expands the number of gadgets that could become voice-operated.
Although many modern smartphones are now capable of speech recognition, they require roughly one watt of power to process commands. The new chip in comparison requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognise.
In a real-world application, that probably translates to a power savings of 90 to 99 per cent, which could make voice control practical for relatively simple electronic devices.
That includes power-constrained devices that have to harvest energy from their environments or go months between battery charges. Such devices form the technological backbone of the Internet-of-Things (IoT), which is the idea that vehicles, appliances, civil-engineering structures, manufacturing equipment, and even livestock will soon have sensors that report information directly to networked servers, aiding with maintenance and the coordination of tasks.
“Speech input will become a natural interface for many wearable applications and intelligent devices,” said Anantha Chandrakasan who led the MIT group. “The miniaturisation of these devices will require a different interface than touch or keyboard. It will be critical to embed the speech functionality locally to save system energy consumption compared to performing this operation in the cloud.”
“I don’t think that we really developed this technology for a particular application,” said Michael Price, who led the design of the chip. “We have tried to put the infrastructure in place to provide better trade-offs to a system designer than they would have had with previous technology, whether it was software or hardware acceleration.”
Currently, the best-performing speech recognisers are based on neural networks – virtual networks of simple information processors roughly modelled on the human brain. Much of the new chip’s circuitry is concerned with implementing speech-recognition networks as efficiently as possible.
But even the most power-efficient speech recognition system would quickly drain a device’s battery if it ran without interruption. So the chip also includes a simpler ‘voice activity detection’ circuit that monitors ambient noise to determine whether it might be speech. If the answer is yes, the chip fires up the larger, more complex speech-recognition circuit.
The prototype chip actually contains three different voice-activity-detection circuits, with different degrees of complexity and power demands.
Which circuit is most power efficient depends on context, but in tests simulating a wide range of conditions, the most complex of the three circuits led to the greatest power savings for the system as a whole. Even though it consumed almost three times as much power as the simplest circuit, it generated far fewer false positives; the simpler circuits often chewed through their energy savings by spuriously activating the rest of the chip.
But voice-recognition systems are too large to fit into the chip’s on-board memory, which is a problem because going off-chip for data is much more energy-intensive than retrieving it from local stores.
So the MIT researchers also designed it so that the amount of data that the chip has to retrieve from off-chip memory is minimised.
Last year, Barclays bank announced it would start storing the vocal imprints of its personal banking customers in order to allow them to clear security checks by just speaking rather than remembering a password.