Robots learning to adapt and thrive when faced with new challenges
Image credit: UPMC
Robots are learning old skills in new ways, but computer power and reliability may slow down progress.
If there is a given in technology demonstrations it’s that they go wrong. A visit by politicians to the Pierre and Marie Curie University (UPMC) in Paris seemed destined to follow the rule.
In preparing the laboratory the night before, someone decided to polish the floor. For a robot that had learned to walk successfully, even with limbs missing, on the unpolished surface this could easily have been one change too far. The robot at first missed its footing as its legs splayed out on the shiny surface. Thankfully, the same software that gave it the ability to lose legs and still find a way to move forward proved successful in this case. The robot scrolled through its memory looking for gaits that might work, tried them out and found one that would let it continue on its way. Demonstration saved.
To develop the TBR-Evolution robot, the UPMC researchers borrowed an idea from biology: that animals use acquired knowledge to deal with new situations. The robot was able to take advantage of thousands of simulations performed on a computer of gaits using different numbers of legs and orientations. When faced with a situation it had not encountered before, the robot paused and sifted through its matrix of gaits before it found candidates for a better way of walking.
If the first attempt failed, it worked out which other gaits would not work, and then settled on a shortlist of probable winners until it found the one that would work. In one experiment, having had a couple of limbs removed to test the software, the robot ended up flipping itself over so that, in effect, it could walk on its knees.
UPMC researcher Jean-Baptiste Mouret says: “Upside down was in the map: we tried to find as many ways for it to walk as possible. The move inspired the next robot: one that’s bigger with feet on both sides.”
Although robots able to evolve new behaviours already existed when Mouret of UPMC and colleagues put TBR-Evolution together, the ability to put acquired knowledge into the machine from hours of earlier simulation greatly accelerated the process. Acquired knowledge is driving the software behind other robots in development to make them more flexible.
“What distinguishes this new generation of robots is that they can adapt,” says Professor David Lane of the University of Edinburgh and director of the Edinburgh Centre for Robotics. The team led by Lane wrote a report on robotics and autonomous systems for the Lloyds Register Foundation, in which Lane describes the coming generation of machines as “the arms, legs and sensors of big data working in the Internet of Things”.
The neural network represents one way of connecting big data to control, though it faces big challenges. First conceived more than 50 years ago, artificial neural networks have swung in and out of favour in AI research. Neural networks saw a flurry of interest in the 1990s because of their ability to self-learn with very little supervision. But the technique fell by the wayside, beaten on benchmarks for speech and image recognition by more supervision-intensive approaches. It proved difficult to scale up the size of the neural networks for them to be able to cope with complex, high-resolution inputs.
Work by Geoffrey Hinton and Ruslan Salakhutdinov of the University of Toronto reinvigorated the neural network idea. They found a way to let the networks do much more sophisticated analysis of images and other data using stacks of 2D arrays of simulated neurons. These stacks, which are much deeper than the two or three-layer stacks of the original concept, lent the reworked technique its popular name: ‘deep learning’.
The multiple layers complicate the job of training the neural networks - it was finding a way to deal with this that gave Hinton and Salakhutdinov their breakthrough. The training process at least became tractable, albeit extremely time consuming on conventional computer processors.
It was the invention of graphics processor units (GPUs) that could be reprogrammed to deal with tasks not just designed to spray textured polygons into a frame that gave the neural network an opportunity to make a comeback. Researchers such as Dan Ciresan and colleagues at the Swiss research institute IDSIA found that general-purpose graphics processors could crunch through the immense demand for floating-point arithmetic that came from a new generation of neural networks.
Ciresan says that in one problem, where deep learning was deployed on microscope images to try to spot cells in the process of division, the GPU version took three days versus the five months needed for a conventional computer processor such as an Intel Xeon.
Ciresan’s work showed that it was possible for a well-trained neural network to be better at recognising road signs than humans. The network his team trained was able to pull the correct meaning from signs that were bleached and damaged almost beyond recognition - employing visual cues that humans do not tend to use. But even with GPUs, the computing demand is intense.
In Edinburgh, Lane notes “We’ve just spent £120,000 on a machine-learning box full of GPUs. It’s big and it’s hot. That’s the reason why Siri is in the cloud. Because it won’t fit in your cellphone.”
The GPU is not the only option. Baidu and Microsoft have used programmable hardware devices made by Altera and Xilinx to run neural networks. Although these chips lack the peak performance of GPUs, they are more energy-efficient and so easier to cool when deployed en masse in server farms. Their performance is generally adequate for performing inferencing. Training, which is performed far less often compared to the inferencing work performed during image searches, can be offloaded to dedicated machines loaded with GPUs. Having been used to implement speech recognition for systems like Siri and image recognition for smarter searching by the likes of Baidu, Google and Microsoft, the neural network is moving into robots.
At UC Berkeley, a team led by Pieter Abbeel has developed a robot called Brett that is teaching itself to perform domestic jobs. At Google, other machines are training themselves how to grab, hold and move objects instead of demanding developers write algorithmic code to do the job. Google research scientist Sergey Levine wrote in a blog post in August: “While initially the grasps are executed at random and succeed only rarely, each day the latest experiences are used to train a deep convolutional neural network to learn to predict the outcome of a grasp, given a camera image and a potential motor command... Observing the behaviour of the robot after over 800,000 grasp attempts, which is equivalent to about 3,000 robot-hours of practice, we can see the beginnings of intelligent reactive behaviours.”
Roke Manor Research has developed a threat-analysis system called Startle for the UK’s Royal Navy based on neural networks that incorporates another form of inspiration from the biological brain. “The concept is based on the mammalian conditioned fear response,” says Mike Hook, principal consultant at Roke.
“The traditional way for computers to analyse threats is to collect data, process it and then figure out what you’ve got. That doesn’t go on in biology. An animal doesn’t want to attend to all that data. You are not aware of much of the data feeding into your brain: you focus on the stuff that matters to you through the mechanism of attention,” Hook explains. Part of the brain, the amygdala takes inputs from most parts of the body’s senses.
“It is taking a view of the vast mass of data that is coming in. When there is a sudden noise, that will cause the animal, which is grazing, to look up, twitch its ears, trying to locate the source - this idea of being able to take in a lot of data, recognise what is important and then cue higher-order processing. It also triggers the animal to get ready to run or fight. It gets the body ready while, at the same time, the higher-order processing can evaluate: is there really a threat? The architecture in Startle directly models that.”
The Startle system today is meant to provide human operators with better information on threats rather than in fully autonomous systems. It couples the machine-learning front-end with a goal-proving system developed with input from experts that looks more closely at any potential threats indicated by the neural network. The company sees the overall Startle system as a potential front-end to autonomous vehicles and similar robotic systems. To reduce the computational burden, Roke opted for a shallow neural network rather than deep learning, using pre-processing to make the data more suitable for machine learning.
“It is computationally efficient so we can deal with very complex environments in real time. In other applications though we might use a deep-learning front end,” Hook says.
The split between compute-intensive training run on a server and inferencing performed locally can make deep learning a feasible option for a new generation of mobile robots and not just those bolted to the deck of a ship or in Google’s California lab.
“The first mass-produced robot that will hit our daily life is the self-driving car,” says Samer Hijazi, senior architect in the intellectual property group at Cadence Design Systems.
Although Google has performed high-profile demonstrations of self-driving vehicles using deep learning, the technology faces a major struggle before it is adopted widely. And it might not be successful. “The technology is not widely understood by embedded systems architects, which limits the ability to deploy it,” says Hijazi. “There is currently no sufficiently high-performance, power-efficient hardware platform that is suitable for these embedded systems. An order of magnitude higher power efficiency is needed on the hardware front.”
Even with hardware more suited to running neural networks, a large amount of processing will need to be devolved to remote servers. “When it comes to deciding what can be done in the car and what needs to be done in the cloud we will need to consider both power and bandwidth cost. What will it cost us to shift data to the cloud? And what about the reliability of those transfers? How fast does the decision need to be made? How frequently do I need to do retraining?” Hijazi asks. “Who is going to be liable for that decision and the regulations around it? These are issues yet to be addressed.”
The reason for offloading processing is not just one of hardware speed and the availability of compute power. “Another reason for being on the cloud is the aggregation of data. As data is aggregated and used from many different sources, you get a better performance from systems like Siri,” says Lane.
The nature of the neural network introduces other problems that may derail the use of the technology in safety-critical systems such as self-driving cars. “Introducing AI techniques to a system that is safety-critical is fraught with problems,” says Professor John Colley of the University of Southampton.
A number of scientists and engineering groups who provided evidence to the House of Commons Science and Technology Committee pointed to the potential problems of verifying the correct behaviour of self-learning systems. Research by Google into the behaviour of its own deep-learning systems showed that a fully trained network can fail to recognise images that are, to a human, indistinguishable from those it classifies correctly. Debugging the trained network today is more or less impossible - the solution used is to continue the training to try to deal with the features in images that confuse it.
Hook says: “It’s very hard to determine why a neural network has decided what it’s decided. Perhaps certification for such standalone systems would be much closer to how we certify air-traffic controllers: by giving them lots of examples and seeing how they behave.”
The expectation that training will continue in parallel with deployment, with new training data relayed from the cloud at regular intervals, makes the verification problem more complex. Innovate UK’s evidence to the select committee argued: “No clear paths exist for the verification and validation of autonomous systems whose behaviour changes with time.”
Lane says: “We use a lot of systems that haven’t been fully verified. It would be quite strong to say that we won’t be able to verify these systems. If we get to the point where we couldn’t verify the systems for safety reasons, then safety-critical and large infrastructure companies would not be prepared to invest in them and take the risk if they can’t get the necessary insurance. That might have an impact on what is used. But just because we haven’t got something today doesn’t mean we can’t find a way to deal with the problem.”
Hook says the use of the goal-proving system behind the neural network is important to check Startle’s overall behaviour: “It’s very important we can verify that the goal-proving threat analyser is behaving in the same way that a highly trained human would think. And moreover it’s producing a body of evidence to support its overall judgement. You can then review why the system thought it was a threat. That makes it much easier to deploy than a pure neural network solution, because it provides transparency into the reasoning process.”
Difficulties with verification in fully autonomous systems may lead to a another shift away from neural networks, back towards techniques being developed in parallel that are less opaque in how they come to conclusions. Lane points to a self-guiding underwater inspection robot that uses Markov processes. “The robot learns by probability. It is a powerful way of teaching robots behaviours that they can apply to things later on.”
Some researchers are focusing on Gaussian processes for another way of employing probabilistic inferencing. Proponents argue the technique can deliver good results with far less data than is required for neural networks. The resulting systems may also need much less computational horsepower.
The UPMC hexapod robot TBR-Evolution uses Gaussian processes to work out which gait it needs to use for a particular situation.
The data held by TBR-Evolution boils down to 4 bytes for each of the 36 parameters it holds for a particular walking style. “In that robot, we have about 13,000 different gaits. I don’t think we will be limited by space. The robot could work with other types of machine-learning technique. But we opted for Gaussian processes because they are very good when you have small amounts of data to deal with.”
Lane sees the uncertainty as to which technique may emerge as the verifiable winner as an opportunity for organisations outside the Silicon Valley bubble of autonomous driving: “It’s not true that Google has done it all already. They have just done some demonstrations. What Google has done is show the way. There is competition out there and everything to play for.”