Google faces challenge of ‘brittle’ and opaque AI, says internet pioneer
Image credit: Pixabay
Internet pioneer and Google vice-president Vint Cerf has appeared before a House of Lords committee, defending the approach Google takes towards search ranking and content moderation.
The Lords Select Committee on Democracy and Digital Technology had invited Cerf - recognised as a ‘father of the Internet’, thanks to his role in developing the internet protocol suite - to answer oral questions about how Google designs its search and recommendation algorithms.
Google uses a combination of machine learning algorithms and several thousand human moderators. Both are guided by regularly updated criteria laying out what Google considers to be high-quality content or low-quality/harmful content.
Several committee members questioned how authorities such as Ofcom (which will be assigned responsibility for regulating online content) could be certain that Google is behaving responsibly in how it treats content; for example, ensuring that it presents unbiased search results.
While acknowledging that earning trust requires transparency, Cerf said it is a challenge for Google to be fully transparent about how search results are ranked on account of the opacity of machine learning algorithms, the issues of scale and because it wants to prevent people gaming the algorithms in order to rise up search rankings.
“Trying to explain what the algorithms are, from my point of view as a computer scientist, is quite difficult because of the introduction of machine learning and neural networks,” he said.
Cerf said that it is easiest to understand how Google judges content by looking at its publicly available criteria for ranking, although he also suggested that independent researchers could build their own machine learning algorithms based on the same criteria and training sets used by Google to see if the rankings produced by those algorithms match its rankings: “That would actually be quite helpful,” he said.
Chair of the committee Lord Puttnam raised the point that Google search results are extremely influential and that occasional errors – such as favourably ranking a fake news story claiming that British Muslims don’t pay council tax, appearing to legitimise the accusation – had the potential to “set off a riot”. Cerf acknowledged that the machine learning algorithms responsible for ranking search results are “brittle” and are easily tricked into promoting content which a human moderator would instantly recognise as low quality or outright harmful.
“There are cases where the change of just two or three pixels which a human being would not recognise as a change causes a [hypothetical image recognition] system to say, 'It’s not a cat, it’s a fire engine” and your reaction to this is 'WTF? How did that possibly happen?' The answer is these systems do not recognise things the same way we do,” he explained.
“We abstract from images, we recognise cats as having little triangular ears, they’re furry, they have a tail and we’re pretty sure fire engines don’t. But the mechanical system of recognition in machine learning systems doesn’t work the same way our brains do, so we know they can be brittle.”
“We are working on trying to remove those problems or identify where they occur and that’s still an area of significant research. So […] are we conscious of the sensitivity and potential failure modes? Yes. Do we know how to prevent all those failure modes? No, not yet, but when they get reported you can be sure there is a reaction, which is: can we adjust the system to eliminate that particular brittleness?”
Google’s UK policy head Kate O’Donovan, who also gave evidence during the same session, described the difficulties Google faces in handling content which cannot be straightforwardly categorised as safe or harmful, such as YouTube videos of 'drill' music, a genre of 'trap' music popular in London and occasionally associated with real-world gang violence. Making well-informed judgements on which of these videos are legitimate artistic expression and which are threats requires contributions from the Metropolitan Police, the Mayor of London’s office, YouTube channels and musicians, she explained.
Cerf added that a serious challenge is that this sort of subtle content moderation needs to be undertaken in hundreds of languages, all of which have their own rapidly-evolving slang. “I’m always astonished at the amount of language that we have to understand in some sort of mechanical way to detect these problems,” he said.
In contrast to some recent hearings in which representatives of tech giants have appeared before lawmakers to answer difficult questions, the hearing remained cordial, with committee members treating Cerf with a degree of reverence (describing him as the "David Attenborough of the digital world"), and Cerf maintaining an urbane manner throughout.
The only hint of conflict emerged towards the end of the hearing when crossbench peer Lord Mitchell expressed discomfort about Google’s “monopolistic position” in the search market. Lord Mitchell asked why Google deserved to be trusted to handle search rankings under these circumstances. “It just feels like you have all the aces in this situation and I would like to feel that there are moderating forces,” he said.
“The fact that we have a lot more users is not the same as having no competitors,” Cerf countered.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.