Chinese neural network beats humans in reading comprehension test
Image credit: REUTERS/Steve Marcus
A machine learning system developed by Alibaba’s AI (artificial intelligence) research arm edged above human participants in a standard reading comprehension test. This is the first time an AI has beaten humans in a language-related assessment.
Machine learning software – which learns to detect patterns and perform tasks increasingly well by processing massive amounts of data – frequently makes headlines when beating humans at specific, well-defined tasks, such as when IBM’s Deep Blue supercomputer beat world champion Garry Kasparov in a chess match.
However, deeply human skills such as understanding language and emotion have proved more of a challenge to machine learning systems.
Alibaba has grown to become one of the largest companies in the world by providing online sales and payment services. However, more recently, it has been exploring more advanced technological services such as cloud computing. In October 2017, it was revealed that the company was investing $15bn (£10.9bn) in research and development in “foundational and disruptive technology”, including AI, the internet of things and quantum computing.
Alibaba AI’s new machine learning system is a deep neural network: a multi-layered machine learning system modelled approximately on the structure of the human brain. Deep learning could allow for far faster machine learning.
The network scored higher than human participants in the Standard Question Answering Dataset. This dataset includes more than 100,000 question-and-answer pairs, based on hundreds of Wikipedia articles. It has been used by Google, IBM, Facebook and many academics to test the performance of their machine learning models.
In a test run on 11 January, Alibaba’s neural network scored 82.44 on the test, compared with an average human score of 82.304.
In an interview with Alibaba-owned South China Morning Post, Si Luo, chief scientist of natural language processing at Alibaba’s research department, said that this means that questions such as “what causes rain?” can be answered with a high level of accuracy by the neural network.
“We believe the underlying technology can be gradually applied to numerous applications such as customer service, museum tutorials, and online response to inquiries from patients, freeing up human efforts in an unprecedented way,” said Si.
The reading comprehension test scores are a demonstration of the improving ability of machines to perform human tasks. A neural network capable of better-than-human reading comprehension could play a valuable role at Alibaba and other companies by performing tasks such as customer service.
Alibaba used the underlying technology behind this neural network to answer customer queries during “Singles Day”, the world's largest e-commerce event, which falls on November 11 each year. In the month leading up to Singles Day, Alibaba reportedly used machine learning software to generate 400 million customised banner adverts and answer 3.5 million simple customer queries a day.
According to Si, the neural network works best when answering questions with well-defined answers, such as standard customer service questions which could be answered with information from a document. If the language used to ask questions is too vague or does not use standard grammar, the network may need help from a human operator.
Alibaba, as well as other Chinese internet giants such as Tencent and Baidu, are all racing to develop machine learning models which improve users’ online experiences, such as by improving search results, targeted advertising and social media feeds.
Last year, the Chinese government laid out plans to make China the world leader in AI by 2030, with an AI industry worth $150bn (£109bn).
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.