Example of a captcha on a web page

Neural network cracks CAPTCHAs after minimal training

Image credit: Dreamstime

A California-based start-up has built an artificial neural network capable of mimicking humans in the most universal online test used to distinguish between humans and robots.

A CAPTCHA, which stands for a ‘Completely Automated Public Turing test to tell Computers and Humans Apart’, presents an image of letters and numbers distorted such that they are difficult for a machine to interpret.

The Turing Test, devised (but not named) by computing pioneer Alan Turing, is a proposed method for identifying whether a machine could be considered intelligent, by testing whether it can be distinguished from a human by another human.

CAPTCHAs were developed in the late 1990s after internet companies such as Yahoo – which offer free email accounts – found that pornography websites and other companies were using bots to sign up for thousands of new email accounts every minute in order to deliver junk mail. CAPTCHAs are used to ensure that internet services – such as email accounts and forums – are being used by humans rather than bots.

Now, a California-based artificial intelligence start-up, Vicarious, has released a Science study detailing a program capable of solving popular CAPTCHAs with well over 50 per cent accuracy.

A CAPTCHA is considered obsolete if a machine can solve it at least 1 per cent of the time.

Vicarious achieved this unprecedented success rate using an artificial neural network – a machine learning system approximately modelled after the human brain, with layers of nodes (‘neurons) connected in a complex web which adjusts itself as it is fed learning data – called the Recursive Cortical Network (RCN). Neural networks like this have been used before to solve CAPTCHAs, although they required millions of samples.

The RCN, however, was able to solve CAPTCHAs after (on average) 300 times less training data. According to the researchers, this was made possible by modelling the network after a human visual system. The RCN models a CAPTCHA image as a collection of shapes and textures, and is able to estimate what letter or number it is likely distorted from.

This neural network was capable of solving Google reCAPTCHA with 66.6 per cent accuracy (after training on just five examples per letter or number), BotDetecth at 64.4 per cent accuracy, Yahoo at 57.4 per cent accuracy and PayPal at 57.1 per cent accuracy.

According to the researchers, while the RCN’s success proves the urgent need for the development of more sophisticated bot-detecting measures, it also demonstrates the potential of deep-learning systems to tackle challenges in computer vision.

Recent articles

Info Message

Our sites use cookies to support some functionality, and to collect anonymous user data.

Learn more about IET cookies and how to control them