Captcha is dying. This is how it's being reinvented for the AI age

Captcha sucks, but work is underway to make it more human
iStock / MarcelC

The goal of the Turing test is for a human to work out whether they're communicating with another person or a machine. Computers have gotten better at imitating people but they've also improved at reversing the Turing test, tricking machines to think they're human.

Back in the 2000s, when crude bots plagued the web, the solution turned out to be a variation of Turing's test: the Completely Automated Public Turing Test to tell Computers and Humans Apart. Better known as the Captcha, the system allowed websites to determine between human and machine behaviour. Or that was the idea, anyway.

"There is always a battle between usability and security," says Nan Jiang, a human-computer interaction lecturer at Bournemouth University. Early Captchas, which were created by a research team including the founder of Dulingo, involved identifying letters and numbers in an image. The system was purchased by Google in 2009 and converted to reCaptcha to help digitise books.

By allowing websites to detect non-human behaviour, websites have been able to block automated behaviour with a degree of success. But early iterations were fairly awful and despised by users. In 2013, Ticketmaster dropped the "hated" Captcha and the tide started to turn against the technology.

Advances in machine vision, where computers are able to interpret images, have made Captchas not fit for purpose. As the outdated Captcha website explains, "Either a Captcha is not broken and there is a way to differentiate humans from computers, or the Captcha is broken and an AI problem is solved".

Breaking Captcha

Back in 2013, artificial intelligence startup Vicarious announced it had defeated Captcha using its AI. Now, four years later, it has published the methodology in the peer-reviewed journal Science.

Dileep George, the co-founder of Vicarious, says his firm's algorithm uses less data-intensive methods to break Captcha and reCaptcha. Using a recursive cortical network, George says, it was possible to solve Captcha text, identify handwritten digits, and spot text in real-world scenarios, while using 5,000 times fewer training images than other methods.

"You really need to understand what the letter is," George says. He explains that his firm's system is able to build models of letters to understand how they are formed. This doesn't require teaching the AI system using previous examples of Captchas. "If you build that model then even if people change the background you can use the model to recognise the letter," he explains. He says the company has only just published the work as it was waiting for these Captcha systems to fall out of use and that his small team started working on the research paper at the end of 2015.

This isn't the first time that Captchas have been broken. Audio Captachas that read out the words needed to authenticate a person were used by Microsoft, Digg, eBay and others up until 2011 when they were decoded by Stanford computer scientists.

There have also been attempts to brute force captchas using humans. In 2008 it was found that companies in India were paying people to sit and answer MySpace and Google Captchas. At the time, you could get 1,000 solved Captcha's for just $2. The businesses were a Mechanical Turk for Captchas, although one study in 2010 found that humans could only get Captchas right 71 per cent of the time.

Even sophisticated Captchas have been broken. Snapchat's point and click Captcha was unpicked by automation. In 2014 Google officially killed the text-based Captcha and replaced it with the 'I'm not a robot' button. The AI based system also included a secondary test getting humans to click on all images that contain a cat or other object in a selection presented.

Three researchers from Columbia University used deep learning to automatically solve 70 per cent of reCaptcha challenges from Google. "We also apply our attack to the Facebook image captcha and achieve an accuracy of 83.5 per cent," the academics wrote in a paper. With Captcha so comprehensively broken, it was clear something needed to be done.

Resetting Captcha

The battle between protecting websites from spammers and creating secure Captchas has become invisible. At the end of 2016, Google announced an Invisible reCaptcha that would use what it calls its Advanced Risk Analysis.

This system uses Google's AI to look for signs of human behaviour. It runs in the background detecting movements of a mouse, how long it takes to click on a page, and removes the 'I am not a robot box' from webpages. The firm's security blog says the Invisible system, which launched in March 2017, has "enabled millions of human users to pass through with zero click everyday". It hasn't given any more details on how the system works.

Other methods are being developed that still require some human knowledge. Bournemouth's Nan has developed a mobile Captcha dubbed Tapcha. He says that in a world where computers are becoming less relevant there needs to be new, sophisticated ways to outsmart AI technologies. The Tapcha system has been designed to work on mobiles and builds on the distorted text approach of old.

"We use this approach to create the instruction," he says. One example has a distorted text task that a person must read and then act upon. The system tells a person to move a star shape from one side of the screen to the other, leaving it on top of another shape. "There's a context built into it," Nan says. He believes that for this approach to be cracked by a machine would require it to understand not just what is written, but also the context behind it and the task required.

Elsewhere, Amazon has patented a Captcha system that humans are meant to fail. The patent explains that humans are likely to fail some basic logic tests (e.g. counting the number of specific letters in a sentence) and machines would find them easy to get right. A separate Amazon patent tests your ability to understand physics. As with the example proposed by Nan, a system would have to understand what is happening in the image and come to a conclusion all by itself. Even with advances in AI, that's difficult to achieve at the moment.

So are such systems infallible? Nan predicts there's the potential for all Captcha systems to be broken, if the AI is human enough. "If we have really good AI technologies these could be mimicked by some AI algorithms we don't really know yet. It's a challenge between how we can retain the usability of the Captcha scheme whist maintaining good security."

Everyone might hate them, but Captchas will need to keep some elements of human interaction to outsmart increasingly agile AI systems. Perhaps better that than black boxes that mysteriously determine the differences between man and machine.

This article was originally published by WIRED UK