Imperva 5 Cyber Security Predictions for 2015

Imperva Data Security Blog

January 14, 2015
5 Cyber Security Predictions for 2015
Crystal ball photo for blogImperva has been in the business of protecting the high-value applications and data assets at the heart of the enterprise since 2002. In the years since, we’ve gained tremendous knowledge about cyber security and the origins and nature of cyber attacks. This knowledge has come from analyzing the data collected by our SecureSphere products in installations around the world, as well as from working closely with over 3,500 customers from across many industries.

When security vendors are challenged at the end of each calendar year to come up with predictions for the year ahead, we like to combine the data we’ve collected from our products with the insights that we’ve gathered from our customers, to come up with some meaningful commentary and helpful guidance. What follows are our predictions for the year ahead, with more to come throughout the year as we continue to analyze what our products have to tell us about how the security landscape is evolving.

2015

1. The year of revolt

2015 could be the year when merchants in the US revolt against the credit card companies’ policy of sticking them with both the liability for fraud and the responsibility for protecting what is essentially un-protectable: credit card numbers that have to be shared in order to be used, and which can be abused simply by knowing what they are. Fallout from such a change could vary widely, but it’s possible that we will see the rise of separate infrastructure for secure payments (like ApplePay) or a more secure credit card infrastructure (chip and pin) in the United States.

2015 could also be the year when consumers revolt at the prospect of having to change their credit card numbers so often. This has been the typical response to mega-breaches with lots of issuers cycling cards. While this is ultimately in the consumer’s best interest, it’s a pain for people to re-sign up for automatic payments, update records with their various business associates and begin anew. Besides resulting in the rise of separate infrastructure for secure payments (above), could we see a credit card outcompete their peers based on cardholder security?

2. The rise of Cyber Insurance

Due to the breaches in 2013 and 2014 that wreaked havoc on the businesses, brands, reputations and leadership of way too many enterprises, 2015 will be the year that Cyber Insurance gains velocity and popularity. The Board and the C-Suite will have an appetite for reducing risk by offloading it to insurance providers. Government agencies and insurance companies are already at work establishing guidelines to support the growth of the cyber insurance market. Reduced Cyber Insurance premiums could be a new business benefit touted by security vendors, as premiums are reduced when a company demonstrates proof of having critical security controls in place.

See how Imperva can help you jump start your efforts to reduce risk.

3. The “cloudification” of IT will accelerate

In 2015, the “cloudification” of IT will accelerate, and we will see some big organizations using the cloud, including more and more financial institutions offering services via SaaS platforms. New compliance mandates for the cloud (ISO 27016, SSEA 16 etc.) are contributing to this phenomenon, because they enable businesses to validate their security posture and risk levels.

This leads us directly to a longer term prediction: By 2017, the term “on-premises data-center” will be a term of the past for the small- and mid-size business market, which will move entirely to the SaaS model.

Access this reference architecture for protecting your AWS-based web applications. It capitalizes on Skyfence which can be used to protect all your SaaS applications.

4. The first Big Data-related breach

As practical applications for Big Data grow, and the amount of information managed by businesses of every size reaches astronomical proportions, the temptation for hackers to secure the prize of being the first to hack a Big Data installation will mount as well. In 2015, the first big Big Data-related data breach will occur. The lack of administration and security knowledge in such installations, combined with the advancements in server side attacks by hackers will result in hackers trying to and successfully infiltrating this growing platform.

Learn how to address the top threats facing database and Big Data resources.

5. DDoS Attackers Take a Page from the APT Playbook

In 2014, DDoS attacks became much more sophisticated. Though much of the reporting focused on the size of attacks, a more troubling trend was the advancement in attack techniques. Much like their APT brethren, DDoS attacks can now morph and adapt based on the defenses in place. Hackers also dupe sites using impersonation, looking for vulnerabilities and cataloging them for future exploit. Though not as stealthy as APTs, DDoS attackers are learning from the successes of APT hackers and adopting some of their techniques for an equally troubling network based attack trend. And DDoS attacks are becoming increasingly common; a majority of organizations can expect to be hit with DDoS attacks in 2015. (Sources: Incapsula DDoS Trends Report 2014, DDoS Impact Survey 2014

Facts about ReCAPTCHA and NoCAPTCHA

Sorry Google, No CAPTCHA reCAPTCHA doesn’t stop bots
ShieldSquare
Google recently launched a new version of reCaptcha which claims to be more robust to bots and easy going on the humans.

While this video on youtube by Google is pretty convincing too, things got a little interesting when we dug deeper. The new approach which seems to be a sophisticated bot identification algorithm, is nothing but a mere usage of browser cookies.

So here’s what happens when you are thrown a reCAPTCHA challenge:

* You are asked to solve a reCAPTCHA image the first time.
* The response to the evaluation of the text string entered by you, is cached in your browser’s cookies.
* The next time you visit the page, or any page which requires you to pass reCAPTCHA, the information from these cookies is used to identify whether you have passed the test before.

A simple test can be done here: https://wordpress.org/support/register.php.

After solving the reCAPTCHA image for the first time, it does not require you to solve an image when you visit again. But, once you delete your cookies, and try again … there! Back to square one, you are required to solve the image to succeed the form submission. Google has simply used cookies to retain information about your authenticity.

What does this mean for bots? Now bots can use an OCR tool to solve the information or require somebody to solve the image initially, post which, the bot can retain the cookies and continue scraping!

P.S: Well, We haven’t got to the main course yet!

The new version of reCAPTCHA can also be bypassed by another technique. This can be done by using the website’s public key (called data-sitekey). Wait, what? Yes! Let’s say a bot wanted to bypass a website X’s reCAPTCHA without actually letting a user (on website Y) know that he is allowing a bot to do so. More technically, this is called clickjacking or UI redress attack. The bot could use the data-sitekey of website X and disable the Referer header on a web page in Y where the user would be asked solve reCAPTCHA.

Once the user solves the CAPTCHA, the response (called “g-recaptcha-response”) can be used by a bot running in the background to submit a form on website X. This way, the bot could trick Google into thinking that the solved reCAPTCHA response was originating from website X (while it is actually coming from Y). Hence, the bot is able to proceed scraping on webiste X. This magically works because Google doesn’t validate the referer header if it has been disabled by the client or is empty. A genuine user just contributed to a bot scraping website X without actually realizing that he was being used as an access card.

This post has been inspired from the original blog article by @homakov. A sample of this has already been implemented and hosted on github.

Bots Outnumber Humans on the Web

Bots Now Outnumber Humans on the Web
BY ROBERT MCMILLAN 12.18.14 | 9:00 AM |

Diogo Mónica once wrote a short computer script that gave him a secret weapon in the war for San Francisco dinner reservations.
This was early 2013. The script would periodically scan the popular online reservation service, OpenTable, and drop him an email anytime something interesting opened up—a choice Friday night spot at the House of Prime Rib, for example. But soon, Mónica noticed that he wasn’t getting the tables that had once been available.

By the time he’d check the reservation site, his previously open reservation would be booked. And this was happening crazy fast. Like in a matter of seconds. “It’s impossible for a human to do the three forms that are required to do this in under three seconds,” he told WIRED last year.

Mónica could draw only one conclusion: He’d been drawn into a bot war.

Everyone knows the story of how the world wide web made the internet accessible for everyone, but a lesser known story of the internet’s evolution is how automated code—aka bots—came to quietly take it over. Today, bots account for 56 percent of all of website visits, says Marc Gaffan, CEO of Incapsula, a company that sells online security services. Incapsula recently an an analysis of 20,000 websites to get a snapshot of part of the web, and on smaller websites, it found that bot traffic can run as high as 80 percent.

People use scripts to buy gear on eBay and, like Mónica, to snag the best reservations. Last month, the band, Foo Fighters sold tickets for their upcoming tour at box offices only, an attempt to strike back against the bots used by online scalpers. “You should expect to see it on ticket sites, travel sites, dating sites,” Gaffan says. What’s more, a company like Google uses bots to index the entire web, and companies such as IFTTT and Slack give us ways use the web to use bots for good, personalizing our internet and managing the daily informational deluge.

But, increasingly, a slice of these online bots are malicious—used to knock websites offline, flood comment sections with spam, or scrape sites and reuse their content without authorization. Gaffan says that about 20 percent of the Web’s traffic comes from these bots. That’s up 10 percent from last year.

Often, they’re running on hacked computers. And lately they’ve become more sophisticated. They are better at impersonating Google, or at running in real browsers on hacked computers. And they’ve made big leaps in breaking human-detecting captcha puzzles, Gaffan says.

“Essentially there’s been this evolution of bots, where we’ve seen it become easier and more prevalent over the past couple of years,” says Rami Essaid, CEO of Distil Networks, a company that sells bot-blocking software.

But despite the rise of these bad bots, there is some good news for the human race. The total percentage of bot-related web traffic is actually down this year from what it was in 2013. Back then it accounted for 60 percent of the traffic, 4 percent more than today.

Ellipsis announces Human Presence technology

Bot or Not?
MARCH 21, 2014by Jennifer Oladipo

Ellipsis targets “human presence” on the Web
Rather than make website visitors prove they’re human, Ellipsis wants companies to use Human Presence technology to figure that out automatically.

Bill West, Ellipsis chairman and CEO, said the company studied more than 80 million actions of Web users to create software that can tell the difference between human and botnet traffic within milliseconds. This vastly reduces the need for Turing tests – those codes, math problems, images and other devices people often must solve to prove that they are human, he said.

Ellipsis – named after the “…” used to signify missing text – pitches its software as a chance to “win the arms race” in Web security, arguing that machine-learning algorithms can stay one step ahead of bot evolution.

West is also a managing partner with The Atlantic Partners, which acquires and sells underperforming companies on behalf of private equity firms. He sat down with UBJ to explain the company’s product and plan.

How was Ellipsis founded?

It was an Atlanta company called Pramana, and actually funded by UCAN [Upstate Carolina Angel Network]. It was abandoned, basically nothing more than software. We moved about a year ago. We liked what they did but didn’t like the way they did it.

Atlantic Partners is one of the owners, along with the other three founders on the management team, UCAN and other investors. We bought the intellectual property, and put all Greenville talent on it to get it working. We rewrote virtually all of their code.

0321UBJJumpStartEllipsis5Greg

How did you gather a team to revamp the product?

I worked in the various high-tech groups in town, so I knew who was capable of handling this kind of deal. We had the choice of hiring a staff, but everybody we have has their own company. I thought we could put a part-time a team together that’s really some of the most talented people in town. They were also involved in the initial analysis when looking at the software.

Peter Waldschmidt, vice chairman, is brilliant working on design, data collection and algorithmic models. Andy Kurtz is CEO of ProActive technology, a premier programing shop in the area. His crew, led by Kelly Summerlin and Rob Hall, built the data collection processes. We got financial guidance from Matt Dunbar at UCAN. The Atlantic Partners provided overall strategy and oversight.

What need does Human Presence meet?

Unwanted botnet traffic is a problem. Attackers come in with bots and scrape information from websites. There’s also click fraud [bots clicking on ads to generate revenue]. Bots are on track to waste nearly $10 billion of advertising dollars spent in 2013.

TuringTestBut 3 percent of Web users log off immediately when they run into Turing tests. More than 30 percent fail on the first attempt to solve the puzzle. We were trying to do something that was totally nonintrusive. Instead of annoying 100 percent of customers, you’ll only annoy maybe five percent.

Then we also wanted painless, simple installation for site owners. It’s a single line of JavaScript that can be installed in less than a minute.

botnet-provided

How does it work?

We’ve studied the time it takes people to press a key, move the mouse around the form, other data points. We can detect in milliseconds whether or not we have a bot. We give businesses a free report to know if they have a bot problem or not. If there is a problem, more detailed reporting is available for a fee.

(Full disclosure: The Greenville Journal was one of the beta sites.)

Who’s your target market?

Real estate, periodicals and blog sites that have lots of content are vulnerable. So are online ticketing companies that deal with bots that are scalpers. Those are easy. When you get into banking it’s a little more complex, and we can do that, too.

What’s the next step?

There will be some staffing up, then we’re going to market mid-March. We’ve gotten inquires to buy from west and east coast companies. An exiting plan was there from the beginning. We’re seeking partners for distribution, investment or acquisition.

Where did the name Ellipsis come from?

I guess when we were all sending emails back and forth to each other in the early stages, I noticed that everyone was using the ellipses, like there was more to think about. I thought, that’s definitely something a human would do, and not what a bot would do.

TAGS: online security, tech and design, technology

Current Methods for Website Security

If you run a website that allows visitors to comment, or where your clients have to set up user accounts, you need some kind of security in place to prevent abuse. Hackers can create robots that can enact malicious attacks on your site by posing as humans. Some of those attacks include making comments and registration requests. Because robots, or “bots,” can work much faster than humans, they could easily bog down your website with multiple attacks in a short time. For this reason, you need some kind of security that can distinguish between humans and machines, and protect your site from malicious attacks.

Types of Security

There are several ways to secure your site form robot attacks, from complex to simple:

CAPTCHAs

CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. It is a complex, but effective, way to differentiate humans from machines.

CAPTCHA uses graphic representations of letters, words, and symbols, which the users are required to type in. The idea is that a robot will not be able to recognize the letters and symbols in order to replicate them.

Recommended for You
Webcast: The Key Social Media Trends for 2015
Unfortunately, the problem with traditional CAPTCHA is that many humans can’t recognize them either. Some CAPTCHAs have an audio option for visually impaired users, but the audio can often be just as difficult to decipher.

In response to the issues with traditional CAPTCHAs, companies like Confident Technologies have spearheaded image-based CAPTCHAs, which use photo images instead of text graphics. The images are easier for most humans to recognize, but still difficult for a bot to manipulate.

The image-based CAPTCHAs could be presented as a single image or as a mini game where users have to solve a puzzle with the images.

Text, Email or Phone Verification

With text or email verification your site will end a text or an email, or place a phone call to anyone who tries to create an account, post a comment, or perform other actions on your site. The user then has to respond to the message, either by clicking on a verification link, pressing a button, or by returning to your site to enter a code.

The advantage to this type of verification is that it requires your users to enter specific information to proceed, which a robot might not be able to do. The disadvantage is that it requires your users to have their phones or email handy. In most cases this should not be an issue, especially considering the advances in smartphone technology, but there could be occasions where a phone or email might not be available.

The Honey Pot

A honey pot is a trap designed to lure the victim into doing something they shouldn’t. In terms of security for your site it means luring a robot to fill a field that it’s not supposed to, or to fill it incorrectly. The field usually contains instructions like “leave this field blank,” a robot won’t be able to read the field and will disqualify itself by entering data.

The issue with the honey pot is that sometimes users can also neglect to read the instructions and disqualify themselves.

Submission Timing

Submission timing is simply the amount of time it takes to complete a task. Since robots generally complete tasks faster than humans, if there is too short of a time frame between tasks, especially similar tasks, the system reads it as a robot and displays a warning message. If the actions continue then the system blocks the user.

The advantage to submission timing is that it’s fairly simple and straightforward. The Ellipsis Human Presence technology utilizes these timing and measurement datapoints and along with their proprietary algorithms, a database of known human behavior and machine learning, they are able to accurately detect human site visitors and quarantine all traffic that does not meet this criteria.

Check Boxes

Check boxes are one of the simplest ways to secure against robots. It is essentially a checkbox on the form that’s invisible to a machine, yet visible to the user. The user must check the box to proceed.

The advantage to the check box is that it is very simple and easy to implement. The disadvantage is that robots have progressed to the point where some can recognize the boxes.

Read more at http://www.business2community.com/tech-gadgets/types-website-security-01092313#MmEft7QREqJoMlip.99

Marketers Could Lose $6.3 Billion To Bots In 2015

Marketers have a billion-dollar bot problem.

Global advertisers could lose $6.3 billion to bots in 2015, if current fraud rates continue. That’s according to new research from the ANA (Association of National Advertisers) and a study conducted with White Ops, an ad fraud detection firm.

Nearly one-fourth (23%) of all video ads are served to bots, while 11% of display ads are bot-infected. Per the ANA study, bots accounted for 17% of all programmatic ad traffic and 19% of retargeted ads.

The data comes from 181 campaigns from 36 ANA member companies. The campaigns were measured for 60 days and accounted for 5.5 billion impressions across three million domains. The campaigns came from major brands including Ford, Walmart, Verizon, Prudential, MasterCard, Kellogg’s, Johnson & Johnson and more.

While the average bot activity for programmatically-purchased ads was 17%, it was significantly higher in some instances.

For example, one agency for a CPG brand ran a video campaign through a publicly traded video supply-side platform (SSP) and 62% of the ads were served to bots, per the study.

Additionally, “for 18 of the 36 study participants, three well-known programmatic ad exchanges supplied programmatic traffic with over 90% bots,” the report reads. “A bot site used the opacity of programmatic display traffic sourcing through demand-side platforms (DSPs) to systematically defraud advertisers.”
To clarify, that does not mean the three exchanges were supplying 90% fraudulent inventory. Rather, there was one site that all three exchanges were sourcing from, and 90% of the inventory from that site was fraudulent. In the end, 18 of 36 advertisers ended up purchasing inventory from that site via the three major ad exchanges.

It’s not just the open ad exchanges that have a bot problem; the report also notes that premium publishers can be infected as well. One CPG brand purchased 230,000 impressions from a premium U.S. media company, per the study, but 19% of the inventory purchased was fraudulent.

The report found that the majority of bots came from residential IPs, noting that 67% of bot traffic “comes from everyday computers that have been hacked.” In an unrelated study released earlier this year, Forensiq, another digital ad detection firm, showed an example of what bots can do behind the scenes once a computer is infected.

“By using the computer of real people … the bots do not just blend in, they get targeted,” the White Ops and ANA report reads. “Sophisticated bots moved the mouse, making sure to move the cursor over ads. Bots put items in shopping carts and visited many sites to generate histories and cookies to appear more demographically appealing to advertisers and publishers.”

Buying tickets with less aggravation

A Rolling Stones concert in London, 1969.

Buying concert tickets these days requires a lot of planning and speed.

You have to be on the dot when tickets go on sale, and even if you are, page loading, internet speed and the always-annoying captcha code can slow you down. Marketplace Tech’s Ben Johnson may have a trick up his sleeve to get tickets faster and with less hassle.

Ticketmaster now has an app for buying tickets. Ben says the things that usually make people nervous about ticket buying— like giving out personal information — can be strengths when purchasing through an app.

“It knows your location, it knows your identity and that means you might get tickets faster and score better seats if you use the app,” Johnson says.

Your smartphone can also help you find closer seats even when you’ve already purchased a ticket. iBeacon uses its technology to pinpoint your location using your mobile phone and ticketing apps use this feature to help you move up. Unfortnately if this option doesn’t suit you, there isn’t really an alternative, save for standing in line.

New research on bot traffic

 

The Incapsula Blog

Back to Blog

09

December

Report: Bot traffic is up to 61.5% of all website traffic

Igal Zeifman

Last March we published a study that showed the majority of website traffic (51%) was generated by non-human entities, 60% of which were clearly malicious. As we soon learned, these facts came as a surprise to many Internet users, for whom they served as a rare glimpse of “in between the lines” of Google Analytics.

Since then we were approached with numerous requests for an updated report. We were excited about the idea, but had to wait; first, to allow a significant interval between the data, and then for the implementation of new Client Classification features.

With all the pieces in place, we went on to collect the data for the 2013 report, which we’re presenting here today.

Bot traffic is up to 61.5% of all website traffic

Research Methodology

For the purpose of this report we observed 1.45 Billion visits, which occurred over a 90 day period. The data was collected from a group of 20,000 sites on Incapsula’s network, which consists of clients from all available plans (Free to Enterprise). Geographically, the traffic covers all of the world’s 249 countries, per country codes provided by an ISO 3166-1 standard.

Report Highlights

Bot Traffic is up by 21%

Compared to the previous report from 2012, we see a 21% growth in total bot traffic, which now represents 61.5% of website visitors. The bulk of that growth is attributed to increased visits by good bots (i.e., certified agents of legitimate software, such as search engines) whose presence increased from 20% to 31% in 2013. Looking at user-agent data we can provide two plausible explanations of this growth:

  • Evolution of Web Based Services: Emergence of new online services introduces new bot types into the pool. For instance, we see newly established SEO oriented services that crawl a site at a rate of 30-50 daily visits or more.
  • Increased activity of existing bots: Visitation patterns of some good bots (e.g., search engine type crawlers) consist of re-occurring cycles. In some cases we see that these cycles are getting shorter and shorter to allow higher sampling rates, which also results in additional bot traffic.

31% of Bots Are Still Malicious, but with Much Fewer Spammers

While the relative percentage of malicious bots remains unchanged, there is a noticeable reduction in Spam Bot activity, which decreased from 2% in 2012 to 0.5% in 2013. The most plausible explanation for this steep decrease is Google’s anti-spam campaign, which includes the recent Penguin 2.0 and 2.1 updates.

SEO link building was always a major motivation for automated link spamming. With its latest Penguin updates Google managed to increase the perceivable risk for comment spamming SEO techniques, while also driving down their actual effectiveness.

Based on our figures, it looks like Google was able to discourage link spamming practices, causing a 75% decrease in automated link spamming activity.

Evidence of More Sophisticated Hacker Activity

Another point of interest is the 8% increase in the activity of “Other Impersonators” – a group which consists of unclassified bots with hostile intentions.

The common denominator for this group is that all of its members are trying to assume someone else’s identity. For example, some of these bots use browser user-agents while others try to pass themselves as search engine bots or agents of other legitimate services. The goal is always the same – to infiltrate their way through the website’s security measures.

The generalized definition of such non-human agents also reflects on these bots’ origins. Where other malicious bots are agents of known malware with a dedicated developer, GUI, “brand” name and patch history, these “Impersonators” are custom-made bots, usually crafted for a very specific malicious activity.

One common scenario: un-categorized DDoS bot with a spoofed IE6 user-agentOne common scenario; un-categorized DDoS bot with a spoofed IE6 user-agent.

In terms of their functionality and capabilities, such “Impersonators” usually represent a higher-tier in the bot hierarchy. These can be automated spy bots, human-like DDoS agents or a Trojan-activated barebones browser. One way or another, these are also the tools of top-tier hackers who are proficient enough to create their own malware.

The 8% increase in the number of such bots highlights the increased activity of such hackers, as well as the rise in targeted cyber-attacks.

This is also reflective of the latest trends in DDoS attacks, which are evolving from volumetric Layer 3-4 attacks to much more sophisticated and dangerous Layer 7 multi-vector threats.

CAPTCHA by-pass software is now readily accessible

Purchase captcha bypass software For Reasonable Price.
Even if you don’t have a lot of experience in sector of computers and world wide web you still have noticed captcha

BriefingWire.com, 11/19/2013 – Even if you don’t have a lot of experience in sector of computers and world wide web you still have noticed captcha or Completely Automated Public Turing test to tell Computers and Humans Apart when you have visited different internet sites that offer security from spam and other bad things widespread in web. So, what is captha and why many people around the world are searhing for the most dependable, cheap and efficient captcha bypass software?

Initially you need to find out that Captcha is a simple device that has been created a lot of years back and is still utilized in order to identify human from automatically operating robots. Commonly it demands typing particular symbols that are displaying on the presented photo. As outlined by this technique just human can determine this symbols, but program can’t. But, in the current times of advancement and innovations people have developed software program of passing up such process.

If you’re looking for gmail captcha broken then we have the incredible solution for you. Visit following web page: http://de-captcher.com/ where you will get one of the most efficient and cheap captcha solving designed for every person. Mainly this type of software is in a great requirement among entrepreneurs, mainly because this style of marketing and advertising is extremely good at promoting, increasing popularity and getting more clients who will discover your internet site on the most widely used search engines. As captha has been made as a security from spam, automated subscription and other fraud of such type you need to discover dependable captcha solver php python that will be able to avoid this defense.

So, don’t spend your time and energy and this incredible chance to bypass any type of captcha even the most challenging just with the few clicks of your mouse button and the most reliable software program that will bypass any captcha in a very rapid manner. It is very simple in use and all you need to complete is to follow simple instructions that you will find on the offered website. There are various strategies that will help you to bypass this kind of system, however presented choice is the fastest and the best. If you are frequent internet user you have probably experienced captcha defense in different internet websites. Most simple users are sick of this frustrating system and would like to take care of it. If you are one of such people then automated captcha bypass is exactly what you have been looking for.

Human Detection

 

The Economist explains

How can websites tell humans and robots apart?

 

SIGN up for a new e-mail account, buy a concert ticket or leave a comment on a website and you will often be confronted with an image of skewed and swirled letters and numbers to be transcribed before you are allowed to proceed. “Completely Automated Public Turing-tests to tell Computers and Humans Apart”, or CAPTCHAs, are used by websites to determine whether the user is a real person or a robot. Recently boffins at Vicarious, a tech firm, posted a video of its artificial-intelligence software outwitting popular sites’ CAPTCHAs, but so far the company has provided no rigorous proof of its system’s capabilities. Commercial and academic researchers frequently release papers describing how they broke one CAPTCHA type or other, but programmers react by making the distortions even more fiendish and unreadable. How does a CAPTCHA decide who is a human and who is a robot?

CAPTCHAs are used by websites to prevent abuse by spammers and scammers, who use automated tools to scrape data, make mass purchases of limited-availability items or push advertising messages. The term was coined by Luis von Ahn, then a 22-year-old student, and colleagues in 2000. The approach relies on the gap between computers’ and humans’ visual processing. Computer vision attempts to identify details in still or moving images, which it can then analyse. It can be as simple as optical character recognition (OCR), in which software converts scanned documents into editable text files, or as sophisticated as robotic systems that can identify objects by sight and act accordingly. But although computer vision has steadily progressed in sophistication over the past few decades, distorting text by skewing, rotating or squishing it together and adding extraneous details is still enough to baffle even the latest software.

The trouble is that such mangled text also pushes the limits of a human’s ability. Software designers have to strike a fine and continually adjusted balance between deterring unwanted visitors and allowing entry to legitimate users. Each advance in computer vision, whether made by academics or in criminal netherworlds, pushes the line further towards illegibility. In October Google, which a few years back purchased Dr von Ahn’s ReCAPTCHA system (which turns humans into OCR engines to transcribe words in old newspaper articles and books), said it would use a variety of behavioural cues by visitors to determine whether to pop up a fiendish text or a simpler numbers-only puzzle.

In practice, CAPTCHAs can be easily solved en masse by those willing to throw a few cents at low-paid workers in poor countries—no robots are needed, unless one looks to the original meaning of the word “robot“. And as CAPTCHAs start to yield to computer-based recognition, text will be replaced by photographs or illustrations in which the user must identify parts of a scene. But there is a certain academic glee among even those who deploy CAPTCHAs about computers’ evolving ability to beat the system. As Dr von Ahn et al noted in their seminal 2003 paper on the topic, any sufficiently advanced artificial-intelligence program that could consistently solve CAPTCHAs would represent a significant and useful advance in the field of study. Turing would probably have approved, too.