New research on bot traffic

 

The Incapsula Blog

Back to Blog

09

December

Report: Bot traffic is up to 61.5% of all website traffic

Igal Zeifman

Last March we published a study that showed the majority of website traffic (51%) was generated by non-human entities, 60% of which were clearly malicious. As we soon learned, these facts came as a surprise to many Internet users, for whom they served as a rare glimpse of “in between the lines” of Google Analytics.

Since then we were approached with numerous requests for an updated report. We were excited about the idea, but had to wait; first, to allow a significant interval between the data, and then for the implementation of new Client Classification features.

With all the pieces in place, we went on to collect the data for the 2013 report, which we’re presenting here today.

Bot traffic is up to 61.5% of all website traffic

Research Methodology

For the purpose of this report we observed 1.45 Billion visits, which occurred over a 90 day period. The data was collected from a group of 20,000 sites on Incapsula’s network, which consists of clients from all available plans (Free to Enterprise). Geographically, the traffic covers all of the world’s 249 countries, per country codes provided by an ISO 3166-1 standard.

Report Highlights

Bot Traffic is up by 21%

Compared to the previous report from 2012, we see a 21% growth in total bot traffic, which now represents 61.5% of website visitors. The bulk of that growth is attributed to increased visits by good bots (i.e., certified agents of legitimate software, such as search engines) whose presence increased from 20% to 31% in 2013. Looking at user-agent data we can provide two plausible explanations of this growth:

  • Evolution of Web Based Services: Emergence of new online services introduces new bot types into the pool. For instance, we see newly established SEO oriented services that crawl a site at a rate of 30-50 daily visits or more.
  • Increased activity of existing bots: Visitation patterns of some good bots (e.g., search engine type crawlers) consist of re-occurring cycles. In some cases we see that these cycles are getting shorter and shorter to allow higher sampling rates, which also results in additional bot traffic.

31% of Bots Are Still Malicious, but with Much Fewer Spammers

While the relative percentage of malicious bots remains unchanged, there is a noticeable reduction in Spam Bot activity, which decreased from 2% in 2012 to 0.5% in 2013. The most plausible explanation for this steep decrease is Google’s anti-spam campaign, which includes the recent Penguin 2.0 and 2.1 updates.

SEO link building was always a major motivation for automated link spamming. With its latest Penguin updates Google managed to increase the perceivable risk for comment spamming SEO techniques, while also driving down their actual effectiveness.

Based on our figures, it looks like Google was able to discourage link spamming practices, causing a 75% decrease in automated link spamming activity.

Evidence of More Sophisticated Hacker Activity

Another point of interest is the 8% increase in the activity of “Other Impersonators” – a group which consists of unclassified bots with hostile intentions.

The common denominator for this group is that all of its members are trying to assume someone else’s identity. For example, some of these bots use browser user-agents while others try to pass themselves as search engine bots or agents of other legitimate services. The goal is always the same – to infiltrate their way through the website’s security measures.

The generalized definition of such non-human agents also reflects on these bots’ origins. Where other malicious bots are agents of known malware with a dedicated developer, GUI, “brand” name and patch history, these “Impersonators” are custom-made bots, usually crafted for a very specific malicious activity.

One common scenario: un-categorized DDoS bot with a spoofed IE6 user-agentOne common scenario; un-categorized DDoS bot with a spoofed IE6 user-agent.

In terms of their functionality and capabilities, such “Impersonators” usually represent a higher-tier in the bot hierarchy. These can be automated spy bots, human-like DDoS agents or a Trojan-activated barebones browser. One way or another, these are also the tools of top-tier hackers who are proficient enough to create their own malware.

The 8% increase in the number of such bots highlights the increased activity of such hackers, as well as the rise in targeted cyber-attacks.

This is also reflective of the latest trends in DDoS attacks, which are evolving from volumetric Layer 3-4 attacks to much more sophisticated and dangerous Layer 7 multi-vector threats.

CAPTCHA by-pass software is now readily accessible

Purchase captcha bypass software For Reasonable Price.
Even if you don’t have a lot of experience in sector of computers and world wide web you still have noticed captcha

BriefingWire.com, 11/19/2013 – Even if you don’t have a lot of experience in sector of computers and world wide web you still have noticed captcha or Completely Automated Public Turing test to tell Computers and Humans Apart when you have visited different internet sites that offer security from spam and other bad things widespread in web. So, what is captha and why many people around the world are searhing for the most dependable, cheap and efficient captcha bypass software?

Initially you need to find out that Captcha is a simple device that has been created a lot of years back and is still utilized in order to identify human from automatically operating robots. Commonly it demands typing particular symbols that are displaying on the presented photo. As outlined by this technique just human can determine this symbols, but program can’t. But, in the current times of advancement and innovations people have developed software program of passing up such process.

If you’re looking for gmail captcha broken then we have the incredible solution for you. Visit following web page: http://de-captcher.com/ where you will get one of the most efficient and cheap captcha solving designed for every person. Mainly this type of software is in a great requirement among entrepreneurs, mainly because this style of marketing and advertising is extremely good at promoting, increasing popularity and getting more clients who will discover your internet site on the most widely used search engines. As captha has been made as a security from spam, automated subscription and other fraud of such type you need to discover dependable captcha solver php python that will be able to avoid this defense.

So, don’t spend your time and energy and this incredible chance to bypass any type of captcha even the most challenging just with the few clicks of your mouse button and the most reliable software program that will bypass any captcha in a very rapid manner. It is very simple in use and all you need to complete is to follow simple instructions that you will find on the offered website. There are various strategies that will help you to bypass this kind of system, however presented choice is the fastest and the best. If you are frequent internet user you have probably experienced captcha defense in different internet websites. Most simple users are sick of this frustrating system and would like to take care of it. If you are one of such people then automated captcha bypass is exactly what you have been looking for.

Human Detection

 

The Economist explains

How can websites tell humans and robots apart?

 

SIGN up for a new e-mail account, buy a concert ticket or leave a comment on a website and you will often be confronted with an image of skewed and swirled letters and numbers to be transcribed before you are allowed to proceed. “Completely Automated Public Turing-tests to tell Computers and Humans Apart”, or CAPTCHAs, are used by websites to determine whether the user is a real person or a robot. Recently boffins at Vicarious, a tech firm, posted a video of its artificial-intelligence software outwitting popular sites’ CAPTCHAs, but so far the company has provided no rigorous proof of its system’s capabilities. Commercial and academic researchers frequently release papers describing how they broke one CAPTCHA type or other, but programmers react by making the distortions even more fiendish and unreadable. How does a CAPTCHA decide who is a human and who is a robot?

CAPTCHAs are used by websites to prevent abuse by spammers and scammers, who use automated tools to scrape data, make mass purchases of limited-availability items or push advertising messages. The term was coined by Luis von Ahn, then a 22-year-old student, and colleagues in 2000. The approach relies on the gap between computers’ and humans’ visual processing. Computer vision attempts to identify details in still or moving images, which it can then analyse. It can be as simple as optical character recognition (OCR), in which software converts scanned documents into editable text files, or as sophisticated as robotic systems that can identify objects by sight and act accordingly. But although computer vision has steadily progressed in sophistication over the past few decades, distorting text by skewing, rotating or squishing it together and adding extraneous details is still enough to baffle even the latest software.

The trouble is that such mangled text also pushes the limits of a human’s ability. Software designers have to strike a fine and continually adjusted balance between deterring unwanted visitors and allowing entry to legitimate users. Each advance in computer vision, whether made by academics or in criminal netherworlds, pushes the line further towards illegibility. In October Google, which a few years back purchased Dr von Ahn’s ReCAPTCHA system (which turns humans into OCR engines to transcribe words in old newspaper articles and books), said it would use a variety of behavioural cues by visitors to determine whether to pop up a fiendish text or a simpler numbers-only puzzle.

In practice, CAPTCHAs can be easily solved en masse by those willing to throw a few cents at low-paid workers in poor countries—no robots are needed, unless one looks to the original meaning of the word “robot“. And as CAPTCHAs start to yield to computer-based recognition, text will be replaced by photographs or illustrations in which the user must identify parts of a scene. But there is a certain academic glee among even those who deploy CAPTCHAs about computers’ evolving ability to beat the system. As Dr von Ahn et al noted in their seminal 2003 paper on the topic, any sufficiently advanced artificial-intelligence program that could consistently solve CAPTCHAs would represent a significant and useful advance in the field of study. Turing would probably have approved, too.