How to Exclude Bot Traffic in Google Analytics

Bot traffic can severely impact your reporting data, leading to false assumptions, hampering your site performance, and even increasing site maintenance costs. While you may believe bot traffic does not impact your site, recent reports suggest that 59 percent of all site visits may be associated with bots. With this in mind, it’s important to understand how to spot bot traffic in order to accurately report your data
 
This post outlines some best practices to detect bot traffic in your Google Analytics reports, as well as ways to eliminate bots using filters and other recommended techniques. I will also be covering some industry best practices that should be followed alongside using your Google Analytics filters. 
 
1. Identifying Bots 
 
Some of the very basic items to look out for in your reports associated to bot traffic would be:
  • High bounce rates
  • Low average session duration
  • No goal completions or revenue associated with traffic
  • Almost 100% new visitor traffic
The following reports provide good information when checking for unexpected spikes:
 
A. New vs. Returning Users 
 
Location in Google Analytics: Audience > Behavior > New vs. Returning Users
 
This report breaks down new vs. returning sessions, and it’s really helpful to match your spikes to bot traffic. Bots generally get registered as new users. Considering this fact, you will surely see a large amount of new visitor sessions being shown with a decrease in overall site engagement metrics. This report can be broken down using secondary dimensions and compared on weekly, monthly, or daily basis. Over a time period, a huge rise in new visitors can be seen if it’s triggered by bot traffic. 
 
 
B. Browser & Browser Version 
 
Location in Google Analytics: Audience > Technology > Browser & OS
 
You can narrow down your results to a specific browser by monitoring this report. After deciding on a particular browser with a high number of sessions, you can drill down specific versions and find the version responsible for the high number of sessions. In this case, other metrics such as bounce rate and low average session duration would support this claim. 
 
 
C. Network Domain
 
Location in Google Analytics: Audience > Technology > Network
 
Another area in the reporting section that one should concentrate on is the “Network” section. A list of internet service providers tend to generate huge bot traffic, such as Google, Amazon, and Microsoft. Secondary dimension “network domain” can be added to filter traffic by domain, which helps greatly. The most common of the lot is “amazonaws.com”, which generates a huge amount of bot traffic. Once you have determined where the domain bot traffic is generated from, you can apply custom filters to exclude that traffic completely. 
 
 
Creating a filter for excluding a specific ISP organization looks like this:
 
 
2. Filtering Bots 
 
A.  Admin View Settings
 
Under the “Admin” section, you can edit your “View” settings by checking a box for excluding known bots. It is highly recommended to create a test view first to see affected results before applying it to your main view. IAB/ABC International Spiders and Bots Lists maintains this list of excluded bots, which is not available to the public and is a paid list. 
 
  
B.  Using IP Address & User Agent 
 
If you think a particular IP address is responsible for the bot traffic you can use “View Filters” to exclude that IP address. Bots frequently change their IP address to escape their identity. If you are using Google Tag Manager, you can pass your visitor’s user agent string value to Google Analytics as a custom dimension and those sessions can be excluded. A custom dimension “user agent” can be created in Google Analytics and set as a JavaScript variable in Google Tag Manager by retrieving the value using “navigator.userAgent”. You can then create filters to eliminate those user agents using this condition. The following images illustrate these steps:
 
 
 
 
 
C. Eliminate Bot Traffic
 
Various industry standards can be followed outside of Google Analytics, with one of them being the CAPTCHA service. Google introduced a new service of their popular CAPTCHA known as “No CAPTCHA”. It is able to detect human behavior, such as mouse usage, and based on that it makes a decision. There is no need to add a phrase for verification purposes. When a user visits the site for the first time, this new CAPTCHA service can be shown to the user. The Google Analytics tag would only need to be fired after the CAPTCHA service is completed successfully. A session cookie can be set after this process, which should get rid of most of the bot traffic entering your website. 
 
 
As a follow up to this process, you can present a form asking the user for their email address and send out an activation link valid for 24 hours. This would add an extra layer of privacy when filtering bot traffic. 
 
With all of the advanced bots that have been spamming the Internet today, it becomes impossible to have 100% bot free traffic in your analytic tools. However, by following the above recommendations, the majority of the bot traffic will be filtered out.
 
Want to know more about excluding bot traffic? Please contact info@stratigent.com
 
Images courtesy of Luna Metrics and SwellPath
 
 
button-request-more-info2.png
 
 
 
 
 
By Akshay Ahluwalia
About the Author:

Akshay Ahluwalia is an Analyst at Stratigent.

Contact Us Now