Bot traffic can severely impact your reporting data, leading to false assumptions, hampering your site performance, and even increasing site maintenance costs. While you may believe bot traffic does not impact your site, recent reports suggest that 59 percent of all site visits may be associated with bots. With this in mind, it’s important to understand how to spot bot traffic in order to accurately report your data.
This post outlines some best practices to detect bot traffic in your Google Analytics reports, as well as ways to eliminate bots using filters and other recommended techniques. I will also be covering some industry best practices that should be followed alongside using your Google Analytics filters.
1. Identifying Bots
Some of the very basic items to look out for in your reports associated to bot traffic would be:
• High bounce rates
• Low average session duration
• No goal completions or revenue associated with traffic
• Almost 100% new visitor traffic
The following reports provide good information when checking for unexpected spikes:
a. New vs. Returning Users
Location in Google Analytics: Audience > Behavior > New vs. Returning Users
This report breaks down new vs. returning sessions, and it’s really helpful to match your spikes to bot traffic. Bots generally get registered as new users. Considering this fact, you will surely see a large amount of new visitor sessions being shown with a decrease in overall site engagement metrics. This report can be broken down using secondary dimensions and compared on weekly, monthly, or daily basis. Over a time period, a huge rise in new visitors can be seen if it’s triggered by bot traffic.
b. Browser & Browser Version
Location in Google Analytics: Audience > Technology > Browser & OS
You can narrow down your results to a specific browser by monitoring this report. After deciding on a particular browser with a high number of sessions, you can drill down specific versions and find the version responsible for the high number of sessions. In this case, other metrics such as bounce rate and low average session duration would support this claim.
c. Network Domain
Location in Google Analytics: Audience > Technology > Network
Another area in the reporting section that one should concentrate on is the “Network” section. A list of internet service providers tend to generate huge bot traffic, such as Google, Amazon, and Microsoft. Secondary dimension “network domain” can be added to filter traffic by domain, which helps greatly. The most common of the lot is “amazonaws.com”, which generates a huge amount of bot traffic. Once you have determined where the domain bot traffic is generated from, you can apply custom filters to exclude that traffic completely.
Creating a filter for excluding a specific ISP organization looks like this:
2. Filtering Bots
a. Admin View Settings
Under the “Admin” section, you can edit your “View” settings by checking a box for excluding known bots. It is highly recommended to create a test view first to see affected results before applying it to your main view. IAB/ABC International Spiders and Bots Lists maintains this list of excluded bots, which is not available to the public and is a paid list.
b. Using IP Address & User Agent
c. Eliminate Bot Traffic
Various industry standards can be followed outside of Google Analytics, with one of them being the CAPTCHA service. Google introduced a new service of their popular CAPTCHA known as “No CAPTCHA”. It is able to detect human behavior, such as mouse usage, and based on that it makes a decision. There is no need to add a phrase for verification purposes. When a user visits the site for the first time, this new CAPTCHA service can be shown to the user. The Google Analytics tag would only need to be fired after the CAPTCHA service is completed successfully. A session cookie can be set after this process, which should get rid of most of the bot traffic entering your website.
As a follow up to this process, you can present a form asking the user for their email address and send out an activation link valid for 24 hours. This would add an extra layer of privacy when filtering bot traffic.
With all of the advanced bots that have been spamming the Internet today, it becomes impossible to have 100% bot free traffic in your analytic tools. However, by following the above recommendations, the majority of the bot traffic will be filtered out.
Images courtesy of Luna Metrics and SwellPath