Has your analytics run aground because you’ve started seeing Google Data Sampling appear?

GOOGLE DATA SAMPLING – WHAT IS IT?

When a web site experiences more than 500,000 sessions that covers the same time range as what you’re querying, then GA ignores some of your data. Google refers to this “sampled data”.

The elimination of some of your data occurs on the web “property” data first, before the subsets of data for Views, Segments, and Filters are derived.

In other words, the use of Google Analytics Views, segments, and/or filtering doesn’t affect GA sampling.

HOW SAMPLING HARMS YOUR BUSINESS

Google deserves credit for offering a free service of GA.

But, the data it reports to you is wrong. Let’s not ignore that.

The data is wrong in ways that are not predictable, consistent and not even a representation of any trend or pattern in your data because the sampling algorithm is undocumented and unknown and not guaranteed in any way. For Google’s explanation of sampling, visit: https://support.google.com/analytics/answer/2637192?hl=en. It’s a good try, but it won’t help when you make bad decisions based on wrong data.

Go ahead and use it, but be aware that it is wrong.  It’d be like Google is giving you a car that sometimes steers you towards a cliff — and you don’t know which corner it will do that. And, that corner is different every time.

The severity of the problem becomes obvious when the data should be the same on multiple reports but isn’t. For example, a report with the current month on it that you use to make decisions, might have a different value in quarterly report you distribute to management. It’s going to be difficult to justify certain actions if you’re looking at different numbers. And you won’t necessarily realize they are looking at different numbers!!! This could have serious ramifications on your business, not to mention your career.

The common sense advice is: If these erroneous numbers are important metrics, and why would you work with ones that aren’t, then you should NOT distribute them within your organization.  If you distribute reports where the values change or are obviously wrong, the integrity of the information will be questioned, and more significantly people could be making important business decisions on wrong data. You might even be subject to litigation if you are providing this data to someone without a specific caution.

NEXT ANALYTICS OFFERS A FREE SOLUTION

We offer you a way to save $150,000 a year.

If you have more than 500,000 sessions per day, but still don’t want to spend $150,000 a year, then another option is to split your web site up into different properties, each one having less than 500,000 sessions per day. Perhaps your business can do this.

Once you have your GA web logging to under 500,000 sessions per day, then you can use NEXT Analytics “Avoid Data Sampling”.  NEXT Analytics software smartly de-constructs your GA query into a series of one day queries. Then it submits them one at a time, until all days have been downloaded. If each day is less than 500,000 sessions, there will be no sampling.  Once our software has downloaded all of the days in your time range, unsampled, it automatically combines the days into a single query result.

NEXT ANALYTICS SOLVES THE GOOGLE ANALYTICS DATA SAMPLING PROBLEM

You have saved $150K by not having to buy GA Premium. What else can you do with NEXT Analytics?

Many people start with Excel for analysis.

  1. NEXT has a built-in analytics engine. It can filter and store a subset in the worksheet.
  2. NEXT’s engine can also summarize data into calculations and write only the calculations to the worksheet.

If you want to save the transactions to be imported into another application, then you need to bypass storing the data in Excel because the limit on the number of rows in a worksheet, approximately 1 million, causes a problem with Excel.  

With such a busy web site, you have to employ other options.

One choice is: Put the data into a database using a bulk CSV upload!

*            Use NEXT Analytics to create a CSV file that you can use with another application such as Tableau or use a bulk upload feature common to various database tools. This may or may not be automatable, depending on the database software you are using.

Maybe you want to put the data into a Google Spreadsheet

*            NEXT Analytics can keep a Google Spreadsheet up to date, or create new ones every day.

NEXT Analytics can put the records into a database for you.

*            Store the data in a local database such as MySQL or SQL Server.  You can use the database with any business intelligence software.

*            NEXT Analytics can also connect to remote databases. Store the data in a database hosted by Amazon Web Services or Microsoft Azure.  You can use the database with any business intelligence software.

Keeping a Database updated can be very effective and fully automated, with no need for human interaction. Indeed, once you have GA data in your own database, then any business intelligence tool has access to it for analysis, reporting and report distribution. All the business analysts in the world are now empowered to use the GA data, not just the small community of web analysts.

The coach / experts at NEXT Analytics can support you to design a workflow that avoids paying for Google Premium while still avoiding Google Data Sampling. Start by signing up for a PRO license and get a conversation going!