Reduce the amount of sampled data in your Google Analytics Reporting API requests

Reduce the amount of sampled data in your Google Analytics Reporting API requests

Have you ever wondered: is there a way to reduce the amount of sampled data in my Google Analytis Reporting API requests? Well there is. Let me explain how.

The problem

Let's look at the problem first. It consists of two parts:

  • 1: Sampled data
  • 2: API restrictions

1: Sampled data

Sampled data occurs when the amount of sessions that your data is based on is larger than 500.000 (25M for premium). If your interested in the details, you can find them here.

2: API restrictions

The API has two main restrictions:

  • Result limits - the API allows you to collect 10.000 rows in one call. You can use the start index parameter to return more data when you hit that limit.
  • General Quota - Google allows you to call the API 50,000 times a day per project, and up to 10 queries per second per IP.

When you import larger periods of data, you'll be more likely to run into sampled data. Luckily, there's an easy fix.

The Solution

When you import a larger period of data, you normally ask the API to get data from the start date, to the end date:

start-date => '2016-01-01'
end-date => '2016-01-31'

The solution is easy: chop up your dates. Based on the start and end date, you can easily get all the dates in between. So looking at the example above, this will result in the following list of dates:

  • 2016-01-01
  • 2016-01-02
  • ...
  • 2016-01-30
  • 2016-01-31

Now after this, you can loop through this list of dates, and get the results for each date separately:

Request date 1

start-date => '2016-01-01'
end-date => '2016-01-01' 

Request date 2

start-date => '2016-01-02'
end-date => '2016-01-02' 


Request date 30

start-date => '2016-01-30'
end-date => '2016-01-30' 

Request date 31

start-date => '2016-01-31'
end-date => '2016-01-31' 

With this approach, you'll collect data day by day and stitch it into one data set. This greatly reduces the chance of hitting that limit (you'll only hit it when a single day hits the sampled data limit).

Don't worry about the request limit

If you're worried about the reporting limit, you shouldn't be. With 50.000 calls per project per day, you'll be able to get a little over 136 years worth of data without hitting the limit.