: Google analytics web data vastly different than the core reporting API data I am trying to pull the exact same data i am seeing in my dashboard via the core reporting API for Google Analytics.
I am trying to pull the exact same data i am seeing in my dashboard via the core reporting API for Google Analytics. However I just dont seem to understand why the data can be so different even for the same time period and metrics!
From my web UI this is my table structure in the dashboard.
**Display the following columns:**
Dimension: Month of Year
Metric: Pageviews
**Filter this data:**
Only show **Page** containing "/blog/"
And this is what i see in my web UI fore period 09/26/2013 to 12/26/2013:
Month of Year Pageviews
201312 151,502
201311 136,856
201310 183,555
201309 22,689
In my script, I use the exact same metrics (except for naming convention differences between the web and API metrics):
dimensions = ga:yearMonth
start-date = 2013-09-26
start-index = 1
metrics = [u'ga:pageviews']
filters = ga:pagepath=@/blog/
end-date = 2013-12-26
And this is what i see:
Rows:
201312 148626
201311 160769
201310 154770
201309 16099
Report Infos:
Contains Sampled Data = False
Kind = analytics#gaData
ID = www.googleapis.com/analytics/v3/data/ga?ids=ga:xxxxxx&dimensions=ga:yearMonth&metrics=ga:pageviews&sort=-ga:yearMonth&filters=ga:pagepath%3D@/blog/&start-date=2013-09-26&end-date=2013-12-26 Self Link = www.googleapis.com/analytics/v3/data/ga?ids=ga:xxxxxx&dimensions=ga:yearMonth&metrics=ga:pageviews&sort=-ga:yearMonth&filters=ga:pagepath%3D@/blog/&start-date=2013-09-26&end-date=2013-12-26
Pagination Infos:
Items per page = 1000
Total Results = 4
So as we can see, the data format is correct but the data inside is wrong. Whats worse is that the data trend is different.
More posts by @Debbie626
2 Comments
Sorted by latest first Latest Oldest Best
I had (what I think was) the same question, comparing python-generated reporting and the google provided web tool. I found the difference was because the web tool uses sampling:
"This report is based on 96,693 sessions (92.19% of sessions)"
You have one data point that is actually higher in the web tool though... can't explain that :)
Actually this is pretty good. Your numbers are pretty close. On my end my stats on my systems would give me about 4x more hits than Google Analytics.
Now... why the discrepancy? There are many factors, these are those I can think of at this point:
You have a cache between you and your clients, Google Analytics will count every single hit, your system not since it does not get hit.
Your system may be capable of returning a 304 and not count those as hits.
Your system count all the hits, including hits from all spiders (i.e. googlebot hits). Google Analytics knows of many spiders and they do not count their hits.
Your system counts hacker accesses since it hits your server, Google Analytics does not since the hackers (web spammers, etc.) do not execute their JavaScript code.
Goole Analytics count hits from HTML pages only, your server may server other data (PDF files, images, etc.) that get counted too.
Google Analytics also counts differently for visitors who browse your website and "returning visitors," which most often a CMS won't grasp in the same way.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.