: Standard ratio of cookies to "visitors"? As noted in a recent blog post, We see a large discrepancy between Google Analytics "visitors" and Quantcast "visitors". Also, for reasons we have
As noted in a recent blog post, We see a large discrepancy between Google Analytics "visitors" and Quantcast "visitors".
Also, for reasons we have never figured out, Google Analytics just gets larger numbers than Quantcast. Right now GA is showing more visitors (15 million) on stackoverflow.com alone than Quantcast sees on the whole network (14 million):
blog.stackoverflow.com/wp-content/uploads/GA-Visitors1.png
Why? I don’t know. Either Google Analytics loses cookies sometimes, or Quantcast misses visitors. Counting is an inexact science.
We think this is because Quantcast uses a more conservative ratio of cookies-to-visitors. Whereas Google Analytics might consider every cookie a "visitor", Quantcast will only consider every 1.24 cookies a "visitor". This makes sense to me, as people may access our sites from multiple computers, multiple browsers, etcetera.
I have two closely related questions:
Is there an accepted standard ratio of cookies to visitors? This is obviously an inexact science, but is there any emerging rule of thumb?
Is there any more accurate way to count "visitors" to a website other than relying on browser cookies? Or is this just always going to be kind of a best-effort estimation crapshoot no matter how you measure it?
More posts by @Voss4911412
6 Comments
Sorted by latest first Latest Oldest Best
Here's a recent (May 4, 2011 -- yesterday when I write this) study from MediaMind with "Cookie Inflation Multipliers" for different markets:
press release
report (requires filling out a form and giving them your e-mail address)
direct link to the report (because it's not unique for the e-mail address)
write up by eMarketer
Their calculated inflation factor is between 2.2 for Germany to 3.0 for the US.
Maybe your GA visitor numbers are more inflated than a normal site because of the more technical nature of it's audience? For example, programmers, web developers especially, are more likely to be using a range of browsers and thus increasing the cookie count.
For question 1, I guess that, as with many metrics, it's better to use data from your own site instead of looking for global standards as aggregates can be misleading. One way to get a cookie to real visitor count might be to count how many cookies you see from each registered user then derive the number from that.
As for number 2, theoretically the best way to count real visitors would be to force everyone to register an account. As that's obviously not a good idea then you could look at normalization. For example, you could use the average cookies for registered user metric I suggested above and apply it to the visitor numbers that GA is reporting.
Quantcast emailed me:
You mentioned that there was a fairly substantial delta between your GA numbers and your QC numbers. While this doesn’t happen often, it does happen and there are several reasons this can occur. For instance, we account for 3rd party cookies and auto-refreshes and GA does not. We also ask that publishers to place our tag near the bottom of the page to comply with MRC and IAB standards. If your other measurement tags are higher on the page, they could fire when Quantcast’s does not. (We are the only MRC accredited traffic measurement service). Also, the numbers are never going to be exactly the same because of time zone considerations - we use a normalizing function and GA’s is fixed.
If you would like to learn more about how we determine our numbers, please check out: www.quantcast.com/how-we-do-it. We also have white papers on our cookie-corrected audience data and our methodology located here.
Perusing the white papers I see that they are, actually, doing what Jeff suggests: fudging the "official" numbers to get something that they think is closer to the true number of people. They have a Cookie Corrected Audience White Paper (PDF link) which implies that their system is rather elaborate, not as simple as just dividing by a magic number:
The Quantcast Quantified Publisher program captures over 75 billion media
consumption events every month, generated by more than 1.4 billion cookies (data as of June, 2008). What’s
more, many of our Quantified Publisher partners share anonymous identifiers with us that are independent of
cookies. Our model also includes several panels which provide for people-based reference points and calibration
which are free of cookie deletion. We triangulate across this mass of data with different collection processes,
biases and issues. Our models take into account visit frequency, time periods, the likelihood of multiple
computer usage and even the impact of multiple people using the same computer to deliver people based
estimates.
Our model for translating unique cookies to people has been validated using hold-out samples and independent
data sets. Further, our model is dynamic and recalibrated on an ongoing basis to reflect the evolving nature of
Internet traffic patterns.
The ratio of cookies to unique visitors is usually between 1.3 and 1.7 for sites with over a million visits.
While yc01 is correct that GA uses first-party cookies vs third-party cookies, we at RealSelf.com use two first-party analytics providers (GA and Comscore Direct) and GA still shows 30% more Absolute Unique Visitors than Comscore's Unique Visitors.
Comscore only shows unique visitors by country, so to compare GA to Comscore we have to calculate the number of US-based absolute unique visitors as follows:
US Visits / Global Visits * Absolute Unique Users
(1,150,110 / 1,650,979) * 1,273,059 = 886,842 US-based Unique Users
In contrast, Comscore reports 680,900 US-based Unique Users. So GA shows 30.2% more.
Comscore has built their business around trying to be accurate, while GA is primarily a free way to track and optimize sites that use AdWords and AdSense. Comscore has a panel of people that they also use to estimate traffic, and they use that panel to determine an average number of cookies per person. With more people using mobile devices (our mobile usage is 15%), it makes sense that unique cookies overstate the number of unique people.
I think IP is trust able ... when i create statistic system like GA with python i use some method like this
send cookie to browser and grab all agent data to database
easy way if new visit has cookie it s not new visit so i save it as not new visit ( also i have assigned date and delay time for find new visit if user repeat visit site after 2 hours )
save user IP and some id for this user and IP and cookie (its save in cookie also)
new user comes and doesn't has any cookie ... is this IP new? yes? OK its new user only grab user agent and IP / no ? how many time this user comes? more than limit? not really new visit, not more with this user agent? OK this is new ... :D
this method has fault but not bad and near valid data ... ( its also depends on delay time to find new user (delay between 2 visit ) and try time for users haven't cookie )
There's another factor at play with Quantcast undercounting: They use third-party cookies (cookies served from the .quantserve.com domain), whereas Google Analytics uses first-party cookies (stackexchange.com, etc.)
This is pretty crucial, as some browsers (particularly Safari, but more recently Firefox and Chrome) disable third-party cookies as the default setting, and many others may individually choose privacy settings that bar third-party cookies. This means there is a subset of the population that will never get tracked by QuantCast's cookies. Inherently, that means Google Analytics will always return a higher visitor count.
I'd say there is no rule of thumb. As an analytics practitioner, I'd say that the quest for a 'true' visitor count is hopeless, and instead focus on the visits themselves. For example, to your Google Analytics account, I'm at least 8 different visitors, having accessed StackOverflow from Chrome, Safari and Firefox on my work laptop, my personal laptop, my phone, and my iPad. Analytics services all count in different ways, and thus all return significantly different numbers.
Even with perfect implementation, Google Analytics will almost always show lower visit counts than a server-log based analytics system, but will show higher visit count than a third-party cookie based system like Quantcast. The important thing isn't to look at the raw totals, but the trends that each method shows in its strengths. So, never compare Quantcast numbers to Google Analytics numbers; instead, use the numbers within the contexts in which they were collected.
Another issue could be that your Google Analytics implementation isn't correct, since configuring it for your kind of multiple-domain-and-subdomain setup can be a nightmare if not done correctly and rigorously, which could lead to a single browser being counted as multiple visitors, itself inflating your count. This is never an issue for Quantcast, as all cookies are set at their one third party domain.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.