: How to correctly count pageviews? I've been using a free 3rd party sitestats counter for years, and I would like to get rid of the external party. For this simple thing I'm only interested
I've been using a free 3rd party sitestats counter for years, and I would like to get rid of the external party.
For this simple thing I'm only interested in pageviews (other stats I use google analytics for). So, I'm performing an insert on the footer of every page (excluding all bots I have seen coming along in the last week), but for some reason my counts are 45% higher than the old counter and 40% higher than google's pageview counts.
Are there any additional checks I need to do to prevent counting incorrect pageviews?
Thanks!
More posts by @Welton855
2 Comments
Sorted by latest first Latest Oldest Best
Ok I'm getting somewhere now...
I have added some check, so now I'm not counting a pageview when:
useragent contains "bot"
useragent contains "crawl"
useragent shorter than 40 chars (98% of those visits are spiders and bots it seems)
request_url is user registration page (seems that 99% of those requests are spambots trying to create spam account (even better would be to check IP's against bot databases, but this gets me near enough).
Result:
compared to google analytics (with javascript): +8% pageviews
compared to my old counter (with an image): +27% pageviews
I'm still evaluating, but once I get a constant difference, I'll just multiply the pageviews registered by he old counter, so here *1.27
I hope this helps people trying to make their own pageview counter.
You'll never be able to get one analytics tool to match the same numbers as another. In this case, you're comparing GA to your own system. You should be looking at trends, and not numbers.
However, 40% is quite large. I've find that anywhere from 2-10% of traffic will have JavaScript disabled depending on the industry the site is based off of. You also claim to be excluding bots - but how exactly are you determining if it is a bot or not? GA has a large list of bots, search engines, etc. that they filter out. That is most likely your largest source of difference.
Why aren't you depending more on GA for analysis instead of trying to make your own solution? If you are doing it just to grab numbers for a backend purpose, use GA's API to grab the data for graphs/whatever.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2025 All Rights reserved.