Mobile app version of vmapp.org
Login or Join
Welton855

: How to correctly count pageviews? I've been using a free 3rd party sitestats counter for years, and I would like to get rid of the external party. For this simple thing I'm only interested

@Welton855

Posted in: #GoogleAnalytics #Php #Statistics

I've been using a free 3rd party sitestats counter for years, and I would like to get rid of the external party.

For this simple thing I'm only interested in pageviews (other stats I use google analytics for). So, I'm performing an insert on the footer of every page (excluding all bots I have seen coming along in the last week), but for some reason my counts are 45% higher than the old counter and 40% higher than google's pageview counts.

Are there any additional checks I need to do to prevent counting incorrect pageviews?

Thanks!

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Welton855

2 Comments

Sorted by latest first Latest Oldest Best

 

@Looi9037786

Ok I'm getting somewhere now...

I have added some check, so now I'm not counting a pageview when:


useragent contains "bot"
useragent contains "crawl"
useragent shorter than 40 chars (98% of those visits are spiders and bots it seems)
request_url is user registration page (seems that 99% of those requests are spambots trying to create spam account (even better would be to check IP's against bot databases, but this gets me near enough).


Result:


compared to google analytics (with javascript): +8% pageviews
compared to my old counter (with an image): +27% pageviews


I'm still evaluating, but once I get a constant difference, I'll just multiply the pageviews registered by he old counter, so here *1.27

I hope this helps people trying to make their own pageview counter.

10% popularity Vote Up Vote Down


 

@Bethany197

You'll never be able to get one analytics tool to match the same numbers as another. In this case, you're comparing GA to your own system. You should be looking at trends, and not numbers.

However, 40% is quite large. I've find that anywhere from 2-10% of traffic will have JavaScript disabled depending on the industry the site is based off of. You also claim to be excluding bots - but how exactly are you determining if it is a bot or not? GA has a large list of bots, search engines, etc. that they filter out. That is most likely your largest source of difference.

Why aren't you depending more on GA for analysis instead of trying to make your own solution? If you are doing it just to grab numbers for a backend purpose, use GA's API to grab the data for graphs/whatever.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme