: Clicks counting and crawler bots I am currently running a small affiliate-program for Facebook users. We use an auto-poster to publish links to fan pages. Every hit is stored in our database
I am currently running a small affiliate-program for Facebook users. We use an auto-poster to publish links to fan pages. Every hit is stored in our database and we have included a 24 hour reload block for the IP-addresses. My problem right now is that the PHP script also stores every hit from all the bots that crawls my website. Now I was thinking to block those bots with the robots.txt of my website but I am afraid that this will have a negative effect on my AdSense ads.
Does anybody have an idea for me how to work this out?
More posts by @Martha676
1 Comments
Sorted by latest first Latest Oldest Best
Avoid using the robots.txt to block known robots. You want robots to crawl your site, you only need to distinguish which hits must be logged and which not.
To solve this problem, you could check the client's user agent and store only the hits that meet your criteria. For example you could use a simple function, to check if the hit comes from a known search engine (You can also add more keys):
function client_is_crawler($user_agent)
{
//Set the crawlers list
$crawlers = array(
'google' => 'GoogleBot|Google Web Preview|Mediapartners-Google|Wirelesss*Transcoder',
'alexa' => 'ia_archiver',
'yahoo' => 'compatible; Yahoo! Slurp;',
'msn' => 'msnbot',
'bing' => 'bingbot',
'apache_bench' => 'ApacheBench',
'baiduspider' => 'Baiduspider',
'grapeshot' => 'GrapeshotCrawler',
'archive.org' => 'archive.org_bot',
'spider' => 'spider',
'indexer' => 'indexer',
'admantx' => 'admantx.com',
'robot' => 'robot',
'bot' => 'bot',
'search' => 'search',
'genieo' => 'Genieo'
);
//Loop through crawlers list, and check if client is a known crawler or not
foreach ($crawlers AS $key => $crawler)
{
if (preg_match('/b' . $crawler . 'b/i', $user_agent) > 0) //It is
return $key;
}
return false; //No is not
}
And call it with the client's user_agent, retrieved from the $_SERVER superglobal array:
if (client_is_crawler($_SERVER['HTTP_USER_AGENT']) === false) //Client is not a bot
do_store_hit();
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.