: If an visitors IP address contains "google" or a similar keyword, does this mean they were a crawler? I have a huge list of IP addresses recorded from various visitors to a website. A huge
I have a huge list of IP addresses recorded from various visitors to a website. A huge amount of the visitors, in some months over 70%, came from IP addresses that contained keywords such as google, yahoo, bot, crawler, etc.
Does this mean that those users were infact search engine crawlers?
If so, why are their so many crawlers in my visitor records in comparison to genuine human visitors? (and if not what's the explanation?)
Thanks in advance.
UPDATE
Here's a few examples of the data:
livebot-65-55-209-133.search.live.com
crawl2.cosmixcorp.com
crawl-66-249-70-78.googlebot.com
lj511965.crawl.yahoo.net
lj611054.inktomisearch.com
ss125.dal0.gigablast.com
crawl-15.cuill.com
More posts by @Deb1703797
2 Comments
Sorted by latest first Latest Oldest Best
These are more commonly referred to as crawler user-agents. The subdomains of the agents normally refer to a server ID (sometimes including the IP address).
You can find a comprehensive list at www.user-agents.org/
That looks genuine, as it's fairly hard to spoof a source domain name -- crawlers can generate a lot of traffic. They often check pages for changes quite frequently.
You can slow down many of them with a non-standard (but fairly well supported) addition to the Robots Exclusion Protocol -- create a file called robots.txt that's served from your web server's root directory with the following contents:
User-agent: *
Crawl-delay: 60
Where the number on the second line is the number of seconds you want each crawler to wait between page loads on your site. (If you've already got a robots.txt file you'll need to modify it instead.)
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.