Mobile app version of vmapp.org
Login or Join
Yeniel560

: The best approach will most likely involve getting the IP address of the visitor to the page, performing a reverse NS lookup, and checking if the domain name matches the known list of web

@Yeniel560

The best approach will most likely involve getting the IP address of the visitor to the page, performing a reverse NS lookup, and checking if the domain name matches the known list of web crawlers. As far as I know, this is pretty much foolproof (discounting DNS spoofing which is unlikely to be a major problem).

For the Google web crawler, this is described in the blog post How to verify Googlebot.

Here's a list of the domain name wildcards for the most common spider bots/web crawlers:


Google (Googlebot): *.googlebot.com
Bing (msnbot): (Not resovable, see IP ranges)
Yahoo (Yahoo Slurp): *.yahoo.com


Though I'm not sure how often the IP address ranges for the various main crawlers, there's also this page which lists such ranges for the three main search engines.

(Note: I believe the bots do set the user-agent HTTP header on requests, but this is very easy to fake of course.)

Hope this helps.

10% popularity Vote Up Vote Down


Login to follow query

More posts by @Yeniel560

0 Comments

Sorted by latest first Latest Oldest Best

Back to top | Use Dark Theme