Mobile app version of vmapp.org
Login or Join
Bryan171

: Does the user agent in any regular browser contain 'bot' or 'crawl'? Does the user agent in any regular browser contain 'bot' or 'crawl'? I check the user agent on my site to see if it is

@Bryan171

Posted in: #UserAgent

Does the user agent in any regular browser contain 'bot' or 'crawl'?

I check the user agent on my site to see if it is coming from a bot or not. If it is, I can do some little optimizations since they don't login. (I don't change the content at all)

After adding checks for 30-40+ bots, I'm getting tired of added them. So I was wondering if checking if it just contains 'bot' or 'crawl'. I know that wont get all bots, but it would get a lot of them. But if that could cause any false positives, then it would totally mess up the ability to add to cart, place an order, and login in.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Bryan171

3 Comments

Sorted by latest first Latest Oldest Best

 

@Ravi8258870

According to the list at www.useragentstring.com/pages/useragentstring.php?typ=Browser with over 9000 user agent strings from various browsers:


0 user agent strings of browsers contains the word "bot"
2 user agent strings of browsers contains the word "crawl"
0 user agent strings of browsers contains the word "spider"


(The 2 which contains "crawl" is the following: "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; YComp 5.0.2.6; MSIECrawler)" and "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; MSIECrawler)" I think it is safe to not consider those.)

According to the list at www.useragentstring.com/pages/useragentstring.php?typ=Crawler with 442 user agent strings listed as bots:


208 user agent strings of bots contains the word "bot"
63 user agent strings of bots contains the word "crawl"
37 user agent strings of bots contains the word "spider"
282 user agent strings of bots contains either "bot", "crawl" or "spider"


My conclusion: it is safe to filter bots by user agent strings by the words "bot", "crawl" and "spider". It's not bullet-proof but is definitely better than nothing.

Note: When searching for the keywords I used case insensitive searching.

10% popularity Vote Up Vote Down


 

@Yeniel560

A better solution IMO would be to detect whether the user is logged in. If they are not, show the standard page (this could be cached). Any web spider will never be logged in but if you are optimizing for them, why not for new users to your site?

10% popularity Vote Up Vote Down


 

@Sue5673885

This question from Stack Overflow should help: "Is there an online user agent database?"

You could quickly scan the db, and find out (or import it).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme