: Which token from a long User-Agent should I use in robots.txt? The definition of User-Agent states that several tokens can be included, as deemed necessary by the client. I want to block certain
The definition of User-Agent states that several tokens can be included, as deemed necessary by the client.
I want to block certain bots via robots.txt and I am confused as to which part of the User-Agent string to use, especially for more obscure bots. For example:
Mozilla/5.0 (compatible; uMBot-LN/1.0; mailto: crawling@ubermetrics-technologies.com)"
JS-Kit URL Resolver, js-kit.com/ Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html
Do I use the second token? Can tokens contain spaces, or did the SEOkicks folks forget a semicolon after SEOkicks-Robot? I don't actually intend on making my question specific to a couple bots - I want to know the guideline: which part of UA do I place in robots.txt for these exotic bots with UA as long as a haiku?
User-agent: uMBot-LN/1.0
Disallow: /
PS: Thank you but I do not need to hear that undesirable bots are better blocked with mod_security. I already have commercial mod_sec rules in place.
More posts by @Hamaas447
1 Comments
Sorted by latest first Latest Oldest Best
Web crawlers that support robots.txt often publish a page about their crawler with instructions about how to block the crawler in robots.txt:
Google:
The Google user-agent is (appropriately enough) Googlebot.
Yandex:
Examples: User-agent: Yandex
Yahoo:
Yahoo Slurp obeys the first entry in the robots.txt file with a User-agent containing "Slurp."
There is also a database of robot names that can be used in robots.txt on the robots website: www.robotstxt.org/db.html
Unfortunately, neither of the robots that you post as examples have pages that I could find, nor are they listed in the robots database. However, as a pattern, I would expect that using a slash in the User-agent line of robots.txt would not be appropriate. None of the examples that I have come across recommend that. So I would use:
User-agent: uMBot-LN
Disallow: /
User-agent: SEOkicks-Robot
Disallow: /
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.