Mobile app version of vmapp.org
Login or Join
Murray432

: Do I really have to block MJ12Bot (as the prevailing visitor on my site)? I am all for allowing any legitimate search engines to visit my site, but I've noticed that on my business-card-style

@Murray432

Posted in: #Block #RobotsTxt #WebCrawlers

I am all for allowing any legitimate search engines to visit my site, but I've noticed that on my business-card-style website about every other request comes from MJ12Bot, yet for well-known reasons of them being a niche SEO bot, they don't even actually send any human visitors back, so, I'm quite disappointed about the noise they generate.

% cut -f12- -d" " constantine.su.access.log | sort | uniq -c | fgrep -i -e bot -e spider | sort -nr | head
421 "Mozilla/5.0 (compatible; MJ12bot/v1.4.5; www.majestic12.co.uk/bot.php?+)
69 "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
64 "woobot/1.1"
62 "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
61 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
39 "Mozilla/5.0 (compatible; SeznamBot/3.2; +http://napoveda.seznam.cz/en/seznambot-intro/)"
30 "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
14 "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
13 "woobot/2.0"
12 "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"




Is there a way to quiet down MJ12Bot ambitions (by something like 20×)? Or, due to the distributed nature of the MJ12bot project, do I just have to block 'em all outright as parasitic?

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Murray432

3 Comments

Sorted by latest first Latest Oldest Best

 

@Odierno851

MJ12bot adheres to the robots.txt standard. If you want the bot to prevent website from being crawled then add the following text to your robots.txt:

User-agent: MJ12bot
Disallow: /

10% popularity Vote Up Vote Down


 

@Alves908

From your comments on another answer, MJ12Bot is visiting your site less than once an hour (421 times in 25 days.) The best thing to do is to not worry about it. Crawl-Delay is useless for you because no crawler will obey a craw-delay so large.

10% popularity Vote Up Vote Down


 

@Ann8826881

Is there a way to quiet down MJ12Bot ambitions


The MJ12Bot reportedly obeys robots.txt and the (non-standard) Crawl-Delay directive:


How can I slow down MJ12bot?

You can easily slow down bot by adding the following to your robots.txt file:

User-Agent: MJ12bot
Crawl-Delay: 5


Crawl-Delay should be an integer number and it signifies number of seconds of wait between requests. MJ12bot will make an up to 20 seconds delay between requests to your site - note however that while it is unlikely, it is still possible your site may have been crawled from multiple MJ12bots at the same time. Making high Crawl-Delay should minimise impact on your site.


Reference: mj12bot.com/

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme