Mobile app version of vmapp.org
Login or Join
Goswami781

: How to block the most popular spider crawlers via robots.txt? I want do disallow my website from indexing via robots.txt by MSN/Bing, Yahoo, Ask Jeeves, Baidu and Yandex spider bots. I want

@Goswami781

Posted in: #RobotsTxt #WebCrawlers

I want do disallow my website from indexing via robots.txt by MSN/Bing, Yahoo, Ask Jeeves, Baidu and Yandex spider bots.

I want to disallow content and media (images, videos) crawlers.

The reason behind doing this is that my website is targetted only for google and US market and located on hosting with limited resources.

I found different rules while googling and merge everything together:

# Block Bing
User-agent: bingbot
Disallow: /

User-agent: msnbot
Disallow: /

# Block Yahoo
User-agent: slurp
User-agent: yahoo
Disallow: /

# Block Ask
User-agent: jeeves
User-agent: teoma
Disallow: /

# Block Baidu
User-agent: baidu
Disallow: /

# Block Yandex
User-agent: yandex
Disallow: /


Are these rules correct?

Or I missed something?

Or maybe I added something redundant?

Is there any official robots.txt rules for each web crawler?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Goswami781

1 Comments

Sorted by latest first Latest Oldest Best

 

@Phylliss660

If you test the robots.txt in one of the many robot.txt validators you'll see it does what you want.

For instance using the seobook robot txt validator shows that when tested for the url / these bots should not spider your website.

If you really want that is another question. If a bot crawling the website is too much of a strain on the resources, then maybe you need to look the performance of the website and or server as well.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme