: How to block the most popular spider crawlers via robots.txt? I want do disallow my website from indexing via robots.txt by MSN/Bing, Yahoo, Ask Jeeves, Baidu and Yandex spider bots. I want

I want do disallow my website from indexing via robots.txt by MSN/Bing, Yahoo, Ask Jeeves, Baidu and Yandex spider bots.

I want to disallow content and media (images, videos) crawlers.

The reason behind doing this is that my website is targetted only for google and US market and located on hosting with limited resources.

I found different rules while googling and merge everything together:

# Block Bing
User-agent: bingbot
Disallow: /

User-agent: msnbot
Disallow: /

# Block Yahoo
User-agent: slurp
User-agent: yahoo
Disallow: /

# Block Ask
User-agent: jeeves
User-agent: teoma
Disallow: /

# Block Baidu
User-agent: baidu
Disallow: /

# Block Yandex
User-agent: yandex
Disallow: /

Are these rules correct?

Or I missed something?

Or maybe I added something redundant?

Is there any official robots.txt rules for each web crawler?

10.01% popularity Vote Up Vote Down

: Google not updating title with htaccess expires It's been 2 weeks I've changed my website title and still Google hasn't changed the title in search engine. I think it's because of if modified

@Goswami781

Posted in: #Htaccess #Php #Seo

1 Comments

: Google Analytics: Capture Zipcode on Form Submit We have a dealer locator that we would like to see what zip codes people are entering into the field before hitting submit. Is it possible to

@Goswami781

Posted in: #GoogleAnalytics

1 Comments

: Best place to place organization structured data Where is the best place to place structured data about your organization? Is it best to place it on every page or on just the home of your

@Goswami781

Posted in: #SchemaOrg #Seo #StructuredData

2 Comments

: "no return tag error" despite correct tagging? I structured my website following this tutorial that explains to do the following when you deal with profile pages that have menus translated but

@Goswami781

Posted in: #CanonicalUrl #GoogleSearchConsole #RelAlternate

0 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Phylliss660

If you test the robots.txt in one of the many robot.txt validators you'll see it does what you want.

For instance using the seobook robot txt validator shows that when tested for the url / these bots should not spider your website.

If you really want that is another question. If a bot crawling the website is too much of a strain on the resources, then maybe you need to look the performance of the website and or server as well.

10% popularity Vote Up Vote Down

Feed

: How to block the most popular spider crawlers via robots.txt? I want do disallow my website from indexing via robots.txt by MSN/Bing, Yahoo, Ask Jeeves, Baidu and Yandex spider bots. I want

More posts by @Goswami781

: Google not updating title with htaccess expires It's been 2 weeks I've changed my website title and still Google hasn't changed the title in search engine. I think it's because of if modified

: Google Analytics: Capture Zipcode on Form Submit We have a dealer locator that we would like to see what zip codes people are entering into the field before hitting submit. Is it possible to

: Best place to place organization structured data Where is the best place to place structured data about your organization? Is it best to place it on every page or on just the home of your

: "no return tag error" despite correct tagging? I structured my website following this tutorial that explains to do the following when you deal with profile pages that have menus translated but

Login to post a comment!

1 Comments

Back to top | Use Dark Theme