: Block third domain from being indexed We have a web application that handles many websites that have test and production environment. This application has only one folder and it's hosted with

We have a web application that handles many websites that have test and production environment.

This application has only one folder and it's hosted with IIS.
We already have noindex nofollow, index follow on respective domains but it's not working for static files (images, pdf documents etc.)

Is there a way to setup the robots.txt disallowing every single test domain?

Example

Disallow: test.domain1.com
Disallow: test.domain2.com

etc.

Take note that the domains are 200+ and they are being added/removed very fastly so robots.txt would be a good solution for us. We do not have access to the google search engine of each domain to request a removal from google index. Also the solution should work for bing and other search engines.

Can you see any way to achieve this?

10.02% popularity Vote Up Vote Down

: Prevent International Subdomain Ranking Higher than National Website I have a national website and an international website. The international website covers the rest of the world so not country

@Sue5673885

Posted in: #GoogleRanking #International #Ranking #Seo

1 Comments

: Effect of incomplete Disallow rule in robots.txt file Solved: Pages were being blocked by meta robots deliberately A lot of pages are being blocked in the robots.txt file and when I checked

@Sue5673885

Posted in: #RobotsTxt

1 Comments

: User Scope Dimension does not appear to persist between sessions I have Analytics set up where I am also sending The User ID. This is set when a user logs in, and is undefined when not logged

@Sue5673885

Posted in: #GoogleAnalytics

1 Comments

: How long will it take for Google to unindex dead websites? I was wondering if someone knows how many days or months does it take for Google to remove the dead sites from their index? I've

@Sue5673885

Posted in: #GoogleIndex

2 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Jamie184

We already have noindex nofollow, index follow on respective domains

Presumably this is implemented using a <meta name="robots"... element in the HEAD section of the HTML? (Strictly speaking these values should be comma separated, ie. "noindex, follow".)

but it's not working for static files

For static files you would need to use the corresponding X-Robots-Tag HTTP response header. For example:

X-Robots-Tag: noindex, nofollow

If the robots meta tag is currently working OK for you then you could instead just use the X-Robots-Tag header for everything on that domain if you wanted to.

Reference: developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag#using-the-x-robots-tag-http-header

Disallow: test.domain1.com
Disallow: test.domain2.com

Aside: robots.txt doesn't work like this. It works on URLs, not hosts/domains. To disallow all crawling then you would have the same robots.txt in the root of each domain. For example:

User-agent: *
Disallow: /

Note, however, that robots.txt blocks crawling. It doesn't necessarily prevent URLs from being indexed if the URLs get linked to. To specifically prevent indexing you use the robots meta tag (as you are already doing) and/or X-Robots-Tag header. Don't use both, since that will block the crawler from seeing the robots meta tag.

10% popularity Vote Up Vote Down

@Pierce454

I would not recommend relying on the robots.txt file. Search engines may respect it – or may not. The "Disallow" is rather a recommendation than a rule.

If you want to be sure your development environments are hidden from the public, in my opinion the only reliable way to block robots / search engines from indexing pages is to use a password protection via a .htaccess file instead. AfaIk .htacces files and pw rules can also be generated on the fly.

10% popularity Vote Up Vote Down

Feed

: Block third domain from being indexed We have a web application that handles many websites that have test and production environment. This application has only one folder and it's hosted with

More posts by @Sue5673885

: Prevent International Subdomain Ranking Higher than National Website I have a national website and an international website. The international website covers the rest of the world so not country

: Effect of incomplete Disallow rule in robots.txt file Solved: Pages were being blocked by meta robots deliberately A lot of pages are being blocked in the robots.txt file and when I checked

: User Scope Dimension does not appear to persist between sessions I have Analytics set up where I am also sending The User ID. This is set when a user logs in, and is undefined when not logged

: How long will it take for Google to unindex dead websites? I was wondering if someone knows how many days or months does it take for Google to remove the dead sites from their index? I've

Login to post a comment!

2 Comments

Back to top | Use Dark Theme