: Preventing Spiders on One Subdomain? We currently host multiple sites on the same domain that all live in the same physical directory on the server. Each is database driven and lives at a unique
We currently host multiple sites on the same domain that all live in the same physical directory on the server. Each is database driven and lives at a unique subdomain. For example, site1.example.com, site2.example.com, etc. However, because all of these sites live in the same directory, they share a robots.txt file.
I would like to set up a test/demo site on this same codebase(i.e. demo.example.com), but I do not want it to be indexed by search engines. Is there any way that I can configure robots.txt to disallow an entire subdomain while not affecting other subdomains that live in the same physical location?
More posts by @Fox8124981
3 Comments
Sorted by latest first Latest Oldest Best
I ended up using mod_rewrite to accomplish what I need:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^demo.example.com
RewriteRule ^robots.txt$ robots-demo.txt
RewriteCond %{HTTP_HOST} !^demo.example.com
RewriteRule ^robots.txt$ robots-othersites.txt
You can configure the subdomain site to send an additional X-Robots-tag HTTP header, assuming that each subdomain is set up as it's own virtual site.
That will allow you to tell spiders not to index that subdomain. The catch is that I do not know how many crawlers currently support this header. A quick look indicates that Google and Yahoo! do.
You can use a meta robots tag, which is supported by all of the major search engines today:
<meta name="robots" content="noindex" />
I'm not exactly sure how to do this with robots.txt, though.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.