: Prevent Duplicate content when using CloudFlare I have a website which uses CloudFlare CDN. At the CDN setup, CloudFlare created a sub-domain like direct.example.com to my website, which can be
I have a website which uses CloudFlare CDN. At the CDN setup, CloudFlare created a sub-domain like direct.example.com to my website, which can be used to override CloudFlare and access the site directly.
When I do a google search for "site:direct.example.com" it comes up with results. It means google had also crawled and indexed that sub-domain too. The problem is, since example.com & direct.example.com both consists of same content, it will end up in a content duplication. (I think it's not good for SEO).
So what I want is google bot to not to crawl and index "direct.example.com". I tried to use robots.txt to do the trick, but I failed since both uses the same robots.txt. What should I do to entirely prevent my sub-domain from Indexing? Is there any other options to over come this problem?
Thank you.
More posts by @Samaraweera270
1 Comments
Sorted by latest first Latest Oldest Best
The simplest solution would be to disable the 'direct' subdomain. If however you want to use that subdomain you would have to use a more creative approach.
One way to do it is to have a dynamic robots.txt. When the web spider requests robots.txt we redirect it to our dynamic robots page. If the subdomain matches our criteria we send a 'disallow' otherwise we just present the normal robots.txt.
If you use Apache your rewrite rule might look something like this:
RewriteRule /robots.txt$ /var/www/myweb/robots.php
The php file is generic.
<?php
header('Content-type: text/plain');
if ($_SERVER['HTTP_HOST'] == 'direct.example.com') {
echo "User-agent: *n";
echo "Disallow: /n";
} else {
include('robots.txt');
}
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.