Mobile app version of vmapp.org
Login or Join
Holmes151

: Prevent azure subdomain indexation Let me explain my situation, I have an azure website (with azurewebsites.net sub domain), and a custom domain.com, built with asp.net MVC Both are being indexed

@Holmes151

Posted in: #Azure #Domains #GoogleSearch #RobotsTxt #Subdomain

Let me explain my situation, I have an azure website (with azurewebsites.net sub domain), and a custom domain.com, built with asp.net MVC

Both are being indexed by Google, but I've noticed the custom domain is being penalized and it doesn't show up in results, it only shows when I search for "site:domain.com"
I want to remove and block the azurewebsites.net subdomain from Google.
I've read the "possible" solutions:


Adding robots.txt: won't work, because the subdomain and the domain are
the exact same content, so subdomain.azures.net/robots.txt will lead
to domain.com/robots.txt, removing the domain as well.
Adding the tag, is the same situation as the previous point.
I'm using a CNAME register to redirect the domain to the subdomain, so I can't redirect to a sub directory.


Do you have any other ideas?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Holmes151

2 Comments

Sorted by latest first Latest Oldest Best

 

@Chiappetta492

If you are using IIS, you can add IIS rewrite rules to your web.config to specify which type of robots.txt to return, depending on the subdomain the user (and thus the crawler) is browsing to. You can specify special HTTP_HOST pattern conditions to specify which robots.txt file should be used for which domain.

An article which explains this perfectly: www.tekcent.com/articles/2013/dynamic-robots-exclusion-file-in-aspnet/

10% popularity Vote Up Vote Down


 

@Hamaas447

If you don't want to separate robots.txt or you don't want to add metatags noindex then your options are very limited; one could be to password protect your azure site, but that will lead to restricted visitors access.

Some about this here.

On the other hand, is your robots.txt too complex? does it have tons of rules? why do you give it too much importance? on regular basis it is just something like this:

User-agent: *
Disallow:


Just delete it from your local folder, and never upload it via ftp or whatever you are using, create an independent robots.txt file for each site and let each one grow apart.
Put this simple robots.txt into your azure site:

User-agent: *
Disallow: /


And put your "complex" robots.txt on the robots allowed site:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~joe/

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme