: Remove subdomains from Google index and stop indexing them I am running static content through a CDN, cdn1-cdn5 I am using such subdomains for that. I am loading just images, CSS and JS files

Posted in: #DuplicateContent #GoogleIndex #GoogleSearchConsole #RobotsTxt #Subdomain

I am running static content through a CDN, cdn1-cdn5 I am using such subdomains for that.

I am loading just images, CSS and JS files this way, but apparently Google has indexed some pages on subdomains and they now appear in the Google index and they are duplicates of my "normal" pages.

The thing is that CDN is set the way to have files appear on subdomains without any extra uploading of stuff, meaning subdomains are mirror copies of content that is on main site, I can't upload files to subdomains, I can upload to main site and change www to cdn1 in address bar to show the same content through the CDN as is on my site.

I have 2 questions:

how do I remove the subdomains from Google index in GWT if it only allows to write anything that goes after `http://domain.com/ ?
how do I prevent from bots indexing the pages on subdomains when I can't upload special robots.txt files or upload a google's verification files to them to prove my ownership in GWT?

Maybe there is something else that I need to know related to this matter?

UPDATE: text in bold is updated

10.03% popularity Vote Up Vote Down

: Schema.org and Person's logo instead of image/photo I want to mark up someone's website with Microdata, so I was going to use her name and logo using schema.org's Person vocabulary. However,

@Alves908

Posted in: #Microdata

1 Comments

: Modern design for developers: getting started I've been working with web development since 2009 when I've learned HTML, CSS and PHP. By the time, I've also learned about the web standards and

@Alves908

Posted in: #Learning #WebsiteDesign

1 Comments

: Google web search shows dateCreated instead of dateModified metadata So today I discovered that the pages from my website are listed with an unexpected date value. I specify the schema.org properties

@Alves908

Posted in: #GoogleSearch #Microdata

2 Comments

: Proper SEO when dealing with duplicate content/article I have a blog post I wrote which another site wishes to post on their site, verbatim. I know having identical content is an SEO "ding".

@Alves908

Posted in: #Seo

1 Comments

Login to post a comment!

3 Comments

Sorted by latest first Latest Oldest Best

@LarsenBagley505

You can remove the sub-domains in webmaster tools, but first you need to add the sub domains as seperate sites and then submit a site removal. They should be gone within a day or so.

See these instructions for removing a site from google : support.google.com/webmasters/answer/1663427?hl=en

10% popularity Vote Up Vote Down

@Jamie184

Short answer.

Put <meta name="robots" content="noindex"> in the header of your HTML for all pages. Once the search engines have spidered these pages and you are sure of it, put

User-agent: *
Disallow: /

...in a robots.txt file in the root directory of each sub-domain.

This will take time of course. It can take 30-60 days typically for say Google to notice the changes and reflect it in the SERPs. It can take less or more time depending upon how Googles gauges freshness for your sub-domains.

10% popularity Vote Up Vote Down

@Yeniel560

There are different way, here you have some, you can use only one or combine them

Use rel="canonical". More examples and details
If you can use a .htaccess file, set a 301 redirect on the servers that you don't want to get indexed.

About the robots.txt, you can use it, but it's much better option to use a solution that is more robust and that all crawlers will have to follow, like the redirect.

Here you can see a short video from Matt Cutts talking about 301 redirects vs rel="canonical". A short extract from that page and video would be:

Okay, I sometimes get a question about whether Google will always use the url from rel=canonical as the preferred url. The answer is that we take rel=canonical urls as a strong hint, but in some cases we won’t use them:

For example, if we think you’re shooting yourself in the foot by accident (pointing a rel=canonical toward a non-existent/404 page), we’d reserve the right not to use the destination url you specify with rel=canonical.
Another example where we might not go with your rel=canonical preference: if we think your website has been hacked and the hacker added a malicious rel=canonical. I recently tweeted about that case. On the “bright” side, if a hacker can control your website enough to insert a rel=canonical tag, they usually do far more malicious things like insert malware, hidden or malicious links/text, etc.

On the video he mentions some more situations and reasons, like the fact that 301 has to be followed by everybody.

10% popularity Vote Up Vote Down

Feed

: Remove subdomains from Google index and stop indexing them I am running static content through a CDN, cdn1-cdn5 I am using such subdomains for that. I am loading just images, CSS and JS files

More posts by @Alves908

: Schema.org and Person's logo instead of image/photo I want to mark up someone's website with Microdata, so I was going to use her name and logo using schema.org's Person vocabulary. However,

: Modern design for developers: getting started I've been working with web development since 2009 when I've learned HTML, CSS and PHP. By the time, I've also learned about the web standards and

: Google web search shows dateCreated instead of dateModified metadata So today I discovered that the pages from my website are listed with an unexpected date value. I specify the schema.org properties

: Proper SEO when dealing with duplicate content/article I have a blog post I wrote which another site wishes to post on their site, verbatim. I know having identical content is an SEO "ding".

Login to post a comment!

3 Comments

Back to top | Use Dark Theme