Mobile app version of vmapp.org
Login or Join
Yeniel560

: Google keeps indexing /comment/reply URL With the new update of Google algorithm called Penguin, I think my site was being penalized due to webspam. But of course I don't create post which seems

@Yeniel560

Posted in: #Drupal

With the new update of Google algorithm called Penguin, I think my site was being penalized due to webspam. But of course I don't create post which seems to be spam to Google. It is just I think how Google index my site.

I found out that Google index the URL of my site like:
www.example.com/comment/reply/3866/26556
So there are so many comment/reply URL index by Google. I have already added:

Disallow: /comment/reply/ Disallow: /?q=comment/reply/

but still Google still index this URL.

Any idea how to prevent Google from indexing comments?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @Yeniel560

4 Comments

Sorted by latest first Latest Oldest Best

 

@Angela700

Using disallow in your robots file will not stop Google from indexing those links or pages. That only tells Google do not crawl them.

If those pages are linked to from other pages on your domain they still will index the pages.

10% popularity Vote Up Vote Down


 

@Kevin317

You haven't mentioned how long ago you added those Disallow rules. The effect isn't instantaneous, requiring at the very least a wait until you're spidered again, and even then might take a bit longer for them to actually get removed from the index/results.

If you use Webmaster Tools, are they showing up in your "Crawler access" screen(under Site Configuration)? That'll at least give you an idea when the robots.txt file was last grabbed.

10% popularity Vote Up Vote Down


 

@Shakeerah822

You can use google webmaster tool: site-configuration section -> site links in order to demote links on your website. You can also use robots.txt as suggested by Ilmari Karonen as well as configure .htaccess (or httpd.conf) to preform 301 redirect

10% popularity Vote Up Vote Down


 

@Michele947

Have you made sure that your robots.txt syntax is correct? If you've signed up for Google's Webmaster Tools, you can use their robots.txt testing tool to see how Googlebot interprets it, but there are also several third-party robots.txt syntax checkers on the web.

You can also add robots meta tags to your reply pages to stop search engines from indexing them. One reason to do this, even if you have the pages disallowed in robots.txt, is that not all bots necessarily understand the fancier robots.txt syntax extensions such as * wildcards, or at least may not understand them the same way.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme