Mobile app version of vmapp.org
Login or Join
Lengel546

: How to avoid google from indexing like example.com/post/post/post/1-foobar.html? I have a weird anomaly with one project. Where Google is indexing sites like: example.com/post/post/post/1-foobar.html

@Lengel546

Posted in: #CrawlErrors #Googlebot

I have a weird anomaly with one project. Where Google is indexing sites like:


example.com/post/post/post/1-foobar.html


While it should do it like:


example.com/post/1-foobar.html


I just recently added canonical tags, so ultimately it should use those right?

And I think the problem is also in my page display script. So it doesn't give 404, if the format isn't correct. However.. How on earth does Google even get to those links?

Is it because of <a href="/post/1-foobar.html">Example</a>? And also I had baseurl set in the head. And it somehow, thinks its a directory and goes on and on.

I have set all my links <a href="http://example.com/post/1-foobar.html">Example</a> now. But I really don't see the point, its just an extra bytes to load..

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Lengel546

1 Comments

Sorted by latest first Latest Oldest Best

 

@Speyer207

When you have a URL in the page without a nofollow, loaded directly, it is most likely that Google will follow it. In your case, when Google is following it, it is being served with a page by your script, so it is not entirely Google's fault. You have really two options here:


Add a canonical link. You mentioned that you have already done that.
Add a rel=nofollow to the link on your page that is generating these URLs. So Google will no longer follow it, at least for your new pages.
In order to remove the already indexed pages from Google, file a page removal request at Google Webmaster Tools.


These are workarounds. The ideal solution is to prevent such things from happening, by making the relative URL an absolute one and preventing your script from serving a page when some one accesses a URL that does not exist. This has mostly likely lead to some duplicate content on your page, and solving this should be one of your top priority issues.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme