: Somehow Google considers a properly 301'd URL as 200 and is still indexing the new content in old page? We redirected all the old URL's to new ones properly using .htaccess. The problem is
We redirected all the old URL's to new ones properly using .htaccess. The problem is Google, somehow is still finding content in the old page (which it shouldn't) and stores it in the cache rather than the new URL.
For e.g.:
Old page: www.natures-energies.com/iching.htm New page: www.natures-energies.com/index.php?option=com_content&view=article&id=760
If you type the old URL into the browser it redirects. If you fetch the old URL as Googlebot in the Webmaster Tools the header says 301/permanently redirected. If I try to crawl as any other bot it still says 301 redirected. Even if you click the old link in Google it redirects to the new URL.
Only in its cache it shows the old URL and moreover it shows the new content in it!
I am stumped on how Google manages to grab the new content and puts in the old URL instead of the new one!
One more interesting thing is that if I try a cache for the new page it shows the cache of the new content with old URL!
Any help would be appreciated. I am at end of my wits. I think I have tried almost everything. Is there anything that I'm missing to see?
You can use this search to find the old URL's. Maybe you'll some patterns that I missed.
site:www.natures-energies.com inurl:htm -inurl:https|index.
More posts by @Candy875
2 Comments
Sorted by latest first Latest Oldest Best
Part of the problem may be that your new URL is much more complex than the old URL. Your new URL is served on a dynamic page with three URL parameters. The old URL appears to be just a static page.
Why do you have three URL parameters? This version of the new URL seems to work fine: www.natures-energies.com/index.php?id=760
Google might be more willing to believe the 301 redirect if the URL were simplified a bit.
This most certainly comes from the way they save the data in their index. Google makes use of a database that is built on what they call Big Tables. The Cassandra database is an open source replicate of that database system if you want to learn more about it.
It is clear to me that the main key in their indexes is the URI of the websites (written in reverse to simplify the sort order). Google does not take a 301 literally. The fact is that when you create a 301, you may change your mind a few times (i.e. you really meant 302, or you re-rename the page and the URI changes again.) I think that one reason they update their index keys (the URI) at a slower rate than the content of the pages because it is a lot more problematic than to change the content. That URI being a key, it must appear billions of times in the database and changing has a huge impact.
Just in case I checked your Drupal site and you do not have a canonical URI defined. I suppose such could have an effect too. But I think it will be updated later. I do not know how long it takes though. Why did you decide to remove the URL aliases?
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.