: Remove several URL of website from Google I want to remove multiple URL in my website (developped in ASP MVC) All these URL have the format www.mysite.com/planning/xxx I have already adapted
I want to remove multiple URL in my website (developped in ASP MVC) All these URL have the format mysite.com/planning/xxx
I have already adapted the code to make 301 redirect and indicate to robots to remove the page of their index.
Now, when I check the WebMasterTools, I see many soft 404 which corresponds to these URLs.
So, I have some questions :
Why did google consider them as soft 404 whereas I do a permanent redirection so with the 301 status ? Are these URL really removed in google index (I don't think because many of them are always find by google) ? Did it exist another way (a better way :) ) to remove multiple URLs ? May adapting the robots.txt with a Disallow: /planning be useful or it only prevents to index without removing from existing index ?
Thanks
More posts by @LarsenBagley505
1 Comments
Sorted by latest first Latest Oldest Best
I want to remove multiple URL in my website (developped in ASP MVC) All these URL have the format mysite.com/planning/xxx
I have already adapted the code to make 301 redirect and indicate to robots to remove the page of their index...
A 301 is a mistake. You actually teased the robots because you made them think that good content to index is found at the URL defined by the location HTTP header (if you provided one)
Now, when I check the WebMasterTools, I see many soft 404 which corresponds to these URLs.
This is the robot's way of saying you teased them. You issued a redirect to URL which the robots think is an error page. (even though the HTTP status returned is 200).
So, I have some questions :
Why did google consider them as soft 404 whereas I do a permanent redirection so with the 301 status ?
A redirection means change to a new URL. It's the resulting URL that matters, and in your case, its a URL resulting in a soft 404.
Soft 404's are pages having text that robots believe are actual errors, but they're "Soft" because the HTTP header doesn't return a 404 status. it instead returns a 200 (success) status.
Are these URL really removed in google index (I don't think because many of them are always find by google) ? Did it exist another way (a better way :) ) to remove multiple URLs ? May adapting the robots.txt with a Disallow: /planning be useful or it only prevents to index without removing from existing index ?
The best way to remove the URLs is to first make a list of all the URLs that google complains about in Webmaster tools that you never plan to turn into actual pages, then make a list of all the URLs that contain code to redirect to error pages.
Then when someone asks for a URL on the list, return a page with the 410 error code which represents GONE. This means the first line in the http headers must begin with HTTP/x.x 410 (where x.x is the version you use which is probably 1.1).
I could give you apache and PHP code if you need it, but I'm not sure if that works with your server setup.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.