: Stop bots from crawling old links with extensions I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which
I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which no longer exist.
So I'm trying to find out if there is a way to format robots.txt (or by some other method) to tell google/bing that any link that ends in an extension isn't a valid link... Is this possible?
On pages that I'm concerned about a User having saved as a fav I'm displaying a 404 page that lists the links to take once they are redirected to the new page (I decided to not just redirect them as I don't want to maintain these forever). For Google/Bing sake I do have the canonical tag in the header.
User-agent: *
Allow: /
Disallow: /*.*
EDIT: I just added the 3rd line (in text above) and it APPEARS to do what I'm wanting. Allow a path, but disallow a file. Can anyone confirm this?
More posts by @Cooney921
1 Comments
Sorted by latest first Latest Oldest Best
First, the "Allow" directive in your robots.txt does nothing as robots spider everything by default.
Blocking robots from *.* is probably OK in some situations, but remember that you are blocking every URL that simply contains a dot. A more reliable method may be blocking individual extensions (if there are not too many) eg *.html and *.php on separate lines.
The preferred method of moving to new pages is a 301 Redirect, which should always be used unless technically difficult. (Although they are 'permanent' redirects you do not need to maintain them forever: a few months is fine.) It's better for users too, as they get a seamless experience.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.