: Stop bots from crawling old links with extensions I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which

Posted in: #Googlebot #RobotsTxt #WebCrawlers

I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which no longer exist.

So I'm trying to find out if there is a way to format robots.txt (or by some other method) to tell google/bing that any link that ends in an extension isn't a valid link... Is this possible?

On pages that I'm concerned about a User having saved as a fav I'm displaying a 404 page that lists the links to take once they are redirected to the new page (I decided to not just redirect them as I don't want to maintain these forever). For Google/Bing sake I do have the canonical tag in the header.

User-agent: *
Allow: /
Disallow: /*.*

EDIT: I just added the 3rd line (in text above) and it APPEARS to do what I'm wanting. Allow a path, but disallow a file. Can anyone confirm this?

10.01% popularity Vote Up Vote Down

: How to force browsers to always reload xslt files? Related: Apache: How can I force the browser to reload CSS files? I'm building an xml page (on an apache2) that is supposed to be translated

@Cooney921

Posted in: #Apache2 #Browsers #Css

2 Comments

: Obtaining visitor stats without traffic monitoring software A client would like me to obtain visitor stats for his website, but does not have any traffic monitoring software installed. So I'm

@Cooney921

Posted in: #Analytics #Statistics #Traffic

2 Comments

: Is there a "product review" plugin silimar to KissInsights for polls and Disqus for comments? Is there any kind of plugin I could use to enable and monitor product reviews on certain product

@Cooney921

Posted in: #LookingForAScript #Plugin

1 Comments

: How do I 301 redirect from the root folder to a sub folder while keeping the rest of the URL string intact When I first launched my site, it I had MediaWiki in the root and a wordpress

@Cooney921

Posted in: #301Redirect #Htaccess #Php

1 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Cofer257

First, the "Allow" directive in your robots.txt does nothing as robots spider everything by default.

Blocking robots from *.* is probably OK in some situations, but remember that you are blocking every URL that simply contains a dot. A more reliable method may be blocking individual extensions (if there are not too many) eg *.html and *.php on separate lines.

The preferred method of moving to new pages is a 301 Redirect, which should always be used unless technically difficult. (Although they are 'permanent' redirects you do not need to maintain them forever: a few months is fine.) It's better for users too, as they get a seamless experience.

10% popularity Vote Up Vote Down

Feed

: Stop bots from crawling old links with extensions I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which

More posts by @Cooney921

: How to force browsers to always reload xslt files? Related: Apache: How can I force the browser to reload CSS files? I'm building an xml page (on an apache2) that is supposed to be translated

: Obtaining visitor stats without traffic monitoring software A client would like me to obtain visitor stats for his website, but does not have any traffic monitoring software installed. So I'm

: Is there a "product review" plugin silimar to KissInsights for polls and Disqus for comments? Is there any kind of plugin I could use to enable and monitor product reviews on certain product

: How do I 301 redirect from the root folder to a sub folder while keeping the rest of the URL string intact When I first launched my site, it I had MediaWiki in the root and a wordpress

Login to post a comment!

1 Comments

Back to top | Use Dark Theme