: Robots.txt disallowing URLs I need to disallow some URLs on my site but I am not sure how to do that. I have a site that has products and reviews. When someone makes a review, the site
I need to disallow some URLs on my site but I am not sure how to do that. I have a site that has products and reviews. When someone makes a review, the site generates a URL automatically like this:
mysite.com/addreview_1.htm
mysite.com/addreview_2.htm
....
mysite.com/addreview_9999.htm
I need some way to disallow all the URLs which will appear in the future.
More posts by @Twilah146
2 Comments
Sorted by latest first Latest Oldest Best
The original robots.txt specification has no concept of "full" URL. Whatever you specify as value for Disallow is always the start of the URL paths you want to block.
For example, see this robots.txt:
# robots.txt for example.com
User-agent: *
Disallow: /foobar.html
This will obviously block example.com/foobar.html. But it will also block:
example.com/foobar.html?foo=bar
example.com/foobar.html.zip
example.com/foobar.html.for.example
example.com/foobar.html/foo/bar
So, in your case you just need:
User-agent: *
Disallow: /addreview
It will block all URLs that begin with the string addreview:
example.com/addreview
example.com/addreview.html
example.com/addreview_1.htm
example.com/addreview_9999.htm
But it will also block an URL like (let’s assume it exists) example.com/addreviewer, of course. Which may or not what you want (depends on all your URLs you use).
So you need to find a part of a starting URL paths that matches to all the URLs you want to have blocked and doesn’t include any others.
You can add a wildcard entry to the robots.txt like:
Disallow: /addreview*
Google and other big players will honor the wildcards, but as this is a more recent addition to the robots.txt specification, there are probably still crawlers that ignore it.
This will also only work if the URLs you want to disallow have a common element that is not found in URLs you want crawled.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.