: Disallow Crawling of All Search Pages Using robots.txt I am attempting to stop all crawling of search URLs Disallow: /rest_of_url/search&tour* Above is what I am using. Our URL looks like
Using robots.txt I am attempting to stop all crawling of search URLs
Disallow: /rest_of_url/search&tour*
Above is what I am using. Our URL looks like the following for all search results. However, everything after search&tour can be different, for example:
www.example.com.au/rest_of_url/search&tour-sdfs=the-palce+lcation+&tour-duration=1/
Will the Disallow code above stop robots from crawling all of my search result pages?
More posts by @Welton855
2 Comments
Sorted by latest first Latest Oldest Best
Will the Disallow code above stop robots from crawling all of my search result pages?
Yes, it will stop the (good) bots that obey the robots.txt "standard".
However, you don't need the trailing *. robots.txt is prefix matching, so the "wildcard" * at the end can simply be omitted. (Wildcard type matches are an extension of the original standard anyway.)
And you obviously need the User-agent directive that precedes this rule, if you haven't got it already:
User-agent: *
Disallow: /rest_of_url/search&tour
Disallow sets the files or folders that are not allowed to be crawled.
In addition, you can prevent a page from appearing in Google Search by including a noindex meta tag in the page's HTML code. When Googlebot next crawls that page, Googlebot will see the noindex meta tag and will drop that page entirely from Google Search results, regardless of whether other sites link to it.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.