Mobile app version of vmapp.org
Login or Join
Welton855

: Disallow Crawling of All Search Pages Using robots.txt I am attempting to stop all crawling of search URLs Disallow: /rest_of_url/search&tour* Above is what I am using. Our URL looks like

@Welton855

Posted in: #Google #Googlebot #GoogleSearchConsole #RobotsTxt #WebCrawlers

Using robots.txt I am attempting to stop all crawling of search URLs

Disallow: /rest_of_url/search&tour*


Above is what I am using. Our URL looks like the following for all search results. However, everything after search&tour can be different, for example:
www.example.com.au/rest_of_url/search&tour-sdfs=the-palce+lcation+&tour-duration=1/

Will the Disallow code above stop robots from crawling all of my search result pages?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Welton855

2 Comments

Sorted by latest first Latest Oldest Best

 

@Alves908

Will the Disallow code above stop robots from crawling all of my search result pages?


Yes, it will stop the (good) bots that obey the robots.txt "standard".

However, you don't need the trailing *. robots.txt is prefix matching, so the "wildcard" * at the end can simply be omitted. (Wildcard type matches are an extension of the original standard anyway.)

And you obviously need the User-agent directive that precedes this rule, if you haven't got it already:

User-agent: *
Disallow: /rest_of_url/search&tour

10% popularity Vote Up Vote Down


 

@Heady270

Disallow sets the files or folders that are not allowed to be crawled.

In addition, you can prevent a page from appearing in Google Search by including a noindex meta tag in the page's HTML code. When Googlebot next crawls that page, Googlebot will see the noindex meta tag and will drop that page entirely from Google Search results, regardless of whether other sites link to it.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme