Mobile app version of vmapp.org
Login or Join
Steve110

: What does "Disallow: /search" mean in robots.txt? In my blog's Google Webmaster Tools panel, I found the following code in my robots.txt of blocked URLs section. User-agent: Mediapartners-Google

@Steve110

Posted in: #GoogleSearchConsole #RobotsTxt #SearchEngines #WebCrawlers

In my blog's Google Webmaster Tools panel, I found the following code in my robots.txt of blocked URLs section.

User-agent: Mediapartners-Google
Disallow: /search
Allow: /


I know that Disallow will prevent Googlebot from indexing a webpage, but I don't understand the usage of Disallow: /search.

What is the exact meaning of Disallow: /search?

10.05% popularity Vote Up Vote Down


Login to follow query

More posts by @Steve110

5 Comments

Sorted by latest first Latest Oldest Best

 

@Welton855

it means that the user agent Mediapartners-Google will not be allowed to go into any of the directories under /search

/search/go blocked
/search blocked
/ not blocked.

10% popularity Vote Up Vote Down


 

@Heady270

Other answers explain how robots.txt is processed to apply this rule, but don't address why you would want to disallow bots from crawling your search results.

One reason might be that your search results are expensive to generate. Telling bots not to crawl those pages could reduce load on your servers.

Search results pages are also not great landing pages. A search result page typically just has a list of 10 pages from your site with titles and descriptions. Users would generally be better served by going directly to the most relevant of those pages. In fact, Google has said that they don't want your site search results indexed by Google. If you don't disallow them, Google could penalize your site.

10% popularity Vote Up Vote Down


 

@Ann8826881

In the Disallow field you specify the beginning of URL paths of URLs that should be blocked.

So if you have Disallow: /, it blocks everything, as every URL path starts with /.

If you have Disallow: /a, it blocks all URLs whose paths begin with /a. That could be /a.html, /a/b/c/hello, or /about.

In the same sense, if you have Disallow: /search, it blocks all URLs whose paths begin with the string /search. So it would block the following URLs, for example (if the robots.txt is in example.com/):
example.com/search http://example.com/search.html example.com/searchengine http://example.com/search/ example.com/search/index.html

While the following URLs would still be allowed:

example.com/foo/search http://example.com/sea


Note that robots.txt doesn’t know/bother if the string matches a directory, a file or nothing at all. It only looks at the characters in the URL.

10% popularity Vote Up Vote Down


 

@Becky754

Since the OP indicated in his comments that he was only interested in the "/search directory", my answer below is in regards to disallowing just a "search" directory:

The following is a directive for robots not to crawl something named "search" located in the root directory:

Disallow: /search


According to the following Google Webmaster Tools help doc below, directory names should be proceeded and followed by a forward slash /, as also specified in the other following reference sources:

Google Webmaster Tools - Block or remove pages using a robots.txt file


To block a directory and everything in it, follow the directory name with a forward slash.
Disallow: /junk-directory/


Robotstxt.org - What to put in it

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~joe/

In this example, three directories are excluded.


Wikipedia - Robots exclusion standard

This example tells all robots not to enter three directories:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/


So according to Google (as copied above), the following would disallow bots with the user-agent Mediapartners-Google from crawling the "search" directory located in the the root directory, but allow all other directories to be crawled:

User-agent: Mediapartners-Google
Disallow: /search/
Allow: /

10% popularity Vote Up Vote Down


 

@Kristi941

It tells AdSense not to crawl anything files in the /search directory or below (i.e. any subdirectories of /search).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme