Mobile app version of vmapp.org
Login or Join
Welton855

: Robot.txt disallow *?s Looking on the robot file of our ,soon to be, website. I want to know what prevent the site to be crawled. Is it this line ? If not, what will it disallow ? Disallow:

@Welton855

Posted in: #RobotsTxt #WebCrawlers

Looking on the robot file of our ,soon to be, website. I want to know what prevent the site to be crawled. Is it this line ? If not, what will it disallow ?

Disallow: *?s=

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Welton855

3 Comments

Sorted by latest first Latest Oldest Best

 

@Jamie184

Usually, this line disallow inner search results from crawling.
The best way to prevent site from crawling is to close it with the password (custom authorization).

10% popularity Vote Up Vote Down


 

@Ann8826881

Disallow: *?s=


Bots following the original robots.txt specification would not be allowed to crawl URLs like these:

example.com/*?s= http://example.com/*?s=foo example.com/*?s=/

So they interpret *, ? and = literally (i.e., these characters have to appear at the beginning of the URL path).

But many bots use (their own) extensions to the robots.txt specification, where some characters are reserved, i.e., they get a specific meaning.

Google, for example, uses * for pattern matching:


To block any sequence of characters, use an asterisk (*).


That means the Googlebot is not allowed to crawl URLs like these:

example.com/?s= http://example.com/?s=foo example.com/foo?s= http://example.com/foo?s=bar example.com/foo/foo/foo?s=bar

Other bots may have other interpretations.

10% popularity Vote Up Vote Down


 

@Shanna517

If you want to disallow all robots to crawl your site simply use:

User-agent: *
Disallow: /


User-agent: * means that all robots should follow the rule that comes next. And Disallow: / prevents them to crawl any path. You can see more here on robotstxt.org.
I think your Disallow: *?s= means that robots are not allowed to crawl URIs with parameters, but I'm not sure about that.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme