: Robot.txt disallow *?s Looking on the robot file of our ,soon to be, website. I want to know what prevent the site to be crawled. Is it this line ? If not, what will it disallow ? Disallow:

Looking on the robot file of our ,soon to be, website. I want to know what prevent the site to be crawled. Is it this line ? If not, what will it disallow ?

Disallow: *?s=

10.03% popularity Vote Up Vote Down

: Want to redirect my site all pages to /blog For example my site: www.example.com When anyone visits my site, I want them all to redirect to www.example.com/blog But when I open my site it

@Welton855

Posted in: #Htaccess #Redirects

1 Comments

: Can I have duplicate Microdata properties on a single page? I’m trying to work with Schema.org for the first time. Is it OK if I have duplicate itemprop attributes? Meaning, If I have my

@Welton855

Posted in: #Microdata #SchemaOrg

3 Comments

: Diagnosing spike in 'direct traffic' in google analytics Recently ive seen a spike in the direct traffic in google analytics its up between 100-200% depending on what day you look at since last

@Welton855

Posted in: #GoogleAnalytics #GoogleAnalyticsSpam

1 Comments

: DNS Domain problem Today, I created a free domain on grendelhosting.com. When I try to enter to file manager the host say: Your domain is not pointing to our nameservers at the moment,

@Welton855

Posted in: #Dns #Domains

1 Comments

Login to post a comment!

3 Comments

Sorted by latest first Latest Oldest Best

@Jamie184

Usually, this line disallow inner search results from crawling.
The best way to prevent site from crawling is to close it with the password (custom authorization).

10% popularity Vote Up Vote Down

@Ann8826881

Disallow: *?s=

Bots following the original robots.txt specification would not be allowed to crawl URLs like these:

example.com/*?s= http://example.com/*?s=foo example.com/*?s=/

So they interpret *, ? and = literally (i.e., these characters have to appear at the beginning of the URL path).

But many bots use (their own) extensions to the robots.txt specification, where some characters are reserved, i.e., they get a specific meaning.

Google, for example, uses * for pattern matching:

To block any sequence of characters, use an asterisk (*).

That means the Googlebot is not allowed to crawl URLs like these:

example.com/?s= http://example.com/?s=foo example.com/foo?s= http://example.com/foo?s=bar example.com/foo/foo/foo?s=bar

Other bots may have other interpretations.

10% popularity Vote Up Vote Down

@Shanna517

If you want to disallow all robots to crawl your site simply use:

User-agent: *
Disallow: /

User-agent: * means that all robots should follow the rule that comes next. And Disallow: / prevents them to crawl any path. You can see more here on robotstxt.org.
I think your Disallow: *?s= means that robots are not allowed to crawl URIs with parameters, but I'm not sure about that.

10% popularity Vote Up Vote Down

Feed

: Robot.txt disallow *?s Looking on the robot file of our ,soon to be, website. I want to know what prevent the site to be crawled. Is it this line ? If not, what will it disallow ? Disallow:

More posts by @Welton855

: Want to redirect my site all pages to /blog For example my site: www.example.com When anyone visits my site, I want them all to redirect to www.example.com/blog But when I open my site it

: Can I have duplicate Microdata properties on a single page? I’m trying to work with Schema.org for the first time. Is it OK if I have duplicate itemprop attributes? Meaning, If I have my

: Diagnosing spike in 'direct traffic' in google analytics Recently ive seen a spike in the direct traffic in google analytics its up between 100-200% depending on what day you look at since last

: DNS Domain problem Today, I created a free domain on grendelhosting.com. When I try to enter to file manager the host say: Your domain is not pointing to our nameservers at the moment,

Login to post a comment!

3 Comments

Back to top | Use Dark Theme