Mobile app version of vmapp.org
Login or Join
Bryan171

: Robots.txt and pattern matching Adding this to my robots.txt User-agent: * Disallow: /*action=*$ How does robots not recognizing wild cards handle this?

@Bryan171

Posted in: #Googlebot #RobotsTxt

Adding this to my robots.txt

User-agent: *
Disallow: /*action=*$


How does robots not recognizing wild cards handle this?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Bryan171

1 Comments

Sorted by latest first Latest Oldest Best

 

@Angela700

Robots that do not recognize wildcards (which is not in the official spec) will treat * as a literal character. The fact that it is not a valid URL character may mean that they ignore the rule altogether. In either case, it likely means that the rule will have no effect on them.

This will depend a bit on the exact implementation of the crawlers robot.txt honoring scheme and can not be entirely counted on.

If you want to avoid this you could have a separate configuration for googlebot (and others who do honor robots.txt.

E.g.

User-agent: *
Disallow: /

User-Agent: Googlebot
Disallow: /*action=*$


Which bans all robots except Googlebot which will honor the wildcard configuration.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme