Mobile app version of vmapp.org
Login or Join
Alves908

: A Disallow field that does not follow a User-agent field (ie. is not part of a group) is invalid. Bots should ignore this as being invalid. However, the rules that govern robots.txt are not

@Alves908

A Disallow field that does not follow a User-agent field (ie. is not part of a group) is invalid. Bots should ignore this as being invalid. However, the rules that govern robots.txt are not a strict standard, so as with anything "invalid" in this respect, robot behaviour could be unpredicatble.

The Original robots.txt standard (1994) simply states:


The record starts with one or more User-agent lines, followed by one or more Disallow lines, as detailed below. Unrecognised headers are ignored.


In this respect, a Disallow field could be seen as an "unrecognised header". (?)

In the 1997 Internet Draft specification: A Method for Web Robots Control it states, with respect to the User-agent line:


If no such
record exists [ie. a specific user-agent match], it should obey the first record with a User-agent
line with a "*" value, if present. If no record satisfied either
condition, or no records are present at all, access is unlimited.


The Google docs (Robots.txt specifications) state that a disallow field is "only valid as a group-member record" and a user-agent field indicates the "start of group". The Google docs also include a more formal definition which indicates that an entry must start with a "startgroupline" (ie. a user-agent field).

If you test this in the "robots.txt Tester" in Google Search Console (formerly Google Webmaster Tools) then any Disallow field that is not part of a group (ie. does not start with a User-agent field) is simply flagged as an error and is ignored.

10% popularity Vote Up Vote Down


Login to follow query

More posts by @Alves908

0 Comments

Sorted by latest first Latest Oldest Best

Back to top | Use Dark Theme