: A Disallow field that does not follow a User-agent field (ie. is not part of a group) is invalid. Bots should ignore this as being invalid. However, the rules that govern robots.txt are not
A Disallow field that does not follow a User-agent field (ie. is not part of a group) is invalid. Bots should ignore this as being invalid. However, the rules that govern robots.txt are not a strict standard, so as with anything "invalid" in this respect, robot behaviour could be unpredicatble.
The Original robots.txt standard (1994) simply states:
The record starts with one or more User-agent lines, followed by one or more Disallow lines, as detailed below. Unrecognised headers are ignored.
In this respect, a Disallow field could be seen as an "unrecognised header". (?)
In the 1997 Internet Draft specification: A Method for Web Robots Control it states, with respect to the User-agent line:
If no such
record exists [ie. a specific user-agent match], it should obey the first record with a User-agent
line with a "*" value, if present. If no record satisfied either
condition, or no records are present at all, access is unlimited.
The Google docs (Robots.txt specifications) state that a disallow field is "only valid as a group-member record" and a user-agent field indicates the "start of group". The Google docs also include a more formal definition which indicates that an entry must start with a "startgroupline" (ie. a user-agent field).
If you test this in the "robots.txt Tester" in Google Search Console (formerly Google Webmaster Tools) then any Disallow field that is not part of a group (ie. does not start with a User-agent field) is simply flagged as an error and is ignored.
More posts by @Alves908
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.