: How do you disallow root in robots.txt, but allow a subdirectory? Using robots.txt, how do you disallow the root of a site (http://www.example.com/) but allow a subdirectory (http://www.example.com/lessons/)?
Using robots.txt, how do you disallow the root of a site (http://www.example.com/) but allow a subdirectory (http://www.example.com/lessons/)?
More posts by @Dunderdale272
2 Comments
Sorted by latest first Latest Oldest Best
You must list the Allow lines first as the file is read on first match basis.
To evaluate if access to a URL is allowed, a robot must attempt to
match the paths in Allow and Disallow lines against the URL, in the
order they occur in the record. The first match found is used. If no
match is found, the default assumption is that the URL is allowed.
Reference: www.robotstxt.org/norobots-rfc.txt
Google provides a tool in webmaster tools for testing your file. I always recommend testing your file. See "Test a site's robots.txt file:" near bottom.
support.google.com/webmasters/bin/answer.py?hl=en&answer=156449
User-agent: *
Disallow: /
Allow: /lessons/
Allow: /other-dir/
This does disallow the entire website, but explicitly allows given directories.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.