: Robots.txt: block all webpages except a few number of webpages? I have a few doubts regarding robots.txt. Say, my domain is stackoverflow.com, A) Will the code below do the following for all
I have a few doubts regarding robots.txt. Say, my domain is stackoverflow.com,
A) Will the code below do the following for all the crawlers?
User-agent: *
Disallow: /
Allow: /$
Allow: /a/$
Allow: /a/login.php
Allow: /a/login.php?return=/pligg/
Accepting stackoverflow.com/ will accept stackoverflow.com too?
Accepting stackoverflow.com/a/ Accepting stackoverflow.com/a/login.php Accepting stackoverflow.com/a/login.php?return=/pligg/ Not accepting any other page on stackoverflow.com
B) Which is right: robots.txt or robot.txt?
More posts by @Cofer257
2 Comments
Sorted by latest first Latest Oldest Best
Your robots.txt is invalid. Line breaks are not allowed in a record. So it should look like:
User-agent: *
Disallow: /
Allow: /$
Allow: /a/$
Allow: /a/login.php
Allow: /a/login.php?return=/pligg/
Will the code below do the following for all the crawlers?
No, your robots.txt won’t work that way for all crawlers.
Allow is not part of the original robots.txt specification. Only some parsers will understand it (and they might have implemented the wildcards differently), all other parsers will ignore the Allow lines.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.