: Proper content for robots.txt on server maintenance page? I know that the proper server response header should be 503 Service Unavailable, to let the search bots know that they shouldn't index
I know that the proper server response header should be 503 Service Unavailable, to let the search bots know that they shouldn't index the pages like Update in progress and Server maintenance. But what is the proper content for the robots.txt on these pages?
In my setup the maintenance page is not a sub-page of a site. I have the maintenance page as a dedicated website in server configuration. I add bindings to it whenever I put the original site offline. Would you recommend empty robots.txt, User-agent: * Disallow: / or the same robots.txt as the original website?
More posts by @Si4351233
2 Comments
Sorted by latest first Latest Oldest Best
robots.txt is meant to prevent search engines from accessing pages but for server maintenance pages, if the maintenance time is short (such as less than a couple hours), then I wouldn't worry about changing robots.txt and I'd just continue with the maintenance.
Search engine companies are only interested in indexing URLs that return a status 200 code (meaning page is good and has content people want to see). If your pages keeps returning status 503, then chances are they will eventually no longer be indexed.
I would suggest only forcing a noindex on pages that never offer value to the public.
you set the 503 page with your server configuration (Apache → htaccess), like
RewriteRule .* example.com/maintenance.html [R=503,L]
or
ErrorDocument 503 /maintenance.html
The page example.com/maintenance.html should be set to noindex, because in other case it appears in index. But it should remain crawlable - no crawling exclusion with
disallow: /maintenance.html,
because, in this case Google could not read the noindex rule.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.