: Optional query string specification with "Disallow" in robots.txt I'm creating a robots.txt file for my website which now has a lot of pages. I'm disallowing all the pages that I do not require
I'm creating a robots.txt file for my website which now has a lot of pages.
I'm disallowing all the pages that I do not require to crawl by so:
Disallow: /folder/file.aspx
I've quite a lot of pages in my website. And there are some that uses query strings with them.
How can I specify optional existence of query string to disallow the robot from crawling?
I've tried this
Disallow: /folder/file.aspx?*
This will disallow file.aspx with any query string parameters. But will it disallow file_with_no_query_string.aspx if there's a file which is not expected to have query string parameters and I write
Disallow: /folder/file_with_no_query_string.aspx?*
Cutting down short:
Specifying "?*" , will it always want a query string?
More posts by @Mendez628
2 Comments
Sorted by latest first Latest Oldest Best
Disallow: /folder/file.aspx
What you had in the beginning is all that's required - in order to block both /folder/file.apsx and /folder/file.aspx?foo=bar. If in doubt check it with the robots.txt testing tool in Google Search Console.
robots.txt is prefix matching, so there is never any need to include the wildcard (*) at the end of the URL path.
The wildcard char (*) is also an extension to the original "standard", so for maximum compatibility it should be avoided anyway.
As long as some of the query strings don't need to be crawled you can simply disallow access to the entire file using
Disallow: /folder/file.aspx*
This will disallow crawling both on the file on its own as well as all query strings appended to the end of the file.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.