: How do I disallow variable parent directories and subdirectory with robots.txt? I want to disallow robots to index some pages of my site. For example, I want to disallow the following pages:
I want to disallow robots to index some pages of my site. For example, I want to disallow the following pages:
/category/name1/page/1
/category/name1/page/2
/category/etc/page/3
But I want to allow these pages:
/category/name1/
/category/name1/
/category/etc/
How do I express this in robots.txt?
User-agent: *
Disallow: /category/*/page/*
Is that right or not?
More posts by @Jamie184
1 Comments
Sorted by latest first Latest Oldest Best
To disallow crawling of the URLs ending in page/ then yes, you can use a wildcard like you mentioned. However, there is no need to use a wildcard at the end of the pattern, since robots.txt is prefixing matching by default.
User-agent: *
Disallow: /category/*/page/
That will still allow the crawling of the less specific URLs mentioned.
Note that the wildcard * is an extension of the "standard". Whilst all the main search engine bots support it, not all (good) bots do.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.