Mobile app version of vmapp.org
Login or Join
Jamie184

: How do I disallow variable parent directories and subdirectory with robots.txt? I want to disallow robots to index some pages of my site. For example, I want to disallow the following pages:

@Jamie184

Posted in: #RobotsTxt #SearchEngines #Seo

I want to disallow robots to index some pages of my site. For example, I want to disallow the following pages:

/category/name1/page/1
/category/name1/page/2
/category/etc/page/3


But I want to allow these pages:

/category/name1/
/category/name1/
/category/etc/


How do I express this in robots.txt?

User-agent: *
Disallow: /category/*/page/*


Is that right or not?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Jamie184

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ogunnowo487

To disallow crawling of the URLs ending in page/ then yes, you can use a wildcard like you mentioned. However, there is no need to use a wildcard at the end of the pattern, since robots.txt is prefixing matching by default.

User-agent: *
Disallow: /category/*/page/


That will still allow the crawling of the less specific URLs mentioned.

Note that the wildcard * is an extension of the "standard". Whilst all the main search engine bots support it, not all (good) bots do.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme