: How to disallow indexing but allow crawling? In the front page of my website, I have some previews to articles (with a small introduction to them) that link to the full articles. I want to
In the front page of my website, I have some previews to articles (with a small introduction to them) that link to the full articles.
I want to disallow the front page to prevent duplicate content. But if I do this (in robots.txt), would it still be crawled?
I mean, the full articles would be still reached by the crawler even though I disallowed the only page that links to them?
I don't want the webcrawler not to access the page and enter the links in them, but I just don't want it to save the information (that will be repeated in the full articles).
More posts by @Margaret670
1 Comments
Sorted by latest first Latest Oldest Best
That is what the robots meta tag is for, control per page for indexing and following.
I've come to prefer it over using robots.txt as it gives finer control.
For your page, you'd want noindex,follow for the setting. The robot will read the page, not index it, but follow all the links off the page.
<meta name="robots" content="noindex,follow" />
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.