: Specifying crawlers to not crawl Links which are dependent on external API's I have links on my site which lead to internal pages that depend on first bringing data from an external API. This
I have links on my site which lead to internal pages that depend on first bringing data from an external API. This only takes time the first time they are pushed. Links to pages which exists in the DB, load a lot faster. I'm specify for search engines to only crawl the internal pages. This is what I thought of:
1. Creating a site map with the internal links
2. Adding this to every page on my site <META NAME="ROBOTS" CONTENT="NOFOLLOW" />
Will this succeed? Does the site map over-ride the nofollow?
Would you suggest a different direction?
Thanks
More posts by @Jessie594
1 Comments
Sorted by latest first Latest Oldest Best
Adding that meta tag will prevent search engines from crawling your entire site, so you should avoid that! Adding pages to the sitemap should allow them to be crawled and indexed (since you are not telling search engines to ignore the actual pages). But if you have no links to them on your own site (as search engines see it) then they will not rank well, or at all.
One solution would be to use robots.txt and block the URLs that use the API, assuming they follow a standard format or it is easy to generate the list of things to block.
However, the better solution would be to spider the site yourself and make sure all pages you link to are already generated and in your database. This way the pages will be fast for users (your main concern of course) and search engines as well.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.