: Do search engines still crawl a noindex page Do search engines crawl a page that has a 'noindex' meta attribute? The reason I ask is because we have near duplicate content caused by faceted

Posted in: #DuplicateContent #Nofollow #Noindex #Seo #WebCrawlers

Do search engines crawl a page that has a 'noindex' meta attribute?

The reason I ask is because we have near duplicate content caused by faceted navigation and the filtered pages have stated 'noindex' and I was wondering if these pages would still be detected as duplicates?

Do I have to add a 'nofollow' attribute to the link whilst we make these pages unique?

10.02% popularity Vote Up Vote Down

: How is Godaddy able to sell domains for as cheap as ? I read this question which made me realize why domains are worth around . Almost a year ago, I purchased a domain from Godaddy

@Sue5673885

Posted in: #Domains

1 Comments

: How to manage SQL views with dependencies in DHIS 2? We have two SQL views created in DHIS2 v2.24 where one view depends on other one (say, A depends on B). We have also scheduled an automatic

@Sue5673885

Posted in: #Dhis2

1 Comments

: Somebody else's domain name is pointing to my server IP address and Google is indexing my content on their domain I am in the middle of updating to https. I changed our DNS A record to point

@Sue5673885

Posted in: #Dns #IpAddress #WebHosting

2 Comments

: VirtualHost add sub domain? I rarely touch apache configs but I must be doing something wrong here. trying to set it up so these domains point to different webroots: example.com -> var/www/html

@Sue5673885

Posted in: #Apache #Centos #WebDevelopment #WebHosting #Webserver

1 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Alves908

As Goyllo has already stated, search engine bots will crawl pages that have a noindex meta tag. If you think about it, they need to crawl the page in order to see the noindex meta tag in the first place. (You could use an X-Robots-Tag HTTP response header instead and, in theory, a bot would only need to do a HEAD request in order to see the noindex attribute - but that's not how Google rolls.)

If a page is noindex, it can still be follow (which it would be by default, unless you explicitly state nofollow as well), so the page would obviously need to be crawled in order to discover any links to follow.

Do I have to add a 'nofollow' attribute to the link whilst we make these pages unique?

That simply discounts that particular link from the ranking algorithm. So, that particular link will not be used as a ranking factor for the target URL. I assume it's highly likely that there are other inbound links to that page as well?

...pages have stated 'noindex' and I was wondering if these pages would still be detected as duplicates?

Duplicate of what? A page can only be considered a duplicate (in the eyes of the search engine index) if it is indexed. If it's not indexed then it can't be a duplicate.

The duplicate content "problem" is if you have two (or more) duplicate pages that have been crawled and indexed then the search engine must decide which page to return in the SERPs. Unless you resolve this duplicate content yourself (redirect, canonical tag or simply making the content unique) then it's out of your control - the search engine makes the decision for you. You are also potentially diluting your search ranking as users discover different pages and link back to one or the other.

To prevent a page from being crawled (ie. not even requested) then you can include an entry in your robots.txt file. However, this will mean the search engines will be unable to see your noindex meta tag. Whilst this should prevent the page appearing in normal search results, it doesn't necessarily prevent the page from appearing as a link-only result in the SERPs (ie. "indexed") if it is linked to. However, it still can't be considered "duplicate" because it's content won't have been read and indexed.

10% popularity Vote Up Vote Down

@Nimeshi995

Yes, Google still crawl webpages that have noindex tag.

But if you have same content on two different webpages and one URL contain noindex tag, while second does not, then you should not worry about it, because out of all duplicate content only one webpage is indexed by Google. Rest of webpages are crawlable but not indexed in Google search result, so that is fine.

10% popularity Vote Up Vote Down

Feed

: Do search engines still crawl a noindex page Do search engines crawl a page that has a 'noindex' meta attribute? The reason I ask is because we have near duplicate content caused by faceted

More posts by @Sue5673885

: How is Godaddy able to sell domains for as cheap as ? I read this question which made me realize why domains are worth around . Almost a year ago, I purchased a domain from Godaddy

: How to manage SQL views with dependencies in DHIS 2? We have two SQL views created in DHIS2 v2.24 where one view depends on other one (say, A depends on B). We have also scheduled an automatic

: Somebody else's domain name is pointing to my server IP address and Google is indexing my content on their domain I am in the middle of updating to https. I changed our DNS A record to point

: VirtualHost add sub domain? I rarely touch apache configs but I must be doing something wrong here. trying to set it up so these domains point to different webroots: example.com -> var/www/html

Login to post a comment!

2 Comments

Back to top | Use Dark Theme