: Solving "Googlebot encountered extremely large numbers of links on your site." I'm getting the following warning over and over again in google webmaster tools Googlebot encountered extremely large
I'm getting the following warning over and over again in google webmaster tools
Googlebot encountered extremely large numbers of links on your site.
The examples he's showing don't give me much clue on what is really wrong here. How would you suggest me to resolve this issue?
UPDATE: My site has a large amount of pages (40M) with ~10M indexed. Should I consider adding noindex to some of the pages to make it 'smaller' for search engines?
More posts by @Rambettina238
2 Comments
Sorted by latest first Latest Oldest Best
Essentially this message means that we (Google) discovered a surprisingly large number of unique URLs while crawling previously-known URLs. This message is sent out before we attempt to crawl those new, unique URLs (since that can take quite some time), it can be useful to inform you of issues with regards to crawlability of your website's structure. Because it's sent out before those new URLs are crawled, the robots.txt, any noindex robots tags, or a rel=canonical are not know at that point.
While it's true that large sites tend to see this message more frequently, it's also the case that especially large sites would profit more by having a clean & crawlable URLs structure from the start. When crawling, we have a limited number of fetches that we can make on a server before it starts to slow down, so if you send us 5-100x more URLs than you actually have content, that can result in us not being able to pick up new content as quickly as we might if we could crawl more efficiently.
My recommendation would be to double-check whether there's something you could do to catch these URLs early (eg avoid linking to them at all, or perhaps use rel=nofollow in the depths of multi-faceted search sections), see if the URL parameter handling tool can be used for your site. Alternately, if you're sure that the search results for your site are "fresh enough," and that the crawling is not a user-noticeable load on your server, then it may be worth just keeping this on your list as something to check when your developers do bigger changes anyway.
There is generally no solution for this "problem". If your site has a large number of pages, you will get this message. When your site has a large number of pages, "an extremely large number of link on your site" is expected.
I get that message on one of my sites that has 10,000 pages. Another site that I worked with that had millions of pages also got that message.
If your site has only a few pages, this could be caused by Googlebot crawling search results or session ids. In such a case, Googlebot will find a large number of urls and links, but you only want a few pages indexed. If that is the case, then you should block Googlebot using robots.txt from search pages or configure your site not to use session ids in the url for googlebot.
There is a discussion about this topic at WebmasterWorld.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.