: Google crawl speed -- how fast can it go? I have a huge website with 5 million pages. Currently Google indexes about 10,000 pages per day. This is very slow, I still have lots of pages that
I have a huge website with 5 million pages. Currently Google indexes about 10,000 pages per day. This is very slow, I still have lots of pages that I can't get indexed. Does anyone know what is the upper threshold for crawl speed?
More posts by @Turnbaugh106
4 Comments
Sorted by latest first Latest Oldest Best
I've found out that it is possible to achieve 2 pages/second crawl speed by improving server responce time. Each page should responce as fast as possible. This may require garbage collector tuning, db tuning and code tuning. If average responce time is better then 50ms per second, then google would index at 2pages/sec, this is experimental fact.
Google's crawl rate is a function of:
Pagerank -- the more reputation and inbound links your site has, the more it will be crawled. Within your site the most prominent pages (like the home page) will get crawled more often because they have higher pagerank.
How often your pages change -- pages that change frequently will get re-crawled more often that pages that don't.
How fast your server is -- rather than having a number of pages per day that Googlebot downloads, it appears that it is limited by the amount of time spent downloading pages. Making pages smaller and increasing the speed of the server can both let Googlebot crawl faster.
In addition, Googlebot has several different crawl modes.
Re-crawl mode -- it will come back and visit pages that it has visited before.
Fresh crawl mode -- it will crawl lots of new pages in a new section of a site. The higher the pagerank of the site, the more pages get crawled.
Stale pages mode -- Googlebot finds a box of old links in the basement and plows through them just for "fun". These pages are often all pages that no longer exist and are redirected to other pages. They often have no pagerank and are crawled in URL-length order.
The upshot of this is that the best way to get your site crawled faster is to get inbound links and increase the pagerank.
If they're crawling your pages and they're not being found in the search results then the crawl rate is not an issue. This sounds like your website is full of low quality content that Google does not want in its index. Is this original content? Is it quality content? Google not listing your pages indicates that it is not.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.