Mobile app version of vmapp.org
Login or Join
BetL925

: How to decrease the frequency of Google/Search-engine bots crawling my site? My server handles my visitors ok, but recently I found there are many search engine bots crawling my site and my

@BetL925

Posted in: #CrawlRate #Googlebot #Performance #WebCrawlers

My server handles my visitors ok, but recently I found there are many search engine bots crawling my site and my server gets quite busy serving these bots.

My site generates over 1000 posts each day, so it might be normal for the bots crawling my site often.

However, is it possible to decrease the frequency for the bots crawling my site? Or just tell the bots to craw my new posts? It seems they are crawling my site more and more often and my server becomes slow when they visit my site.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @BetL925

3 Comments

Sorted by latest first Latest Oldest Best

 

@Cofer257

You should also specify a crawl delay in robots.txt for all the other search engines (Yandex and Baidu can be quite aggressive in their crawling). Add this:

User-agent: *
Crawl-delay: 5


The crawl delay is in seconds. Make sure not to go too high - 5-10 seconds max should lighten the server load considerably. If you have 1000 new pages per day you want search engines to be able to find them all.

However the best method for Google (and possibly Bing) is still using their webmaster tools, as Google ignores the crawl-delay directive in robots.txt.

10% popularity Vote Up Vote Down


 

@Eichhorn148

I think this help document from Google should be solving my problem:


Change the crawl rate:


On the Webmaster Tools Home page, click the site you want.
Click the gear icon , then click Site Settings.
In the Crawl rate section, select the option you want.


The new crawl rate will be valid for 90 days.

10% popularity Vote Up Vote Down


 

@Dunderdale272

The de facto standard for telling robots what to crawl is robots.txt.

Make the URLs of your posts fit a pattern such that you can generate a robots.txt that selects just the new postings. You can generate it automatically with a script, e.g. on the fly (CGI script) or in a batch job you run every hour, or in some other way.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme