Mobile app version of vmapp.org
Login or Join
Jessie594

: Is there a reason msnbot crawls in spikes? I've been experiencing high RPM spikes recently. Something like this: When I debugged, I've found reasons to believe the reason is the msnbot suddenly

@Jessie594

Posted in: #Bing #RobotsTxt #WebCrawlers

I've been experiencing high RPM spikes recently. Something like this:


When I debugged, I've found reasons to believe the reason is the msnbot suddenly makes a massive crawl and then stops. I assume I'm not the only site that has a problem to suddenly handle 5x the normal RPM, so why does msnbot do this? Is there any valid explanation or technical reason to perform such a HIT & RUN?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Jessie594

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

The msnbot was retired from active web crawling in 2010 and replaced with bingbot - is that what you meant?

Regardless, as covered here, factors that can affect its crawl rate are:


The total number of pages on a site (is the site small, large, or
somewhere in-between?)
The size of the content (PDFs and Microsoft Office files are typically much larger than regular HTML files)
The freshness of the content (how often is content added/removed/changed?)
The number of allowed concurrent connections (a function of the web server infrastructure)
The bandwidth of the site (a function of the host’s service provider; the lower the
bandwidth, the lower the server’s capacity to serve page requests)
How highly does the site rank (content judged as not relevant won’t be crawled as often as highly relevant content)


Taking the above into account might help explain the spikes in your requests per minute.

To slow down the crawl rate, specify in your site's robots.txt:

User-agent: msnbot
Crawl-delay: 1


Change msnbot to bingbot if you determine that's the bot/user-agent causing the spike. And use a crawl-delay of 5 (very slow) or 10 (extremely slow) if your server's performance is suffering.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme