Mobile app version of vmapp.org
Login or Join
Murray155

: BingBot hitting multiple subdomains all at the same time, causing panic I have a site with multiple subdomains. On certain hours of the day, Bingbots would gather at my site and do a massive

@Murray155

Posted in: #Bingbot #MultiSubdomains #WebCrawlers

I have a site with multiple subdomains. On certain hours of the day, Bingbots would gather at my site and do a massive scan like this:

01:23:11 a.example.com GET /index HTTP/1.1 200 Bot.A
01:23:11 b.example.com GET /index HTTP/1.1 200 Bot.A
01:23:11 c.example.com GET /index HTTP/1.1 200 Bot.A
01:23:11 d.example.com GET /index HTTP/1.1 200 Bot.A
01:23:12 e.example.com GET /index HTTP/1.1 200 Bot.A
01:23:12 f.example.com GET /index HTTP/1.1 403 Bot.A
01:23:12 g.example.com GET /index HTTP/1.1 403 Bot.A
01:23:22 h.example.com GET /index HTTP/1.1 200 Bot.B
01:23:22 i.example.com GET /index HTTP/1.1 200 Bot.B
01:23:22 j.example.com GET /index HTTP/1.1 200 Bot.B
01:23:22 k.example.com GET /index HTTP/1.1 200 Bot.B
01:23:23 l.example.com GET /index HTTP/1.1 200 Bot.B
01:23:23 m.example.com GET /index HTTP/1.1 403 Bot.B
01:23:23 n.example.com GET /index HTTP/1.1 403 Bot.B


As the bots are scanning across multiple subdomains, the Crawl-delay: 1 directive in robots.txt would have no effect on such behaviour. The server defence mechanism would kick in and block these crawlers by issuing 403 errors.

Is there a way to spread out BingBot's crawling evenly? The default crawl pattern on Bing's webmaster tool doesn't seem to be followed.

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Murray155

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

This is Vincent from Bing Webmaster Tools and I noticed your post.

First of all, I'm sorry to hear about the problem you are having with our crawler's crawl activity across your subdomains. I am sure we can do better.

Couple of things:

I noticed you mentioned crawl pattern setting in Webmaster Tools wasn't working for. The reason is that when using a Crawl-delay: directive in robots.txt and - this directive always gets precedence over any Crawl Control settings in the Bing Webmaster Tools, which is why this isn't working as expected (see the note in www.bing.com/webmaster/help/crawl-control-55a30302).
On the other hand, since having to mitigate this through several subdomain-specific robots.txt with different crawl-delay: directives isn't optimal I don't have a good self-service solution here which is why I suggest you contact Bing Webmaster Support and share the domain/sub-domain information so they can pass it to the right team to take a closer look (they may ask for server logs to help with the investigation).

To contact Webmaster Support go to go.microsoft.com/fwlink/p/?linkid=261881, fill out the required fields, and in the "What type of problem do you have?" dropdown, select "Under-Crawling or Over-Crawling inquiry" and describe the problem you are seeing. Even if they don't come back with a personalized response immediately (it can take 24-48 hrs.), this should at least get the ball rolling.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme