Mobile app version of vmapp.org
Login or Join
Alves908

: MSNBot/BingBot not reporting it is a bot? I have apache logs from my server, and I filter out visits from bots/crawlers/scrapers using a python script that checks for user agent strings containing

@Alves908

Posted in: #Bing #WebCrawlers

I have apache logs from my server, and I filter out visits from bots/crawlers/scrapers using a python script that checks for user agent strings containing text like 'bot', 'googlebot', etc.

Lately, there have been a number of visits to my site from what I believe is msnbot/bingbot, but they don't report it in their user agent.

An example of a log line is:

207.46.12.74 - - [27/May/2011:07:45:07 -0400] ...stuff... "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/5.0; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648)"

And a reverse DNS of the IP address:

Name: msnbot-207-46-12-74.search.msn.com
Address: 207.46.12.74


Right now I'm thinking of filtering out specific IP addresses that I detect are msnbot when I perform rDNS on their IP addresses, but when I presented this to a coworker, he felt that maybe there was something else to the fact that Microsoft's bingbots weren't reporting their user agent, like if it was IE private browsing or bing's safe-website crawler.

I've looked on Project Honeypot, various sites that have databases of user agents, and have confirmed net blocks of IP addresses that MSN/Bing bots use, but I think he wants even stricter confirmation.

Anyone know the behavior of these bots and why they're not reporting their agent strings as being 'bots'?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Alves908

1 Comments

Sorted by latest first Latest Oldest Best

 

@Gretchen104

This post, although it's old, tells you how to verify the msnbot: www.bing.com/community/site_blogs/b/search/archive/2006/11/29/search-robots-in-disguise.aspx
You've done steps 2, 3 and 4 and it all checks out OK, so I think that the user agent doesn't say msnbot may be a mistake on their behalf. When search.live.com became bing they kept the same msnbot user agent, and changed the version number:
www.bing.com/community/site_blogs/b/webmaster/archive/2009/11/04/msnbot-1-1-is-retired.aspx
I would submit your findings to the Indexing and Ranking forum and see if you get a response from Microsoft.

The DNS lookups should enable to you filter out this misbehaving bot though.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme