Mobile app version of vmapp.org
Login or Join
Tiffany637

: Bingbot Crawling URLS Like Folders, Causing Thousands Of 404 Errors I first started seeing this a little over two months ago on a site I run but now I am seeing it on several others I manage.

@Tiffany637

Posted in: #Bing #BingWebmasterTools #WebCrawlers

I first started seeing this a little over two months ago on a site I run but now I am seeing it on several others I manage. It appears the bot is trying to parse the URL into folders and this is what's causing the issue.

For instance this URL is correct:
amgoa.org/Proposed-Alaska-Gun-Law-SCR6/State-Law/8895
But then Bing tries to access this URL:
amgoa.org/Proposed-Alaska-Gun-Law-SCR6/State-Law
And then this URL:

/Proposed-Alaska-Gun-Law-SCR6 (sorry cant post more than 2 complete links)

The second two of course throwing a 404.

This site has over 67,000 pages and this error on their part is driving us nuts, loading up the error logs with tens of thousands of 404 for URLs that are incorrect.

About a month ago I built an xml sitemap script hoping that would solve the issue. I submitted it to Google and Bing via the webmaster tools section. Google correctly indexed all 67,000+ pages while Bing is sitting there trying to parse these non existent URLs.

Has anyone else seen this and more importantly does anyone know how to either stop this or contact Bing to get them to stop?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Tiffany637

1 Comments

Sorted by latest first Latest Oldest Best

 

@Radia820

The problem you have isn't Bing but the way your server is handling error responses.

Your 404 pages are reporting: SERVER RESPONSE: HTTP/1.1 200 OK

It should be reporting: SERVER RESPONSE: HTTP/1.0 404 NOT FOUND

So search engines are assuming they are valid pages and that's why they are being crawled all the time. Fix this and Bing should start to stop smashing at those pages.

You can test your header response using FireBug, Google Webmaster Tools or these online websites: site-scan.com, seobook.

The odd thing is that you do have status 404 Not Found in your header response but its not valid since 200 OK on the first response, basically you have a soft 404 which Yahoo and Bing don't take seriously

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme