Mobile app version of vmapp.org
Login or Join
Hamm4606531

: Strange Bingbot hits in my website access logs I'm seeing many hits to my site recently in the access logs and I'm not sure what to do with them. The pages they are trying to reach do not

@Hamm4606531

Posted in: #Htaccess #Logs

I'm seeing many hits to my site recently in the access logs and I'm not sure what to do with them. The pages they are trying to reach do not exist and they say they are coming from Bingbot, but I don't think those are bing IP addresses. Any one have any ideas of how I should handle these either via htaccess or reporting it to Bing?

66.249.69.1 - - [11/Aug/2016:07:41:23 -0400] "GET /index.php/write-academic-papers-for-money/js/jquery-1.8.2.min.js HTTP/1.1" 200 10014 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com /bot.html)"
70.208.74.141 - - [11/Aug/2016:07:41:28 -0400] "GET /images/ways.jpg HTTP/1.1" 200 188202 "http://tt.tennis- warehouse.com/index.php?threads/nice-mean-pros-on-tour.570480/" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12D508 Safari/600.1.4"
40.77.167.6 - - [11/Aug/2016:07:41:30 -0400] "GET /index.php/buy-research-paper-no-plagiarism/gifts-gear.php HTTP/1.1" 200 9866 "-" "Mozilla/5.0 (compatible; bingbot/2.0;)"

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Hamm4606531

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

The 3 log records shown all look like legitimate traffic (both the Google and Bing IP addresses appear valid) and as closetnoc has already pointed out, only the last one references the Bingbot.


The pages they are trying to reach do not exist


But your server is returning a 200 OK status, which is potentially allowing these URLs to be indexed by the search engines. If these URLs returned a 404 Not Found then it wouldn't be such a problem.

It looks like your site has been the target of a XSS-like attack to create spammy links in the SERPs for keywords that are irrelevant to your site.


Is there something I can do to prevent any /index.php/XXXXXX requests


Yes. The additional XXXXXX in the URL after a valid filename is trailing pathname information (PATH_INFO). The default behaviour on Apache generally allows this additional path info (although it depends on the handler).

However, this can be disabled with the AcceptPathInfo directive in your server config or .htaccess file. For example:

AcceptPathInfo Off


This will result in Apache returning a 404 NOT FOUND error on such requests.

Apache docs... httpd.apache.org/docs/2.4/mod/core.html#acceptpathinfo


Depending on your website URL structure, you could just block any direct requests to index.php. Something like the following, using mod_rewrite in the root .htaccess file:

RewriteEngine On
RewriteCond %{THE_REQUEST} ^GET /index.php [NC]
RewriteRule ^index.php - [F]


This would need to go before any URL routing directives (eg. WordPress).

THE_REQUEST contains the initial request header only, so you are still OK to internally rewrite to index.php if you are using a front controller (for example).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme