: How do I block a user-agent from Apache How do I realize a UA string block by regular expression in the config files of my Apache webserver? For example: if I would like to block out all
How do I realize a UA string block by regular expression in the config files of my Apache webserver?
For example: if I would like to block out all bots from Apache on my debian server, that have the regular expression /bw+[Bb]otb/ or /Spider/ in their user-agent.
Those bots should not be able to see any page on my server and they should not appear neither in the accesslogs nor in the errorlogs.
global-security.blogspot.de/2009/06/how-to-block-robots-before-they-hit.html supposes to uses mod_security for that, but isn't there a simple directive for http.conf?
More posts by @Jennifer507
1 Comments
Sorted by latest first Latest Oldest Best
I enabled rewrite engine in Apache:
a2enmod rewrite
and added this block to my /etc/apache2/httpd.conf
<Directory /var/www/>
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} googlebot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} sosospider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BaiduSpider [NC]
# Allow access to robots.txt and forbidden message
# at least 403 or else it will loop
RewriteCond %{REQUEST_URI} !^/robots.txt$
RewriteCond %{REQUEST_URI} !^/403.shtml$
RewriteRule ^.* - [F,L]
</IfModule>
</Directory>
and restarted Apache:
apache2ctl graceful
now these calls from those spiders all cause 403 Errors:
grep -E 'spider|bot' /var/log/apache2/*.log
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.