: Redirecting bots and crawlers to another but not human via htaccess I would to apply this diagram via htaccess I tried a lots of codes but I failed every time So I need to redirecting bots
I would to apply this diagram via htaccess I tried a lots of codes but I failed every time
So I need to redirecting bots and crawlers especially from facebook via .htaccess
More posts by @Odierno851
2 Comments
Sorted by latest first Latest Oldest Best
Ok, I have maybe the solution, try this (you can customize the list) :
SetEnvIfNoCase User-Agent "Abonti|aggregator|AhrefsBot|asterias|BDCbot|BLEXBot|BuiltBotTough|Bullseye|BunnySlippers|ca-crawler|CCBot|Cegbfeieh|CheeseBot|CherryPicker|CopyRightCheck|cosmos|Crescent|discobot|DittoSpyder|DotBot|Download Ninja|EasouSpider|EmailCollector|EmailSiphon|EmailWolf|EroCrawler|Exabot|ExtractorPro|Fasterfox|FeedBooster|Foobot|Genieo|grub-client|Harvest|hloader|httplib|HTTrack|humanlinks|ieautodiscovery|InfoNaviRobot|IstellaBot|Java/1.|JennyBot|k2spider|Kenjin Spider|Keyword Density/0.9|larbin|LexiBot|libWeb|libwww|LinkextractorPro|linko|LinkScan/8.1a Unix|LinkWalker|LNSpiderguy|lwp-trivial|magpie|Mata Hari|MaxPointCrawler|MegaIndex|Microsoft URL Control|MIIxpc|Mippin|Missigua Locator|Mister PiX|MJ12bot|moget|MSIECrawler|NetAnts|NICErsPRO|Niki-Bot|NPBot|Nutch|Offline Explorer|Openfind|panscient.com|PHP/5.{|ProPowerBot/2.14|ProWebWalker|Python-urllib|QueryN Metasearch|RepoMonkey|RMA|SemrushBot|SeznamBot|SISTRIX|sitecheck.Internetseer.com|SiteSnagger|SnapPreviewBot|Sogou|SpankBot|spanner|spbot|Spinn3r|suzuran|Szukacz/1.4|Teleport|Telesoft|The Intraformant|TheNomad|TightTwatBot|Titan|toCrawl/UrlDispatcher|True_Robot|turingos|TurnitinBot|UbiCrawler|UnisterBot|URLy Warning|VCI|WBSearchBot|Web Downloader/6.9|Web Image Collector|WebAuto|WebBandit|WebCopier|WebEnhancer|WebmasterWorldForumBot|WebReaper|WebSauger|Website Quester|Webster Pro|WebStripper|WebZip|Wotbox|wsr-agent|WWW-Collector-E|Xenu|Zao|Zeus|ZyBORG|coccoc|Incutio|lmspider|memoryBot|SemrushBot|serf|Unknown|uptime files" bad_bot
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} env=bad_bot
RewriteRule (.*) www.exemple.com/custom_page
What you are trying to do could technically be classified as cloaking which is a violation of Google's terms and can result in your site being removed from the Google index. Google is very strict in what they class as cloaking and basically the rule is whatever the end user sees the crawler has to see as well. If you are trying to block malicious bots then the easiest thing to do is simply block their user agent strings using .htaccess but if you try cloaking with a legitimate crawler such as Google it will be detected and will result in severe penalties and manual action notices which can severely affect your SERP ranking.
Google not only uses the known Googlebot user agent but also uses other bots which have the user agent string of real browsers on IP addresses not affiliated with Google as a way to detect this on websites so there is no way to prevent yourself from being caught out doing this.
Now having given that warning...
You mention Facebook crawler specifically. Facebook has three different user agents for crawling. facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) and facebookexternalhit/1.1 which are used when a user shares your website to their wall and Facebot which is used to help improve advertising performance. Out of all of them only Facebot respects the robots.txt rule as the other ones are only triggered by a user action and so are treated the same as a web browser in effect. If you want to block any Facebook crawling simply add a .htaccess rule to detect these user agent strings and if they are detected either block them or return an error page that crawlers are not permitted. Trying to forward them to an alternate site with different content will simply complicate matters and could have the potential of reducing your SERP ranking due to not having context appropriate content on the pages that the Bots can access.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.