Mobile app version of vmapp.org
Login or Join
Odierno851

: Redirecting bots and crawlers to another but not human via htaccess I would to apply this diagram via htaccess I tried a lots of codes but I failed every time So I need to redirecting bots

@Odierno851

Posted in: #Facebook #Googlebot #Htaccess #RobotsTxt #WebCrawlers

I would to apply this diagram via htaccess I tried a lots of codes but I failed every time



So I need to redirecting bots and crawlers especially from facebook via .htaccess

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Odierno851

2 Comments

Sorted by latest first Latest Oldest Best

 

@Moriarity557

Ok, I have maybe the solution, try this (you can customize the list) :

SetEnvIfNoCase User-Agent "Abonti|aggregator|AhrefsBot|asterias|BDCbot|BLEXBot|BuiltBotTough|Bullseye|BunnySlippers|ca-crawler|CCBot|Cegbfeieh|CheeseBot|CherryPicker|CopyRightCheck|cosmos|Crescent|discobot|DittoSpyder|DotBot|Download Ninja|EasouSpider|EmailCollector|EmailSiphon|EmailWolf|EroCrawler|Exabot|ExtractorPro|Fasterfox|FeedBooster|Foobot|Genieo|grub-client|Harvest|hloader|httplib|HTTrack|humanlinks|ieautodiscovery|InfoNaviRobot|IstellaBot|Java/1.|JennyBot|k2spider|Kenjin Spider|Keyword Density/0.9|larbin|LexiBot|libWeb|libwww|LinkextractorPro|linko|LinkScan/8.1a Unix|LinkWalker|LNSpiderguy|lwp-trivial|magpie|Mata Hari|MaxPointCrawler|MegaIndex|Microsoft URL Control|MIIxpc|Mippin|Missigua Locator|Mister PiX|MJ12bot|moget|MSIECrawler|NetAnts|NICErsPRO|Niki-Bot|NPBot|Nutch|Offline Explorer|Openfind|panscient.com|PHP/5.{|ProPowerBot/2.14|ProWebWalker|Python-urllib|QueryN Metasearch|RepoMonkey|RMA|SemrushBot|SeznamBot|SISTRIX|sitecheck.Internetseer.com|SiteSnagger|SnapPreviewBot|Sogou|SpankBot|spanner|spbot|Spinn3r|suzuran|Szukacz/1.4|Teleport|Telesoft|The Intraformant|TheNomad|TightTwatBot|Titan|toCrawl/UrlDispatcher|True_Robot|turingos|TurnitinBot|UbiCrawler|UnisterBot|URLy Warning|VCI|WBSearchBot|Web Downloader/6.9|Web Image Collector|WebAuto|WebBandit|WebCopier|WebEnhancer|WebmasterWorldForumBot|WebReaper|WebSauger|Website Quester|Webster Pro|WebStripper|WebZip|Wotbox|wsr-agent|WWW-Collector-E|Xenu|Zao|Zeus|ZyBORG|coccoc|Incutio|lmspider|memoryBot|SemrushBot|serf|Unknown|uptime files" bad_bot
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} env=bad_bot
RewriteRule (.*) www.exemple.com/custom_page

10% popularity Vote Up Vote Down


 

@LarsenBagley505

What you are trying to do could technically be classified as cloaking which is a violation of Google's terms and can result in your site being removed from the Google index. Google is very strict in what they class as cloaking and basically the rule is whatever the end user sees the crawler has to see as well. If you are trying to block malicious bots then the easiest thing to do is simply block their user agent strings using .htaccess but if you try cloaking with a legitimate crawler such as Google it will be detected and will result in severe penalties and manual action notices which can severely affect your SERP ranking.

Google not only uses the known Googlebot user agent but also uses other bots which have the user agent string of real browsers on IP addresses not affiliated with Google as a way to detect this on websites so there is no way to prevent yourself from being caught out doing this.

Now having given that warning...

You mention Facebook crawler specifically. Facebook has three different user agents for crawling. facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) and facebookexternalhit/1.1 which are used when a user shares your website to their wall and Facebot which is used to help improve advertising performance. Out of all of them only Facebot respects the robots.txt rule as the other ones are only triggered by a user action and so are treated the same as a web browser in effect. If you want to block any Facebook crawling simply add a .htaccess rule to detect these user agent strings and if they are detected either block them or return an error page that crawlers are not permitted. Trying to forward them to an alternate site with different content will simply complicate matters and could have the potential of reducing your SERP ranking due to not having context appropriate content on the pages that the Bots can access.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme