Mobile app version of vmapp.org
Login or Join
Jessie594

: How to detect search engines on Cloudfront? Is there a way to "ask" google to add a query string paramater to the url when crawling my website? (maybe in robots.txt?) so that when it crawls

@Jessie594

Posted in: #AmazonCloudfront #Seo

Is there a way to "ask" google to add a query string paramater to the url when crawling my website? (maybe in robots.txt?) so that when it crawls example.com, it would add something like ?iam=google . Because we use Cloudfront to serve our website, we need a way to detect search engines on the cloudfront and forward them to origin server.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Jessie594

1 Comments

Sorted by latest first Latest Oldest Best

 

@Eichhorn148

There is no way to get Googlebot to add parameters. Google does not support such functionality.

Googlebot does send a User-Agent header:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)


However, you should not being doing anything differently for Googlebot than you do for users. Google calls that practice cloaking. Google will penalize sites that treat Googlebot specially.

There should be no reason that your Cloudfront servers can't serve the same cached content to Googlebot that they serve to users.

If your pages are AJAX and you need Google to be able to access the content, you can implement the hash bang AJAX crawling scheme.

To implement it, you need to have use #! in your URL where you would normalling use a # sign. Then Googlebot will fetch a URL with a parameter _escaped_fragment=everything+after+the+hash+sign. Your server needs to be configured to return the content that would be shown to the user at that point.

If you use pushstate to change your URLs (and they don't have a hash), but the page still uses JavaScript to load the content (rather than the first page having all the data served by the server), you can still use a similar technique. You can include a meta tag to tell Googlebot to fetch the _escaped_fragment version of the URL:

<meta name="fragment" content="!">


See point #4 from "Transformation of the URL" in Google's documentation

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme