Mobile app version of vmapp.org
Login or Join
Sent6035632

: How is Googlebot finding URLs that are only visible to authenticated users? Here is one of my customers, performing some action after having logged in to his account. The unique token is simply

@Sent6035632

Posted in: #Google #Googlebot #SearchEngines

Here is one of my customers, performing some action after having logged in to his account. The unique token is simply an encrypted user id + timestamp.


94.254.xxx.xxx - - [02/Jul/2011:22:25:46 +0200] "GET /some-action/unique-token-123abc HTTP/1.1" 200 410 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"


Now, Googlebot somehow found out about this unique link and tried to access the exact same URL one week later.


66.249.71.179 - - [10/Jul/2011:09:56:01 +0200] "GET /some-action/unique-token-123abc HTTP/1.1" 302 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"


(status code is 302 because the token had expired)



Let me emphasize that this is a unique URL that was visible exactly once, for only 2 seconds, before the user clicked it and proceeded to visit that page. It was not sent in an email or published anywhere public.

What is going on here, how is it possible that Google found this unique URL?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Sent6035632

2 Comments

Sorted by latest first Latest Oldest Best

 

@Deb1703797

I just realized that the user must have found an outbound link on this authenticated page, and then leaked the private URL as Referer when clicking through to some other website. This is the only possible explanation, and should really have been obvious from the start.

Once leaked, the private URL may have been exposed to Google in a number of ways, e.g. the target site might have published their access logs publicly. Note: none of the outbound links were using Google Analytics so this does not indicate that Googlebot are using referrer URLs from Analytics.

Lesson relearned: never ever put sensitive data in URLs unless you use https, in which case the browser would have left Referer empty.

10% popularity Vote Up Vote Down


 

@Kevin317

It's hard to say for sure but here are likely scenarios:


The user has a browser toolbar or extension installed that reports the URLs they visit to Google.
Someone linked to that URL and Google found it by crawling the page with that link on it.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme