Mobile app version of vmapp.org
Login or Join
Samaraweera270

: Which domain should I authorize in my .htaccess to let google spreadsheet access files on my site I use google spreadsheet feature that allows one to let an image be the content of a cell.

@Samaraweera270

Posted in: #Apache #Google #Htaccess

I use google spreadsheet feature that allows one to let an image be the content of a cell. The cell is defined by:

=image("http://www.example.com/my-pics/my-pic.jpg")


It works pretty well.

But, I don't want my pics to be public. So I added a .htaccess file in the my-pics directory of my web site. Here is the template I used

order deny,allow
deny from all
allow from <domain used by google-spreadsheet to get image>


My problem is that I don't know which domain I should allow? Do you have any idea?

EDIT: Just an idea, but I cannot test it on my own (I use the web server my ISP provides me and I don't have access to logs). If my .htaccess is simply

order deny,allow
deny from all


and that google spreadsheet tries to access the image, it should be logged somewhere on my web server that domain some-google-linked-domain.com has tried to access the file. Am I wrong?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Samaraweera270

1 Comments

Sorted by latest first Latest Oldest Best

 

@Jennifer507

I ran a quick test and got the following in my Apache access logs from adding an image to a Google Spreadsheet:

64.233.172.188 - - [09/May/2014:05:14:51 -0700] "GET twitter.png HTTP/1.1" 200 1842 "-" "Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html)"
66.249.80.216 - - [09/May/2014:05:14:52 -0700] "GET twitter.png HTTP/1.1" 200 1842 "-" "Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html)"
66.249.80.216 - - [09/May/2014:05:14:52 -0700] "GET twitter.png HTTP/1.1" 200 1842 "-" "Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html)"


Interestingly, Google grabs the image three times. Aside from that, whitelisting the IP isn't going to work as Google has a lot of those (200,000+ apparently).

Based on this the best bet is to allow by the user-agent. This isn't foolproof, but should be enough for your purposes. Naturally StackExchange has the answer already - so for what you're after, start with something like:

SetEnvIfNoCase User-Agent .*Feedfetcher-Google.* search_robot

Order Deny,Allow
Deny from All
Allow from env=search_robot

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme