Mobile app version of vmapp.org
Login or Join
Bethany197

: Make server force Excel to hand over hyperlink to browser without trying to load the page itself I am not sure if this is an Excel (SuperUser) or WebMasters question since I may have exhausted

@Bethany197

Posted in: #Http #Microsoft

I am not sure if this is an Excel (SuperUser) or WebMasters question since I may have exhausted the server option.

Use case

Our site allows downloading a result list in XLS format. A hyperlink is present for each result which will load the result into the default browser.

Looking in our log file we find several requests per click:


OPTIONS (which I have now generating a 405 - when I started this quest it was returning all options with a 200OK, but then I read Stopping Microsoft Office 2010 from integrating with Subversion server as if it's Sharepoint and KB838028 )
Link Pre-fetching HTTP GET from MSIE 7.0 user agent (Microsoft's built-in URL handler) - 200 OK - 2/3rds of the actual data
HTTP GET from IE 11 user agent - 200 OK - full data


Log file entries

1.2.3.4 - - [17/Oct/2014:11:20:02 +0200] "OPTIONS /myfolder/ HTTP/1.1" 405 36 "-" "Microsoft Office Protocol Discovery" - -
1.2.3.4 - - [17/Oct/2014:11:20:02 +0200] "GET /myfolder/mypage?myparm=myvalue HTTP/1.1" 200 57288 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/7.0; ....; InfoPath.3; ms-office)" - -
1.2.3.4 - - [17/Oct/2014:11:20:03 +0200] "GET /myfolder/mypage?myparm=myvalue HTTP/1.1" 200 75326 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" - -


The second request is a waste of bandwidth and likely server processing

Is there a way on the server to not serve data on the 2nd request which I guess Excel loads just in case it was something it could show and then gives up when it isn't?

In the link Microsoft Office Link Pre-fetching and Single Sign-On given by @Perry I see an enticing

RewriteCond %{HTTP_USER_AGENT} ;sms-office()|;) [I] ...


but I cannot see how I can leverage this to have Execel NOT partly download the file we need to open in the default browser

UPDATE

I saw some office clients actually send ms-office as part of the user agent
But not mine. My win7 Office 2013 sends


Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)


UPDATE: Will abandon this, but with a suggestion:

If you produce links in an actual excel sheet and you are certain you do not want to support IE7 for the actual page, add a parameter to the link and return 200OK with empty payload to MSIE 7 user agents.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Bethany197

1 Comments

Sorted by latest first Latest Oldest Best

 

@Barnes591

Interesting - I never knew that!

Just tested from my custom web server (Rapid Server) and got the same results (3 GET requests after clicking on a hyperlink), so we know it's not the web server:

127.0.0.1:1062 - - [17/Oct/2014:03:57:45 -0700] "GET / HTTP/1.1" 0
127.0.0.1:1063 - - [17/Oct/2014:03:57:47 -0700] "GET / HTTP/1.1" 0
127.0.0.1:1064 - - [17/Oct/2014:03:57:47 -0700] "GET /favicon.ico HTTP/1.1" 404


What I have observed is that it won't make the extra request repeatedly, just the first time and then it gets cached.

It seems to be related to the way Office Protocol Discovery works. KB838028 explains it:


The decision about how to open the Web resource is resolved by
investigating the folder path where the document comes from and by
investigating the capabilities of the server that manages that path.
To determine what capabilities the server supports, Office 2003 issues
an HTTP 1.1 standard OPTIONS command. The OPTIONS command requests
that the server identify what commands and what methods that the
server supports for the folder where the document is located. The
server identification is done according to the rules that are outlined
in RFC 2616.

Office also tries to determine the Web-server type. This determination
is based on header information that is returned by the OPTIONS call.
Specifically, Office looks for header values that indicate
communications with a SharePoint document library or an Exchange
WebStore folder.


Basically it happens due to the way Office tries to play nicely or provide extensibility for other collaborative software that may be related to these hyperlinks.

You could create a rewrite rule to intercept the OPTIONS request and return a 405, see here.

EDIT: looks like I missed some of the details in your question when I first posted this answer, and I haven't really provided a useful answer. Our HTTP results are similar, but also a bit different. In the interim, these articles appear to offer some additional insights: 1, 2, 3. Looking at your results again, the second request's user agent is "older" which could mean that it's an internal web client in Office making that request, as you suspected. Oddly, I see a GET for favicon.ico in my results, but not yours.

From this article:


When you click a hyperlink in an Office file (Word document, Excel
spreadsheet, PowerPoint presentation, etc.), the URL is not
immediately passed to your browser. First, Office will internally
fetch the address. If the link returns a 3xx Redirection code, Office
will request the new address and repeat. If the link returns a 4xx
Client Error code, Office will abort the request and tell the user
that the link is unavailable, having never opened a browser (I’m
guessing that’s the whole point of this “feature”). And if the link
returns a 2xx Success code, the URL is finally opened in a browser.
Only the final URL is passed to the browser; any redirects are masked
by Office.


To sum it up, when you click a hyperlink in a Microsoft Office program, up to 4 HTTP requests could be made:


Office makes an HTTP OPTIONS request using an internal web client to check the web server capabilities - user agent is "Microsoft Office Protocol Discovery".
Office makes an HTTP GET request for the clicked hyperlink URL using an internal web client to ensure the web server returns 200 OK before launching the URL in the web browser - user agent is "Mozilla/4.0" aka "ms-office".
Office launches the URL in the web browser, and the web browser makes the third HTTP GET request for the clicked hyperlink URL - user agent is "Mozilla/5.0".
The web browser (not always, apparently) makes an HTTP GET request for the favicon.ico - user agent is "Mozilla/5.0".


References:


How documents are opened from a Web site in Office 2003 (KB838028)
You are redirected to a logon page or an error page, or you are prompted for authentication information when you click a hyperlink to a SSO Web site in an Office document (KB899927)
Authentication requests when you open Office documents (KB2019105)
Stopping Microsoft Office 2010 from integrating with Subversion server as if it's Sharepoint

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme