File Download and Filenames
Several months ago, I blogged about IE’s support for International Filenames on Downloads. Today’s post is a bit simpler and describes two cases when IE may rename downloaded files.
Filename Extension and QueryString Parameters
If a file download HTTP response does not contain a Content-Disposition header, Internet Explorer will determine the filename from the URL. This may behave unexpectedly if the response's Content-Type is an ambiguous MIME type (e.g. application/octet-stream, typically sent for executable files) and there are querystring parameters in the URL. For example, downloading www.fiddlercap.com/dl/FiddlerCapSetup.exe?id=123 will result in a file named “FiddlerCapSetup” without a file extension. This file will show the Windows Open With dialog if you attempt to run the file:
The IE Team ourselves ran into this issue when launching an IE8 Beta. The website team had decorated every link on the install page with a querystring parameter, including the installer URL, for analytics purposes. For an short period of time, users had to manually rename the installer in order to install the beta.
To work around the issue, you can:
- Send a Content-Disposition header with the desired filename specified
- Remove all querystring parameters from the URL
- Add a bogus querystring parameter at the end with the desired extension: www.fiddlercap.com/dl/FiddlerCapSetup.exe?id=123&x=.exe
Any of these three approaches will restore the lost extension and allow the downloaded file to work properly.
Now, the obvious questions are: “Why does IE do that? Can’t you just change IE? ” As it turns out, some server-side “CGI” programs have a .EXE file extension and utilize a querystring parameter to convey a download’s filename to IE. So, unfortunately, there would be a compatibility impact to changing IE to ignore the querystring, even if the sites in question are relying on a non-standard behavior.
The Content-Disposition Header and Run-From-Cache
Recently, a product team contacted us to report that their website’s downloads didn’t work properly when the site was in the Internet Zone, but worked fine from the Intranet and Trusted Sites zones. Intrigued, I asked for further details, and the team explained their architecture.
They host a webpage that allows the user to download an “online installer” that installs one or more individual products. The installer itself is simply a “stub” that downloads and installs the selected products. Interestingly, there’s only a single installer, and the team relies upon the filename of the stub installer to decide what products to install. The filename is set via a Content-Disposition header, and the stub installer itself is downloaded via navigation to an ASHX URL.
For instance, http://example.com/getinstaller.ashx?InstallProducts=1,2,3 will return the installer with the header Content-Disposition: attachment; filename="Install-1-2-3.exe" while http://example.com/getinstaller.ashx?InstallProducts=4,7 will return the installer with a Content-Disposition: attachment; filename="Install-4-7.exe" . Upon starting, the stub installer examines its own filename and downloads and installs the requested products.
Now, at first glance, this appears to be a clever architecture, because it solves a number of problems. For instance, to otherwise use a single installer for all of these combinations, they would have to find a way to interrogate the user’s browser for which products to install, e.g. by checking cookies or some other approach that is browser specific. Other strategies would either require that they keep multiple versions of the installer on the server (one for every permutation) or dynamically build and resign the installer for each individual user (requiring that they keep a code-signing certificate on the server).
The problem is that there are two major problems with this approach—one obvious, and one not so obvious. The obvious problem is that the user might not immediately run the installer and might instead save the installer with a different name. If the user renames the file (e.g. “CoolStuffToInstall.exe”) then the installer will fail. The more subtle problem is that you cannot rely upon the filename if the file is Run directly from the Temporary Internet Files cache. The product team correctly noted that the cache filename can have a numeric disambiguater appear at the end when run from the cache (e.g. “Install-1-2-3.exe”) and they simply changed their parser to ignore this number. However, they ran into a much more subtle problem later, when they deployed their service from the Local Intranet developer test site to the live Internet site.
The problem is that when an executable file is downloaded from an Internet server in Protected Mode, it must be copied from the Protected Mode (low integrity) Temporary Internet Files (TIF) folder to the non-Protected Mode (medium integrity) TIF folder. When this copy occurs, the file is renamed using a filename generated from the original download URL. So, the stub installer was always being copied and executed with the filename getinstaller.exe, causing the command-line parsing logic to fail. The team had never encountered this problem in their internal testing because they were always testing the service using a server in the Local Intranet zone. The Local Intranet Zone runs outside of Protected Mode, so the renaming issue was never encountered.
Update: IE9 Release Candidate has changed this behavior and files now keep their name when copied between caches.
We suggested a variety of workarounds to the team, including using a HTTP Handler to allow any URL within a given path to return the installer, like so: http://example.com/getinstaller/Install-1-2-3.exe, such that the filename is preserved upon copy. Of course, this approach still suffers from the “renamed by user” problem. We also suggested that the installer itself offer checkboxes to enable the user to select which products to install.
The stub-installer itself also installed a URL Protocol Handler, which allowed websites to later invoke the installer using a special URL (e.g. installer://site/params/etc). While URL Protocol Handlers incur a significant attack surface and must be carefully reviewed for security issues, their one major advantage is that they are supported by all browsers on Windows and they do not suffer from the file-renaming issues that are inherent in the original “magic filename” architecture.
Until next time,