Vulnerability Development mailing list archives

Re: Plain text files in internet explorer


From: Marc Slemko <marcs () znep com>
Date: Tue, 3 Sep 2002 08:42:38 -0700 (PDT)

On Mon, 2 Sep 2002, Dan Kaminsky wrote:

I'm serious; we have an extension <-> filetype LUT in the web server,
the one component that cares least about the content, and it's breaking
at precisely this point.  Extensions are file types.  Period.

This shows a distinct lack of knowledge and experience with the
diversity present on the web today.  The reality is that there are
many servers where MIME types do NOT come from any such trivial
extention to MIME type mapping.

There is no such thing as a filename in a URL, and no such thing as a
filename extention in a URL.  Heck, lets look at the most common
case: a path ending in a trailing "/", such as http://www.example.com/foo/
How do you know if that is plain text, HTML, XML, etc.?


What about .cgi that looks like HTML but declares itself to be
text/plain?

Photoshop makes a JPEG.  It's a JPEG.
Imagemagick makes a JPEG.  It's a JPEG.
Some crazy hacker with a hex editor makes a JPEG.  It's a JPEG.

The implementation does not define the format.  Exposing CGI/PHP/ASP is
marketing, nothing more.  We actually shouldn't be seeing foo.cgi...but
if we are, I'll accept MIME type being used as a *hack* to expose the
type of *backend* data.

Umh, the filename extension is entirely an artifact of the implementation;
the operating system implementation, in this case.  Some OSes have limits
on filenames (eg. DOS 8.3, meaning the file would be called a ".jpg" most
commonly).  Some OSes have built in support for file types outside of the
filename.  Some OSes have other restrictions or lack thereof on filenames
and extentions.

You agree the implementation doesn't define the format.  So how
the heck can the file extension define the format?  MIME types are
an abstraction designed exactly because the filename is a platform
specific implementation artifact and not suitable for inter-platform
communication of file types.

Perhaps the author of a image archive site intends his .gif/.jpg/.bmp
files to be downloaded straight, not rendered, so uses
application/octet-stream.

So at the layer of the web server, he's going to subvert the GIF mapping
into octet stream?

Do consider how ridiculous this sounds.

Maybe to you, but lots of people do similar types of things every
single day.  Files are commonly downloaded from a dynamic source, and
it is a completely bogus kludge to require that people setup their
webservers to execute their code when they get a request for foo.mpeg
or what have you.

http://www.foobar.com/movie.mpg is a direct handle to an mpeg movie.
http://www.foobar.com/foobar.exe is a direct handle to an executable.

Suppose for a moment we keep the URLs the same, but swap file content
and MIME header (i.e. you go to download the movie and instead run the
code in foobar.exe).  Sure, this is an obvious breach of security, but
it's something *more* than that.  It's a spoofing attack.  The user has
as much a legitimate right to consider themselves downloading a batch of
video data as they do to believe the content is coming from foobar.com.

I have no idea what this has to do with security; if your browser is
setup to automatically execute .exe files, then it is completely
irrelevant if you type it directly into the browser or if a page loads it
in some other fashion so that you never see the URL.  Either way, your
browser has a security problem that should be fixed.

However, trying to guess the MIME type based on non-standard and
unexpected rules does have major security consequences.  Thanks to
Microsoft's brain damage, it is impossible to serve up arbitrary
text files without the risk of them being interpreted as HTML.  To
make them accessible safely, they need to be converted to HTML and
encoded.

From a security perspective, the more complex the behaviour the
harder it is to analyze, predict, and filter.  Using the MIME type
properly is about the simplest behaviour possible.  Having some
copmlex system of sometimes looking at the MIME type, sometimes
looking at what you think is a filename, and sometimes looking at
the content to guess what to do with it is a nightmare, and has
already been the direct cause of a number of security holes in IE.


Current thread: