Vulnerability Development mailing list archives

Re: Plain text files in internet explorer


From: Philip Rowlands <phr () doc ic ac uk>
Date: Mon, 2 Sep 2002 17:29:56 +0100 (BST)

On Sun, 1 Sep 2002, Dan Kaminsky wrote:

All things being equal, I'll go with correct behavior being first that
which matches what is presented to the user in the title bar, using
standard (Microsoftian!) in-band filename notation, then if nothing
usable is there, use the MIME-type as a hint.

<RANT>
And how will everybody discover your scheme? What exists now is a mess,
caused by Microsoft's typical embrace-and-extend behaviour of tweaking
their apps to be annoyingly inconsistent with the standard.

If you dislike the RFC, then don't use it. The worst thing you can do is
make up your own rules. This creates more problems than it solves.
</RANT>

foobar.txt is always read as text.
foobar.html is always read as html.
foobar.php and foobar.php, which really *should* be foobar.html because
-- dear god, they contain html -- can use the MIME-type to hint
themselves into HTML parsing.
foobar.gif is always read as gif.
a javascript virus is always obviously either javascript(foo.js) or
parsed as a gif(foo.gif).

Importantly, I cannot concieve of a circumstance in which this can be
described incorrect behavior.  None.

A tutorial site teaching basic HTML, which presents code snippets as
text/plain to allow the student to read the markup, but would save to
the hard disk as .html.

What is .rpm? Is it a RPM Package Manager file, or a Realaudio Plugin?
Both exist.

What about .cgi that looks like HTML but declares itself to be
text/plain?

Perhaps the author of a image archive site intends his .gif/.jpg/.bmp
files to be downloaded straight, not rendered, so uses
application/octet-stream.

Let's consider the opposite; a browser consistently uses the MIME type
in all situations. What follows is that developers/authors get it right
during testing, users see the "correct" rendering of the content as the
author intended, and everyone's happy :)

I expect the exploit stream will eventually lead to MIME-type
deprecation.

That's a huge (and IMHO backward) paradigm shift. The Uniform Resource
Locator is just that, a "handle" on some content. It does not specify
the type of data, nor its size, age, TTL, language, caching
characteristics etc. All of these belong out-of-band, so to speak, in
the protocol headers.


Cheers,

Phil


Current thread: