WebApp Sec mailing list archives

RE: Content monitorting in Application Security


From: "Ofer Shezaf" <Ofer.Shezaf () breach com>
Date: Wed, 19 Jan 2005 04:23:29 -0500

(Mark: I think this is interesting and it was returned on timeout)

Thanks Jeremiah

One righteous within the City of Sodom.... I already thought that I'll
get only tones of answers showing me the light of "file" ...

-----Original Message-----
on Monday, January 10 Jeremiah Grossman wrote:

Explicitly trusting the content-type header value and the file
extension when they match alone is not a good idea.
Here's a process I have seen done in the past that seems to work
fairly
effectively.

1) Is the content-type header value a valid file-type your expecting
to
receive (HTML/XML/GIF)? If not, fail.
2) Does the content-type header value and file extension match with a
mime-type entry? If not, fail.
3) Is the file within certain size constraints? If not, fail.

As this is not directly type related, it sort of belongs to along list
of checks that an Application IDS should do (RFC compliance for
example).

4) Does the file have proper format according to what its claiming to
be? If not, fail.
5) If is an HTML file, then run it through some security filtering
libraries.

Probably true to other file types as well...


* Hopefully I got the steps right. Someone might want to double check
the logic flow *

With respect to the question, step 4 is where the details really
matter. While you could use the unix 'file' command (as suggest by
another poster) to determine actual file type, I would prefer another
approach.

How right you are regarding "file": while file is a very nice and useful
utility, it is productivity oriented and not very security oriented:
- It matches very short signatures, making it relatively simple to evade
it.
- It has some big identification holes, at least in the magic file I'm
using (while it detects sub versions of a PDF file, it detects both word
and excel as a "Microsoft office document"
- It does little to detect content of text files, so that a perl, shell
and java script files are all detected the same.

The reason is not just that these shortcomings is not just that the
magic file is not large enough: the detection operators it supports are
rather limited. For example it does not support scanning the files for a
signature, but only looking for it at a predefined offset.

I'm also not sure that it is very well optimized for real time traffic
inspection required by an application security protection system such as
my company's product.


Use the content-type header value that the files claims to be and
parse
based on that premise. For instance if the file claimed to be GIF when
it hits step 4, run it through an image parser and see if there are
any
errors.  Usually when I see files uploaded via web interface, the
expect type of file is fairly limited for the most part. Normally
maybe
a few types of text files (HTML, CSV, XML), pictures (GIF, JPG, PNG),
possibly mp3's, etc. I would handle each type of file on a
case-by-case
basis.


The problem with full parsing of each type is that it just takes too
long for a real time product such as the one we do. I'm looking for an
interim solution that does not require full parsing but does not rely on
limited signatures.

One tool that I've found is trid (http://mark0.ngi.it/soft-trid-e.html).
It is signature based but employs much stronger signatures. It also has
a unique tool to build those signatures from a collection of files.


I don't know exactly how the unix 'file' command works, but I believe
its going to make its best guess based on certain identifiable format
indicators. And if this is all your looking for, its a great util.
Personally I think its better to know if a file would actually parse
rather than just appears to be something it might not be.

Just preference between the different methods.


jeremiah-



On Sunday, January 9, 2005, at 01:22  PM, Ofer Shezaf wrote:


Hi Jeremiah,

I was researching lately the issue of ensuring that files (uploaded
and
downloaded) are of the right type.

Do you think that matching extension and content type header would
be
enough? If no, are you aware of any technology to determine a file
type
according to its content?

~ Ofer

Ofer Shezaf
CTO, Breach Security

Tel: +972.9.956.0036 ext.212
Cell: +972.54.443.1119
ofers () breach com
http://www.breach.com

-----Original Message-----
From: Jeremiah Grossman [mailto:jeremiah () whitehatsec com]
Sent: Saturday, January 08, 2005 3:44 AM
To: Alfred Hitchcock
Cc: webappsec () securityfocus com
Subject: Re: Content monitorting in Application Security

Sounds like common web site functionality and the resulting
security
challenge.

Here are techniques that may help...

1) When receiving an uploaded file of any kind, use various parser
libraries to sanity check the actual format of data. Ensuring the
file
being uploaded is what it claims to be. With the incoming file
extension and content type header in agreement. jpeg's should be
formatted like jpegs, mp3's like mp3's, html like html and so on.

2) If you plan on handling files beyond plain text, such as zips
and
exe's, you may consider using some type of A/V product as well. A
nice
security add-on that can be useful depending on the situation.

3) This following method is strictly about XSS and HTML/JavaScript
content.

While its fairly easy to filter all HTML tags from a file to
prevent
XSS, its exponentially harder to separate HTML from executable
client-side code (JavaScript). Especially when the HTML is freeform
and
most tags need to be supported on the web site. I've long said its
a
slippery slope to support use-submitted HTML, but sometimes it
can't
be
helped.

There are a few things than can do help mitigate the risk of the
uploaded files.

   a. Filter out potentially malicious HTML tags or only allows a
strict
set of safe HTML tags.
   b. Filter out potentially malicious tag attributes or only
allows a
strict set of safe tag attributes.

   * The either or is a  give and take of security vs.
functionality/ease-of-use.

   Depending on the programming language you are using, there might
be
some libraries available that could help make this process easier.
I
haven't used them, but I noticed there are libraries available for
Perl.

http://cpan.uwinnipeg.ca/dist/HTML-StripScripts
http://cpan.uwinnipeg.ca/dist/HTML-Scrubber-StripScripts

There might be some available if you use some other language.


best of luck!


jeremiah-






On Friday, January 7, 2005, at 04:55  AM, Alfred Hitchcock wrote:

Hi All,
I have a major doubt it would be of great help if anybody can
provide
solution to this.
I have a web page which allows to upload files such as jpeg and
html
files.
Is there any mechanisms which can detect malicious html files.
E.g.
if
a html page has got a malicious java script such as alert('xss')
then
how can we check these things. One more point to be noted here is
that
uploading of file can be done by any user.



Ofer Shezaf
CTO, Breach Security

Tel: +972.9.956.0036 ext.212
Cell: +972.54.443.1119
ofers () breach com
http://www.breach.com 


Current thread: