WebApp Sec mailing list archives

RE: Content monitorting in Application Security

From: "Ofer Shezaf" <Ofer.Shezaf () breach com>
Date: Wed, 19 Jan 2005 04:23:29 -0500

(Mark: I think this is interesting and it was returned on timeout)

Thanks Jeremiah

One righteous within the City of Sodom.... I already thought that I'll
get only tones of answers showing me the light of "file" ...

-----Original Message-----
on Monday, January 10 Jeremiah Grossman wrote:

Explicitly trusting the content-type header value and the file
extension when they match alone is not a good idea.
Here's a process I have seen done in the past that seems to work

fairly

effectively.

1) Is the content-type header value a valid file-type your expecting

to

receive (HTML/XML/GIF)? If not, fail.
2) Does the content-type header value and file extension match with a
mime-type entry? If not, fail.
3) Is the file within certain size constraints? If not, fail.


As this is not directly type related, it sort of belongs to along list
of checks that an Application IDS should do (RFC compliance for
example).

4) Does the file have proper format according to what its claiming to
be? If not, fail.
5) If is an HTML file, then run it through some security filtering
libraries.


Probably true to other file types as well...


* Hopefully I got the steps right. Someone might want to double check
the logic flow *

With respect to the question, step 4 is where the details really
matter. While you could use the unix 'file' command (as suggest by
another poster) to determine actual file type, I would prefer another
approach.


How right you are regarding "file": while file is a very nice and useful
utility, it is productivity oriented and not very security oriented:
- It matches very short signatures, making it relatively simple to evade
it.
- It has some big identification holes, at least in the magic file I'm
using (while it detects sub versions of a PDF file, it detects both word
and excel as a "Microsoft office document"
- It does little to detect content of text files, so that a perl, shell
and java script files are all detected the same.

The reason is not just that these shortcomings is not just that the
magic file is not large enough: the detection operators it supports are
rather limited. For example it does not support scanning the files for a
signature, but only looking for it at a predefined offset.

I'm also not sure that it is very well optimized for real time traffic
inspection required by an application security protection system such as
my company's product.


Use the content-type header value that the files claims to be and

parse

based on that premise. For instance if the file claimed to be GIF when
it hits step 4, run it through an image parser and see if there are

any

errors.  Usually when I see files uploaded via web interface, the
expect type of file is fairly limited for the most part. Normally

maybe

a few types of text files (HTML, CSV, XML), pictures (GIF, JPG, PNG),
possibly mp3's, etc. I would handle each type of file on a

case-by-case

basis.


The problem with full parsing of each type is that it just takes too
long for a real time product such as the one we do. I'm looking for an
interim solution that does not require full parsing but does not rely on
limited signatures.

One tool that I've found is trid (http://mark0.ngi.it/soft-trid-e.html).
It is signature based but employs much stronger signatures. It also has
a unique tool to build those signatures from a collection of files.


I don't know exactly how the unix 'file' command works, but I believe
its going to make its best guess based on certain identifiable format
indicators. And if this is all your looking for, its a great util.
Personally I think its better to know if a file would actually parse
rather than just appears to be something it might not be.

Just preference between the different methods.


jeremiah-



On Sunday, January 9, 2005, at 01:22  PM, Ofer Shezaf wrote:


Hi Jeremiah,

I was researching lately the issue of ensuring that files (uploaded

and

downloaded) are of the right type.

Do you think that matching extension and content type header would

be

enough? If no, are you aware of any technology to determine a file

type

according to its content?

~ Ofer

Ofer Shezaf
CTO, Breach Security

Tel: +972.9.956.0036 ext.212
Cell: +972.54.443.1119
ofers () breach com
http://www.breach.com

-----Original Message-----
From: Jeremiah Grossman [mailto:jeremiah () whitehatsec com]
Sent: Saturday, January 08, 2005 3:44 AM
To: Alfred Hitchcock
Cc: webappsec () securityfocus com
Subject: Re: Content monitorting in Application Security

Sounds like common web site functionality and the resulting

security

challenge.

Here are techniques that may help...

1) When receiving an uploaded file of any kind, use various parser
libraries to sanity check the actual format of data. Ensuring the

file

being uploaded is what it claims to be. With the incoming file
extension and content type header in agreement. jpeg's should be
formatted like jpegs, mp3's like mp3's, html like html and so on.

2) If you plan on handling files beyond plain text, such as zips

and

exe's, you may consider using some type of A/V product as well. A

nice

security add-on that can be useful depending on the situation.

3) This following method is strictly about XSS and HTML/JavaScript
content.

While its fairly easy to filter all HTML tags from a file to

prevent

XSS, its exponentially harder to separate HTML from executable
client-side code (JavaScript). Especially when the HTML is freeform

and

most tags need to be supported on the web site. I've long said its

slippery slope to support use-submitted HTML, but sometimes it

can't

be

helped.

There are a few things than can do help mitigate the risk of the
uploaded files.

   a. Filter out potentially malicious HTML tags or only allows a
strict
set of safe HTML tags.
   b. Filter out potentially malicious tag attributes or only

allows a

strict set of safe tag attributes.

   * The either or is a  give and take of security vs.
functionality/ease-of-use.

   Depending on the programming language you are using, there might

be

some libraries available that could help make this process easier.

haven't used them, but I noticed there are libraries available for

Perl.


http://cpan.uwinnipeg.ca/dist/HTML-StripScripts
http://cpan.uwinnipeg.ca/dist/HTML-Scrubber-StripScripts

There might be some available if you use some other language.


best of luck!


jeremiah-






On Friday, January 7, 2005, at 04:55  AM, Alfred Hitchcock wrote:


Hi All,
I have a major doubt it would be of great help if anybody can

provide

solution to this.
I have a web page which allows to upload files such as jpeg and

html

files.
Is there any mechanisms which can detect malicious html files.

E.g.

if

a html page has got a malicious java script such as alert('xss')

then

how can we check these things. One more point to be noted here is

that

uploading of file can be done by any user.


Ofer Shezaf
CTO, Breach Security

Tel: +972.9.956.0036 ext.212
Cell: +972.54.443.1119
ofers () breach com
http://www.breach.com

Current thread:

RE: Content monitorting in Application Security, (continued)
- RE: Content monitorting in Application Security Security (Jan 08)
  - RE: Content monitorting in Application Security Paul Laudanski (Jan 09)
- RE: Content monitorting in Application Security Ofer Shezaf (Jan 09)
  - Re: Content monitorting in Application Security Martin Mačok (Jan 10)
  - RE: Content monitorting in Application Security Antoine Martin (Jan 10)
  - Re: Content monitorting in Application Security oliver.karow (Jan 10)
  - Re: Content monitorting in Application Security Ivan Ristic (Jan 10)
  - Re: Content monitorting in Application Security Jeremiah Grossman (Jan 13)
- Re: Content monitorting in Application Security Jeremiah Grossman (Jan 15)
- RE: Content monitorting in Application Security Ofer Shezaf (Jan 23)
- RE: Content monitorting in Application Security Ofer Shezaf (Jan 23)
  - Re: Content monitorting in Application Security Martin Schapendonk (Jan 24)
- RE: Content monitorting in Application Security Ofer Shezaf (Jan 27)