WebApp Sec mailing list archives

Re: Preventing cross site scripting


From: Jeremiah Grossman <jeremiah () whitehatsec com>
Date: 19 Jun 2003 19:23:56 -0700

On Thu, 2003-06-19 at 11:28, Andrew Beverley wrote:
I am currently writing a web application that, as a small part of it,
needs to display an email message. Obviously the message is potentially
in html format, which to display could be sent straight to the browser.

I would like to know the best way of filtering out undesirable html. I
understand the best way is to only allow acceptable information, in this
case all the different html formatting tags.

However, there is a lot of tags that are acceptable. Another approach
would be to strip out all the bad stuff such as <SCRIPT>, <OBJECT>,
<APPLET>, and <EMBED> but this is far from ideal because of new tags
becoming available and so on.

Are there any functions available (for php) that will take a html page
as input and strip out all nasty stuff? Does anyone have suggestions as
to how to do this as easy as possible?

This is a very tough problem to solve, and no one to my knowledge has
done it completely effectively. Any html-aware web applications faces
this dilemma, especially with a web browser loose interpretation of
D/HTML/JavaScript.

Let me say first....

Attempting to safely allow HTML into your system is playing with fire,
plain and simple. Taking this into account, we can move onto a decent
solution to implement.

Use a strict HTML and tag attribute allow list:
Start with small safe set of allowable HTML. Just the tags and
attributes you feel your users need to get the job done and that wont
allow other client-side technologies (JS/ActiveX/Java/etc) to leak
through.

Parse your html content and only allow those tags and attributes to pass
unfiltered. Any other tags/attributes, replace with html entities.  Be
wary of all STYLE tags and attributes, as well as all *SRC attributes.
Also be careful about whitespace ASCII and HTML entity whitespace
equivalents. These methods have been used to bypass html filters. 

This should get you fairly close to safety or at least make things
harder to bypass.

 
Hope this helps...


Jeremiah-







Current thread: