WebApp Sec mailing list archives

Re: Data sanitization approaches in Java


From: Stephen de Vries <stephen () twisteddelight org>
Date: Mon, 17 Jan 2005 11:38:11 +0000


Hi Ben,

While sanitising data is best left to custom built functions, it is possible to implement some form of data validation using the validation frameworks present in Apache Struts and Java Server Faces (the new MVC kid on the block). Faces allow you to do things like specify the patterns for valid data right in the presentation layer. Some implementation of Faces provide only minimal built in validators (and let you write your own), while others provide a number of commonly used validators in a ready-to-use-form. The open source myfaces implementation (www.myfaces.org) provides some such validators, e.g. the following code snippet could be used to check user supplied input for 1) a valid email address and 2) a regular expression:

<h:inputText id="email" value="#{validateForm.email}" required="true">
     <f:validator validatorId="net.sourceforge.myfaces.validator.Email"/>
</h:inputText>

<h:inputText id="regExprValue" value="#{validateForm.regExpr}" required="true">
     <x:validateRegExpr pattern='\d{5}' />
</h:inputText>

For more info on JSF, see: http://www.jsfcentral.com/ and the Java Tutorial http://java.sun.com/docs/books/tutorial/ (there's a chapter on JSF).

Stephen


On 16 Jan 2005, at 14:11, Jeff Williams wrote:

Ben,

I did a presentation at last year's OWASP AppSec conference on this subject. There's a link to the presentations on the conference page (http://www.owasp.org/conferences/appsec2004nyc.html).

Essentially, the approaches range from completely external (deep packet inspection/web app firewall), to web server plugin (modsecurity) to J2EE filter, to a common validation library, to just doing it everywhere in your code. There are advantages and disadvantages to all of them, although I find the J2EE filter approach to be the most flexible.

Also, I noticed that you use the word "sanitization" -- did you mean actually modifying the input data? This is a little tricky in J2EE, although possible. If that's what you're after, let me know.

Oh, and URL encoding is really not a very good idea. Many interpreters just decode URL encoding automatically. HTML entity encoded data (&lt; &gt; &quot;) is generally not interpreted. There's not an HtmlEntityEncoder built into J2EE, so you'll have to roll your own. I could post one if there's interest.

--Jeff

Jeff Williams, CEO
Aspect Security, Inc.
http://www.aspectsecurity.com

----- Original Message ----- From: "Benjamin Livshits" <livshits () cs stanford edu>
To: <webappsec () securityfocus com>
Sent: Friday, January 14, 2005 4:20 PM
Subject: Data sanitization approaches in Java


I was wondering about data sanitization strategies commonly used in
today's Web applications, especially those written using J2EE. I am
aware of libraries that would simplify the sanitization process for you, however, I haven't really seen many applications that use anything more
sophisticated than URL-encoding the user-supplied string data.

Are there some common sanitization strategies that people actually use
in their code on a regular basis?

Thanks in advance,
-Ben


 ----------------------------------------------------------------------
 CONFIDENTIALITY: This e-mail and any files transmitted with it are
 confidential and intended solely for the use of the recipient(s) only.
 Any review, retransmission, dissemination or other use of, or taking
 any action in reliance upon this information by persons or entities
 other than the intended recipient(s) is prohibited. If you have
 received this e-mail in error please notify the sender immediately
 and destroy the material whether stored on a computer or otherwise.
 ----------------------------------------------------------------------
 DISCLAIMER: Any views or opinions presented within this e-mail are
 solely those of the author and do not necessarily represent those
 of Corsaire Limited, unless otherwise specifically stated.
 ----------------------------------------------------------------------


Current thread: