WebApp Sec mailing list archives

Re: Input validation


From: Jeremiah Grossman <jeremiah () whitehatsec com>
Date: 19 Jun 2003 19:37:48 -0700

On Thu, 2003-06-19 at 10:38, Kooper, Larry wrote:
I am a newbie to this list - apologies if this question is often asked.  (I
don't know if the list has a FAQ).  

When securing a web site against attacks such as SQL injection and XSS, what
approach do you recommend following to validate user input?  

1) Attempt to massage data so that it becomes valid
2) Reject input that is known to be bad
3) Accept only input that is known to be good

(The three categories are taken from a paper here-
http://www.nextgenss.com/papers/advanced_sql_injection.pdf ,p22)

The problem with solutions 1 and 2 is that you may miss some forms of bad
input.  Another subtle problem with solution 1 and 2 is that sometimes bad
input can be embedded in good input.  For example, if someone searches for
"director's selections" the string "select" would be rejected (as a SQL
command), resulting in "director's ions." 

Solution 3 seems like the most secure but also the most expensive to
implement.  And the problem seems more difficult when validating free-format
fields such as a name or an address.  One could reject non-alphanumeric
characters, but then things like # (for apartment number) or - (hyphen)
would be kicked out. Any thoughts?



I personally like #3. Sometimes proper sanity checking can be difficult
to implement in some cases... but maybe less difficult as an alternative
to massaging data back into conformity as suggest by #1. I personally
find the hardest part not the code itself, but remembering to do the
sanity checking on all input and not becoming lazy in the process.

The 3 main things I do when sanity checking input that keeps things safe
are...

Character-Set Check, Length Check, and Escape all input. Making sure I
only get the characters I expect, in the max/min length I expect it, and
always escape all data. Anything else, I kick an error, and never echo
user supplied input. 

About the # and - characters, escaping should solve the problem or
simply just allowing a few more characters in your set. Just watch those
meta characters.

for your apt. question, my regex skills on display:
/^([\d#\-]{1,5})$/


regards,


Jer-











Current thread: