Full Disclosure mailing list archives

Re: FW: Introducing a new generic approach to detecting SQL injection


From: "Paul J. Morris" <mole () morris net>
Date: Fri, 22 Apr 2005 16:39:05 -0400

On Fri, 22 Apr 2005 15:26:41 -0400
Mohit Muthanna <mohit.muthanna () gmail com> wrote:
Once the allowed character set gets beyond $sanitized =
preg_replace("/[^a-zA-Z0-9]/", "", $untrusted) especially into the
realm

Don't use simple regexp matching. 
Why not?  I am not matching known attacks, I am stripping everything but
a small set of known good characters.  How are you going to construct a
sql injection attack using the character set [A-Za-z]?   Yes, you can
try to overflow preg_replace (or the dbms if I don't truncate your
input), but the set [A-Za-z] isn't going to enable a sql injection
attack.  If I have a single field being submitted from a form where the
characters in a legitimate query will only be in the set [A-Za-z], I
know with certainty that $santized will not contain a sql injection
attack if I filter it with $sanitized = preg_replace("/[^a-zA-Z]/", "",
$untrusted), regardless of any other dependencies (e.g. with php, I am
not dependent on the settings of safe_mode or of magic_quotes_gpc).  If
the set of legitimate characters includes quote characters or slashes or
the like, then I entirely agree with you that escaping and encoding
libraries are an important element.

This technique, though novel, is really
_wreckless_misuse_of_resources_.
    Agreed, most of the time there are better and more efficient ways to
handle the problem.  I find it interesting as it appears (and I'm not
sure that this is true), to rely on passing the known good rather than
filtering out a set of known attacks.  
I'll reiterate; unless your regexp is robustly tested don't use it.
There are many libraries out there for URL/Base64/Unicode/etc. etc.
encoding, decoding and escaping. Use them to clean up your input.
     I have seen too many discussions of ways to get around escaping of
attack characters by interesting twists on encoding to be sure that the
library I choose has though of all of the possible ways around the
decoding and escaping.   Encoding/decoding/escaping relies on the
library recognizing known attack characters, something it may be very
good at, but something experience has taught us is hard to do.  

If your database API supports it, use prepared statements and
parameter binding.
    Agreed.  Prepared statements are a very powerful tool, when
available.

Don't use simple string interpolation (without quote handling).
    I don't see the rationale for this.  The rationale for never filter
out known bad characters is clear, but filtering out all but a small set
of known good characters seems the simplest and surest way of sanitizing
input.  

It's really that easy.
    In the realm of multibyte characters with multiple kinds of clients
I'm not at all convinced it is.  I don't know that an attacker isn't
going to encode a query terminating character in a way that is going to
get through the decoding and escaping.  The fundamental principle of
escaping is that of recognizing known bad characters - something that
experience teaches us is inferior to allowing only known good characters
through. 
-- 
Mohit Muthanna [mohit (at) muthanna (uhuh) com]

Merry Snailing,
-Paul
--------------
Paul J. Morris
Biodiversity Information Manager, The Academy of Natural Sciences
1900 Ben Franklin Parkway, Philadelphia PA, 19103, USA
mole () morris net  AA3SD  PGP public key available

Attachment: _bin
Description:

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Current thread: