Secure Coding mailing list archives

Lateral SQL injection paper


From: Everhart at gce.com (Mary and Glenn Everhart)
Date: Tue, 29 Apr 2008 20:04:48 -0400

Let me suggest something a little differently:
Perhaps when speaking of web app security, an already enormous area, it is
not so useful to enlarge it still more, but "fools rush in..."..

One way to look at web code (and many other kinds) is that we are sending
strings to an interpreter and it does things. What makes security hard is
that the underlying interpreter doesn't give us much (any?) help in 
figuring
out what the set of functions/operations done are, so if we get some string
together we are going to pass, which we want to do some set of things, 
and if
it does some different set because it is an attacker, we don't have any 
easy way
to find out or do anything.

(This is easier to see with SQL or other languages where the string 
passing tends
to be easier to identify.)

Suppose the interpreter were made to count how many times major 
functions ran -
stuff from its parse tree - and make some kind of hash or structure that 
encapsulated
these counts (or even the functions only, counting just "done" or "not 
done"), and
returned that the first time it was run, and gave a way to rerun the 
call a second
time if this count were what was wanted?

You'd need a way to train your app in what was wanted, or otherwise 
somehow figure
out these hashes or structures without editing every time, and you'd 
need a way to
get the underlying interpreter to check what was passed to it, compare 
the "legality"
functions, and execute finally what was legal. (This can be viewed as an 
access control
system if you like.)

But might such a system not give a way to keep web apps (or others) from 
doing unexpected things?

The next question might be: could a web page be constructed so that this 
kind of thing
might be done, altering only logic at the server?

If it can be, then, would it not make sense to think about building a 
server or servers
with such properties available, so that one could write a web site that 
would tend only
to behave in predictable ways?

Or would such a thing so constrain what could be done that it is useless?

It seems to me that the numerous attacks are such that removing them one 
at a time is
a bit like using a hammer to wipe out a roach infestation, and some more 
generic
approaches should be asked about. But what about it? Does anyone have 
some suggestions that might be generic and might be possible to 
implement a site at a time?

Glenn C. Everhart
Everhart at gce.com


Arian J. Evans wrote:
So I'd like to pull this back to a few salient points. Weirdly,
some folks seem quick to dismiss the paper with a
didactic shot of "folks shouldn't code that way anyway"
which has nothing to do with the subject.

1. I think everyone on SC-L gets the idea of strong
patterns and implementations, and why parameterized
SQL is a good thing, and why cached queries are also
a good thing (for performance, at least, and security if
by doing so you enforce avoidance of EXEC())

2. David's paper is interesting, because out in the real
world people do not, and sometimes cannot, follow
ideal patterns, command patterns, and or implementations
that are safe. (e.g. delegation of privilege on Windows
accessing the DB for security inheritance vs. the negative
impact to thread pooling and process safety -- it is
a real tradeoff, and *never* made on the side of security)

David's paper is interesting because out in the real
world people still follow many borderline unsafe practices
and understanding new attack vectors is essential to
assessing risk, and understanding whether refactoring,
or hofixing, vs. logging, filtering, or *ignoring* the code,
is the right business choice to make.

David's example is more CVE instance than CWE class.

--

Steven, I like the grouping of your two main abstractions
below; for purpose of discussion & education I like to  put
these together a little differently into Semantic and Syntax
software security-defect buckets. I'm curious what your
thoughts are (and take this offline if the response is too tangential)


1. Semantic -- I place message structure, delimiting,
and all entailments of semantic conversation, including
implications of use-case and business rules here, where
the latter relate to enforcing specific semantic user/caller-
dialogues with the application.

I place callee requirement to enforce workflow, order,
message structure, state and sequence, and *privilege* here.

2. Syntax -- at heart we have a data/function boundary
problem, right? And most modern implementation level
languages do not give us constructs to address/enforce
this, so all our cluged workarounds, from stack canaries
to crappy \ escaping in SQL to attempts to use of HTML
named entities to encode output, fall into this grouping.

I place in callee requirements everything to do with
message encoding, canonicalization, buffer and
case e.g.- all syntax issues, into this grouping.

Now, arguably you could call a buffer or heap overflow
semantic, if you argue it's privilege related, but I
would say that is a result of language defects (or
realities) and it still starts syntactically.

Where would you put the recent URI-handler issues
in this structure?

Why did you specify privilege burden on the caller?

I tend to leave out/ignore the caller responsiblities
when I am thinking of software. This could be a
bias of predominantly web-centric (and db client/server
where I don't control the client) programming and
design over the years.

While it makes sense to enforce some syntax
structure upon the caller, in general I tend to
put all semantic responsibilities upon the callee,
and even assume the callee should enforce
some notion of syntax requirements upon
the caller, and feed said back to caller.

-- 
-- 
Arian J. Evans.

I spend most of my money on motorcycles, mistresses, and martinis. The 
rest of it I squander.



On Tue, Apr 29, 2008 at 9:10 AM, Steven M. Christey 
<coley at linus.mitre.org <mailto:coley at linus.mitre.org>> wrote:


    On Tue, 29 Apr 2008, Joe Teff wrote:

    > > If I use Parameterized queries w/ binding of all variables,
    I'm 100%
    > > immune to SQL Injection.
    >
    > Sure. You've protected one app and transferred risk to any other
    > process/app that uses the data. If they use that data to create
    dynamic
    > sql, then what?

    Let's call these "using apps" for clarity of the rest of this post.

    I think it's the fault of the "using apps" for not validating
    their own
    data.

    Here's a pathological and hopefully humorous example.

    Suppose you want to protect those "using apps" against all forms of
    attack.

    How can you protect every "using app" against SQL injection, XSS,
    *and* OS
    command injection?  Protecting against XSS (say, by setting "<" to
    "&gt;"
    and other things) suddenly creates an OS command injection scenario
    because "&" and ";" typically have special meaning in Unix
    system() calls.
    Quoting against SQL injection "\'" will probably fool some XSS
    protection
    mechanisms and/or insert quotes after they'd already been stripped.

    As a result, the only safe data would be alphanumeric without any
    spaces -
    after all, you want to protect your "user apps" against whitespace,
    because that's what's used to introduce new arguments.

    But wait - buffer overflows happen all the time with long alphanumeric
    strings, and Metasploit is chock full of alpha-only shellcode, so
    arbitrary code execution is still a major risk.  So we'll have to
    trim the
    alphanumeric strings to... hmmm... one character long.

    But, a one-character string will probably be too short for some "using
    apps" and will trigger null pointer dereferences due to failed error
    checking.  Worse, maybe there's a buffer underflow if the using
    app does
    some negative offset calculations assuming a minimum buffer size.

    And what if we're providing a numeric string that the using app might
    treat as an array index?  So, anything that looks like an ID should be
    scrubbed to a safe value, say, 1, since presumably the programmer
    doesn't
    allocate 0-size arrays.  But wait, a user ID of "1" is often used to
    identify the admin in a using apps, so this would be tantamount to
    giving
    everyone admin privileges!  We shouldn't accept any numbers at all.

    And, we periodically see issues where an attacker can bypass a
    lowercase-only protection mechanism by using uppercase, so we'd
    best set
    the characters to all-upper or all-lower.

    So, maybe the best way to be sure we're protecting "using apps" is
    to send
    them no data at all (which will still trigger crashes in apps that
    assume
    they'll be hearing from someone eventually).

    Or, barring that, you pass along some meta-data that explicitly states
    what protections have or have not been applied to the data you're
    sending
    - along with an integrity check of your claims.

    Of course, some "using apps" won't check that integrity and will
    accept
    bad data from anywhere, not just you, so they'll be vulnerable again,
    despite your best intentions.

    The alternate approach is to pick and choose which vulns you'll
    protect
    using apps against.  But then, if you've protected a using app
    against SQL
    injection, but it moves to a non-database model instead, you've just
    broken your legitimate functionality.  So, you're stuck with modeling
    which using apps are using which technologies and might be subject to
    which vulns.  You will also need a complete model of what the
    using app's
    behaviors are, and you'll need to keep different models for each
    different
    version and operating environment.  This will become brittle and
    quickly
    unmaintainable, and eventually introduce unrelated security issues
    as a
    result of that brittleness.

    To my current way of thinking, the two main areas of
    responsibility are:

    - for the caller to make sure that the request/message is perfectly
    structured and delimited, and semantically correct for what the
    caller is
    asking the callee to do.  The current browser URI handler
    vulnerabilities,
    and argument injection in general, are examples of violations of this
    responsibility.

    - for the caller, given any arbitrary message/request, to prove (or
    enforce) that it is well-formed, to make sure that the caller has the
    appropriate privileges to make that message/request in the first
    place,
    and to protect itself against SQL injection when interacting with
    a DB,
    against XSS when printing out to a web page, etc.


    I recognize that you might not have a choice with stovepipe or legacy
    applications, or in proxy/firewall code that resides between two
    components.  I feel for anyone wrestling with those problems.  But,
    "protect using apps against themselves" as general advice seems
    fraught
    with peril.

    - Steve
    _______________________________________________
    Secure Coding mailing list (SC-L) SC-L at securecoding.org
    <mailto:SC-L at securecoding.org>
    List information, subscriptions, etc -
    http://krvw.com/mailman/listinfo/sc-l
    List charter available at -
    http://www.securecoding.org/list/charter.php
    SC-L is hosted and moderated by KRvW Associates, LLC
    (http://www.KRvW.com)
    as a free, non-commercial service to the software security community.
    _______________________________________________



------------------------------------------------------------------------

_______________________________________________
Secure Coding mailing list (SC-L) SC-L at securecoding.org
List information, subscriptions, etc - http://krvw.com/mailman/listinfo/sc-l
List charter available at - http://www.securecoding.org/list/charter.php
SC-L is hosted and moderated by KRvW Associates, LLC (http://www.KRvW.com)
as a free, non-commercial service to the software security community.
_______________________________________________
  



Current thread: