Dailydave mailing list archives

Re: DBs and Patents and Obama and Crypto

From: Zack Payton <zpayton () gmail com>
Date: Fri, 23 Oct 2015 16:39:01 -0700

I'm a huge fan of that approach to DB monitoring (passive tap -> reassemble
-> protocol parse -> parse SQL syntax -> analyze).

Applications have a limited set of queries that they make and when you can
model those queries in terms of an AST, you can remove the data (aka the
literals) and simply hash the control logic then build a blume filter for
each application to do near-wire-speed lookups.  If something triggers,
either an app has updated (which is simple to verify), or the control logic
has changed via some form of injection (or 3rd your baseline wasn't
accurate which is a wholly separate topic).  No fancy machine learning
necessary.  Users are a little harder to model but we start by assessing
how much access to data do they actually exercise and alert on large
increases.  A stealthy attacker can stay under the radar by not doing
anything aggressive but most attackers will fall right into it.

I haven't (yet) spent to time reversing M$ wire protocols or doing this on
closed source grammars but it is possible to do query logging and feed them
into the AST generator some other way.  Agents are tricky but you can make
them light weight by just scraping off the socket and still doing your
protocol analysis rather than a more invasive debugger based approach.

I started down this route because most people don't really understand their
data models, and the apps and users that use them.  Once you get into that
space there is all kinds of benefits like auditing app privileges versus
columns actually read/written to start enacting better practice of
principle of least privilege, spotting apps that don't use prepared
statements just from the wire protocol, analyzing source IP and command
styles for database admins, spotting shared accounts, weak passwords,
identifying high privileged accounts, etc.  My point is there is a large
amount of data that can enable better decision capabilities.  I don't agree
that it's as complex as the NSA signals problems, but then again I don't
see new database features being developed every 2 weeks that the devs are
just dying to take advantage of so having to keep up with ensuring your
grammars have semantic predicates for every major version of your fancy
dialect isn't really a problem, you just gotta staff a guy that understands
compilers for when shit breaks.

When you start thinking about parsing data bases like that it starts to
remind me of the techniques applied by Ram Shankar and Sacha Faust in their
Data Driven Offense excellent talk <https://vimeo.com/133292422> only
instead of an LDAP database, we're talking about some other type and
applying similar techniques for different ends.

Z



On Fri, Oct 23, 2015 at 9:18 AM, Dave Aitel <dave () immunityinc com> wrote:

I wanted to talk about patents in our industry, but I can't because
everyone is all like "Software patents are evil" _until they get one_ and
it gives me the sads.

So instead I'm going to talk about this company I saw yesterday, which is
basically this simple diagram:

Web App
[span port of your mid-tier] -----> [Parser for TDS] ---> [Machine
learning to find SQLi]

The good things about being on the network stack is that you can get
access to clusters. The bad thing is that every minor change of the TDS
stack or SQL syntax or anything of that nature means your system starts
failing. And you have to auto-detect all possible variation in the network
traffic because you're modeling what happens in an immensely complex piece
of software on one side that you don't have access to.

To avoid all possible ambiguity: This is an impossible problem to get
right, even if you limit it to "parse one version of TDS exactly the same
as SQL Server 2010 at a known patch level".

The other option is to install debugger-like instrumenters on every DB
server. In fact, a script to do this came out with an early version of
Immunity Debugger, which integrated with SPIKE Proxy so you could scan for
SQL Injection and use the feedback loop to guide your scanner around
filters and false positives. The downside is of course having to install
things on every DB server. In theory MS would release an API that allows a
logical "span port" that gave you ever SQL request, and I bet there IS one
somewhere in the auditing section.

Aside from the horribleness of every possible solution in that area, which
probably STILL works better than a few other things, I wanted to point out
a KEY sentence you might have missed in the Crypto-War guidelines
<https://assets.documentcloud.org/documents/2426450/read-the-nsc-draft-options-paper-on-strategic.pdf>
the administration pointed out. It was this: Without voluntary
<https://lists.immunityinc.com/pipermail/dailydave/2015-September/001016.html>and
enthusiastic help from Apple and Google, really bad things we won't specify
will happen, even if we force it all to be in cleartext. That "parse all
variations of TDS" problem that we just looked at is the same as the SIGINT
problem faced by the FBI/NSA/etc. Even WITH THE KEYS, the problem is
completely intractable if Google and Apple and Microsoft want to make it
so.

I can hear Google's lawyers now: "Oh, we delivered you our latest protocol
spec sheet, every two weeks as promised. Of course, our spec changes every
two weeks right after we deliver it, and you are always out of date, and
even if you WERE in date, only our software knows which version anyone is
at at any given time, and parsing it incorrectly means you are wildly
wrong, and if you can't provide a provably correct parser, no court will
accept your analysis, etc. Hey, did we mention that every block is not
encrypted, but of course it is XORed with this value which we calculate
with the most crazy slow algorithm we could find, one million times. That's
just this week though. Next week we are reversing every block, but we
aren't going to update the version number on the wire."

Just food for thought! ;)
-dave

_______________________________________________
Dailydave mailing list
Dailydave () lists immunityinc com
https://lists.immunityinc.com/mailman/listinfo/dailydave

_______________________________________________
Dailydave mailing list
Dailydave () lists immunityinc com
https://lists.immunityinc.com/mailman/listinfo/dailydave

Current thread:

DBs and Patents and Obama and Crypto Dave Aitel (Oct 23)
- Re: DBs and Patents and Obama and Crypto Zack Payton (Oct 26)