Dailydave mailing list archives

Re: More from Taiwan

From: nnp <version5 () gmail com>
Date: Wed, 8 Jul 2009 22:21:43 +0100

It's quite a difficult problem really, to give an answer that is
correct a large percentage of the time without getting sucked into
more heavyweight analysis i.e. dataflow and path conditions. One
approach is to go the whole way and just try to generate an exploit,
but this gets rather complicated quite quickly, and involves a lot of
analysis that isn't going to be feasible to run for 200,000 test cases
(I'm currently looking at ~1 hour for a single input on VLC)

Especially with things like heap overflows, the determining factor on
whether a bug is exploitable or not might not even be obvious from the
path executed by the fuzz file. You may need to also analyse how much
memory massaging you can do; an automated solution for which is
probably going to need a pretty complicated dynamic/static analysis
tool implementing something like [1], among other stuff. (I do have a
tendency to overcomplicate things though, so I'd love to hear more
hackish solutions)

Given the sheer number of crashes Ben has, there seems to be a real
opportunity to see if there are paramaters of an executed path that
make the paths cluster into groups that are exploitable and not
exploitable. One option is to use a NN optimising over some paramaters
of a path, although the requirement for training data might be
prohibitive. There are similar clustering algorithms that don't have
this exact drawback though. What I have in mind is something like the
vector space model used to classify similarity between documents. A
calculation is run on each path to give its position in the vector
space and then clusters can be assigned manually, or via something
like the k-means algorithm [2]

The real trick/problem is in coming up with measurable properties of a
single path that when this calculation is run, places it in/around the
correct cluster. Basic blocks executed is one, perhaps assigning
unique numbers to loops and using the number of times the loop is
executed might be another... any other ideas?

[1] http://bitblaze.cs.berkeley.edu/papers/EECS-2009-34.pdf
[2] http://en.wikipedia.org/wiki/K-means_clustering

On Wed, Jul 8, 2009 at 11:13 AM, Dave Aitel<dave () kof immunityinc com> wrote:

Ok, so here's the thing Ben Nagy and I were going on about at lunch. I
thought I'd share it with thousands of people.

Ben's problem is that he has 200,000[2] crashes in the latest Word. Word
2007 or whatever. He classifies these problems with !exploitable from
Microsoft, which drops them into buckets of various sorts. But saying "This
is probably exploitable"[1] or not is a really hard problem - far beyond
what !exploitable is useful for. (It claims to do data tainting, but this is
clearly a misnomer?). Basically it divides things into "Definitely likely to
be exploitable because EIP is 41414141", "Pretty much likely to be
exploitable cause we're writing to bad memory" and "Everything else".

So here's my little idea (which I'm sure everyone else has had at least
twice cause I'm not a special snowflake): Take each basic block and number
it. Execute the program twice, once with your crashing file, and once with
your template. This generates two signals, which have a stream of numbers in
them (from the execution trace). Then you can do interesting things by
converting to frequency domain (I.E. FFT?) and doing filtering and
visualization. Ben thinks you want to attach state to your numbers too (i.e.
memory and register info?). I'm not so keen on that because I think too much
data can be as bad as too little, but whatever. Each to their own.

I'm not sure what the interesting thing here is that magically tells you
something is worth really digging into? Maybe you take your two signals, and
subtract their frequencies and visualize how different they are? Throw that
at a HMM/NN and make it tell you something?

-dave

[1] Ben: Do you have a !exploitable in Immunity Debugger? Me: Yes, it just
returns true. :>
[2] Literally.

_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave

_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave

Current thread:

More from Taiwan Dave Aitel (Jul 08)
- Re: More from Taiwan Piotr Bania (Jul 09)
- Re: More from Taiwan nnp (Jul 09)