Dailydave mailing list archives

Re: More from Taiwan


From: "Piotr Bania" <bania.piotr () gmail com>
Date: Wed, 8 Jul 2009 14:45:04 +0200

Yo,

I think dynamic data flow analysis (including register/memory tracking 
(taint analysis) etc.) is something that can provide you a lot of answers in 
this case. Basically you can analyze how the input data (lets say a fuzzed 
.doc* file) influences the execution flow of a program (in this case 
Microsoft Word). Whenever exception happens you can test if the faulting 
instruction used operand(s) that was/were previously tainted (came from the 
input in direct or indirect way). And that's why i have created SpiderPig 
[1].

One of the problems here is speed, i have no idea how you are going to 
create execution trace, but if you are thinking about using single stepping 
via you debugger API i really wish you luck and lot of patience :). In my 
SpiderPig project i have used Virtual Code Integration technique (as 
explained in the article), however i am exchanging it right now for my own 
dynamic binary instrumentation engine (not because of speed), which should 
be ready when it will be ready :-) I also advice you to look on other 
projects that are using dynamic taint analysis.

Btw. Julio Auto should speak about some data flow coolness at SOURCE 
Barcelona 2009[2], perhaps his talk can give you some hints too.


best regards,
pb

[1] - http://piotrbania.com/all/spiderpig/

[2] - 
http://www.sourceconference.com/index.php/source-barcelona-2009/schedule



----- Original Message ----- 
From: "Dave Aitel" <dave () kof immunityinc com>
To: <dailydave () lists immunitysec com>
Sent: Wednesday, July 08, 2009 12:13 PM
Subject: [Dailydave] More from Taiwan


Ok, so here's the thing Ben Nagy and I were going on about at lunch. I
thought I'd share it with thousands of people.

Ben's problem is that he has 200,000[2] crashes in the latest Word. Word
2007 or whatever. He classifies these problems with !exploitable from
Microsoft, which drops them into buckets of various sorts. But saying 
"This
is probably exploitable"[1] or not is a really hard problem - far beyond
what !exploitable is useful for. (It claims to do data tainting, but this 
is
clearly a misnomer?). Basically it divides things into "Definitely likely 
to
be exploitable because EIP is 41414141", "Pretty much likely to be
exploitable cause we're writing to bad memory" and "Everything else".

So here's my little idea (which I'm sure everyone else has had at least
twice cause I'm not a special snowflake): Take each basic block and number
it. Execute the program twice, once with your crashing file, and once with
your template. This generates two signals, which have a stream of numbers 
in
them (from the execution trace). Then you can do interesting things by
converting to frequency domain (I.E. FFT?) and doing filtering and
visualization. Ben thinks you want to attach state to your numbers too 
(i.e.
memory and register info?). I'm not so keen on that because I think too 
much
data can be as bad as too little, but whatever. Each to their own.

I'm not sure what the interesting thing here is that magically tells you
something is worth really digging into? Maybe you take your two signals, 
and
subtract their frequencies and visualize how different they are? Throw 
that
at a HMM/NN and make it tell you something?

-dave

[1] Ben: Do you have a !exploitable in Immunity Debugger? Me: Yes, it just
returns true. :>
[2] Literally.



--------------------------------------------------------------------------------


_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: