Dailydave mailing list archives

Re: Cloud fuzzing.


From: Ben Nagy <ben () iagu net>
Date: Fri, 22 May 2009 16:25:35 +0800

I just thought I'd drop in a few details here, based on some questions
I've been getting.

On Wed, May 13, 2009 at 2:12 PM, Dave Aitel <dave () kof immunityinc com> wrote:
Today at SyScan Ben Nagy of COSEINC gave a talk on a fuzzing cluster
he's built that does 1.2 million fuzz cases a day against Word 2007.

The arithmetically inclined will, of course, note that 20 tests / sec
is more like 1.7 mil / day. Currently I'm cruising at more like 25/s -
this is with full page heap enabled, which roughly halves the speed.
The main reason I wanted to give a lot of metrics is so that someone
else will bite and release theirs. :) We use 5 Quad-Xeon servers, and
the total cost was around $15k USD. Running 72 clients under ESXi, I
figure we broke even over EC2 after about 3 months. Plus, Kostya can't
cloudburst me and steal all my 0day when the machines are on an
in-house cluster. :P

He's using !exploitable for WinDBG to get an approximation of the
problem. It's a talk full of real metrics.

!exploitable is fantastic for grouping crashes, and providing a rough
description of the problem. I don't trust its assessment of the result
in absolute terms, but this is a hard problem. I just deleted 250k
files which all triggered what seems to be a very common and useless
bug, and I have about 60k left, spread across ~25 'buckets' (although
only about 15 distinct eips, the rest are split up because of
different crash states).

10% cause crashes

Quite a few people said that this seems high. It is. I mentioned in
the talk that those stats depend very much on the test case generator
I am running. The talk was mainly about the harness and general Word
info that others can use as a Quick Start manual for distributed Word
fuzzing. I didn't talk much about my own case generation, mainly
because I don't believe that's the hard part, but partially because
not everything about what I'm doing is ready for public release.

To get the percentage that high I need to parse the OLE2 file
structure, parse the File Information Block, use that to find a
specific structure, parse that structure out, and then start
manipulating the internal fields. Once I run that, I find some field
values that caused crashes and refine the case generator to focus on
those fields. So, it's both deep and "adaptive" (which is a term I
hate) in the sense that a human writes an iteratively deeper fuzzer.
My case generators are not adaptive, intelligent, game changing, new
or particularly clever. They are, however, in Ruby.

I'm pretty sure we're going to release the harness and the parsers
under some kind of OSS license for SyScan Singapore.

A small percentage say they are possibly exploitable, and out of
those, largely false positives.

These are all _crashes_, not failures, so false positive is only used
here in the sense that !exploitable says "UNKNOWN" or "PROBABLY
EXPLOITABLE". For example it sees any read AV on a block data move and
figures it will play it safe with "PROBABLY EXPLOITABLE".

The problem of fuzzing is exponential, but if you architect your
fuzzer right, you can scale linearly with your budget. Perhaps your
budget also grows exponentially? :>

Yes, soon I plan to have it automatically sell the bugs, and place
online orders to Dell with the proceeds. :)

The problems for the future are interesting. Classification of
potential exploitability  is a problem that involves diffing program
runs, examining programs deeply for structure and behavior, and all
this has to scale up with your 200K cases a day.

And this is where I'm going to be spending my time for a while. There
is lots of really excellent work out there by lots of people, but
adapting it to get fully hands-free operation is not trivial. Answers
on a postcard, please. :)

Cheers,

ben
_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: