Dailydave mailing list archives

Re: Fuzzing


From: Jared DeMott <demottja () msu edu>
Date: Tue, 16 May 2006 14:53:37 -0400

Thanks for your input. :)

Both Miller and Jack mention randomness. But Jack mentions security. Peter also mentions security, but not randomness, he says invalid data. Some use the word fault injection or stress testing. Does the way we make data invalid HAVE to include randomness? Vaugnoux prefers a list of attacks rather than random data.

Depending on your level of knowledge of your target's input vectors and
data structures, no, the invalid data does not have to be random.  For
example, if your fuzzing an application's protocol stack, and you know
the message's first field in it's header is an 8 bit integer, why send
it random data?  You can fairly easily (and quickly) send it all 255
possible values for complete coverage.  With a little bit of knowledge
or educated guessing, you can vastly reduce the list of potential
exceptional values you are sending to your target and thus greatly
reduce your testing time.

Well, I agree and disagree here. Suppose we have a protocol that has many "levels" or legs. Suppose at each stage of the protocol we're required to supply various ints, strings, etc. Now consider every combination of states and data. Complete coverage is an intractable problem. Thus, most fuzzers introduce heuristics (long strings, format strings, etc.) and/or "bounds checking" to bring the scope of the problem back to planet earth. A fuzzer could do this completely deterministically (see a tool known a Autodafe that's about to be released.) List of attacks * Variables * time/run = runtime. This is good for getting a baseline like any other testing tool might. However, my question is about randomness. After said baseline completes, would it not be beneficial to continue (regardless if bugs were found or not) in some stochastic way? (I hope to give a talk about this at BlackHat this year.) Next year I'd like to study how Genetic Algorithms might be able to help us here.

It's clear that pure randomness is only good "one level deep". We need to add some structure (protocol knowledge) to advance multiple layers into any non-trivial protocol.

Correct.  For a great example of this, try fuzzing the 3rd or 4th
message in an IKE or IKEv2 session.  Guess what?  You might as well just
implement the entire protocol and all the crypto required inside your
fuzzer, because you're going to likely need it to get to the part of the
session where you want to perform your fuzzing.  (And on that note, YAY
for racoon2 iked's -P option!)

Fuzzing is a testing technique (usually software) used to find bugs. That's about the only consensus currently.

I think you can also safely include here that fuzzing requires sending
(hopefully) exceptional data to one or more of your target's input
vectors.

So my question is to all you fuzzers and software testers: what is the difference between fuzzing and software testing? (If someone knows of a good software testing list, please forward this on.) In theory, they seem very similar. In practice, the second party, security focus of fuzzing has made it effective in finding exploitable bugs. But more academically, what is the difference between the two (or the definition of each).

Fuzzing is a subset of software testing.  It just so happens that it's
particularly useful testing method for security researchers...

So we have: "Fuzzing: a software testing technique (hardware too?) that delivers exceptional data to some or all of the target's input vectors. Fuzzing has been widely used by security researchers because it is efficient and cost effective." My desire is to further expand that with possible metrics. What is exceptional and why is one exceptional value "better" than another? What makes one fuzzer better than another? For fuzzers that employ randomness and run forever, what is considered a "complete" run? Why has it been effective for security researchers when companies already do in-house testing? (This seems obvious to me: gap testing. We're testing in a way that's different and more security targeted that what the company probably did.) Could we also add something in the definition about "typically includes randomness" or is that just not true and unnecessary?
The flip side of fuzzing (and testing) is determining when a fault has occurred. This seems to receive less attention than how input data is malformed. Here's an important question when defining fuzzing: What are the different (current and future) methods/trends for detecting when a failure/fault/bug has been found?

Summarizing your last two quotes (which I snipped for brevity), in the
general sense, being able to observe your target's behavior in some form
as you send it exceptional data is required.  Sometimes this is
achieved, using network protocol fuzzing as an example, by connecting to
a service, sending exceptional data, then checking if the socket was
forcibly closed or, if after disconnecting, attempting a legitimate
reconnect.  If the socket was forcibly closed or the reconnect fails,
you've likely found a successful test case.  In the case of non-network
application testing, such as fuzzing an application's document parser,
your input will likely be a malformed data file of some form, and the
application is likely on your local machine.  In this case you can
attach a debugger, perhaps integrated directly with your fuzzer.  If
you're fuzzer/debugger combo tool then supported some form of scripting
then you'd have a fairly robust and automated testing tool.

Yes, such a tool would be good. Peter I believe has one such tool. But as far as I know, there isn't one freely available. Also, I guess I was looking for more detail/theory. We all know if we get a RST or coredump we may have hit gold. But what about second generation bugs, like uninitialized stack/heap problems. They may or may not crash an app. I'm wondering if anyone has thought of novel ways to determine if a program is acting "correctly" or not. (There is a paper out called "Efficient Context-Sensitive Intrusion Detection" that may prove useful in this area.)


Current thread: