Dailydave mailing list archives

Re: Month of Kernel Bugs and fsfuzzer release (0.6)


From: L.M.H. <lmh () info-pull com>
Date: Tue, 24 Oct 2006 22:16:50 +0200

On 10/24/06, Jared DeMott <demottja () msu edu> wrote:
Ah, yes, this is a general fuzzing issue I've been thinking about.  I've
done a bit of research trying to figure out with what heuristics to
fuzz.  And than it hit me: even more important than finding a "good"
long string, for example, is how to get it properly delivered.  The
"test harness" + "knowledge of where the test ends up" is almost more
important than the "test" if you will.

Well, I don't have time for a long explanation on the stuff so
consider this a preliminary version of a longer e-mail I'll send
sometime later (tomorrow).

You don't really need to go the hard way and use over complicated
models and such stuff. Although, in practice you need at least enough
data sets to cover all the legitimate structures that the engine is
capable of handling.

If it supports 50 different structures or variations of the same
structure, you need not less than 50-100 samples that use those
structures over and over. The more samples the more fine tuned your
'heuristics' can be.

An example of this concept can be found in the 'HAAR training' used
for the OpenCV library when you want to perform object detection in
images [1]. You feed a high amount of samples for the same kind of
object, and probably hand-select the area where the object is
contained. The background changes, the colors may change, the light
effects change, shadows too, etc. But the object remains with the same
shape, edges, etc. HAAR training is a lengthy process, and can take
days on a not-so-powerful machine for completing.

But if you give enough samples (say, thousands) and fine-tune the
process, you can have a highly precise 'profile' that can be used to
detect the object in mostly any image out there.

Although, you need to give 'negative samples'. That is, samples that
in no way contain the object at issue.

The same can be applied to fuzzing, at very least in theory.

Instead of focusing on a 'magic fuzzer', I would invest some effort on
a fully flexible language that lets the user nicely define the
structures, etc. Visual modeling can be an extremely valuable resource
for an user when determining the format/structure of the format/data
being manipulated.

And then, give helpers that use positive and negative samples for
aiding in the process, automating the analysis before manual
fine-tuning.

Cheers.

[1]: http://lab.cntl.kyutech.ac.jp/~kobalab/nishida/opencv/OpenCV_ObjectDetection_HowTo.pdf
http://www.intel.com/technology/itj/2005/volume09issue02/art03_learning_vision/p04_face_detection.htm
_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: