Dailydave mailing list archives
Re: [fuzzing] Coverage and a recent paper by L. Suto
From: "matthew wollenweber" <mwollenweber () gmail com>
Date: Wed, 17 Oct 2007 00:28:44 -0400
The main downside is that I don't know why it is that finding bugs should depend linearly on anything. is it really true that if run A explores 10 more blocks than run B then run A has a .10 better chance of finding a bug in the program?
I think the above is the crux of the problem. Despite lots of intuition on where the bugs are or how to find them, there hasn't been much analysis (to my knowledge) of modeling bug prediction. Clearly a lot of bugs occur where unchecked data is transferred into fixed sized buffers, but from a system perspective I've not seen anything quantifiable that suggests where bugs will be. For your weights a,b,c (or even the linear equation) you're right that we don't have metrics on what the best fit looks like. I think it would be interesting for an academic or two to put a lot of "small enough" programs through a lot of fuzzers. Fuzz almost all/most of the possible paths and then analyze the common traits to answer basic questions as to how diversity, coverage, and interesting functions relate to finding bugs. (to any professors out there: I haven't finished my masters yet and would love to find an interesting program). The cool research out there (Molnar, Aitel, DeMott, etc) seems to focus a lot on the feedback systems to find more bugs, but as of yet, I've seen no reason as to why any fitness function is well suited to the problem. On 10/16/07, David Molnar <dmolnar () eecs berkeley edu> wrote:
On Mon, 2007-10-15 at 21:38 -0700, J.M. Seitz wrote:So, let's try this: spankage = code coverage + (path diversity weight * 2) + (num interesting funcs hit * 5) This stems from many talks with Jared, Pedram, and Charlie. I also proposed that a proximity value to dangerous functions be thrown in, but let's not get too far ahead of ourselves.Cool, this is interesting. I wonder if you could also flip it around and try to estimate the values of those coefficients instead of picking "2" and "5". I don't think those are bad values, just it'd be interesting if there is a way to derive them somehow from data about which fuzzing runs have been successful so far. One way this could work is as follows: suppose we have a set of fuzzing runs R_1, R_2,...,R_n, and suppose for each run we know whether the run found a bug or not. Let's define Y_i to be 1 if the run R_i found at least one bug and Y_i to be 0 otherwise. Then we could set up a linear model: Y_i = a + b*coverage_i + c*path diversity_i + d*num int.funcs_i +error_i This just says we think whether a bug is found on run i depends linearly on coverage, path diversity, and number of interesting functions, plus an "intercept term" a and an error term. In your equation above for "spankage", we have a = 0, b = 1, c = 2, d =5. The error represents among other things the effect of all the other variables we did not measure. Like, there's no variable in this equation for "did the developers do a lot of threat modeling," so if that has a lot of impact on whether a specific run finds a bug, then we will see that as "error" in estimating how important coverage, path diversity, and #interesting functions is to finding a bug. The error can also account for things like randomization in the fuzzer, difficulty in measuring path diversity, and so on. The main point is that if we assume this model holds we can then estimate values for a,b,c, and d from the data using any of several methods. For example, if we believe the errors are independent of the other variables, and if we believe the errors have a mean value of 0, then we can use ordinary least squares regression. We can then look at the resulting estimates for a,b,c, and d to give us some idea of whether coverage, path diversity, and number of interesting functions in fact have an impact on finding bugs. Finally, we could feed the resulting estimates back into your Fuzzing Proxy Coverage Spanker-abob. The main downside is that I don't know why it is that finding bugs should depend linearly on anything. is it really true that if run A explores 10 more blocks than run B then run A has a .10 better chance of finding a bug in the program? Mainly this came to mind because I'm just learning about linear regressions now and have to read dozens of papers where people run, like, regressions of computer use vs. wages and compare them to regressions of pencil use vs. wages to see whether computers or pencils better predict high wages. Still, it seems like we should be able to leverage the data generated on which fuzzer runs produce bugs and which do not. -David Molnar
-- Matthew Wollenweber mwollenweber () gmail com | mjw () cyberwart com www.cyberwart.com
_______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com http://lists.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- Coverage and a recent paper by L. Suto Dave Aitel (Oct 15)
- Re: Coverage and a recent paper by L. Suto matthew wollenweber (Oct 15)
- Re: Coverage and a recent paper by L. Suto Stephen John Smoogen (Oct 16)
- Re: Coverage and a recent paper by L. Suto Charles Miller (Oct 16)
- Message not available
- Message not available
- Re: [fuzzing] Coverage and a recent paper by L. Suto matthew wollenweber (Oct 17)
- Re: [fuzzing] Coverage and a recent paper by L. Suto JFV (Oct 18)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Nicolas RUFF (Oct 25)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Alexander Sotirov (Oct 26)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Lance M. Havok (Oct 26)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Nicolas RUFF (Oct 27)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Lance M. Havok (Oct 27)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Alexander Sotirov (Oct 28)
- Re: [fuzzing] Coverage and a recent paper by L. Suto Andre Gironda (Oct 29)
- Re: [fuzzing] Coverage and a recent paper by L. Suto J.M. Seitz (Oct 29)
- Re: Coverage and a recent paper by L. Suto matthew wollenweber (Oct 15)