WebApp Sec mailing list archives

Re: Code Complexity vs. Security


From: <athena () buyukada co uk>
Date: Mon, 26 Jul 2004 10:02:32 +0100 (BST)

The authors state that (paraphrasing):

"estimates are anywhere between 5 to 50 bugs per KLOC (thousand lines
of code) in that book. The numbers corresponding to a system that has
undergone rigourous quality assurance and a system that has only been
feature tested, like most commercial software."

they also include LOC counts for some software systems, Windows XP
(40M), Space Station (40M), Linux (1,5M), Windows95 (<5M) etc


I think this comes from an IBM study somewhere. KaVaDo's marketing
literature quotes similar figures from an IBM study, and they make it
something like one bug per 1500 lines.

I cant think of any code complexity metrics other than loc, and even
that isnt most satisfying.  Can anyone think of any general one ?

LOC can be seen as an indicator of code complexity, but not really the
number of bugs. It's development models, code complexity and readability
that are real indicators.
For example, NetBSD could have X number of bugs whereas OpenBSD would
probably have fewer because of the secure development principles being
followed even though they share a common code base. Although this is a
somewhat extreme example I'd say most applications 'out there' don't
adhere to a secure development model.
As pointed out elsewhere on the list, things such as complex loops,
complex string handling (such as format strings and regexes) and the use
of static buffers or loosely typed variables tend to be greater
indicators. Anything that adds complexity increases the risk of a bug.
This is generally taught at college, but so many people ignore it.
Another major indicator of flawed code is commenting. When you have code
that's maintained over a long period of time by multiple authors, switches
in coding and commenting style can be an indicator that no author has
looked at the code as a whole, or that fixes to the original code have
been layered on top of the original code base with more importance placed
upon getting the fix out of the door rather than integrating the fix into
the existing code. So many developers comment inappropriately, at the
wrong place or too much (or too little) that it comes as no surprise that
code can often look messy by the time it reaches a 3.x release. The
'messyness', or things such as code obsfucation reduce the readability and
subsequent maintainability of code whilst adding 'cruft' to the code base
as people desperately try to work around what they don't understand. Look
at sendmail if you'd like to see proof of this in action.

IMHO, it wouldnt be wrong to think more locs more bugs, as a general
guideline. The number and nature of entry points to the program can
also have a value in determining the risk level the software is exposed
to.

In most cases, the number of locs can provide an idea of complexity which
is one factor of bug counts, but its important to remember that not all
bugs are security bugs, and if there a functionality problems with the
thing you're reviewing, that's also a pretty big indicator that things are
going to get ugly.
Steve




Current thread: