Full Disclosure mailing list archives

Coding securely, was Linux (in)security


From: Paul Schmehl <pauls () utdallas edu>
Date: Sun, 26 Oct 2003 19:39:53 -0600

--On Sunday, October 26, 2003 8:04 PM -0500 Bill Royds <broyds () rogers com> wrote:

You are saying that a language that requires every programmer to check for
security problems on every statement of every program is just as secure as
one that enforces proper security as an inherent part of its syntax?

Well, no, that's not at all what I'm saying. What I'm saying is that, no matter how well the language is devised, programmers must still understand how to write secure code and be aware of those places in the code where problems can arise and prevent them.

    And I suppose that you also believe in the tooth fairy.

Of course.  Just last week I found a dime under my pillow....

[snipped a bunch]
I have been programming in C since the 70's
so I am quite aware of what the language can do and appreciate its power.
But the power comes at the price of making it much more difficult to
handle the security and readability of C code. Since one can do just
about anything in C, the language compilers will not prevent you from
shooting yourself in the foot. Other languages restrict what you can do,
to prevent some security problems.
   But there is so much code out there that is written in C (or its
bastard child C++) that we are not going to get rid of it soon. Java
would actually be a good language if Sun would allow one to write
compilers for it that generated native machine language, not just Java
byte code.  But the conversion of the world programmer mindset to
restricting use of insecure language features will take eons so I give it
no hope.

So which makes more sense to you? To convert the world's programmers to a new language? Or to teach them to code securely? Surely, if we were to replace C today, they would just find other ways to write insecure code?

A programmer certainly can not know what his pointers refer to. That would
require the writer of a function to know all possible circumstances in
which the routine would be called and to somehow prevent her routine from
being linked in with code that calls it incorrectly. That is often called
the halting problem. Most security problems come from exactly the case
that the subroutine user "knows" what are the arguments for all calls in
the original use and handles those. The infinity of all other cases can
not be checked at run time without either significantly slowing down the
code or risking missing some.

But it shouldn't be the job of the writer of a subroutine to verify the inputs. The writer of a subroutine defines what the appropriate inputs to that routine are, and it's up to the *user* of that subroutine to use it properly. The entire concept behind OOP is that you cannot know what's in the "black box" you're using. That makes it incumbent on you as the *user* of a subroutine to use the correct inputs and to *verify* those inputs when necessary.

Now a subroutine writer is prefectly free to do error checking if they choose, but the user of that subroutine should never *assume* that the subroutine *does* do error checking.

   The recent MSBlaster worm is a case in point. The writers of the RPC
code "knew" that code within MS Windows never passed more than a 16
unicode character (32 byte) machine name as one of its parameters so did
not need to check ( the argument was not of type wchar * but of type
wchar[16]). Since C actually does not implement arrays at all, but only
uses  array syntax [] as an alias for a pointer, the only way to prevent
buffer overflow in a C routine is to never allow string arrays  as
parameters to functions, complete obscuring the meaning of code.
The problem is that C encourages bad coding practice and obscures the
actual meanings of various data structures and even the code auditing
techniques of the OpenBSD crowd do not find all the possible mistakes
A language will never be goof-proff, but it should not make it easier to
goof than be correct.

I'm not disagreeing with this point at all. I'm simply saying that programmers *must* verify inputs when they cannot be known. In this particular example, you're pointing out a classic mistake. The programmers of the RPC code *assumed* that they knew what the input would be when in fact they could not *know* that for certain. And so we ended up with another classic example of a buffer overflow (actually several). Assumptions are the mother of all problems.

You complain that the code would be really slowed down if consistent and complete error checking were done. I wonder if anyone has ever really tried to write code that way and then tested it to see if it really *did* slow down the process? Or if this is just another one of those "truisms" in computing that's never really been put to the test?

BTW, in my example, I didn't use strlen.

Paul Schmehl (pauls () utdallas edu)
Adjunct Information Security Officer
The University of Texas at Dallas
AVIEN Founding Member
http://www.utdallas.edu

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: