Secure Coding mailing list archives

Coding with errors in mind - a solution?


From: leichter_jerrold at emc.com (Leichter, Jerry)
Date: Tue, 5 Sep 2006 11:54:16 -0400 (EDT)

[Picking out one minor point:]
| [Exceptions] can simplify the code because
| -as previously mentioned by Tim, they separate error handling from normal
| logic, so the code is easier to read (it is simpler from a human reader's
| perspective).  I have found bugs in my own code by going from error handling
| to exceptions -- it made the mistake plain to see.
I agree with this ... but:

        - Many years ago, I worked in a language called BASIC PLUS.
                (This was an extension to BASIC that DEC bought along
                with the RSTS operating system.  It was actually a
                surprisingly productive environment.  But the details
                aren't really relevant here.)

          BASIC PLUS had a simple "exception handling" approach:  You
                specified "ON ERROR <statement>", which requested that
                if any system-detected error occurred (and, in modern
                terms, BASIC PLUS was a "safe" language, so *all* errors
                were detected by the run-time system) then the given
                statement was to be executed.  Almost universally,
                <statement> was a GOTO to a particular line number.
                At that line, you had access to the particular error
                that occurred (and error number); the line number on
                which it occurred; the current state of any variables;
                and the ability to resume execution (with some funny
                limitations).  This provided exactly the separation
                of normal code flow from error handling that seems
                like a fine idea ... until you try to do anything with
                it other than logging the error and aborting at some
                high level of abstraction.

          I wrote an incredibly hacky error handler and a function called
                FNSET() that was used essentially as follows:

                IF (FNSET(errs))
                THEN    <normal code>
                ELSE    <error path>

          "errs" encoded the error numbers that would be handled by
                <error path>; any errors not on the list took the
                usual log-and-abort route.

          So ... this was essentially a try/catch block (which I don't
                think I'd seen - this was in 1976 or thereabouts),
                with the odd filip that you declared the error
                conditions you handled in the "try" rather than in
                the "catch".  It worked very well, and supplanted
                the old monolithic error handlers that preceded it.

          But notice that it moved the error handlers right up close
                to the normal operational code.  Yes, it's not as
                close as a test after every call - *some* degree of
                separation is good.  Really, what I think matters
                is that the error handling code live at the right
                semantic level relative to the code that it's
                covering.  It's fine for the try/catch to be three
                levels of call up from where the throw occurs *if
                it the semantics it reflects are those of the code
                explicitly in its try block, not the semantics
                three levels down*.  This is also what goes wrong
                with a try block containing 5 calls, whose catch
                block is then stuck with figuring out how far into
                the block we got in order to understand how to unwind
                properly.  *Those 5 calls together* form the semantic
                unit being protected, and the catch should be written
                at that level.  If it can't be, the organization of
                the code needs to be re-thought.  (Notice that in
                this case you can end up with a try/catch per function
                call.  That's a bad result:  Returning a status
                value and testing it would probably be more readable
                than all those individual try/catches!)

        - On a much broader level:  Consider the traditional place
                where "exceptions" and "errors" occur - on an assembly
                line, where the process has "bugs", which are detected
                by QA inspections (software analogue:  Assertions) which
                then lead to rework (exception handling).  In the
                manufacturing world, the lesson of the past 50 years
                or so is that *this approach is fundamentally flawed*.
                You shouldn't allow failures and then catch them later;
                you should work to make failures impossible.

          Too much of our software effort is directed at better
                expression (try/catch) and implementation (safe
                languages, assertions, contract checking) of the
                "assume it will fail, send it back for rework"
                approach.  Much, much better is to aim for designs
                that make failure impossible to begin with.

          Consider all the blood spilled over buffer overflows that
                occur because buffers don't know their own sizes.
                After much pain, we're fixing that ... but in many
                cases what we do is change an unchecked assertion
                (which ends up being "checked" by unrelated code
                whose invariants get tromped on) into a checked one.
                But then what do we do when the checked assertion is
                triggered?  Wouldn't it be better to grow the buffer
                dynamically so that no failure is possible at all?

          Yes, there are always edge cases, like running out of
                memory to expand the buffer.  But (a) those (b) many
                can be very, very rare, so handled by catch-alls;
                of them can be avoided by proper design to begin
                with.  The way to handled running out of memory
                because a hacked field in a message indicates
                that some object is of a huge length is not to
                worry about elegant ways to handle out-of-memory
                conditions - it's to enforce constraints in the
                message definition and the implementation of the
                code that handles such messages.

I'm not arguing that error checking, error handling, and such are not
important.  They are and always will be.  I'm arguing that error
*prevention* - by design, by implementation of error *amelioration*
at the earliest and most appropriate semantic level - is a much
better approach.  And, yes, we need libraries and languages and tools
to support this approach - libraries and languages and tools that
don't really exist today.
                                                        -- Jerry


Current thread: