Nmap Development mailing list archives

Nmap uses ctype macros improperly


From: Solar Designer <solar () openwall com>
Date: Sat, 18 Jul 2009 17:20:34 +0400

Hi,

There are lots of improper uses of ctype macros, such as isupper() and
tolower() (and many others), in Nmap source code (latest in SVN).  In
fact, most of the uses look problematic.

Some invoke the macros on "char" variables or expressions, which are
signed on most archs and with most compilers.  This breaks down when the
C library directly uses the macro's argument as array subscript and does
not provide a 384-element array (some do, specifically as a workaround
for improper uses like that).  Indeed, behavior on negative array
subscripts is undefined anyway, but those C libraries with the
workaround assume "support" by the "native" C compiler (or at least
provide the workaround in case this happens to be "supported").

Some cast "char" to "int" first, but this is of little help unless the C
library has a workaround, because a "signed char" value with the 8th bit
set would remain negative even after the cast to "int" (sign extension).

Finally, some are almost-correct, using a cast to "unsigned char".
However, there are a couple of problems with that too: (1) the ctype
macros are formally defined to work on "int" only, so it is correct for
an implementation to assume the argument type to be "int", and (2) on
some platforms, using a "char" (whether signed or not) array subscript
is broken.  Specifically, I saw it fail with old C compilers when
generating code for Alpha without BWX (sign- or zero-extension would not
be performed prior to using the value as displacement).  I don't know if
this has since been fixed in the compilers, or if there's any
justification for this being "undefined behavior".

Thus, the best approach appears to be to cast any "char" expressions to
(int)(unsigned char) (yes, double-cast) before passing them to the ctype
macros.  In some contexts, code such as:

int c;
...
c = (unsigned char)*p++;
...
if (islower(c)) ...;

could be shorter and easier to read (the cast to int is implicit and not
repeated all over the place).

I hope this helps.

Alexander

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: