Nmap Development mailing list archives

RE: Small Problem w/RegEx for Service Detection...


From: "Jay Freeman \(saurik\)" <saurik () saurik com>
Date: Sun, 3 Sep 2000 14:37:54 -0500

Expect actually looks really useful... but, as you mentioned, it is a
function set for existing scripting environments.  I had thought a while
back about having nmap use something for its back-end, requiring TCL or some
such, but it always seems so wrong to rely on so complex a "library" somehow
:) (at least Xerces boils down to a single .so when all is said and done).
Not to mention that Fyoder is less than enthusiastic about doing so as well.
The POSIX regcomp() was really nice in that most Unix OS's had it, and I was
able to include a source file replacement for it without making nmap become
some crazy mess.  What I am really interested in is a _slightly_ more
powerful regex parser, or even a regex alternative.

The real crux of the problem is that Expect's parser isn't any better than
regcomp().  Not even for an administrative reason, but for an engineering
one:  internally it just uses a slightly modified version of regcomp() that
was designed specifically for TCL that (while it may be faster or something,
don't know) isn't any more powerful than the one in libc... or the one I
include in regex.c.  It doesn't have internal escaping, takes even fewer
arguments (a single "char *"), and has almost the same error message for
trying exact escaping of '\0':

        case '\\':
                if (*rcstate->regparse == '\0')
                        FAIL("trailing \\");  // << regcomp() says "Trailing
backslash"
                ret = regnode(EXACTLY,rcstate);
                regc(*rcstate->regparse++,rcstate);
                regc('\0',rcstate);
                *flagp |= HASWIDTH|SIMPLE;
                break;

Expect is really just a framework to wrap the parser to make it easier to
use (instead of nmap+V's "line noise"... I love that analogy :) )... it
doesn't actually offer a better parser itself.  It is a nice set of
functions that let's you get access to the data and then work with it in the
power of the scripting language you are currently working in (such as TCL),
but I don't see it being more useful than the XML formats that have been
proposed....  Fyoder mentioned to me a while back that a custom, slightly
more restrictive file format is much more likely to easily go parallel as
you would know at what points you could stop things without requiring
mingling your I/O with a very complex language parser.  That was what pretty
much shut down my TCL efforts, as I knew Fyoder was really looking for
something he could run in parallel.

Expect isn't even designed for sockets internally, to talk to most systems
it has the developer "spawn" copies of "telnet".  When working with
applications that shoot back binary data it becomes just as annoying to work
with as nmap+V (which also supports most of the parts expect adds to TCL,
obtaining and branching based on replies, sending bricks of data, etc.... it
just isn't a pretty sight...); and thanks to how TclRegComp() works, has the
same '\0' limitation.

*sigh*, maybe I should just futz around with regcomp() until I remove the
restriction... shouldn't be _that_ difficult....   Since I know the length
of the string up front I could allow trailing backslashes and then end when
I run out of string instead of when '\0' is found.  That's probably the way
to do it.

Sincerely,
Jay Freeman (saurik)
saurik () saurik com

-----Original Message-----
From: Paul Tod Rieger [mailto:prie () abl com]
Sent: Sunday, September 03, 2000 1:26 PM
To: nmap-dev () insecure org; saurik () saurik com
Subject: Re: Small Problem w/RegEx for Service Detection...

Anyone know of a more powerful, rather portable regex parser that supports

http://expect.nist.gov/

Expect automates interactive processes.  It's a Tcl add-on, but I
think there's also a Perl module.  And there's also the source
code.

Tod
abl.com


---------------------------------------------------------------------
For help using this (nmap-dev) mailing list, send a blank email to 
nmap-dev-help () insecure org . List run by ezmlm-idx (www.ezmlm.org).



Current thread: