Nmap Development mailing list archives

Re: CouchDB and MongoDB


From: David Fifield <david () bamsoftware com>
Date: Mon, 15 Feb 2010 21:03:03 -0700

On Wed, Feb 03, 2010 at 10:07:03PM +0100, Martin Holst Swende wrote:
Hi,

I have now implemented the following :
- json.lua is heavily reworked (and now according to specs :)) - except
for Unicode collapsing into UTF-8, which is not done. If you have any
good pointers on how to do that it is welcome.

First you must decode from UTF-16, then you must reencode with UTF-8. I
think the Unicode character handling is important to have, at least so
we're not passing things like "\u0041" on to scripts. It will not be
terribly hard to implement, but if you like you can leave it until last.
I am thinking of writing a unicode.lua or charenc.lua library to handle
encodings like these, and then json.lua can just use that.

-- However, couchdb as it is in the matchlines matches as "httpd". I'm
not saying that is wrong, since http is the protocol used, but it may be
misleading. Can it somehow be both? (There is an html web-interface for
couchdb on http://localhost:5984/_utils/ , which other http-scripts
could be interested in). One snag is that the portrule for couchdb
should really not use "httpd" to match against, since it will give a lot
of false positives - which means that the portrule must match on port
only, and miss couchdb:s on other ports. So, maybe that should be
changed to 'couchdb' ? Or, how should I set the portrule?

That's a good question. For now, I would just use

portrule = shortport.portnumber({5984})

Attaching the files (but also available from
http://martin.swende.se/hgwebdir.cgi/nsescripts/)

Thanks, this is looking better and better! However, there are still some
bugs I can see. One problem is string parsing. The code parses this
string incorrectly:

"test\\t"

(That is a string containing 7 bytes: 't' 'e' 's' 't' '\' '\' 't'. It
would be escaped in Lua as "test\\\\t". \t does not represent a tab
here.) It should be unescaped to the 6 bytes

test\t

That is, 't' 'e' 's' 't' '\' 't'. But the code takes the \t off the end
first, producing the 6-byte string

test\<TAB>

('t' 'e' 's' 't' '\' followed by the tab character.) A better unescape
function is something like this:

return str.gsub(str, "\\[\\\"/bfnrt]", ESCAPE_TABLE)

The way you look for the end of a string can also be fooled. The string

"abc\\"

(5 bytes 'a' 'b' 'c' '\' '\') should be unescaped to

abc\

But the code sees \" and thinks it must be an escaped quote and the
string is not done yet, even though the backslash is part of an earlier
escape.

The right way to handle these string is do proceed byte by byte in
order, and every time you see a backslash, look at the next byte and
insert the appropriate character. That will also let you reliably detect
the end of the string. JSON strings aren't allowed to begin and end with
single quotes, so you don't have to worry about that.

Properly, the code should first be tokenizing the string, breaking it up
into discrete elements like left-bracket, string, colon, number, and
then parsing based on the tokens. This would fix another troublesome
case I found:

{"a}": 1}

The code is fooled by the } character inside the string.

Here are the test cases I want you to add to the library:

        '',                     -- error
        'null',                 -- error
        '"abc"',                -- error
        '{a":1}',               -- error
        '{"a" bad :1}',         -- error
        '["a\\\\t"]',           -- Should become Lua {"a\t"}
        '["a\\"]',              -- Should become Lua {"a\"}
        '{"a}": 1}',            -- Should become Lua {"a}" = 1}
        '["key": "value"]',     -- error
        '["\\u0041"]',          -- Should become Lua {"A"}
        '["\\uD800"]',          -- error
        '["\\uD834\\uDD1E"]',   -- Should become Lua {"\240\157\132\158"}

json.NULL is good idea, and I understand the reason for it, but it can't
be a constant string like it is now. So I'm going to ask anyone reading,
is there a good way to create a unique Lua object that can't be mistaken
for any other type of object?

David Fifield
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: