Nmap Development mailing list archives
Re: CouchDB and MongoDB
From: David Fifield <david () bamsoftware com>
Date: Mon, 15 Feb 2010 21:03:03 -0700
On Wed, Feb 03, 2010 at 10:07:03PM +0100, Martin Holst Swende wrote:
Hi, I have now implemented the following : - json.lua is heavily reworked (and now according to specs :)) - except for Unicode collapsing into UTF-8, which is not done. If you have any good pointers on how to do that it is welcome.
First you must decode from UTF-16, then you must reencode with UTF-8. I think the Unicode character handling is important to have, at least so we're not passing things like "\u0041" on to scripts. It will not be terribly hard to implement, but if you like you can leave it until last. I am thinking of writing a unicode.lua or charenc.lua library to handle encodings like these, and then json.lua can just use that.
-- However, couchdb as it is in the matchlines matches as "httpd". I'm not saying that is wrong, since http is the protocol used, but it may be misleading. Can it somehow be both? (There is an html web-interface for couchdb on http://localhost:5984/_utils/ , which other http-scripts could be interested in). One snag is that the portrule for couchdb should really not use "httpd" to match against, since it will give a lot of false positives - which means that the portrule must match on port only, and miss couchdb:s on other ports. So, maybe that should be changed to 'couchdb' ? Or, how should I set the portrule?
That's a good question. For now, I would just use portrule = shortport.portnumber({5984})
Attaching the files (but also available from http://martin.swende.se/hgwebdir.cgi/nsescripts/)
Thanks, this is looking better and better! However, there are still some bugs I can see. One problem is string parsing. The code parses this string incorrectly: "test\\t" (That is a string containing 7 bytes: 't' 'e' 's' 't' '\' '\' 't'. It would be escaped in Lua as "test\\\\t". \t does not represent a tab here.) It should be unescaped to the 6 bytes test\t That is, 't' 'e' 's' 't' '\' 't'. But the code takes the \t off the end first, producing the 6-byte string test\<TAB> ('t' 'e' 's' 't' '\' followed by the tab character.) A better unescape function is something like this: return str.gsub(str, "\\[\\\"/bfnrt]", ESCAPE_TABLE) The way you look for the end of a string can also be fooled. The string "abc\\" (5 bytes 'a' 'b' 'c' '\' '\') should be unescaped to abc\ But the code sees \" and thinks it must be an escaped quote and the string is not done yet, even though the backslash is part of an earlier escape. The right way to handle these string is do proceed byte by byte in order, and every time you see a backslash, look at the next byte and insert the appropriate character. That will also let you reliably detect the end of the string. JSON strings aren't allowed to begin and end with single quotes, so you don't have to worry about that. Properly, the code should first be tokenizing the string, breaking it up into discrete elements like left-bracket, string, colon, number, and then parsing based on the tokens. This would fix another troublesome case I found: {"a}": 1} The code is fooled by the } character inside the string. Here are the test cases I want you to add to the library: '', -- error 'null', -- error '"abc"', -- error '{a":1}', -- error '{"a" bad :1}', -- error '["a\\\\t"]', -- Should become Lua {"a\t"} '["a\\"]', -- Should become Lua {"a\"} '{"a}": 1}', -- Should become Lua {"a}" = 1} '["key": "value"]', -- error '["\\u0041"]', -- Should become Lua {"A"} '["\\uD800"]', -- error '["\\uD834\\uDD1E"]', -- Should become Lua {"\240\157\132\158"} json.NULL is good idea, and I understand the reason for it, but it can't be a constant string like it is now. So I'm going to ask anyone reading, is there a good way to create a unique Lua object that can't be mistaken for any other type of object? David Fifield _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- MongoDB scripts Martin Holst Swende (Jan 19)
- Re: MongoDB scripts David Fifield (Jan 25)
- Re: MongoDB scripts Martin Holst Swende (Jan 25)
- Re: MongoDB scripts Martin Holst Swende (Jan 27)
- Re: MongoDB scripts David Fifield (Jan 29)
- Re: CouchDB scripts David Fifield (Jan 29)
- Re: CouchDB scripts Martin Holst Swende (Jan 31)
- Re: CouchDB scripts David Fifield (Feb 01)
- Re: CouchDB scripts David Fifield (Feb 01)
- CouchDB and MongoDB Martin Holst Swende (Feb 03)
- Re: CouchDB and MongoDB David Fifield (Feb 15)
- Re: CouchDB and MongoDB Martin Holst Swende (Feb 22)
- Re: CouchDB and MongoDB David Fifield (Feb 23)
- Re: CouchDB and MongoDB Martin Holst Swende (Feb 27)
- Re: CouchDB and MongoDB David Fifield (Feb 28)
- Re: MongoDB scripts Martin Holst Swende (Jan 25)
- Re: MongoDB scripts David Fifield (Jan 25)
- Re: CouchDB and MongoDB Patrick Donnelly (Feb 28)
- Re: CouchDB and MongoDB Martin Holst Swende (Mar 01)
- Re: CouchDB and MongoDB Patrick Donnelly (Mar 01)
- Lua and LPeg David Fifield (Mar 05)
- Re: Lua and LPeg Patrick Donnelly (Mar 05)
- Re: CouchDB scripts Martin Holst Swende (Feb 01)