Nmap Development mailing list archives

Re: CouchDB and MongoDB


From: Martin Holst Swende <martin () swende se>
Date: Sat, 27 Feb 2010 22:25:49 +0100

Hi,

David Fifield wrote:
On Mon, Feb 22, 2010 at 11:09:09PM +0100, Martin Holst Swende wrote:
  
I have now rewritten the json library. I got tired of dealing with
state by passing parameters and return values, and made an OO-approach
(for the parser-part). It became much nicer. It also made it more easy
to handle errors, so the library now says stuff like

NSE: Json:Syntax error near pos 2: Expected '"', got 'a' input: {a":1}

I did not go all the way with a tokenizing lexer, a grammar etc - the
grammar is part of the parser flow, but otherwise I have skipped all
%b{} and such 'shortcuts' and parse character by character.
    

The new design looks really good. It has worked well in my testing.

I want you to change the names of the fromJson and toJson functions to
"parse" and "generate" respectively. The camelcase names don't fit with
the style of other libraries, and there's no need to have "json" twice
in calls like json.fromJson and json.fromJson.
  
Good point, fixed.
  
I disagree about one point though:

    '["a\\"]',               -- Should become Lua {"a\"}

a\\" is interpreted in lua as a\", which to the parser looks like an
escaped quote, and gives syntax error. Did you mean '["a\\\\"]' , or am
I lost again? :)
    

You are right, I meant '["a\\\\"]'.

  
json.NULL is good idea, and I understand the reason for it, but it can't
be a constant string like it is now. So I'm going to ask anyone reading,
is there a good way to create a unique Lua object that can't be mistaken
for any other type of object?
      
What problem do you see with using a constant string? The way I see it :
if the script using this library
gets a "null" from the server, one way or the other, it will fail unless
it checks explicitly for equality with json.NULL. The reason
I chose 'JAVASCRIPT NULL' is it gives script writes a chance to read the
text and look it up in the library. Perhaps something like
"JAVASCRIPT NULL : check documentation in json.lua" would be better ?
    

The problem here is that if a table happens to contain the string
"JAVASCRIPT NULL", then it will look like a null to the JSON library.
For example, the call json.toJson({"a", "JAVASCRIPT NULL", "b"}) returns
["a", null, "b"]. It's better to use an object that can be distinguished
from all other objects. How about this:

NULL = {}

It would have to be documented that code must check for json.NULL before
checking for a table, otherwise this NULL could look like an empty array
or object. Lua tables have their own identity; two different empty
tables are not equal. This makes it impossible for someone to insert a
null value in a table without using json.NULL.
  
I see, well that definitely fits a lot better then. Fixed.

Here are some other notes.

Numbers are allowed to contain more than your parser allows. The grammar
has
      number = [ minus ] int [ frac ] [ exp ]
Try adding these test cases:
        '[-1]',
        '[0.0.0]',
        '[5e3]',
        '[5e+3]',
        '[5E-3]',
        '[5.5e3]',
The JSON syntax is kind of strict in that it doesn't allow leading
zeroes and requires at least one digit one either side of the dot. I
don't think we have to enforce that on input, as long as we observe it
on output. I think just calling tonumber and checking if it worked is
sufficient.
  
I changed the parser to be 'strict', we could set it back if you want.
At least it now
checks if tonumber works, so if we 'un-strict' it again it should work
fine.

I think you are missing a return after
      if not a or not b then
              self:syntaxerror("Error parsing numeric value")

'[0.0.0]' above is an interesting case because it points to a need for a
little more error checking in the parser. Right now, parsing that string
returns the empty array {}. The reason is that while the number parsing
fails, parseArray doesn't check the return value of parseValue (even if
I add the missing return from the previous paragraph). There are other
places where low-level errors are caught only because they cause
higher-level errors when the parser fails to advance, for example with
the string '[invalid]' I get the error
      Syntax error near pos 3: Expected , input: [invalid]
The error message is misleading because there's no reason for the parser
to expect a comma here. The lower-level "Error parsing numeric value"
has been lost becase the higher level kept on parsing as if the numeric
parsing had been successful.
  
I have added some more error checking to fix those issues.

For parseValue, I think it would be better to have a separate function
for parsing the highest level, the one that only accepts an object or an
array, instead of having a "first" parameter.

  
Added that, but it basically calls the other function anyway - I don't want
to duplicate any code...

Also, I added the unicode-decoding that you provided me with. Needless
to say, it works like a charm...:)
Attaching latest, also availble from my repo.
Regards,
Martin

There is no need to handle Unicode escapes when escaping a string for
output. Anything representable by a \u escape is also representable in
UTF-8, and it's easier to just use UTF-8.

David Fifield
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/
  

Attachment: couch.tar.gz
Description:

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

Current thread: