Wireshark mailing list archives

Re: Fixing #12958 (Duplicated keys in -T json output)


From: Daan De Meyer <daan.j.demeyer () gmail com>
Date: Tue, 13 Jun 2017 10:10:15 +0000

I've submitted a patch which refactors the JSON output functions in order
to support grouping multiple nodes in a json array. Before being printed a
node's children are grouped using a grouping function. If multiple children
end up in the same group they are printed as a json array in the output.

Right now the grouping function puts every child in a separate group so as
to not change the current json output. Removing duplicate keys in the
output is as simple as changing the grouping function to a function that
groups children based on their json key. More complex grouping functions
could also be added in the future.

The patch can be found here: https://code.wireshark.org/review/#/c/22064/ .
I've tested the changes by diffing json output from this commit against
json output from the current master branch. The output is exactly the same
for multiple traces with multiple combinations of options enabled (-x, -j,
-T jsonraw).

Is creating the change on the code review site all I need to do or is some
other step required before the patch can get reviewed?

Regards,

Daan


On Wed, 7 Jun 2017 at 21:32 Daan De Meyer <daan.j.demeyer () gmail com> wrote:

Hello,

Right now to use the tshark -T json output in a project I have to use a
streaming json parser in order to avoid values of duplicated keys being
overwritten. Using a standard json parser like JavaScript's JSON.parse()
results in only the last value of the duplicated key being available in the
resulting json. This is not ideal and I'd like to fix this bug so I can use
JSON.parse() instead of a streaming json parser to read tshark's json
output.

The way I work around the problem at the moment is by intercepting each
duplicated key/value before it gets overwritten and storing the value next
to the duplicated key values as an array with the same key with the
"_array" suffix.

I'd solve the problem in wireshark in a similar way. A duplicate key in
the current output would only be written once (in the object) and its value
would be a json array containing all different values for the key. A simple
suffix like "_array" or "s" could be added to the key in order to clearly
indicate the key has mulitple values.

My current workaround with a streaming json parser does the same thing and
this has worked for the ip, tcp, http and http/2 tshark json output.
However, I don't know if there are other protocols where this approach
would not work.

Would this be a good solution for the problem or am I missing something?

Regards,

Daan

___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe

Current thread: