Wireshark mailing list archives

Re: Collection of captures for each supported dissector?


From: Peter Wu <peter () lekensteyn nl>
Date: Mon, 30 Jun 2014 16:52:30 +0200

(adding back the list, adding Gerald)

On Monday 30 June 2014 09:33:29 Evan Huus wrote:
On Mon, Jun 30, 2014 at 9:05 AM, Peter Wu <peter () lekensteyn nl> wrote:
On Monday 30 June 2014 07:12:56 Evan Huus wrote:
The "menagerie" is our collection of capture files that the fuzz-bot
uses to
test with. It contains a substantial number of files across as many
protocols as we have been able to accumulate. However, I am not sure it
is entirely publicly accessible?

I have seen the menagerie mentioned in bug reports, but could never find
this publically.

The best public collection I've been able to find is [1] which is all the
fuzzed captures that have ever caused the fuzz-bot to fail. It's worth
noting that the vast majority of the remaining menagerie (~90% at a rough
guess) is harvested from Bugzilla attachments, so most of the individual
captures are already public, they're just not easily browseable.

5 GiB of non-specific captures, I think I'll pass for now on this.
 
[1] http://www.wireshark.org/download/automated/captures/

Additionally, it is not indexed. There is a script somewhere to use
tshark
to extract the protocols contained in each capture and build a list, but
it only works for protocols which are dissectible by default (no "decode
as", decryption, or other special settings usually).

One of the ideas floated at sharkfest this year was the possibility of a
proper interface to the menagerie, but I don't think anything really
came
of it. What protocol are you interested in right now?

There is no particular protocol I am interested at, it was an idea to
improve
regression testing. Right now I am looking at all dissectors below TCP (or
on
top, depending on how you look at it).


By the way, could I get delete permissions for attachments for the
SampleCaptures page on the wiki?

I think Gerald has to grant this.

Gerald, could I get delete privileges for the captures on the SampleCaptures 
page?

There are a bunch of duplicates (and even
some empty files) listed as attachment and not linked. Some are not even
captures files although their extension suggest so.

Empty files:
mount-de.pcap.gz
omron-test-csum.pcap
wireshark.org.pcap.gz

Not pcap (but tcpdump text output or even a media file):
packetout.pcap
RTSP.pcap

Duplicates can be found with:
md5sum * | sort | uniq -w32 -D | while read sum file; do echo $sum $(date
+"%Y-%m-%d %H:%M" -r "$file") "$(du -hD "$file")"; done

Are there known efforts to index the files? I don't think that the wiki is
a
sustainable way to collect them?

No efforts I know of, but I agree the current method isn't scaling.

The script I mentioned to get the list of protocols in one (or more)
capture files is in git as ./tools/list_protos_in_cap.sh. Pipe it to a text
file and then grep for the protocols you're looking for.

Thanks for the pointers, maybe it is already sufficient for my purposes of 
validation.

Kind regards,
Peter
https://lekensteyn.nl
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe


Current thread: