IDS mailing list archives

Re: Intrusion Detection Evaluation Datasets


From: Damiano Bolzoni <damiano.bolzoni () utwente nl>
Date: Sat, 07 Mar 2009 11:27:51 +0100

On 04/03/2009 6.09, snort user wrote:

For evaluating a new technique or methodology using a dataset, especially when
presenting the results to a conference, the validity of the dataset is critical.
How does one solve this problem, if not for the limited number of
standard datasets available?

You have to carefully document how you built the dataset, here some suggestions (this list is not exhaustive):
- is it artificial? or you collected live traffic?
- for how many hours your "collector" has been running?
- which applications/systems have you been collecting traffic from?
- if any attack, how many of them and of which types? did you inject the attacks artificially? if yes, how did you select which ones to inject? where did you get them (Nessus, milw0rm)?

The point is the following. Because, mainly for privacy reasons, your dataset will remain private, you have to "convince" your reviewer about the fact that 1) you didn't cheat and 2) you methodology is solid. By describing in detail what you have done, you have more chances that the reviewer will understand that the way you approached the problem was correct, and you couldn't have done better. When I get a paper for reviewing where they hardly say how they set up the testing environment, I become suspicious about their terrific results.

--
Damiano Bolzoni

damiano.bolzoni () utwente nl
Homepage http://dies.ewi.utwente.nl/~bolzonid/
PGP public key http://dies.ewi.utwente.nl/~bolzonid/public_key.asc
Skype ID: damiano.bolzoni () utwente nl

Distributed and Embedded Security Group - University of Twente
P.O. Box 217 7500AE Enschede, The Netherlands
Phone +31 53 4892477
Mobile +31 629 008724
ZILVERLING building, room 3013



Current thread: