IDS mailing list archives

Re: Re: Intrusion Detection Evaluation Datasets

From: zubair.shafiq () yahoo com
Date: 10 Mar 2009 08:55:29 -0000

An ideal IDS dataset will be fully diverse (in terms of type of attacks) and completely free of artifacts (incurred
during creation and pre-processing). However, ideal scenarios do not hold in real-life! -- if they do then they will
not be real...

I agree that it is very hard to obtain datasets with payloads due to privacy constraints. Good anonymization procedures
mostly retain the relative statistics of the data. For example, you may consult the following work by people at ICSI.

http://www.icir.org/enterprise-tracing/devil-ccr-jan06.pdf

An overwhelming majority of network based IDSs use only spatial information present in packet headers. The datasets
that I have mentioned in my earlier post can be used to evaluate such IDSs. Moreover, you can find details of the
endpoint worm propagation dataset in the following papers:

http://www.nexginrc.org/papers/tr15-zubair.pdf
http://www.nexginrc.org/papers/gecco08-zubair.pdf

In my view, there are two directions to take dataset labeling further:

1. Improving injection procedures to ensure minimization of artifacts. This is more feasible if you know all parameters
and environmental conditions during trace collection -- Know Thy Data.

2. Use "semi-automated" ~ "semi-manual" procedures.

@Stefano: You have probably missed this point. Semi-automated procedures still require manual intervention, however, it
will help to reduce its magnitude significantly. So, we are not exactly developing a typical anomaly detection system.

let me know what you think.

Current thread:

Intrusion Detection Evaluation Datasets snort user (Mar 04)
- Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 06)
- Re: Intrusion Detection Evaluation Datasets Damiano Bolzoni (Mar 09)
- Re: Intrusion Detection Evaluation Datasets Jamie Riden (Mar 09)
- <Possible follow-ups>
- Re: Re: Intrusion Detection Evaluation Datasets zubair . shafiq (Mar 09)
  - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 09)
- Re: Re: Intrusion Detection Evaluation Datasets zubair . shafiq (Mar 10)
  - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 11)
    - Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 12)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 12)
    - Re: Intrusion Detection Evaluation Datasets Stuart Staniford (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)

(Thread continues...)