IDS mailing list archives
Re: Machine Learning for IDS: which dataset?
From: "J.A." <centurion () phreaker net>
Date: Thu, 08 Jun 2006 11:13:29 +0200
trantichphuoc () yahoo com wrote:
Hi there, I am interested in applying machine learning algorithms in detecting network intrusions. I read many papers and realized that the KDD-99 is the most well-known dataset used in the field. However, this dataset is provided by MIT in 1999, and obviously, its pretty old. As we all know, the defensive technologies are fast, and also the hacking techniques. Clearly, the KDD-99 dataset would not provide the true representation of a network at the current time. So, could anyone plz tell me which dataset is more updated, specialized for machine learning research in IDS? Thanks Patrick
Hi, Patrick.I am using the KDD-99 dataset in my research work. Though it is the most well-known datasets it has several drawbacks that limits what you can do with it. As an example, and as you note, the distribution of normal data and attack data does not represents a true real network.
I think that a better dataset is the original used to generate the KDD-99 dataset. It can be obtained from www.ll.mit.edu.
Cheers Juan A. Suárez-Romero ------------------------------------------------------------------------ Test Your IDS Is your IDS deployed correctly?Find out quickly and easily by testing it with real-world attacks from CORE IMPACT. Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 to learn more.
------------------------------------------------------------------------
Current thread:
- Machine Learning for IDS: which dataset? trantichphuoc (Jun 06)
- Re: Machine Learning for IDS: which dataset? Brad Carmichael (Jun 09)
- Re: Machine Learning for IDS: which dataset? J.A. (Jun 09)
- Re: Machine Learning for IDS: which dataset? Stefano Zanero (Jun 19)