Nmap Development mailing list archives

Re: Request for Comments: New IPv6 OS detection machine learning engine


From: Mathias Morbitzer <m.morbitzer () runbox com>
Date: Fri, 20 Jan 2017 18:31:38 +0100

Hi everyone,


Meanwhile, I managed to get feedback on the new implementation we started last summer from some people who know their ML.

Let me start by saying that we are doing quite good! :) However, of course there are things we could improve / comments on future work:


1) Considering the size of our DB (300+ fingerprints), the random forest model is a good choice. To make use of more complex models,

such as neural networks or deep learning, we would need a much bigger database. Therefore, I suggest to stick with random forest

for the near future and instead focus on improving in other areas.


2) Since the random forest (and other types of ensemble models) are already multi-modal, multi-stage would not improve accuracy.

However, it should also not make it worse. Since we have multiple reasons to prefer the multistage approach, this is good news.

The reason why the multistage approach performed slightly worse in our tests is probably the way we did the test, which brings me to


3) In terms of evaluation, the 80:20 split is not a good idea since the test size is too small, this will create variance on precision.

It would be better to re-run the tests multiple times with a 50:50 split, and then check the mean average precision and variance.

Also, for the multistage, it would be interesting to analyze for wrong classifications if they are already incorrectly classified in stage 1,

or in stage 2. This brings me to


4) We could reconsider our choice for the stage1 classifier. For the current first stage, we took the 4 main operating systems plus

a group "others". It could make more sense to create different groups based on similar behavior.


5) As we already thought, having 695 features is quite a lot. Approaches to reduce the amount of features could be for example

using neural networks or principal component analysis (PCA). We did play around with such things a bit before, but it might be interesting

to have another look.


6) I also learned that ML might not always be the best solution when it comes to figuring out exactly one perfect match. ML is good in

providing the top k results, from which there is a high probability that one is correct. So this might be also something to consider in

the future for our tests (consider the top k results will give a better overview of how the model performs), and also when OS detection is performed,

we could give the user the top k OS guesses, or at least have this option.


7) And finally, I've been told that we could also try the non-ML approach of signature based checking.


So that's it regarding feedback. I hope we can increase accuracy even further with this information!


Cheers,

Mathias

_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/


Current thread: