Nmap Development mailing list archives

Re: Request for Comments: New IPv6 OS detection machine learning engine


From: Mathias Morbitzer <m.morbitzer () runbox com>
Date: Thu, 2 Mar 2017 17:42:28 +0100

On 02/21/2017 05:34 AM, Varunram Ganesh wrote:
>
>
>     5) As we already thought, having 695 features is quite a lot.
> Approaches to reduce the amount of features could be for example
>
>     using neural networks or principal component analysis (PCA). We
>     did play around with such things a bit before, but it might be
> interesting
>
>     to have another look.
>
>
> I'm not exactly familiar with these either, but it definitely sounds
> like it's worth a look!

>Unfortunately, me neither. Anyone who is is welcome to apply to the next
>Google Summer of Code!

I think PCA might be a better choice considering we have a dataset of slightly greater than 300 fingerprints and for neural networks to work correctly, we would need as many fingerprints as features. That being said, as you mentioned, it'd be helpful to have more fingerprints to make the algorithm better (and maybe implement neural networks in the future).

Agreed. We will hopefully have a closer look at PCA at this year's GSoC.

Last summer, we also tried to use the OpenCV function GetVarImportance(). [1] From its description, it should "return the variable importance vector, computed at the training stage", so we were hoping to get an indication on which features are most helpful for classification. However, the numbers we received didn't really help us - maybe we should have another look at this.

[1] http://docs.opencv.org/3.0.0/d0/d65/classcv_1_1ml_1_1RTrees.html#afc0c6f0c53e11d04ec1d1e176b3ba07f
_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread: