Nmap Development mailing list archives

Prabhjyot's Status Report: #9 of 17


From: Prabhjyot Singh Sodhi <prabhjyotsingh95 () gmail com>
Date: Tue, 28 Jun 2016 10:33:28 +0530

Hey list,

This week went by quick, but I think I'm finally closing upon the random
forest experimentation.
So, here's what all was done,

Accomplishments:
- As reported last week I have been trying to convert the opencv
implementation to return the top 3 predictions instead of just one. This
basically involves getting votes from individual trees of the random
forest. I have tried to post questions to the opencv forum and their IRC
channel but haven't had success yet.

- For testing the new db, I tried two methods. (As of now both use accuracy
measure)
1) Stratified sampling: I formed a 80:20 class wise split for each class,
where 80 % prints are used for training and 20% used for testing.
This method yielded a total of 68 prints in the testing set. The Liblinear
model was able to 40 prints correctly (64.4%) while the random forest was
able to predict 52 prints correctly (81%)
2) Randomized testing: In this case I selected a set of either 20 or 40
prints randomly (instead of class wise). I repeated this test for around 15
times and the random forest model performed better each time (in terms of
accuracy) (I will update nmap-private-dev to contain these test results)

- This week I also worked on documenting the changes made to  the db, the
model that is being used. I hope to push it to nmap-private-dev in a week's
time for feedback.

- Lastly I passed GSoC's Midterm evaluation, yay!. I'd also like to
congratulate other participants who passed

Priorities:
- Work on documentation
- experiment with using specificity and sensitivity (or precision and
recall) in addition to accuracy while testing models.
- Start working on the multi-stage classifier

Cheers,
Prabhjyot
_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread: