Dailydave mailing list archives
Re: Assymetry
From: Josh Saxe <josh.saxe () invincea com>
Date: Mon, 11 Apr 2016 14:04:37 -0400
I figured I'd chime in as someone who builds security machine learning models as part of his day job. A few hopefully not-too-incongruous observations: 1) Most security problems are not machine learning problems. Like encryption, dual-factor authentication, taint analysis, or hand-crafted IOCs, machine learning is just one of many security tools. But somehow people outside of machine learning seem to think a) machine learning can be applied everywhere and replace every other approach or b) machine learning can be applied nowhere, always underperforms, and is marketing snake oil. The people who believe a) are bound to be disappointed and the people who believe b) are bound to be blindsided when they wake up and realize machine learning has become an important ingredient in the network defense landscape. 2) For a working security data scientist, much of the ingenuity to developing a successful machine learning product is in picking problems that *are* good machine learning problems and not going down the rabbit hole of problems that aren't. Unsupervised clustering of malware to help identify new malware families or link threat actors -- that's a good problem, and systems that do this are currently deployed to good effect, but can probably be improved upon. Detecting and classifying malware is another good one that's already been productized but merits continued research. Setting firewall policy or predicting which users on a network will commit treason or sell your trade secrets is not a good machine learning problem and probably won't be in the foreseeable future, even though I'll bet there are products on the market that claim to do these things. 3) For a problem to be a good security machine learning problem you need a continuously replenished source of good data, because security models go out of date as adversaries evolve if the models don't evolve along with them. If you don't have good data at scale (and this includes *ground truth* with respect to this data) machine learning is the wrong approach. For example, because we don't have thousands of examples of employees going rogue and selling trade secrets (at least I don't) a machine learning approach to detecting such employees doesn't make sense. 4) To echo what Sven said, custom modeling for a given security application, which involves mostly either feature engineering or custom crafting of deep learning models that automate a portion of the feature engineering process, is the main work of a security data scientist. In my experience, wholesale adoption of approaches from other fields never works. For one, the statistics of the problem are totally different: in the detection use case, we tend only to care about the performance of a model in the extremely low false positive rate region, which changes the modeling goals from many non-security applications. And secondly, security is just different from computer vision, text mining, etc., and in my experience requires custom solutions to perform well. Best, Josh On Fri, Apr 1, 2016 at 9:59 PM, <Robin.Lowe () forces gc ca> wrote:
Good day all, Just a couple things I thought of while reading the earlier discussion on AI and this follow-up email. Just some, as Chris so eloquently put it earlier, conversation fodder. I think one thing we have to keep in mind is that the underlying framework behind machine learning is still a machine. An issue I can see about this is who is accountable for if it fails? If we’re talking about national security, what’s the risk that someone will be willing to take on in order to prove that their new machine learning intrusion detection system works 100% of the time? The number of hours that would be required to amass the amount of data needed to seed the system would be substantial, even on its own. There’s also the possibility of false positives being generated by erroneous data. Sure, an listening meterpreter shell on port 4444 is pretty damn obvious, but what about, say, Cobalt Strike’s Beacon system? Will the people developing the IDS need to spend thousands of dollars throwing all of these expensive network auditing programs at it in order to generate the data necessary to make it accurate even 90% of the time? Also, the budget just for personnel would be pretty high. You’d need people in R&D, maintenance, actually checking flagged intrusion attempts, etc. One last thing before I start in on the possible positives is that the machine itself might be prone to exploitation. Similar to how getting into domain controllers and hypervisors are pretty much endgame states, what if you broke into the IDS itself and started messing with its signatures? Seems like a few things to think about. However, some cost-reducing factors are that it’s always looking. And faster than a person can. Sure, there are some blue teams that are basically machines at this point, I can definitely see a time where machines can take over that facet of security. You don’t have to pay it a salary, just keep the machine happy with electricity and known behaviours and it’ll chug along. Kind of starting to sound like an antivirus program but one that looks at networks instead of files. New to this sort of thing so sorry if I mentioned something that would be considered common knowledge or just plain nonsense. Cheers, Leading Seaman/Matelot de 1re classe Robin Lowe Naval Communicator, HMCS EDMONTON Department of National Defence / Government of Canada *Robin.Lowe () forces gc ca <Robin.Lowe () forces gc ca>* / Tel: 250-363-7940 Communicateur Naval, NCSM EDMONTON Ministère de la Défense nationale / Gouvernement du Canada *Robin.Lowe () forces gc ca <Robin.Lowe () forces gc ca>* / Tel: 250-363-7940 *“**The quieter you are, the more you are able to hear.”* *From:* dailydave-bounces () lists immunityinc com [mailto: dailydave-bounces () lists immunityinc com] *On Behalf Of *Dave Aitel *Sent:* April-01-16 11:36 AM *To:* dailydave () lists immunityinc com *Subject:* [Dailydave] Assymetry One possible long-lasting cause of the "asymmetry" everyone talks about is that US defenders get quite high salaries compared to Chinese attackers (I assume, not being a Chinese attacker it's hard to know for sure). Just in pure "dollars spent vs dollars spent" it seems like it would be three times cheaper to be a Chinese attacker at that rate? But I think it's still a question whether or not machine learning techniques make surveillance cheaper than intrusion as a rule. What if it does? What would that change about our national strategy? (And if it DOESN'T then why bother?) -dave _______________________________________________ Dailydave mailing list Dailydave () lists immunityinc com https://lists.immunityinc.com/mailman/listinfo/dailydave
_______________________________________________ Dailydave mailing list Dailydave () lists immunityinc com https://lists.immunityinc.com/mailman/listinfo/dailydave
Current thread:
- Assymetry Dave Aitel (Apr 01)
- Re: Assymetry Sven Krasser (Apr 01)
- Re: Assymetry Robin.Lowe (Apr 11)
- Message not available
- Re: Assymetry Josh Saxe (Apr 12)