Dailydave mailing list archives

Re: You cannot use IDS data to talk about 0days or attacks.


From: Mara Tam <marasawr () gmail com>
Date: Sat, 7 May 2016 16:30:04 -0400

But when you hear me go on and on about how Academia has completely lost its way in security, it's because of papers 
like the one at the top of this email.

Since when is Symantec Research Labs an academic institution? That is a vendor paper submitted to and presented at a 
professional association's conference.[1][2] How did you get from hypothetical Symantec data falsification scenarios to 
‘academia’ being responsible for everything from the misuse of IDS data in ‘threat analysis’ to shitty corporate and 
public policy?
_________
[1] https://www.sigsac.org/ccs/CCS2012/ <https://www.sigsac.org/ccs/CCS2012/>
[2] http://www.acm.org/ <http://www.acm.org/>


On 6 May 2016, at 08:50, dave aitel <dave () immunityinc com> wrote:


This paper is bad in many ways, but in particular it confuses binaries with 0day (which are more related to 
vulnerabilities), uses a simplistic "windows of vulnerability" model, and uses the Symantec WINE dataset to try to 
derive real data from.
https://users.ece.cmu.edu/~tdumitra/public_documents/bilge12_zero_day.pdf 
<https://users.ece.cmu.edu/~tdumitra/public_documents/bilge12_zero_day.pdf>

A brief word about the WINE dataset and datasets like it: It is impossible to remove massive observer bias from them. 
All I want you to do is read the above paper and ask yourself "If the most used 0day on the market was in Symantec's 
endpoint protection, what would this paper look like?"  A good rule of thumb is that if someone is talking about 
"Windows of vulnerability" they have oversimplified the problem beyond recognition.

What you get with people who rely on IDS data to talk about 0days is a bizarre level of cognitive dissonance when it 
comes down to how bad their data is for the conclusions they are trying to draw. The only valid thing you can say 
from that kind of data is "sometimes we get lucky and find an 0day". And the same thing is true when looking at the 
Verizon data to try to understand attacks. Their conclusions this year are demonstrably nonsensical, but every year 
has been the same basic methodology...

This is a must read: http://blog.trailofbits.com/2016/05/05/the-dbirs-forest-of-exploit-signatures/ 
<http://blog.trailofbits.com/2016/05/05/the-dbirs-forest-of-exploit-signatures/> 

But when you hear me go on and on about how Academia has completely lost its way in security, it's because of papers 
like the one at the top of this email. When you don't have the data you need to make a conclusion, but you are forced 
to publish something, you get shit results. And then we make government and corporate policy decisions based on those 
results.

-dave
(P.S. The Windows emulator WINE is great, and not related to the Symantec WINE 
dataset:https://www.caida.org/workshops/telescope/slides/telescope1103_wine.pdf 
<https://www.caida.org/workshops/telescope/slides/telescope1103_wine.pdf>)
(P.P.S. A behavioral Windows dataset would actually be of great value. Maybe Crowdstrike could drop one out?)


_______________________________________________
Dailydave mailing list
Dailydave () lists immunityinc com
https://lists.immunityinc.com/mailman/listinfo/dailydave

_______________________________________________
Dailydave mailing list
Dailydave () lists immunityinc com
https://lists.immunityinc.com/mailman/listinfo/dailydave

Current thread: