IDS mailing list archives
Re: Anomaly Based Network IDS
From: Thomas Ptacek <tqbf () arbor net>
Date: Thu, 24 Jun 2004 19:16:14 -0400
On Jun 18, 2004, at 9:09 AM, Joe Dauncey wrote:
I suppose my defintion of anomaly based is that it discovers attacks based on sampling and analysing the network traffic and identifying anomalies on the norm, rather than relying on a specific external signature to tell it what to look for.
Interesting thread. Some interesting points from Drew, Adam, and Sasha, leading to some things I'd like to clear up. Obviously, I'm biased (my company leads the market in anomaly detection systems). I also compete directly with Adam's company. But I think my take on this is different than most IDS enthusiasts expect.
First, "anomaly detection" is one of the least useful terms in information security.
- The term has been completely hijacked by mainstream IDS, which uses "protocol anomaly detection" to describe something that is basically "meta-signatures" or "preemptive signatures" --- having a rulebase that prescribes "good" and "bad" traffic is the dictionary definition of "misuse detection". - Even after you refine "real" anomaly detection to refer to "systems that learn patterns of normal behavior", you're still left with tens of different approaches, defined by different threat models, strengths, and weaknesses.For the same reason, terms that describe where the raw information comes from aren't useful. You can write a "flow-based" misuse detector. Roesch's target-based work comes dangerously close to being a misuse-based anomaly detector. The mind reels. And nevertheless, I'm going to continue using the term "anomaly detection" throughout this post, just to add to the confusion.
There are two basic problems with this whole thread: - It's a discussion about "detection systems" that abstracts away the threat model --- what are we trying to detect --- and substitutes a straw-man --- "signature-free misuse detector". - It conflates a fundamental technological approach --- modeling networks --- with a product goal: detection. We're not being very creative if, based on a complete understanding of the normal usage patterns of an entire network, the best thing we can come up with is "unusually long-lived flow!". I'm going to take a whack at these problems in order.First, my sympathies are with Drew in the zero-day argument, and not just because of my vast admiration for eEye's research team. I had Drew's job at Secure Networks, have run reasonably large networks, and was a partisan on Focus-IDS when Vern Paxson was arguing that anomaly detection couldn't solve the zero-day detection problem for precisely the reasons Drew is bringing up. A clumsy swipe at some of the subtext of Drew's argument:
- Number of new vulnerabilities discovered by research teams like eEye: hundreds. - Number of new vulnerabilities discovered by next-gen IDS systems like Lancope: zero.Certainly, rate-driven anomaly models have some day-to-day value for perimeter security analysts; backdoor ports and sudden behavior changes are canary signals that machines have been compromised. There's an argument to be had that on a signal/noise basis, this gives anomaly detection a value comparable to typical IDS. But as a front-line replacement for signature IDS at the perimeter, the argument against is pretty strong: anomaly alerts are abstract and still prone to false positives.
I question the value of systems like Lancope, versus a good signature-driven system, for immediate detection of perimeter security threats.
I think there are two threat models where good anomaly-driven systems do have quite a bit of value:
- Network worm outbreaks: a good anomaly model can give a rapid, coherent early-warning signal of a network worm outbreak, and more importantly can be extremely valuable in mitigating and recovering from outbreaks. - Insider misuse: the subtle attacks that can't be accounted for in signatures, like "improper web access to payroll resource".On the worm issue, I'll argue that we have a pretty good claim to "zero-day" (and even "pre-zero-day") defense. Our system has a model that statistically detects propagating behavior. We do a good job of detecting worm outbreaks without signatures, but more importantly, we characterize, scope, and trace the behavior we've flagged. What protocol is the worm vectoring over? Which hosts have been tapped? Who's actively propagating the traffic? How much traffic is there? How is the rate of infection changing? These are important questions during and outbreak, and they are not questions that many signature systems do a good job of answering.
At the same time, it's on the insider misuse threat that I think anomaly detection gets really interesting. Dan Farmer gained instant notoriety in the early 90's for SATAN and his "hack your system to secure it" paper. Go re-read that paper: most of the attacks he described (and tested for with SATAN) are textbook misconfiguration and misuse problems. Ten subsequent years of buffer overflow hell and massive investment in perimeter security has clouded the fact that, on most large networks, Dan Farmer's SATAN problems remain unsolved. The only thing that changed was the protocol: NFS turned into SMB. Wonderful.
The key point here is that the internal security threat model is different from the perimeter security threat model. You aren't speaking to the internal threat model when you talk about hacker attacks and zero-day exploits. Insider attacks are carried out using Internet Explorer and the Windows file browser, and they are often carried out by attackers who don't know what a TTL is, let alone how to tunnel SSH in an HTTP connection. When an attacker has both insider access and deep technical knowledge, we are worried less about whether he has an exploit for the LSASS bug, or a DNS tunneling proxy, and a lot more concerned about whether he can write a 10 line perl script to disrupt a trading feed protocol or capture patient health information. We use anomaly methods to capture attacks that can't be accounted for with signatures, not to replace signatures for attacks that they already handle well.
As I mentioned previously, there are a variety of interesting anomaly models. We've implemented several of them. To detect worms, model behavior propagation. To detect a flooding attack, baseline traffic rates over time. To find a helpdesk worker mucking with the payroll server, model relationships between hosts. To have a reasonable discussion about anomaly models versus signatures, pick a threat model, and talk about a specific anomaly model, and then we'll get somewhere.
The problem with "signature versus anomaly" arguments that annoys me even more is the tunnel-vision around the "detection" problem. What can a network model do besides detect a worm outbreak?
- It can keep track of infected hosts on the network to facilitate cleanup. - It can track dependencies on infected hosts, so we know which hosts to pull the plug on and which hosts to triage patch. - It can tell us whether we use the afflicted protocol at all (remember Slammer), because if we don't, we might as well just ACL the whole service off at the core. - It can tell us which pairs of hosts legitimately use the afflicted protocol, so we can halt the outbreak with filters without disrupting a $10,000/second HR system. - From the moment eEye gave that webinar for LSASS where they said "this vulnerability will definitely become a worm", it can lock the network down by answering that same "who's talking to who" question. - For a variety of protocols on eEye's hit-list, it can track and prepare us to lock down the network even in advance of advisory publication.The ability to provide an instant answer to the "who's talking to who" question is interesting and useful. Find latent vulnerabilities (services that are never used, but are installed). Generate firewall rules to create DMZs in front of critical resources. Provide forensic information for incident investigations. I'm pretty sure we haven't scratched the surface yet.
But it definitely doesn't help to tilt at windmills and try to position anomaly detection as a hapless replacement for signature IDS.
--- Thomas H. Ptacek // Product Manager, Arbor Networks (734) 327-0000 --------------------------------------------------------------------------- ---------------------------------------------------------------------------
Current thread:
- Anomaly Based Network IDS Joe Dauncey (Jun 18)
- RE: Anomaly Based Network IDS Mike Lyman (Jun 21)
- RE: Anomaly Based Network IDS Sasha Romanosky (Jun 24)
- Re: Anomaly Based Network IDS Thomas Ptacek (Jun 25)
- <Possible follow-ups>
- Re: Anomaly Based Network IDS Drew Simonis (Jun 18)
- Re: Anomaly Based Network IDS Jose Nazario (Jun 22)
- RE: Anomaly Based Network IDS Shafi, Shahid (Jun 22)
- RE: Anomaly Based Network IDS Joshua Berry (Jun 22)
- Re: Anomaly Based Network IDS Aaron Jordan (Jun 22)
- RE: Anomaly Based Network IDS Drew Copley (Jun 22)
- Re: Anomaly Based Network IDS Adam Powers (Jun 24)
- RE: Anomaly Based Network IDS David J. Meltzer (Jun 22)
- RE: Anomaly Based Network IDS crayola (Jun 22)
- RE: Anomaly Based Network IDS Drew Copley (Jun 23)
(Thread continues...)