IDS mailing list archives

Re: TippingPoint Releases Open Source Code for FirstIntrusionPrev ention Test Tool, Tomahawk


From: ADT <synfinatic () gmail com>
Date: Mon, 8 Nov 2004 20:58:28 -0800

Hey Paul,

You make a number of good points, but I honestly think most of those
issues you raise with IPS testing is very solveable using replay
technology. That's not to say that replay technology is the best
solution in all cases, but let's go down the list...

On Mon, 8 Nov 2004 21:01:37 -0500, Paul Palmer <b.paul.palmer () gmail com> wrote:

[snip]

First, once the IPS responds, the remainder of the packets replayed
are likely no longer an accurate reflection of what would have
happened with live traffic. For example, if the IPS drops packets,
does the capture show retransmissions at the key points? If not, how
do you know that naturally occuring retransmissions wouldn't get
forwarded? Does the tool compensate for this by synthesizing
retransmissions on "dropped" packets?

Yeah, I think retransmitting the last "appropriate" packets would be
quite reasonable and not to painful to impliment.  While I haven't looked 
carefully enough, this seems to be the method tomahawk impliments. 
It's also on the todo list for tcpreplay v3.  Please let me know if
you think otherwise.

What if the IPS inserts TCP RST packets ahead of the undesired packet
instead of dropping it? Does the tool account for this? How does the
tool know whether the RST would have been effective? Does the tool
understand the protocol stacks it is simulating to be able to answer
that question?

Well I'd argue that RST's alone aren't effective in the real world;
there are enough single packet exploits in the world that sending a
RST after the fact does you no good, not to mention completely useless
for non TCP protocols. But, assuming you still want to test a RST only
solution, yeah, this is probably where you want to have a real IP
stack on the victim end to verify wether or not the IPS/IDS was able
to inject the RST packet inside the window and meet any other criteria
that might be implimented.

Of course, this doesn't negate the possiblilty of a replay tool which
reads pcap files and creates real socket sessions with another device.
Won't work very well with multi-flow protocols like FTP (without
protocol specific logic), but should be fine for HTTP, SMTP, etc. 
Flowreplay (ships with tcpreplay) will hopefully someday do this
better then it does now.

What if the IPS doesn't respond by dropping packets? What if it 
rewrites the attack out of the packet before forwarding it in some
cases (because simply dropping or resetting an SMTP connection tends
to make matters worse rather than better for example)? How does the
tool know whether this rewriting would be effective?

Obviously the tool can "diff" packets it sends against those that it
recieves at the other end to see if changes have been made (again on
todo list for tcpreplay v3).  Wether or not these changes are
sufficent to neuter an exploit would have to be up to the user. 
Honestly, while this has some serious geek coolness to it, I doubt it
will ever be more then a corner case solution.

Of course the question is, what is the test?  If you are doing
regression testing, then doing a simple diff and testing it against a
known good result is completely viable.  If you're looking to use it
as a sales tool, you probably should look elsewhere unless you've got
some examples where it's blatently obvious to anyone with half a
brain.

Ultimately, you get into a situation of role-reversal. How do you test
these tools? Well you exercise them in "real world" scenarios to see
how they behave. That is, you place them in a test bed with an IPS.
You launch "attacks" and see what happens. If the test fails, you look
to see if the tool failed or the IPS failed. Very rapidly you get into
the situation where the IPS is used as the "test tool" for your test
tool. That is, the IPS measures and heavily influences "proper" tool
behavior. 

I think this over-simplifies things a bit.  There are three major uses
for these kind of tools:
1) Internal testing (mostly regression testing)
2) Marketing/Sales calls 
3) Internal comparative testing

For the first case, you know what the "correct" result is and you can
verify it in the test result.  For the second case, yeah, use a real
exploit.  People want to see "real exploits".

Now for internal comparitive testing, I would definately use something
like tcpreplay or tomahawk since this kind of testing is a lot more
scalable then running legit attacks. Again, I'd argue that dropping
evil packets is the best defense 90+% of the time, and valid solution
the remainder of the time.

Tomahawk works well at testing Tippingpoint's products
because that is how it was tested. It is horrible at testing ISS
products (the vendor I happen to work for) because Tippingpoint has no
vested interest in investing time and energy to make sure that ISS,
Netscreen, McAfee, etc (I apologize to all those I left out) are
adequately measured by their tools. I do not see how placing the tool
in the public domain solves this problem.

Well rather then complaining that tcpreplay or tomahawk don't test ISS
products "properly", why not submit a patch which adds the
intelligence to do so?  Or at least provide constructive criticisim
which explains how those tools could be improved to better meet your
needs.  I would love to hear any insight you might have.

So who is going to invest $1,000,000 in IPS equipment from various
vendors to properly test this tool and make sure it gives every vendor
fair representation? Who gets to decide what is fair? The vendors? The
public domain developers? Do you want to be one of those developers?

I'm not sure if that was directed at me, but let me answer it:  Yes,
actually I am one of those developers.  And if a vendor doesn't like
it how I show their IDS/IPS they're more then welcome to let me know
how I can improve it.  Even better, they can send me in a patch and
I'll even give them credit for helping out.

I
can just smell the defamation lawsuits from whichever vendors feel
they are not being fairly represented by your code changes...

Pure FUD.

FYI, I work for a vendor. We have a tool very similar to Tomahawk. It
works well at testing our products. However, other vendor's products
do not stand up as well to our testing :)

Great.  I'm sure everyone here would love to get access to your tool (and more
importantly it's code). 

The only reliable means of testing IPS product effectiveness I have
discovered so far is live fire testing. Setup an isolated LAN. Place
vulnerable systems on one side of the IPS and launch attacks from the
other side. If the vulnerable system is compromised, the IPS failed. I
strongly recommend modifying all of the exploits to only DoS the
victim so that nothing more than a reboot is ever necessary to prepare
for the next test.

I'd say that's reasonably accurate statement for current replay
technology.  I don't think tcpreplay, tomahawk, etc have matured to
the point where they can completely replace live exploits for IPS
product effectiveness testing.  The technology clearly is good enough
for pure detection testing, but it's only beginning to verify an IPS
ability to protect other systems.

Regards,
Aaron

-- 
http://synfin.net
 
On Sat, 6 Nov 2004 13:16:02 -0800, ADT <synfinatic () gmail com> wrote:
(thread is getting long, so just going to snip the whole thing,
hopefully you kept a local copy)

Hey Greg/Marty,

I don't think anyone would argue that tcpreplay or tomahawk are
written for performance
testing of IDS or IPS.   I'm sure some people do that, but both have
rather limited use in that regards (you want to generate background
traffic using *your* network's traffic).  What tcpreplay and tomahawk
do rather well is provide the means to safely reproduce malicious
traffic for testing detection capabilities.

Unlike "live tests", tcpreplay/tomahawk don't require people to
distribute working exploit code
or attack an actual host which due to the nature of exploits will
likely have to be "fixed" in some
manner.  Unlike exploit code, there is no risk that a pcap will also
re-format your harddrive or
require you to install and configure a wide variety of operating
systems and applications to
attack.

Of course, unlike a "live test" there is some trust involved that the
pcap contains packets which
are relevant for the test you are running.   Wether or not this
precludes using either tool for being
used by someone evaluating an IDS/IPS probably depends on how much
they trust the pcaps.
For those people who don't want to trust pcaps and don't have the
means to get a library of working exploits, I'm sure Blade will be
more then happy to sell you IDS Informer (of course, now you have to
trust Blade, so you're just shifting your trust).

Of course if you already have a repository of valid pcaps (maybe
something the OSVDB guys could do?) with known attacks, then using
these tools probably make a lot of sense for certain kinds of tests.

Aaron, the tcpreplay guy

--
http://synfin.net/



--------------------------------------------------------------------------


Test Your IDS

Is your IDS deployed correctly?
Find out quickly and easily by testing it with real-world attacks from
CORE IMPACT.
Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708
to learn more.
--------------------------------------------------------------------------





-- 
http://synfin.net/

--------------------------------------------------------------------------
Test Your IDS

Is your IDS deployed correctly?
Find out quickly and easily by testing it with real-world attacks from 
CORE IMPACT.
Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 
to learn more.
--------------------------------------------------------------------------


Current thread: