nanog mailing list archives

Re: We hit half-million: The Cidr Report


From: Robert Drake <rdrake () direcpath com>
Date: Thu, 1 May 2014 18:00:45 -0400


On 4/29/2014 10:54 PM, Jeff Kell wrote:
Yeah, just when we thought Slammer / Blaster / Nachi / Welchia / etc /
etc  had been eliminated by process of "can't get there from here"... we
expose millions more endpoints...

/me ducks too (but you know *I* had to say it)

Slammer actually caused many firewalls to fall over due to high pps and having to track state. I thought about posting in the super-large anti-NAT/statefull firewall thread a few weeks ago but decided it wasn't worth it to stir up trouble.

Here is some trivia though:

Back when Slammer hit I was working for a major NSP. I had gotten late dinner with a friend and was at his work chatting with him since he worked the night shift by himself. It became apparent that something bad was wrong with the Internet. I decided to drive to my office and attempt to do what I could to fix the issues.

This was a mistake. Because of corporate reasons, my office was in a different city from the POP I connected to. I was 3 hops away from our corporate firewall, one of which was a T1.

We had access lists on all the routers preventing people from getting to them from the Internet, so I thought my office was the only place I could fix the issue. Well, someone had put a SQL server in front of or behind the firewall, somewhere where it would cause fun. That DOS'd the firewall. It took 3-4 hours of hacking things to get to the inside and outside routers and put an access-list blocking SQL. Once that was done the firewall instantly got better and I was able to push changes to every 7500 in the network blocking SQL on the uplink ports.

This didn't stop it everywhere because we had 12000's in the core and they didn't support ACLs on most of the interfaces we had. The access lists had to stick around for at least 6 months while the Internet patched and cleaned things up.

Fun fact: the office network I was using pre-dated RFC1918 so we were using public IPs. The software firewall that fell over either did so because statefull rules were included for SQL even when they weren't needed, or it died due to pure packets/sec. Regardless, all of the switching and routing hardware around it were fine.

This isn't an argument against firewalls, I'm just saying that people tend to put stock in them even when they're just adding complexity. If you have access lists blocking everything the firewall would block then you might think having both gives you defense in depth, but what it also does is gives a second place where typos or human error might cause problems. It also gives a second point of failure and (if state synchronization and load-balance/failover are added) compounded complexity. It depends on the goals you're trying to achieve. Sometimes redundant duties performed by two different groups gives you piece of mind, sometimes it's just added frustration.


Current thread: