nanog mailing list archives

Re: Securing the BGP or controlling it?


From: Marshall Eubanks <tme () americafree tv>
Date: Tue, 11 May 2010 11:08:12 -0400

Dear Danny;

On May 11, 2010, at 10:13 AM, Danny McPherson wrote:


On May 11, 2010, at 7:32 AM, Nick Hilliard wrote:

Risk analysis is ass covering without the theatre. You collect data, make a judgement based on that data, and if it turns out that the judgement says that signed bgp updates constitute more of a stability risk to network
operations than the occasional shock problem

So apply the risk management analogy here.  We all know that
pretty much anyone can assert reachability for anyone else's
address space inter-domain on the Internet, in particular the
closer you get to 'the core' the easier this gets.  We also
know that route "leaks" commonly occur that result in outages
and the potential for intercept or other nefarious activity.
Additionally, we know that deaggregation, and similar events
result in wide-scale systemic effects.  We also know that
topologically localized events occur that can impact our reachability,
whether we're party to the actual fault or not.  We have a slew of
empirical data to support all of these things, some more high profile
than others, with route leaks likely occurring at the highest
frequency (every single day).

I would suspect that the probability of fire effecting your
network availability is very low, as you can fail over to a
new facility.  OTOH, if you have a route hijack (intentional
or not) failover to a new facility with that address space
isn't going to help, and hijacks can be topologically localized
- the same applies for DDoS.  Yet I suspect your organization
has invested reasonably in fire suppression systems, but the
asset that matters most that enables the substrate of some
applications and services that you care about - the availability
of your address space within the global routing system, has no
safeguards whatsoever, and can be impacted from anywhere in the
world.

I'd also venture a guess that we've had more routing issues that
have resulted in network downtime of critical sites than we have had
fires (if someone disproves that _nice dinner on me!).

But there is also recovery time, which you don't mention in your bet.

If the building I am sitting in right now were to
burn down to the ground, the client I am at would be affected for months and months. Yes, they
have backups, and redundancy, but this is their HQ.

If they (say) fat finger their BGP, well, it would be bad, but if they fix it this
afternoon, everything will go back to normal shortly thereafter.

So, sure, network outages may be more frequent than catastrophic fires, but that doesn't mean that the aggregated duration of disruption from network outages is greater than the aggregated
disruption duration from fires.

Regards
Marshall


We've got empirical data, we understand the vulnerability and the
risk (probability of a threat being used).  Put that in your risk
management equation and consider what assets are most vulnerable
to your organization - I'd venture it's something to do with network,
and if routing ain't working, network ain't working...

-danny





Current thread: