nanog mailing list archives

Re: Max Prefixes Configured on Customer BGP


From: Chris Woodfield <rekoil () semihuman com>
Date: Fri, 16 Aug 2002 12:58:41 -0400

That's why you make sure that any incidents where max-prefix is tripped is 
caught by a syslog watcher and brought to the immediate attention of whoever's 
sitting in your NOC. Honestly, if all you're dealing with is customer BGP 
session, I would propose that 90% of them don't advertise more than 10 prefixes, 
so a max-prefix number higher than, say, 100 should do for most cases. And for 
that last 10%, max-prefix is a per-session configuration, so that number can always 
be set higher. IMO, advertising 100 routes for 30 seconds is far less damaging 
than 8000 routes.

Also, don't forget about the warn option - if a customer's organic growth puts 
them close to the prefix limit, you should get a heads-up in most cases.

I recall an incident where we brought up a customer advertising around 600 
routes, and sent the prefix list our upstream, who dutifully added all 
600 routes to the prefix list, but neglected to raise their maximum-prefix limit 
from 300. This, of course, had predictable results. Doh.

-C

This isn't a terribly cisco-specific reply so I'll keep it here.

The problem with restart systems (btw thank you cisco for finally adding
this)  is, think about how much damage can be done by announcing 8k routes
for the 30 seconds (or 5-10 minutes if there is a Foundry in the mix :P)
before you get to the limit and kill the session. Now add in the damage 
caused by this happening every 15 minutes, and the dampening. Or even 
worse, someone who turns up more routes and happens to hit right around 
the exact number or close to it. Imagine a session which goes over by 1 
route, trips, stays down for 15 minutes, comes back up and this time has 1 
less route, and noone notices the prefix limit needs to be raised. You 
should make sure that the restart time exceeds the number/length of flaps 
necessary to trigger dampening, which on a connect you transit is pretty 
darn hard to accurately guess.

IMHO, using only prefix limits on a customer is actually doing them (and
the rest of the internet that listens to your announcements) a disservice.

A better system might be where the session is kept up (or periodically
polled, if you want to make it obvious to the other party that there is a
problem) without installing the routes, and kept in a "quarantine" state
for X amount of time to make sure that things stay below a configured
number. This would be at least a slightly better way of recovering quickly
once the "problem" has passed, without mucking things up every 15 minutes 
in the process.

-- 
Richard A Steenbergen <ras () e-gerbil net>       http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

Attachment: _bin
Description:


Current thread: