nanog mailing list archives

RE: how is cold-potato done?


From: "Daniel Golding" <dgolding () sockeye com>
Date: Thu, 27 Jun 2002 13:37:57 -0400


Andre,

What Avi meant is that when you use routing policy (like routemaps or the
equivalent) to set additive MEDs between POPs, only do it on egress from all
POPs or ingress to all POPs. Don't do it on routes both ways. Look at slide
35 - it has all the MEDs being added as "from" routemaps, as opposed to both
"from" and "to".

Here is an example:

I have a POPs in NYC, Chicago, Seattle. I have routes in BGP being announced
from NYC, with a MED of +100 being tacked on as it leaves the NYC POP. I
then add an additional MED of +200 when it leaves the Chicago POP, heading
for Seattle. This is a cost metric, so higher is "worse". If I had routemaps
adding more MED cost upon ingress to the Chicago and Seattle POPs, in
addition to on egress from the NYC and Chicago POPs, you are adding twice as
much to the metrics - it just doesn't make much sense, and is twice the
number of values to control, when you are adjusting the values.

Of course, this is all about generating meaningful MEDs on your own network
for your own purposes, and for those of your customers and peers. It doesn't
really have to do with cold potato routing of other's traffic on your
network (although it does let people cold-potato route your traffic on THEIR
networks.)

Another valid approach for doing this sort of thing is setting your MEDs to
be the same as your IGP metrics to the next hops of the BGP routes - there
are "shortcut" commands for doing this. Of course, your mileage may vary.

- Daniel Golding

-----Original Message-----
From: owner-nanog () merit edu [mailto:owner-nanog () merit edu]On Behalf Of
dre
Sent: Wednesday, June 26, 2002 3:22 PM
To: nanog () merit edu
Subject: Re: how is cold-potato done?




Shortest-exit is the default because of the BGP decision process.
This tends to favor heavy-content providers because the bulk of
the data travels shorter distances out of the AS sending content
to the AS receiving the content to their eyeballs.

Shortest-exit is caused by IGP metrics (which shouldn't ever be
the same for two paths, unless you actually want that to happen).
IGP metrics are generally set by length of fiber paths or delay
values.  Provider backbones set these manually with ISIS or OSPF
costs.

There are many ways to do best-exit.  People are always coming
up with strange ways to do routing (ToS routing, MPLS-TE, DS-TE),
and they can sometimes apply these techniques to best-exit.

For those looking for something simple and standard, the two
ways were made known in the first email -> outbound MED's and
delay-based routing from `traceroute' information.  There are
quite a few problems with this as well, documented in many
various papers on the matter e.g.:
http://www.ietf.org/internet-drafts/draft-ietf-idr-route-oscillati
on-01.txt

For MED's, Avi spoke to the methods used in the following talks:
http://www.nanog.org/mtg-9901/ppt/bgp102/index.htm
http://www.nanog.org/mtg-9811/ppt/avi/index.htm

One thing Avi mentioned here, I never quite understood..
http://www.nanog.org/mtg-9811/ppt/avi/sld031.htm
He says "set MED's in one direction only", but he doesn't say
which direction or why.

As to solving the aggregation problem making outbound MED's
insignficant, there is some work trying to be solved using
Communities (NO-PEER, supercommunities, redistribution, cost
communities, link-bw, et al).  Some of which is believed (and
probably rightly so) to be overcomplicated and possibly even
oscillatory just like the other methods.

I enjoy the simple approach that RFC 3272 takes (surprisingly
simple Inter-Domain traffic engineering coming from the super
complex Intra-Domain TE based on MPLS/etc that the authors
recommend).  They have some suggestions on setting local_pref
and inbound MED's that I found to be very clueful.
http://www.ietf.org/rfc/rfc3272.txt (Section 7.0)

  "Inter-domain TE is inherently more difficult than intra-domain TE
   under the current Internet architecture.  The reasons for this are
   both technical and administrative."

So maybe best practice today for doing best-exit is simply having
the technical data (communities, tags, traffic, etc) and talking
directly with the administrators of your peer-AS to find a solution
(or reading their minds without their data, or inferring it, or
guessing).

I guess the final question is -- why is anyone concerned about
best-exit at all?  Doesn't shortest-exit still get the traffic
there?  I'm willing to bet there are a lot of different answers
to all these questions.

-dre

On Wed, Jun 26, 2002 at 02:35:55PM -0400, Leo Bicknell wrote:

In a message written on Wed, Jun 26, 2002 at 01:52:08PM -0400,
Ralph Doncaster wrote:
If I peer with network X in cities A and B, and receive the
same route in

Wow, I'm amazed at the wrong answers here.  The vendors even document
this, as do the RFC's, see
http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/bgp.htm

More to your question, cold-potato uses MEDS to determine the best exit.
Generally they do not work for large aggregates of the peer, so they
are spread out across the network.  Clueful peers set the outgoing meds
on their aggregates to all the same value.

Set to the same value, or clobbered on inbound, if there is no MED,
then the routers inside your network will choose the closest exit
based on your IGP cost.  This is "hot potato" routing.

If, by strange chance, you have equal IGP costs to two peering points
with equal MEDS, then it will choose the one with the lower router ID.

As you can see, there are many other steps to the selection process,
as documented in the link above.



Current thread: