nanog mailing list archives

Re: Slashdot: Providers Ignoring DNS TTL? PPLB is not a good thing


From: "Robert M. Enger" <enger () comcast net>
Date: Sat, 23 Apr 2005 15:29:18 -0400



Per Packet Load Balancing is not TCP friendly.  (this discussion is orthogonal to DNS)
PPLB leads to packet reordering.

Quite a few empirical and theoretical papers have been published (in peer reviewed fora and elsewhere)
that discuss the negative consequences of packet reordering.  A Google search finds many references.

On the downside for those attempting to maximize use of their circuits:
packet reordering can lead to unnecessary retransmissions (squandering capacity).

On the downside for users:
packet reordering can lead to lower performance.


Macroscopically:
There is some movement (finally) towards providing the consumer
with higher speed access to the Internet.  (e.g. FIOS 30Mbps and other FTTH and vDSL services).
Consumer adoption of such services would result in an upsurge of traffic:
the need for larger backbones, enhanced server farms, more acolytes to service all of it.
Adoption (and consequent resurgence of the Internet industry) will fail
if consumers do not actually obtain improved performance from their new higher speed connections.

PPLB only benefits those who are milking the last available profits out of a decaying industry.
It is not a forward looking approach.






At 01:51 AM 4/23/2005, Steve Gibbard wrote:

On Sat, 23 Apr 2005, Christopher L. Morrow wrote:

oh well, I tried to stay quiet :) Probably the PPLB problem isn't quite as
simple as: "you have pplb you can't do anycast". I'd imagine that you have
to have some substantial difference in the paths that the PPLB follows,
yes? like links to differing ISP's or perhaps extremely diverse links
inside the same ISP. Correct?

For anybody who's confused by this thread, this is a quick explanation, after which I'm really hoping the thread will 
die:

The "PPLB" Dean mentions is "per packet load balancing" in which you have two or more circuits, and packets to the 
same destination alternate which circuit they go down.  In every case in which I've seen this used, it's been to 
combine multiple circuits taking the same path between the same pair of routers, to in effect create a bigger circuit. 
In theory, PPLB could also be used to split traffic between circuits going to different routers, perhaps even in 
different places.  I've never seen anybody actually use the latter setup, and it seems to be universally regarded as 
something that would break things.  I suppose it's possible that somebody's using it somewhere, probably with 
"interesting" results.  It's the latter, theoretically possible, setup that Dean is talking about.

Anycast is a technique in which two or more servers, generally in different locations, announce the same address 
space.  Those sending traffic into a network via one POP or exchange point will have their traffic go to the server 
close to that entry point, while those sending traffic into a network via another POP or exchange point will have 
their traffic go to the server close to that point.  To an outside network, it looks the same as regular peering -- 
you see the same route at each peering point and can hand off traffic.  The only difference is that the packets may 
not have to travel as far once they enter the other network.

So, just as a fun theoretical exercise, let's examine what happens in the PPLB to multiple locations scenario that 
Dean imagines:

Let's say somebody is in the Midwest, and has T1s to Network A and network B.  And let's say that their network 
administrator read on the NANOG list that per packet load balancing was the trendy thing to do, so they turn on per 
packet load balancing between the two T1s.  Now they want to send some packets to a unicast host on network C, 
somewhere in California.

They start with UDP DNS queries, each consisting of a single packet. Half go via network A, which peers with Network C 
in California. Responses come back with a 40 ms RTT.  The other half go through network B, which has its closest 
peering point with Network C in Virginia.  The packets go to Virginia and then to California, and the replies come 
back 80 ms later.  Everything works fine.

Then they try to set up a more persistent connection, and again half their packets are taking the 40 ms path while the 
others are taking the 80 ms path.  Now things get interesting, because the packets are arriving out of order.  Some 
applications may do ok with this, since they'll take the sequence numbers and reorder the packets, with some buffering 
and processing delay.  But remember, the latency amounts here are numbers I just made up, and there's no reason why it 
couldn't be 40 ms vs. 1 second in some parts of the world.  In either case, I suppose it's possible that you'd get an 
HTTP connection to sort of work, and an ssh session might just seem mildly painful.  But good luck getting a VOIP call 
or anything of the sort to function over such a connection.

Dean is correct that this setup would fall apart even further when anycast is thrown into the mix.  In the anycast 
example, Network A hands off the packets to Network C in California, where they get sunk into a local server.  Network 
B hands off the packets to Network C in Virginia, where they get sunk into a local server.  Each server only sees half 
the packets, and half the retransmits, and is probably never going to get enough of the connection to put it all back 
together in a way that works.

So, there are a couple of different conclusions that could be drawn from this.  The conclusion I come to is that there 
are enough problems doing per packet load balancing on non-identical paths that nobody would actually do it.  I'm made 
more comfortable in this conclusion by having been through this discussion several times without finding anybody who 
claims to actually do that sort of per packet load balancing.  I, therefore, declare the PPLB thing to be a non-issue.

It may also be valid to declare that PPLB over non-identical paths is important to allow people to use every last bit 
of bandwidth they're paying for, and that we shouldn't make their already painful predicament worse.  But that's an 
argument I continue to be skeptical of.


Current thread: