nanog mailing list archives
Re: Shady areas of TCP window autotuning?
From: Marian Ďurkovič <md () bts sk>
Date: Tue, 17 Mar 2009 09:47:39 +0100
On Mon, Mar 16, 2009 at 09:09:35AM -0500, Leo Bicknell wrote:
Many edge devices have queues that are way too large. What appears to happen is vendors don't auto-size queues. Something like a cable or DSL modem may be designed for a maximum speed of 10Mbps, and the vendor sizes the queue appropriately. The service provider then deploys the device at 2.5Mbps, which means roughly (as it can be more complex) the queue should be 1/4th the size. However the software doesn't auto-size the buffer to the link speed, and the operator doesn't adjust the buffer size in their config. The result is that if the vendor targeted 100ms of buffer you now have 400ms of buffer, and really bad lag.
This is a very good point. Let me add, that it happens also for every autosensing 10/100/1000Base-T ethernet port, which typically does not auto-reduce buffers when the actual negotiated speed is not 1 Gbps.
As network operators we have to get out of the mind set that "packet drops are bad". While that may be true in planning the backbone to have sufficient bandwidth, it's the exact opposite of true when managing congestion at the edge. Reducing the buffer to be ~50ms of bandwidth makes the users a lot happier, and allows TCP to work. TCP needs drops to manage to the right speed. My wish is for the vendors to step up. I would love to be able to configure my router/cable modem/dsl box with "queue-size 50ms" and have it compute, for the current link speed, 50ms of buffer.
Reducing buffers to 50 msec clearly avoids excessive queueing delays, but let's look at this from the wider perspective: 1) initially we had a system where hosts were using fixed 64 kB buffers This was unable to achieve good performance over high BDP paths 2) OS maintainers have fixed this by means of buffer autotuning, where the host buffer size is no longer the problem. 3) the above fix introduces unacceptable delays into networks and users are complaining, especially if autotuning approach #2 is used 4) network operators will fix the problem by reducing buffers to e.g. 50 msec So at the end of the day, we'll again have a system which is unable to achieve good performance over high BDP paths, since with reduced buffers we'll have an underbuffered bottleneck in the path which will prevent full link untilization if RTT>50 msec. Thus all the above exercises will end up in having almost the same situation as before (of course YMMV). Something is seriously wrong, isn't it? And yes, I opened this topic last week on Linux netdev mailinglist and tried hard to persuade those people that some less aggresive approach is probably necessary to achieve good balance between the requirements for fastest possible throughput and fairness in the network. But the maintainers simply didn't want to listen :-( M.
Current thread:
- Re: Shady areas of TCP window autotuning?, (continued)
- Re: Shady areas of TCP window autotuning? David Andersen (Mar 16)
- Re: Shady areas of TCP window autotuning? Leo Bicknell (Mar 16)
- Re: Shady areas of TCP window autotuning? Wayne E. Bouchard (Mar 16)
- Re: Shady areas of TCP window autotuning? Lars Eggert (Mar 16)
- RE: Shady areas of TCP window autotuning? Frank Bulk - iName.com (Mar 16)
- Re: Shady areas of TCP window autotuning? Brett Frankenberger (Mar 17)
- Re: Shady areas of TCP window autotuning? Mikael Abrahamsson (Mar 17)
- Re: Shady areas of TCP window autotuning? Leo Bicknell (Mar 17)
- Re: Shady areas of TCP window autotuning? Marian Ďurkovič (Mar 18)
- Re: Shady areas of TCP window autotuning? Leo Bicknell (Mar 18)
- Re: Shady areas of TCP window autotuning? Marian Ďurkovič (Mar 17)
- Re: Shady areas of TCP window autotuning? Joe Maimon (Mar 17)
- Re: Shady areas of TCP window autotuning? John Schnizlein (Mar 17)
- Re: Shady areas of TCP window autotuning? Tony Finch (Mar 17)
- Re: Shady areas of TCP window autotuning? Lars Eggert (Mar 17)