nanog mailing list archives

Re: Software router state of the art


From: "Kevin Oberman" <oberman () es net>
Date: Wed, 23 Jul 2008 14:23:18 -0700

Date: Wed, 23 Jul 2008 16:51:50 -0400
From: "William Herrin" <herrin-nanog () dirtside com>
Sender: wherrin () gmail com

On Wed, Jul 23, 2008 at 3:59 PM, Kevin Oberman <oberman () es net> wrote:
The first bottleneck is the interrupts from the NIC. With a generic
Intel NIC under Linux, you start to lose a non-trivial number of
packets around 700mbps of "normal" traffic because it can't service
the interrupts quickly enough.

Most modern high performance network cards support MSI (Message Signaled
Interrupts) which generate real interrupts only in an intelligent
basis. and only at a controlled rate. Windows, Solaris and FreeBSD have
support for MSI and I think Linux does, too. It requires both hardware
and software support.

"ethtool -c". Thanks Sargun for putting me on to "I/O Coalescing."

But cards like the Intel Pro/1000 have 64k of memory for buffering
packets, both in and out. Few have very much more than 64k. 64k means
32k to tx and 32k to rx. Means you darn well better generate an
interrupt when you get near 16k so that you don't fill the buffer
before the 16k you generated the interrupt for has been cleared. Means
you're generating an interrupt at least for every 10 or so 1500 byte
packets.

You have just hit on a huge problems with most (all?) 1G and 10G
hardware. The buffers are way too small for optimal performance in any
case where the RTT is anything more that half a millisecond, you exhaust
the window and stall the stream.

I need port move multi-gigabit streams across the country and between the
US and Europe. Those are a bit too far apart for those tiny buffers to
be of any use at all. This would require 3 GB of buffers. This same
problem also make TCP off-load of no use at all.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman () es net                       Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751

Attachment: _bin
Description:


Current thread: