nanog mailing list archives

Re: 400G forwarding - how does it work?


From: Masataka Ohta <mohta () necom830 hpcl titech ac jp>
Date: Sun, 7 Aug 2022 20:16:07 +0900

Saku Ytti wrote:

I'm afraid you imply too much buffer bloat only to cause
unnecessary and unpleasant delay.

With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
buffer is enough to make packet drop probability less than
1%. With 98% load, the probability is 0.0041%.

I feel like I'll live to regret asking. Which congestion control
algorithm are you thinking of?

I'm not assuming LAN environment, for which paced TCP may
be desirable (if bandwidth requirement is tight, which is
unlikely in LAN).

But Cubic and Reno will burst tcp window growth at sender rate, which
may be much more than receiver rate, someone has to store that growth
and pace it out at receiver rate, otherwise window won't grow, and
receiver rate won't be achieved.

When many TCPs are running, burst is averaged and traffic
is poisson.

So in an ideal scenario, no we don't need a lot of buffer, in
practical situations today, yes we need quite a bit of buffer.

That is an old theory known to be invalid (Ethernet switches with
small buffer is enough for IXes) and theoretically denied by:

        Sizing router buffers
        https://dl.acm.org/doi/10.1145/1030194.1015499

after which paced TCP was developed for unimportant exceptional
cases of LAN.

> Now add to this multiple logical interfaces, each having 4-8 queues,
> it adds up.

Having so may queues requires sorting of queues to properly
prioritize them, which costs a lot of computation (and
performance loss) for no benefit and is a bad idea.

> Also the shallow ingress buffers discussed in the thread are not delay
> buffers and the problem is complex because no device is marketable
> that can accept wire rate of minimum packet size, so what trade-offs
> do we carry, when we get bad traffic at wire rate at small packet
> size? We can't empty the ingress buffers fast enough, do we have
> physical memory for each port, do we share, how do we share?

People who use irrationally small packets will suffer, which is
not a problem for the rest of us.

                                                Masataka Ohta



Current thread: