nanog mailing list archives

Re: Comcast - Significant v4 vs v6 throughput differences, almost stateful.


From: Matthew Petach <mpetach () netflight com>
Date: Thu, 23 Apr 2020 14:26:31 -0700

On Thu, Apr 23, 2020 at 12:45 PM Sabri Berisha <sabri () cluecentral net>
wrote:

----- On Apr 23, 2020, at 8:06 AM, Nick Zurku <nzurku () teraswitch com>
wrote:

We’re having serious throughput issues with our AS20326 pushing packets to
Comcast over v4. Our transfers are either the full line-speed of the
Comcast customer modem, or they’re seemingly capped at 200-300KB/s. This
behavior appears to be almost stateful, as if the speed is decided when the
connection starts. As long as it starts fast it will remain fast for the
length of the transfer and slow if it starts slow. Traces seem reasonable
and currently we’ve influenced the path onto GTT both ways. If we prepend
and reroute on our side, the same exact issue with happen on another
transit provider.

Have you tried running a test to see if there may be ECMP issues? I wrote
a rudimentary script once, https://pastebin.com/TTWEj12T, that might help
here. This script is written to detect packet loss on multiple ECMP paths,
but you might be able to modify it for througput.

The rationale behind my thinking is that if you have certain ECMP links
that are oversubscribed, the TCP sessions following that path will stay
"low" bandwidth. Sessions what win the ECMP lottery and pass through a
non-congested ECMP path may show better performance.

Thanks,

Sabri



And for a slightly more formal package to do this,
there's UDPing, developed by the amazing networking
team at Yahoo; it was written to identify intermittent
issues affecting a single link in an ECMP or L2-hashed
aggregate link pathway.

https://github.com/yahoo/UDPing

It does have the disadvantage of being designed for
one-way measurement in each direction; that decision
was intentional, to ensure each direction was measuring
a completely known, deterministic pathway based on the
hash values in the packets, without the return trip potentially
obscuring or complicating identification of problematic links.

But if you have access to both the source and destination ends
of the connection, it's a wonderful tool to narrow down exactly
where the underlying problem on a hashed ECMP/aggregate
link is.

Matt

Current thread: