nanog mailing list archives

Re: QUIC traffic throttled on AT&T residential


From: Mark Andrews <marka () isc org>
Date: Wed, 19 Feb 2020 16:26:09 +1100



On 19 Feb 2020, at 15:47, Daniel Sterling <sterling.daniel () gmail com> wrote:

On Tue, Feb 18, 2020 at 8:05 PM Michael Brown <michael () supermathie net> wrote:
Blocking a (for you) undesirable option when an established fallback
exists is a much better end user experience than introducing breakage
into that option

Or: I no longer use my ISP's IPv6 access (via 6rd) since it would cause
terrible slowdowns due to packet loss when it broke

All the +1s.


I have one goal for my home internet: it should work better than my cell phone!

In our house, "screen time" is "all the time". Everyone is on their
phone non-stop. So you'd think getting fiber, installing UBNT APs and
swapping out AT&T's CPE for a Core i5 linux box would provide a better
internet experience than a tree-obstructed cell tower a mile or two
down the road.

But you'd be wrong.


Everyone in the house was, on a daily basis, turning off wifi in favor
of AT&T LTE!

Why was everyone switching off wifi? I couldn't blame them -- I was
toggling it on and off myself to get the occasional website or IM
conversation to load. Why was my network broken?? How was it possible
that a fairly high-latency mobile connection could provide a better
experience than 802.11ac to an AP in the same room backed by a gigabit
PON?


I banged against this for *years*. I punted on using my own router and
tried just AT&T's CPE (reset to factory defaults). That does work
decently, but there are some maddening quirks, not least of which is
insanely high jitter.

I tried SoHo devices running vendor stock firmware driving hardware
NAT. Those also work well -- until they inevitably crash.


I tried ddwrt and openwrt. I tried AQM; I tried QoS. I tried NUCs
running upstream kernels; downstream kernels; I tried custom patches.
I tried HFSC and CoDel; I compiled iproute2 so I could have some
tc_cake.


On the link-layer (wifi) side I tried one AP; two APs; three APs. I
tested any number of combinations of SSID name, channel frequency and
width. I tried with ipv6; no ipv6; I put all 2ghz devices on their own
AP; in desperation I even tried dedicating one AP and an entire 5ghz
frequency range for just one phone.

But nothing mattered until I finally figured it out:

It was DNS. Of course it was DNS. It's always DNS.


When DNS was solid and fast, everything else fell into place. Toggling
wifi worked because it was the same as re-querying DNS! And the DNS
service on mobile works well -- better than the average CPE.

*** I cannot stress this enough. No manner of tuning or tweaking to my
home network stopped users from fleeing it, until I had solid DNS. ***


For fast DNS, you of course need fast UDP. And, as we've empirically
discovered, well-behaved UDP is by no means guaranteed on residential
connections.

It turns out dnsmasq has a couple of tunables that can make a huge
difference to home internet DNS performance. First, you need to be
querying the DNS servers AT&T tells you to via DHCP. They're the least
likely to be throttled, unfortunately. But I've found even that alone
isn't enough.

You need to set dnsmasq's "query-port" option. By default it's random
to protect against CVE-2008-1447, but apparently sending a ton of
random-source-port UDP traffic does not impress the AT&T network flow
control systems, and your DNS traffic becomes unbearably slow (or is
simply dropped entirely)

If dnsmasq supports DNS COOKIE turn it on.  DNS COOKIE provides protection
against CVE-2008-1447 provides the other end supports DNS COOKIE without
having to play games with ports.

AT&T gives you two DNS servers via DHCP. You can query more --
8.8.8.8, 4.2.2.2, 2606:4700:4700::1111 -- but if you do, you'll want
to enable dnsmasq's "all-servers" option. Packets are cheap -- send a
query to every server on your list and use whatever comes back first.
If you've angered the UDP flow restrictor, no matter -- with luck at
least one of your packets is going to a server that's up and not
throttled or overloaded!


Of course DNS is just the beginning -- you still need a proper gateway
device with a good NAT stack and/or firewall; you still need a strong
wifi signal; you still need tc_cake so everyone can watch Netflix at
the same time.

But DNS is the *core*. Nothing works well until DNS works well. That
means nothing works well unless UDP works well. And if I have learned
anything about AS7018, it's that UDP -- especially its v4 UDP -- Does.
Not. Work. Well.


Enter QUIC. It may be the perfect transport-layer protocol; but by
putting it on top of UDP it's hobbled. It breaks extant v4 internet in
a way that nothing else we've yet seen does -- it takes what would be
your TCP traffic and gives it inconsistent and intermittently poor
performance. Maybe it's sometimes fast. Maybe it is. But I can tell
you, it sometimes Is Very Much Not So.


As much as I would on principle rather not stick to a legacy, TCP-only
home network --

I can say that right now, my home internet, blocking UDP 443, and
making tons of insecure DNS queries -- is the most stable, fastest,
most usable and enjoyable home internet I've ever had. And my users
agree -- they no longer turn off wifi.


May I naively ask if Google staff have considered scrapping using UDP
and instead proposing a new, first-class transport protocol that OSes
can implement on top of IP? UDP certainly helped speed testing and
iteration for QUIC in real-world scenarios, but I fear UDP is too
brittle ground upon which to build the next generation of internet
transport. Committing to UDP now with HTTP/3 may be a mistake.

And if that doesn't convince you, consider that even I was smart
enough to figure out how to block it :)

-- Dan

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742              INTERNET: marka () isc org


Current thread: