nanog mailing list archives

Re: NDAA passed: Internet and Online Streaming Services Emergency Alert Study


From: Andy Brezinsky <andy () mbrez com>
Date: Sun, 3 Jan 2021 16:01:03 -0600

At this point I would assume that nearly every device is persisting at least one long lived TCP connection.  Whether it's for telemetry or command and control, everything these days seems to have this capability.  As an example, I can hit a button in the Nintendo Switch parent app on my phone and my kid's Switch is reflecting changes a second later.  That's not even a platform I would have expected to have that capability.

If they have an existing connection then there lots of high connection count solutions in the IOT space that could easily handle this number of connections.  A single 12c 32G box running emqttd could handle 1.3M connections.  Just picking a random AWS EC2 size machine, m5.4xlarge, would run you about $0.003/year per device to keep that connection open and passing data.  I assume you could drive that down significantly from there.


On 01/03/2021 03:35 PM, Brandon Martin wrote:
On 1/3/21 4:22 PM, Mark Delany wrote:
Creating quiescent sockets has certainly been discussed in the context of RSS where you might want to server-notify a large number of long-held client connections very
infrequently.

While a kernel could quiesce a TCP socket down to maybe 100 bytes or so (endpoint tuples, sequence numbers, window sizes and a few other odds and sods), a big residual cost is
application state - in particular TLS state.

Even with a participating application, quiescing in-memory state to something less than, say, 1KB is probably hard but might be doable with a participating TLS library. If so, a million quiescent connections could conceivably be stashed in a coupla GB of memory. And of course if you're prepared to wear a disk read to recover quiescent state, your in-memory cost could be less than 100 bytes allowing many millions of quiescent
connections per server.

Having said all that, as far as I understand it, none of the DNS-over-TCP systems imply centralization, that's just how a few applications have chosen to deploy. We deploy DOH to a private self-managed server pool which consume a measly 10-20 concurrent TCP sessions.

I was thinking more in the original context of this thread w.r.t. potential distribution of emergency alerts.  That could, if semi-centralized, easily result in 100s of million connections to juggle across a single service just for the USA.  While it presumably wouldn't be quite that centralized, it's a sizable problem to manage.

Obviously you could distribute it out ala the CDN model that the content providers use, but then you're potentially devoting a sizable chunk of hardware resources at something that really doesn't otherwise require it.

The nice thing is that such emergency alerts don't require confidentiality and can relatively easily bear in-band, application-level authentication (in fact, that seems preferable to only using session-level authentication).  That means you could easily carry them over plain HTTP or similar which removes the TLS overhead you mention.

Several GB of RAM is nothing for a modern server, of course.  It sounds like you'd probably run into other scaling issues before you hit memory limitations needed to juggle legitimate TCP connection state.


Current thread: