nanog mailing list archives

Re: free collaborative tools for low BW and losy connections


From: Nick Hilliard <nick () foobar org>
Date: Sun, 29 Mar 2020 19:46:28 +0100

Joe Greco wrote on 29/03/2020 15:56:
On Sun, Mar 29, 2020 at 03:01:04PM +0100, Nick Hilliard wrote:
because it uses flooding and can't guarantee reliable message
distribution, particularly at higher traffic levels.

That's so hideously wrong.  It's like claiming web forums don't
work because IP packet delivery isn't reliable.

Really, it's nothing like that.

Usenet message delivery at higher levels works just fine, except that
on the public backbone, it is generally implemented as "best effort"
rather than a concerted effort to deliver reliably.

If you can explain the bit of the protocol that guarantees that all nodes have received all postings, then let's discuss it.

The concept of flooding isn't problematic by itself.

Flood often works fine until you attempt to scale it. Then it breaks, just like Bjørn admitted. Flooding is inherently problematic at scale.

 If you wanted to
implement a collaborative system, you could easily run a private
hierarchy and run a separate feed for it, which you could then monitor
for backlogs or issues.  You do not need to dump your local traffic on
the public Usenet.  This can happily coexist alongside public traffic
on your server.  It is easy to make it 100% reliable if that is a goal.

For sure, you can operate mostly reliable self-contained systems with limited distribution. We're all in agreement about this.

The fact that it ended up having to implement TAKETHIS is only one
indication of what a truly awful protocol it is.

No, the fact that it ended up having to implement TAKETHIS is a nod to
the problem of RTT.

TAKETHIS was necessary to keep things running because of the dual problem of RTT and lack of pipelining. Taken together, these two problems made it impossible to optimise incoming feeds, because of ... well, flooding, which meant that even if you attempted an IHAVE, by the time you delivered the article, some other feed might already have delivered it. TAKETHIS managed to sweep these problems under the carpet, but it's a horrible, awful protocol hack.

It did and has.  The large scale binaries sites are still doing a
great job of propagating binaries with very close to 100% reliability.

which is mostly because there are so few large binary sites these days, i.e. limited distribution model.

I was there.

So was I, and probably so were lots of other people on nanog-l. We all played our part trying to keep the thing hanging together.

I'm the maintainer of Diablo.  It's fair to say I had a
large influence on this issue as it was Diablo's distributed backend
capability that really instigated retention competition, and a number
of optimizations that I made helped make it practical.

Diablo was great - I used it for years after INN-related head-bleeding. Afterwards, Typhoon improved things even more.

The problem for smaller sites is simply the immense traffic volume.
If you want to carry binaries, you need double digits Gbps.  If you
filter them out, the load is actually quite trivial.

Right, so you've put your finger on the other major problem relating to flooding which isn't the distribution synchronisation / optimisation problem: all sites get all posts for all groups which they're configured for. This is a profound waste of resources + it doesn't scale in any meaningful way.

Nick


Current thread: