nanog mailing list archives

Re: Facebook post-mortems...


From: Masataka Ohta <mohta () necom830 hpcl titech ac jp>
Date: Wed, 6 Oct 2021 17:51:25 +0900

Hank Nussbacher wrote:

- "it was not possible to access our data centers through our normal
 means because their networks were down, and second, the total loss
of DNS broke many of the internal tools we'd normally use to
investigate and resolve outages like this.  Our primary and
out-of-band network access was down..."

Does this mean that FB acknowledges that the loss of DNS broke their
OOB access?

It means FB still do not yet understand what happened.

Lack of BGP announcement does not mean "total loss". Name
servers should still be accessible by internal tools.

But, withdrawing route (for BGP and, maybe, IGP) of failing anycast
server is a bad engineering seemingly derived from commonly seen
misunderstanding that anycast could provide redundancy.

Redundancy of DNS is maintained by multiple (unicast or anycast)
name servers with different addresses, for which, withdrawal of
failing route is unnecessary complication.

                                                Masataka Ohta


Current thread: