nanog mailing list archives

Re: Facebook post-mortems...


From: Tom Beecher <beecher () beecher cc>
Date: Tue, 5 Oct 2021 08:33:00 -0400


Maybe withdrawing those routes to their NS could have been mitigated by
having NS in separate entities.


Assuming they had such a thing in place , it would not have helped.

Facebook stopped announcing the vast majority of their IP space to the DFZ
during this. So even they did have an offnet DNS server that could have
provided answers to clients, those same clients probably wouldn't have been
able to connect to the IPs returned anyways.

If you are running your own auths like they are, you likely view your
public network reachability as almost bulletproof and that it will never
disappear. Which is probably true most of the time. Until yesterday happens
and the 9's in your reliability percentage change to 7's.

On Tue, Oct 5, 2021 at 8:10 AM Jean St-Laurent via NANOG <nanog () nanog org>
wrote:

Maybe withdrawing those routes to their NS could have been mitigated by
having NS in separate entities.

Let's check how these big companies are spreading their NS's.

$ dig +short facebook.com NS
d.ns.facebook.com.
b.ns.facebook.com.
c.ns.facebook.com.
a.ns.facebook.com.

$ dig +short google.com NS
ns1.google.com.
ns4.google.com.
ns2.google.com.
ns3.google.com.

$ dig +short apple.com NS
a.ns.apple.com.
b.ns.apple.com.
c.ns.apple.com.
d.ns.apple.com.

$ dig +short amazon.com NS
ns4.p31.dynect.net.
ns3.p31.dynect.net.
ns1.p31.dynect.net.
ns2.p31.dynect.net.
pdns6.ultradns.co.uk.
pdns1.ultradns.net.

$ dig +short netflix.com NS
ns-1372.awsdns-43.org.
ns-1984.awsdns-56.co.uk.
ns-659.awsdns-18.net.
ns-81.awsdns-10.com.

Amnazon and Netflix seem to not keep their eggs in the same basket. From a
first look, they seem more resilient than facebook.com, google.com and
apple.com

Jean

-----Original Message-----
From: NANOG <nanog-bounces+jean=ddostest.me () nanog org> On Behalf Of Jeff
Tantsura
Sent: October 5, 2021 2:18 AM
To: William Herrin <bill () herrin us>
Cc: nanog () nanog org
Subject: Re: Facebook post-mortems...

129.134.30.0/23, 129.134.30.0/24, 129.134.31.0/24. The specific routes
covering all 4 nameservers (a-d) were withdrawn from all FB peering at
approximately 15:40 UTC.

Cheers,
Jeff

On Oct 4, 2021, at 22:45, William Herrin <bill () herrin us> wrote:

On Mon, Oct 4, 2021 at 6:15 PM Michael Thomas <mike () mtcc com> wrote:
They have a monkey patch subsystem. Lol.

Yes, actually, they do. They use Chef extensively to configure
operating systems. Chef is written in Ruby. Ruby has something called
Monkey Patches. This is where at an arbitrary location in the code you
re-open an object defined elsewhere and change its methods.

Chef doesn't always do the right thing. You tell Chef to remove an RPM
and it does. Even if it has to remove half the operating system to
satisfy the dependencies. If you want it to do something reasonable,
say throw an error because you didn't actually tell it to remove half
the operating system, you have a choice: spin up a fork of chef with a
couple patches to the chef-rpm interaction or just monkey-patch it in
one of your chef recipes.

Regards,
Bill Herrin

--
William Herrin
bill () herrin us
https://bill.herrin.us/



Current thread: