nanog mailing list archives
Re: Facebook post-mortems...
From: Warren Kumari <warren () kumari net>
Date: Tue, 5 Oct 2021 14:07:46 -0400
On Tue, Oct 5, 2021 at 1:47 PM Miles Fidelman <mfidelman () meetinghouse net> wrote:
jcurran () istaff org wrote: Fairly abstract - Facebook Engineering - https://m.facebook.com/nt/screen/?params=%7B%22note_id%22%3A10158791436142200%7D&path=%2Fnotes%2Fnote%2F&_rdr <https://m.facebook.com/nt/screen/?params=%7B%22note_id%22:10158791436142200%7D&path=/notes/note/&_rdr> Also, Cloudflare’s take on the outage - https://blog.cloudflare.com/october-2021-facebook-outage/ FYI, /John This may be a dumb question, but does this suggest that Facebook publishes rather short TTLs for their DNS records? Otherwise, why would an internal failure make them unreachable so quickly?
Looks like 60 seconds: $ dig +norec star-mini.c10r.facebook.com. @d.ns.c10r.facebook.com. ; <<>> DiG 9.10.6 <<>> +norec star-mini.c10r.facebook.com. @ d.ns.c10r.facebook.com. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25582 ;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;star-mini.c10r.facebook.com. IN A ;; ANSWER SECTION: star-mini.c10r.facebook.com. 60 IN A 157.240.229.35 ;; Query time: 42 msec ;; SERVER: 185.89.219.11#53(185.89.219.11) ;; WHEN: Tue Oct 05 14:01:06 EDT 2021 ;; MSG SIZE rcvd: 72 ... and cue the "Bwahahhaha! If *I* ran Facebook I'd make the TTL be [2 sec|30sec|5min|1h|6h+3sec|1day|6months|maxint32]" threads.... Choosing the TTL is a balancing act between stability, agility, load, politeness, renewal latency, etc -- but I'm sure NANOG can boil it down to "They did it wrong!..." W
Miles Fidelman -- In theory, there is no difference between theory and practice. In practice, there is. .... Yogi Berra Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. In our lab, theory and practice are combined: nothing works and no one knows why. ... unknown
-- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
Current thread:
- Re: Facebook post-mortems..., (continued)
- Re: Facebook post-mortems... av (Oct 05)
- Re: Facebook post-mortems... Mark Tinka (Oct 05)
- Re: Facebook post-mortems... Hauke Lampe (Oct 05)
- Re: Facebook post-mortems... Tom Beecher (Oct 05)
- Re: Facebook post-mortems... Hank Nussbacher (Oct 05)
- RE: Facebook post-mortems... Kain, Becki (.) (Oct 05)
- Re: Facebook post-mortems... Matthew Petach (Oct 05)
- Re: Facebook post-mortems... Michael Thomas (Oct 05)
- Re: Facebook post-mortems... Warren Kumari (Oct 05)