nanog mailing list archives

Re: Facebook post-mortems...


From: Randy Monroe via NANOG <nanog () nanog org>
Date: Tue, 5 Oct 2021 14:11:21 -0400

Updated:
https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/

On Tue, Oct 5, 2021 at 1:26 PM Michael Thomas <mike () mtcc com> wrote:


On 10/5/21 12:17 AM, Carsten Bormann wrote:
On 5. Oct 2021, at 07:42, William Herrin <bill () herrin us> wrote:
On Mon, Oct 4, 2021 at 6:15 PM Michael Thomas <mike () mtcc com> wrote:
They have a monkey patch subsystem. Lol.
Yes, actually, they do. They use Chef extensively to configure
operating systems. Chef is written in Ruby. Ruby has something called
Monkey Patches.
While Ruby indeed has a chain-saw (read: powerful, dangerous, still the
tool of choice in certain cases) in its toolkit that is generally called
“monkey-patching”, I think Michael was actually thinking about the “chaos
monkey”,
https://en.wikipedia.org/wiki/Chaos_engineering#Chaos_Monkey
https://netflix.github.io/chaosmonkey/

No, chaos monkey is a purposeful thing to induce corner case errors so
they can be fixed. The earlier outage involved a config sanitizer that
screwed up and then pushed it out. I can't get my head around why
anybody thought that was a good idea vs rejecting it and making somebody
fix the config.

Mike




-- 

Randy Monroe

Network Engineering

[image: Uber] <https://uber.com/>

Current thread: