nanog mailing list archives

Re: CloudFlare issues?


From: Ross Tajvar <ross () tajvar io>
Date: Mon, 24 Jun 2019 20:50:39 -0400

Maybe I'm in the minority here, but I have higher standards for a T1 than
any of the other players involved. Clearly several entities failed to do
what they should have done, but Verizon is not a small or inexperienced
operation. Taking 8+ hours to respond to a critical operational problem is
what stood out to me as unacceptable.

And really - does it matter if the protection *was* there but something
broke it? I don't think it does. Ultimately, Verizon failed implement
correct protections on their network. And then failed to respond when it
became a problem.

On Mon, Jun 24, 2019, 8:06 PM Tom Beecher <beecher () beecher cc> wrote:

Disclaimer : I am a Verizon employee via the Yahoo acquisition. I do not
work on 701.  My comments are my own opinions only.

Respectfully, I believe Cloudflare’s public comments today have been a
real disservice. This blog post, and your CEO on Twitter today, took every
opportunity to say “DAMN THOSE MORONS AT 701!”. They’re not.

You are 100% right that 701 should have had some sort of protection
mechanism in place to prevent this. But do we know they didn’t? Do we know
it was there and just setup wrong? Did another change at another time break
what was there? I used 701 many  jobs ago and they absolutely had filtering
in place; it saved my bacon when I screwed up once and started
readvertising a full table from a 2nd provider. They smacked my session
down an I got a nice call about it.

You guys have repeatedly accused them of being dumb without even speaking
to anyone yet from the sounds of it. Shouldn’t we be working on facts?

Should they have been easier to reach once an issue was detected?
Probably. They’re certainly not the first vendor to have a slow response
time though. Seems like when an APAC carrier takes 18 hours to get back to
us, we write it off as the cost of doing business.

It also would have been nice, in my opinion, to take a harder stance on
the BGP optimizer that generated he bogus routes, and the steel company
that failed BGP 101 and just gladly reannounced one upstream to another.
701 is culpable for their mistakes, but there doesn’t seem like there is
much appetite to shame the other contributors.

You’re right to use this as a lever to push for proper filtering , RPKI,
best practices. I’m 100% behind that. We can all be a hell of a lot better
at what we do. This stuff happens more than it should, but less than it
could.

But this industry is one big ass glass house. What’s that thing about
stones again?

On Mon, Jun 24, 2019 at 18:06 Justin Paine via NANOG <nanog () nanog org>
wrote:

FYI for the group -- we just published this:
https://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/


_________________
*Justin Paine*
Director of Trust & Safety
PGP: BBAA 6BCE 3305 7FD6 6452 7115 57B6 0114 DE0B 314D
101 Townsend St., San Francisco, CA 94107
<https://www.google.com/maps/search/101+Townsend+St.,+San+Francisco,+CA+94107?entry=gmail&source=g>



On Mon, Jun 24, 2019 at 2:25 PM Mark Tinka <mark.tinka () seacom mu> wrote:



On 24/Jun/19 18:09, Pavel Lunin wrote:


Hehe, I haven't seen this text before. Can't agree more.

Get your tie back on Job, nobody listened again.

More seriously, I see no difference between prefix hijacking and the
so called bgp optimisation based on completely fake announces on
behalf of other people.

If ever your upstream or any other party who your company pays money
to does this dirty thing, now it's just the right moment to go explain
them that you consider this dangerous for your business and are
looking for better partners among those who know how to run internet
without breaking it.

We struggled with a number of networks using these over eBGP sessions
they had with networks that shared their routing data with BGPmon. It
sent off all sorts of alarms, and troubleshooting it was hard when a
network thinks you are de-aggregating massively, and yet you know you
aren't.

Each case took nearly 3 weeks to figure out.

BGP optimizers are the bane of my existence.

Mark.



Current thread: