nanog mailing list archives

Re: CDN Overload?


From: Mike Hammett <nanog () ics-il net>
Date: Tue, 20 Sep 2016 14:18:14 -0500 (CDT)

This is what I'm asking of them: 


===== 
Have you seen a CDN overloading a customer? Help me gather information on the issue. 

What CDN? 
What have you identified the traffic to be? 
What is the access network? 
Where is the rate limiting done? 
How is the rate limiting done (policing vs. queueing, SFQ, PFIFO, etc,, etc.)? 
What is doing the rate limiting? 
What is the rate-limit set to? 
Upstream of the rate-limiter, what are you seeing for inbound traffic? 
One connection or many? 
How much traffic? 
How does other traffic behave when exceeding the rate limit? 
Where is NAT performed? 
What is doing NAT? 
Shared NAT or isolated to that customer? 
Have you done a packet capture before and after the rate limiter? The NAT device? 
Would you be willing to send a filtered packet capture (only the frames that relate to this CDN) to the CDN if they 
want it? 



There have been reports of CDNs sending more traffic than the customer can handle and ignores TCP convention to slow 
down. Trying to investigate this thoroughly so we can get the CDN to fix their system. Multiple CDNs have been shown to 
do this. 
===== 




----- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

----- Original Message -----

From: "Mike Hammett" <nanog () ics-il net> 
To: "NANOG" <nanog () nanog org> 
Sent: Monday, September 19, 2016 12:34:48 PM 
Subject: CDN Overload? 

I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints 
from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some 
wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the 
only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. 

The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is 
significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to 
the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations 
as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general 
issue and develop a better process for collecting what exactly is happening at the time and how to address it. 

One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All 
other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended 
periods of time. 

An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the 
other side of the rate limiter. 

Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections 
downloading said data from Microsoft. 

The past week or two I've been hearing of people only having a single connection downloading at more than their plan 
rate. 


These situations effectively shut out all other Internet traffic to that customer or even portion of the network for 
low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even 
notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin 
everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) 




Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it 
to? 




----- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 



Current thread: