nanog mailing list archives

Re: Do ISP's collect and analyze traffic of users?


From: Dave Taht <dave.taht () gmail com>
Date: Sat, 10 Jun 2023 10:03:09 -0600

On Sat, Jun 10, 2023 at 9:46 AM John van Oppen <john () vanoppen com> wrote:

As a decent sized north American ISP I think I need totally agree with this post.    There simply is not any 
economically justifiable reason to collect customer data, doing so is expensive, and unless you are trying to traffic 
shape like a cell carrier

They shape? News to me...

has zero economic benefit.     In our case we do 1:4000 netflow samples and that is literally it, we use that data for 
peering analytics and failure modeling.

This is true for both large ISPs I've been involved with and in both cases I would have overseen the policy.

What I see in this thread is a bunch of folks guessing that clearly have not been involved in large eyeball ISP 
operations.

The smaller (mostly rural) WISPs I work with do not have time or
desire to monetize traffic either! Pretty much all of them have their
hands full just solving tech support problems.

They do collect extensive metrics on bandwidth, packet loss, latency,
snmp stats of all sorts, airtime, interference, cpu stats, routing
info,
(common tools are things like UISP, splynx, opennms), and keep
amazingly good (lidar, even) maps of the local terrain. If the bigger
ISPs are only doing netflow once in a while, no wonder the little
wisps survive.

The ones shaping via libreqos.io now are totally in love[1] with our
in-band RTT metrics as that is giving them insight into their backhaul
behaviors in rain and snow and sleet, instead of out of band snmp, as
well as gaining insight into when it is the customer wifi that is the
real problem. It is the combination of all these metrics that helps
narrow down problems.

But the only monetization that happens is the monthly bill. Most of
these cats are actually rather ornery and *very* insistent about
protecting their customers privacy, from all comers, and resistant to
cloud based applications in general.

There are some bad apples in the wisp world that do want to rate limit
(via dpi) netflix above all else in case of running low on backhaul,
but they are not in my customer base.

[1] we (and they) *are* passionately interesting in identifying the
characteristics of multiple traffic types and mitigating attacks, and
a couple are publishing some anonymized movies of what traffic looks
like: https://www.youtube.com/@trendaltoews7143/videos




-----Original Message-----
From: NANOG <nanog-bounces+john=vanoppen.com () nanog org> On Behalf Of Saku Ytti
Sent: Tuesday, May 16, 2023 7:56 AM
To: Tom Beecher <beecher () beecher cc>
Cc: nanog () nanog org
Subject: Re: Do ISP's collect and analyze traffic of users?

I can't tell what large is. But I've worked for enterprise ISP and consumer ISPs, and none of the shops I worked for 
had capability to monetise information they had. And the information they had was increasingly low resolution. 
Infraprovider are notoriously bad even monetising their infra.

I'm sure do monetise. But generally service providers are not interesting or have active shareholders, so very little 
pressure to make more money, hence firesales happen all the time due infrastructure increasingly seen as a liability, 
not an asset. They are generally boring companies and internally no one has incentive to monetise data, as it 
wouldn't improve their personal compensation. And regulations like GDPR create problems people rather not solve, 
unless pressured.

Technically most people started 20 years ago with some netflow sampling ratio, and they still use the same sampling 
ratio, despite many orders of magnitude more packets. Meaning previously the share of flows captured was magnitude 
higher than today, and today only very few flows are seen in very typical applications, and netflow is largely for 
volumetric ddos and high level ingressAS=>egressAS metrics.

Hardware offered increasingly does IPFIX as if it was sflow, that is,
0 cache, immediately exported after sampled, because you'd need like
1:100 or higher resolution, to have any significant luck in hitting the same flow twice. PTX has stopped supporting 
flow-cache entirely because of this, at the sampling rate where cache would do something, the cache would overflow.

Of course there are other monetisation opportunities via other mechanism than data-in-the-wire, like DNS


On Tue, 16 May 2023 at 15:57, Tom Beecher <beecher () beecher cc> wrote:

Two simple rules for most large ISPs.

1. If they can see it, as long as they are not legally prohibited, they'll collect it.
2. If they can legally profit from that information, in any way, they will.

Now, ther privacy policies will always include lots of nice sounding clauses, such as 'We don't see your personally 
identifiable information'. This of course allows them to sell 'anonymized' sets of that data, which sounds great , 
except as researchers have proven, it's pretty trivial to scoop up multiple, discrete anonymized data sets, and 
cross reference to identify individuals. Netflow data may not be as directly 'valuable' as other types of data, but 
it can be used in the blender too.

Information is the currency of the realm.



On Mon, May 15, 2023 at 7:00 PM Michael Thomas <mike () mtcc com> wrote:


And maybe try to monetize it? I'm pretty sure that they can be
compelled to do that, but do they do it for their own reasons too? Or
is this way too much overhead to be doing en mass? (I vaguely recall
that netflow, for example, can make routers unhappy if there is too much "flow").

Obviously this is likely to depend on local laws but since this is
NANOG we can limit it to here.

Mike



--
  ++ytti



-- 
Podcast: https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/
Dave Täht CSO, LibreQos


Current thread: