nanog mailing list archives

Re: Linux BNG


From: Denys Fedoryshchenko <denys () visp net lb>
Date: Sun, 15 Jul 2018 17:52:21 +0300

On 2018-07-15 06:09, Jérôme Nicolle wrote:
Hi Baldur,

Le 14/07/2018 à 14:13, Baldur Norddahl a écrit :
I am investigating Linux as a BNG

As we say in France, it's like your trying to buttfuck flies (a local
saying standing for "reinventing the wheel for no practical reason").
You can say that about whole opensource ecosystem, why to bother, if
*proprietary solution name* exists. It is endless flamewar topic.


Linux' kernel networking stack is not made for this kind of job. 6WIND
or fd.io may be right on the spot, but it's still a lot of dark magic
for something that has been done over and over for the past 20 years by
most vendors.

And it just works.
Linux developers are working continuously to improve this, for example
latest feature, XDP, able to process several Mpps on <$1000 server.
Ask yourself, why Cloudflare "buttfuck flies" and doesn't buy some
proprietary vendor who 20 years does filtering in hardware?
https://blog.cloudflare.com/how-to-drop-10-million-packets/
I am doing experiments with XDP as well, to terminate PPPoE, and it is
doing that quite well over XDP.


DHCP (implying straight L2 from the CPE to the BNG) may be an option
bust most codebases are still young. PPP, on the other hand, is
field-tested for extremely large scale deployments with most vendors.
DHCP here, at least from RFC 2131 existence in March 1997.
Quite old, isn't it?
When you stick to PPPoE, you tie yourself with necessary layers of
encapsulation/decapsulation, and this is seriously degrading performance at _user_ level at least. With some development experience of firmware for routers,
i can tell that hardware offloading of ipv4 routing (DHCP) obviousl
is much easier and cheaper, than offloading PPPoE encap/decap+ipv4 routing.
Also Vendors keep screwing up their routers with PPP, and for example
one of them failed processing properly PADO in newest firmware revision.
Another problem, with PPPoE you subscribe to headache called reduced mtu, that also
will give a lot of unpleasant hours for ISP support.


If I were in you shooes, and I don't say I'd want to (my BNGs are scaled
to less than a few thousand of subscribers, with 1-4 concurrent session
each), I'd stick to plain old bitstream (PPP) model, with a decent
subscriber framework on my BNGs (I mostly use Juniper MXs, but I also
like Nokia's and Cisco's for some features).
I am consulting operators from few hundreds to hundreds of thousands.
It is very rare, when Linux bng doesn't suit them.


But let's say we would want to go forward and ditch legacy / proprietary
code to surf on the NFV bullshit-wave. What would you actually need ?

Linux does soft-recirculation at every encapsulation level by memory
copy. You can't scale anything with that. You need to streamline
decapsulation with 6wind's turborouter or fd.io frameworks. It'll cost
you a few thousand of man-hours to implement your first prototype.
6wind/fd.io is great solutions, but not suitable for mentioned task.
They are mostly created for very tailor made tasks or even as core of some vendor solution. Implementing your BNG based on such frameworks, or DPDK, is really reinventing the wheel, unless you will sell it or can save by that millions
of US$.


Let's say you got a woking framework to treat subsequent headers on the
fly (because decapsulation is not really needed, what you want is just
to forward the payload, right ?)… Well, you'd need to address
provisionning protocols on the same layers. Who would want to rebase a
DHCP server with alien packet forms incoming ? I gess no one.
accel-ppp does all that and exactly for IPoE termination, and no black magic
there.


Well, I could dissert on the topic for hours, because I've already spent
months to address such design issues in scalable ISP networks, and the
conclusion is :

- PPPoE is simple and proven. Its rigid structure alleviates most of the
dual-stack issues. It is well supported and largelly deployed.
PPPoE has VERY serious flaws.
1)Security of PPPoE sucks big time. Anybody who run rogue PPPoE server in your network will create significant headache for you, while with DHCP you have at least "DHCP snooping". DHCP snooping supported in very many vendors switches, while for PPPoE most of them
have nothing, except... you stick each user to his own vlan.
Why to pppox them then?
2)DHCP can send some circuit information in Option 82, this is very useful for
billing and very cost efficient on last stage of access switches.
3)Modern FTTX(GPON) solutions are built with QinQ in mind, so IPoE fits there flawlessly.

- DHCP requires hacks (in the form of undocummented options from several vendors) to seemingly work on IPv4, but the multicast boundaries for NDP
are a PITA to handle, so no one implemented that properly yet. So it is
to avoid for now.
While you can do multicast(mostly for IPTV, yes it is not easy, and need some vendor magic on "native" layer (DHCP), with PPP you can forget about multicast
entirely.

- Subscriber frameworks, be it Juniper's, Cisco's or Nokia's, are at the core of the largest residentioal ISPs out there. It Just Works. Trust them.
Sticking to "It Just Works" means "zero innovation" as well.
For example, while everybody said so, Mikrotik guys appeared, and very possible in total numbers their solutions serving now more users than Cisco or Nokia,
for much lower cost.
But sure they have their own market niche, Mikrotik doesn't fit well for large deployments.


That being said, I love the idea of NFV-ing all the things, let it be
BNGs first because those bricks in the wall are the most fragile we have
to maintain.

But I cleraly won't stand for an alternative to traditionnal offerings
just yet : it's too critical, and it's a PITA to build from scratch and
scale.

Best regards,
A lot of people who just sit on their warm, monthly salary chair and care about their
personal stability only(but not employer profit) -
will ask employer to pay for expensive *vendor name* solution, as it's safest bet for them. So, if person who implement solution is just "corporate screw" - he will say *vendor*. Nicolas Taleb "Skin in the game" books perfectly explains, why they will do worst choice.
If person is entrepreneur - he will start feasibility study.


Current thread: