tcpdump mailing list archives

Re: BPF Extended: addressing BPF's shortcomings


From: Darren Reed <darrenr () netbsd org>
Date: Thu, 11 Jun 2015 21:05:20 +1000

On 11/06/2015 9:31 AM, Mindaugas Rasiukevicius wrote:
Darren Reed <darrenr () netbsd org> wrote:
Extending BPF
=============

Introduction
------------
BPF was originally designed to provide very fast packet matching
capabilities for IPv4 but as a result of its generic nature, is
capable of being used for just about any protocol. With IPv6 the
limitations of BPF became apparent.

...
Conceptually, I like the idea of an extended BPF instruction set.  There
are several important questions here.  First, what is the exact problem we
want to solve with a new instruction set?  Is it just the IPv6 handling?

I do find the BPF byte-code useful as a general purpose instruction set
i.e. for the use as a universal virtual machine.  This was also one of the
driving forces behind the Linux eBPF.  They use it beyond packet filtering.
Note that LLVM has recently gained support for the eBPF backend.  If that
is the objective, then there is a wider spectrum of the requirements.

Specifically, I would like to see:

- Capability to jump backwards.  Basically, the general purpose instruction
set ought be Turing-complete.  Obviously, with a way to enable/disable this
depending whether the user needs bpf_validate().

Do you have any thoughts on what sort of conditions this should be allowed in?

Guy's suggestion is intriguing and suggests to me something like this:

    If condition is true, jump (either forward or backward?) AND
    move the start of packet pointerforward by # > 0 bytes.

Rather than try and detect the move of the start of packet pointer,
force it to occur as part of the loop instruction?


- 32-bit jump offsets.  Currently, they are 16-bit which is quite limiting
if you have a larger BPF program.

Yes, agreed.


- Opcode extended to 32-bits.  It seems we agree on this, although this
can be debatable.  The classic BPF byte-code has a simple, minimalistic
RISC-like instruction set (with the exception of BPF_MSH hack).  I would
be inclined to keep it that way instead of polluting the, quite limited,
instruction space with various arbitrary mechanisms, but this is somewhat
philosophical RISC vs CISC debate.  Nevertheless, if the general feeling
is to go with complex instructions, then we could at least dedicate a wide
range for them.

What are RISC and CISC these days but a layer over microcode?

Consider that when BPF was developed, 32MB of RAM was a lot of memoryand
the use of long options with getopt was almost unheard of.Today thesize
of theinstruction might almost be considered to be a hindrance to performance
asaccessing the individual bytes is no longer performance friendly andwill
often result in the CPU fetching a complete word anyway. Thus the compact
size of the BPF instruction is no longer the benefit it once was.



- Support for 64-bit words, but not quite convinced about 128-bit words.
Do you want to add them just to accommodate IPv6?  Why not to leave this
for the byte-code generator/compiler?

Considering Michael's comments about concentrating on what opcodes the compiler should generate, it may be a better idea to focus on 128bit words rather than 64bit words because it is rather elegant to have the compiler produce instructions that allow for the address to be used in a single instruction rather than needing to do
several.

As a for example, it is easier to mentally inspect and verify the following:

ldq  [32]
jeqq #0x01    jt 2    jf 3

than to try and follow half a dozen instructions. Now when it comes to making
changes to those values, e.g. to apply a netmask:

ldq  [32]
andq #0xfffffffffffffff00000000000000000
jeqq #0x2002000100000a900000000000000000....

How many instructions does BPF need to emit today to do that?
Which is easier to understand?

Not only that but if you are writing manual instructions to do math
(such as addition or subtraction), it becomes harder again to ensure
what's happening is correct.Do you really want to do multiple 32bit
instructions with BPF to represent adding together two 128bit values?

I would rather have instructions with larger operands that are easier
for the parser to generate and let the interpreter (or JIT) worry about
how to execute them.



- External scratch memory store or a way to initialise it before calling
the BPF program.  Also, potentially arbitrary (dynamic) BPF_MEMWORDS size
rather than hardcoded size.  Basically, the user/caller should be able to
provide arbitrary data through the memory store.

What's the end goal here?

Thinking on the fly here, what if the bpf program had a data section on the
end of it or at the beginning? Although with JIT this may be starting to wander down the road of inventing a object file format and that might be a bit too far.



- BPF_COP and BPF_CALL.  When I added BPF_COP to NetBSD, I thought about
the generic BPF_CALL to invoke *arbitrary* functions.  It requires solving
some of the above problems first, but there is an important difference:
BPF_COP allows program to invoke a *predetermined* set of functions, while
the capability to invoke arbitrary functions through BPF_CALL can have
security implications (yet it is more powerful).  I would like to see both,
but with the ability to disable, at least, BPF_CALL.

My take on BPF_COP and BPF_CALL is that they are almost identical, except that it ispossible to implement BPF_COP with BPF_CALL and not the other way around.
Is thata fair assessment? My examination of eBPF is that BPF_CALL doesn't
allow just any function or address to be called.


On 11/06/2015 3:32 AM, Michael Richardson wrote:
Darren thank for taking a lead on this.
The IPv6 header chasing problem is I think, the more important part of this.

While I'm not against a re-architecture of BPF, I wonder if it will be
widely accepted.

Carrots will be needed :)

 From what I can see, it appears that BPF can be machine translated to BFPX,
so it would permit one to deploy BPFX in such a way that would permit
backwards compatibility, that's a good thing.

Yes, that was a design goal.


One thing I learnt (too late) at a fabless semi company that was made a
ram-intensive loop-less packet processing engine (simpler than BPF) was
not to architect the instruction set and then the compiler.  We sound up
creating instructions we never knew how to use, and we also wound up
optimizing encodings of instructions that were seldom used.

Strange as it may seem, this almost tells me that doing 128bit instructions for
IPv6 work is more important than 64bit ones for the same or other things.


So my advice is not to worry about the encoding, but rather to work on the
BPF compiler (in libpcap), and see how you would use each instruction.

Ok, thanks for the tip.

Cheers,
Darren

_______________________________________________
tcpdump-workers mailing list
tcpdump-workers () lists tcpdump org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Current thread: