tcpdump mailing list archives
Re: BPF Extended: addressing BPF's shortcomings
From: Darren Reed <darrenr () netbsd org>
Date: Thu, 11 Jun 2015 21:05:20 +1000
On 11/06/2015 9:31 AM, Mindaugas Rasiukevicius wrote:
Darren Reed <darrenr () netbsd org> wrote:Extending BPF ============= Introduction ------------ BPF was originally designed to provide very fast packet matching capabilities for IPv4 but as a result of its generic nature, is capable of being used for just about any protocol. With IPv6 the limitations of BPF became apparent. ...Conceptually, I like the idea of an extended BPF instruction set. There are several important questions here. First, what is the exact problem we want to solve with a new instruction set? Is it just the IPv6 handling? I do find the BPF byte-code useful as a general purpose instruction set i.e. for the use as a universal virtual machine. This was also one of the driving forces behind the Linux eBPF. They use it beyond packet filtering. Note that LLVM has recently gained support for the eBPF backend. If that is the objective, then there is a wider spectrum of the requirements. Specifically, I would like to see: - Capability to jump backwards. Basically, the general purpose instruction set ought be Turing-complete. Obviously, with a way to enable/disable this depending whether the user needs bpf_validate().
Do you have any thoughts on what sort of conditions this should be allowed in?
Guy's suggestion is intriguing and suggests to me something like this: If condition is true, jump (either forward or backward?) AND move the start of packet pointerforward by # > 0 bytes. Rather than try and detect the move of the start of packet pointer, force it to occur as part of the loop instruction?
- 32-bit jump offsets. Currently, they are 16-bit which is quite limiting if you have a larger BPF program.
Yes, agreed.
- Opcode extended to 32-bits. It seems we agree on this, although this can be debatable. The classic BPF byte-code has a simple, minimalistic RISC-like instruction set (with the exception of BPF_MSH hack). I would be inclined to keep it that way instead of polluting the, quite limited, instruction space with various arbitrary mechanisms, but this is somewhat philosophical RISC vs CISC debate. Nevertheless, if the general feeling is to go with complex instructions, then we could at least dedicate a wide range for them.
What are RISC and CISC these days but a layer over microcode? Consider that when BPF was developed, 32MB of RAM was a lot of memoryand the use of long options with getopt was almost unheard of.Today thesizeof theinstruction might almost be considered to be a hindrance to performance
asaccessing the individual bytes is no longer performance friendly andwill often result in the CPU fetching a complete word anyway. Thus the compact size of the BPF instruction is no longer the benefit it once was.
- Support for 64-bit words, but not quite convinced about 128-bit words. Do you want to add them just to accommodate IPv6? Why not to leave this for the byte-code generator/compiler?
Considering Michael's comments about concentrating on what opcodes the compiler should generate, it may be a better idea to focus on 128bit words rather than 64bit words because it is rather elegant to have the compiler produce instructions that allow for the address to be used in a single instruction rather than needing to do
several. As a for example, it is easier to mentally inspect and verify the following: ldq [32] jeqq #0x01 jt 2 jf 3than to try and follow half a dozen instructions. Now when it comes to making
changes to those values, e.g. to apply a netmask: ldq [32] andq #0xfffffffffffffff00000000000000000 jeqq #0x2002000100000a900000000000000000.... How many instructions does BPF need to emit today to do that? Which is easier to understand? Not only that but if you are writing manual instructions to do math (such as addition or subtraction), it becomes harder again to ensure what's happening is correct.Do you really want to do multiple 32bit instructions with BPF to represent adding together two 128bit values? I would rather have instructions with larger operands that are easier for the parser to generate and let the interpreter (or JIT) worry about how to execute them.
- External scratch memory store or a way to initialise it before calling the BPF program. Also, potentially arbitrary (dynamic) BPF_MEMWORDS size rather than hardcoded size. Basically, the user/caller should be able to provide arbitrary data through the memory store.
What's the end goal here? Thinking on the fly here, what if the bpf program had a data section on theend of it or at the beginning? Although with JIT this may be starting to wander down the road of inventing a object file format and that might be a bit too far.
- BPF_COP and BPF_CALL. When I added BPF_COP to NetBSD, I thought about the generic BPF_CALL to invoke *arbitrary* functions. It requires solving some of the above problems first, but there is an important difference: BPF_COP allows program to invoke a *predetermined* set of functions, while the capability to invoke arbitrary functions through BPF_CALL can have security implications (yet it is more powerful). I would like to see both, but with the ability to disable, at least, BPF_CALL.
My take on BPF_COP and BPF_CALL is that they are almost identical, except that it ispossible to implement BPF_COP with BPF_CALL and not the other way around.
Is thata fair assessment? My examination of eBPF is that BPF_CALL doesn't allow just any function or address to be called. On 11/06/2015 3:32 AM, Michael Richardson wrote:
Darren thank for taking a lead on this. The IPv6 header chasing problem is I think, the more important part of this. While I'm not against a re-architecture of BPF, I wonder if it will be widely accepted.
Carrots will be needed :)
From what I can see, it appears that BPF can be machine translated to BFPX, so it would permit one to deploy BPFX in such a way that would permit backwards compatibility, that's a good thing.
Yes, that was a design goal.
One thing I learnt (too late) at a fabless semi company that was made a ram-intensive loop-less packet processing engine (simpler than BPF) was not to architect the instruction set and then the compiler. We sound up creating instructions we never knew how to use, and we also wound up optimizing encodings of instructions that were seldom used.
Strange as it may seem, this almost tells me that doing 128bit instructions for
IPv6 work is more important than 64bit ones for the same or other things.
So my advice is not to worry about the encoding, but rather to work on the BPF compiler (in libpcap), and see how you would use each instruction.
Ok, thanks for the tip. Cheers, Darren _______________________________________________ tcpdump-workers mailing list tcpdump-workers () lists tcpdump org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Current thread:
- BPF Extended: addressing BPF's shortcomings Darren Reed (Jun 10)
- Re: BPF Extended: addressing BPF's shortcomings Paul "LeoNerd" Evans (Jun 10)
- Re: BPF Extended: addressing BPF's shortcomings Darren Reed (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Paul "LeoNerd" Evans (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Michael Richardson (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Darren Reed (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Paul "LeoNerd" Evans (Jun 10)
- Re: BPF Extended: addressing BPF's shortcomings Mindaugas Rasiukevicius (Jun 10)
- Re: BPF Extended: addressing BPF's shortcomings Guy Harris (Jun 10)
- Re: BPF Extended: addressing BPF's shortcomings Paul "LeoNerd" Evans (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Darren Reed (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Paul "LeoNerd" Evans (Jun 11)
- Re: BPF Extended: addressing BPF's shortcomings Guy Harris (Jun 10)