tcpdump mailing list archives
Re: endianness of portable BPF bytecode (DRAFT revision 3)
From: Denis Ovsienko via tcpdump-workers <tcpdump-workers () lists tcpdump org>
Date: Sat, 25 Jun 2022 13:01:52 +0100
--- Begin Message --- From: Denis Ovsienko <denis () ovsienko info>
Date: Sat, 25 Jun 2022 13:01:52 +0100
Hello list. Below you can find the next draft revision. It incorporates some feedback received from Guy and Michael, also it reorders the TLVs and adds SnapLen TLV, Netmask TLV and EOF TLV. Also the text has been converted to a man page, so it could live next to pcap-savefile(5) when it is ready enough. One thing that does not look quite right to me in this revision is that, for example, LinkTypeValue TLV could be not an optional TLV, but a part of the fixed header because DLT is a meaningful bit of information and could (and possibly should) be checked against the DLT where the bytecode is applied. (This is exactly what https://github.com/the-tcpdump-group/libpcap/issues/211 does.) tc-bpf(8) on Linux says: "Since libpcap does not support all Linux' specific cBPF extensions in its compiler..." If that's true, then the header would need another field to indicate the cBPF dialect, so if anybody/anything was to validate the bytecode, the result would always be conclusive. Could anybody tell an example of such cBPF differences if those indeed exist? If you see an issue with terminology or style, let me know. ---------------------------------------------------------------------- CBPF-SAVEFILE(5) File Formats Manual CBPF-SAVEFILE(5) NAME cbpf-savefile - cBPF savefile format (DRAFT revision 3) DESCRIPTION This man page discusses a file format for cBPF, which is the "classic" (and for a long time the only) Berkeley Packet Fil- ter. It does NOT apply to the newer eBPF variety of BPF. The main purpose of this file format is to store cBPF bytecode, most commonly compiled from a BPF filter expression using libp- cap. Besides that, the format allows to encode some informa- tion about the context in which the compilation was done. This meta-data can make it easier to reproduce the compilation later if required. In the following specification integer fields are big-endian unsigned, String fields do not use NUL character for termina- tion or padding. FILE FORMAT A cBPF savefile consists of a fixed-size header and a variable- size body as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0xA1 | 0xB2 | 0xC3 | 0xCB | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 'c' | 'B' | 'P' | 'F' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MajorVer | MinorVer | InstructionCount=n | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | instruction 1 | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | instruction 2 | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | instruction n | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | optional trailing TLV space | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The first four bytes contain a fixed signature, also known as a magic number, to make it easy to identify the file type auto- matically. The next four bytes contain the ASCII string "cBPF" to provide a hint for manual identification. MajorVer and MinorVer contain the major and the minor version numbers of this format respectively. The current major version is 1 and the current minor version is 0. Format changes that do not impact compatibility (e.g., new TLV types) increment the minor version only. Other format changes increment the major version and reset the minor version to 0. InstructionCount is the last field of the fixed header, it con- tains the number of bytecode instructions following the header. By convention, valid BPF bytecode must consist of at least one instruction, so in a valid cBPF savefile this field value is at least 1. The file format thus far minimizes the overhead for software that only writes or reads cBPF bytecode. If there is any data after the last instruction, it is the trailing TLV space, which mostly contains meta-data for human interpretation. It con- tains TLVs in the format specified below. INSTRUCTION FORMAT 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | opcode | jt | jf | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | k | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This is the traditional encoding of a cBPF instruction. Note that usually its endianness depends on the machine, but in this format it is fixed. TLV FORMAT 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length=m | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Value (m bytes) ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ All TLVs are optional. Every TLV may appear in the same cBPF savefile at most once. Length value does not include Type and Length. Code points for Type and the associated Length con- straints are defined below. EOF TLV Allows to mark the end of TLV space (hence of the file) explic- itly to make it clear that the file is not truncated. If this TLV is present in the TLV space, it may appear the last only. Type is 0, Length is 0, Value is empty. LinkTypeValue TLV Allows to record the link-layer header type value used for the compilation, usually this is either the linktype input argument to pcap_open_dead(3PCAP) or the dlt input argument to pcap_set_datalink(3PCAP). By convention link-layer header type values are limited to 16 bits. Type is 1, Length is 2, Value contains an integer. LinkTypeName TLV Allows to record the input argument to pcap_datalink_name_to_val(3PCAP) if the latter was used to translate a DLT name into the DLT value (the same name can sometimes produce different values in different contexts). Type is 2, Length is variable, Value contains an ASCII string. SnapLen TLV Allows to record the snapshot length used for the compilation, usually this is the snaplen input argument to pcap_open_dead() or pcap_set_snaplen(3PCAP). Type is 3, Length is 4, Value contains an integer. Filter TLV Allows to record the filter expression that was compiled into the bytecode, usually this is the str input argument to pcap_compile(3PCAP). Type is 4, Length is variable, Value contains an ASCII string. OptReq TLV Allows to record whether optimization was requested for the compilation or not, usually this is the optimize input argument to pcap_compile(). Note that some link-layer header types and filter keywords disable the optimization automatically in libp- cap. Type is 5, Length is 1, Value contains 1 or 0. Netmask TLV Allows to record the value of netmask input argument to pcap_compile(). Type is 6, Length is 4, Value contains a 32-bit IPv4 netmask. Comment TLV Allows to record a free-form text, for example, the name and version of the program that generated the file. Type is 7, Length is variable, Value contains a UTF-8 string. Timestamp TLV Allows to record when the compilation was performed. Type is 8, Length is 8, Value contains a 64-bit Unix timestamp. SOFTWARE SUPPORT None at the time of this writing. SEE ALSO pcap-savefile(5) 24 June 2022 CBPF-SAVEFILE(5) -- Denis Ovsienko
--- End Message ---
_______________________________________________ tcpdump-workers mailing list tcpdump-workers () lists tcpdump org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Current thread:
- endianness of portable BPF bytecode Denis Ovsienko via tcpdump-workers (Jun 02)
- Re: endianness of portable BPF bytecode Denis Ovsienko via tcpdump-workers (Jun 10)
- Re: endianness of portable BPF bytecode Guy Harris via tcpdump-workers (Jun 10)
- Message not available
- Re: endianness of portable BPF bytecode Denis Ovsienko via tcpdump-workers (Jun 10)
- Message not available
- Re: endianness of portable BPF bytecode Denis Ovsienko via tcpdump-workers (Jun 11)
- Re: endianness of portable BPF bytecode Denis Ovsienko via tcpdump-workers (Jun 10)
- Re: endianness of portable BPF bytecode (DRAFT revision 3) Denis Ovsienko via tcpdump-workers (Jun 25)
- Re: endianness of portable BPF bytecode (DRAFT revision 4) Denis Ovsienko via tcpdump-workers (Jun 30)