tcpdump mailing list archives

RFC: Add multicast reception API to libpcap


From: Bruce M Simpson <bms () incunabulum net>
Date: Sat, 25 Aug 2007 00:30:50 +0100

Hi all,

I'd like to field a Request for Comments for a new pcap feature: receive link-layer multicast packets.

I attach Pavlin's summary of XORP's requirements, and draw your attention to the varying methods used to enable this behaviour across operating systems.

First of all some background.

The XORP team intend to import support for the IS-IS routing protocol. Being an ISO-OSI derived routing protocol, IS-IS normally uses an 802.3 LLC type encapsulation. 802.3 multicast addresses are used for certain IS-IS link-local protocol traffic. XORP has recently grown support in its I/O libraries to do raw packet I/O using libpcap, in order to facilitate a portable implementation of IS-IS.

Recall also that when acting as an IP forwarder, the last thing we want to do is to put the interface into promiscuous mode, as this will generate an unacceptable level of redundant reception interrupts, and possibly introduce a routing loop in some network stacks.

As such we'd like to be able to request the reception of multicast traffic via libpcap, using a portable API. I propose the following function prototype and description:

%%%
int pcap_set_multi(pcap_t *p, int enable, int lladdrlen, u_char *lladdr);

Enable or disable link-layer multicast reception for the given address. May be used to receive such traffic without putting the interface into promiscuous mode, if and only if the underlying network driver supports it. If enable is set to zero, reception for the link-layer multicast address *lladdr is disabled; if non-zero, reception is enabled. p must point to a live network device descriptor obtained by calling pcap_open_live().
%%

I think the above definition is fairly portable. I don't think it makes sense to use pcap_addr_t, as this is simply a stash of pointers to struct sockaddr. Windows folks may note that it should possible to manipulate the multicast filter list using an NDIS OID from user mode, with sufficient privilege.

Whilst I may not get around to adding this feature myself for a while, it is something which is needed for XORP IS-IS, and other folks may wish to consider it too.

There is a 3rd party motivation for this - I recently changed the SIOCADDMULTI ioctl in FreeBSD to allow one and only one invocation per group from userland, because of other changes in the network stack to do with address reference counting.

It is likely that this functionality would be moved to BPF in order to benefit from the reference counting, as the kernel can then track the refcounted link-layer group membership from the BPF descriptor.

pcap is therefore a more accessible API for it.

Comments very much solicited and appreciated.

regards,
BMS
--- Begin Message --- From: Pavlin Radoslavov <pavlin () icir org>
Date: Wed, 22 Aug 2007 16:16:27 -0700
Bruce M Simpson <bms () incunabulum net> wrote:

(Cc:ing Pavlin as he did the XORP pcap socket support to facilitate IS-IS)

Sam Leffler wrote:

Tapping BPF in-kernel does not automatically enable multicast filters
unless we specifically call if_addmulti(); the BPF code does not examine
BPF programs which examine the link-layer destination address and enable
filters correspondingly.

Seems like it might be worthwhile for it to do that under certain
circumstances if there's no other mechanism to what you want.
Doing it this way, i.e. inside BPF, would avoid having to code up an API for
the other zillion platforms which pcap supports. [Windows in particular is a
pain, it requires a Winsock socket handle to tweak multicast filters.]

It is however very intrusive - it would require parsing enough of the BPF
program being plumbed in by BIOCSETF to detect that the header fields for that
data-link type (DLT) are being compared with a multicast address, and that
itself is DLT specific.

You are quite right that it isn't strictly needed for SEEMesh with RA-OLSR,
which can rely on kernel to userland upcalls.

However, it is something which protocols like IS-IS need for their control
plane traffic. Putting a forwarding interface into promiscuous mode may not be
acceptable. Last time I looked at Quagga, I believe it was doing just that.

I am missing the beginning of the discussion and the original
question, but (FWIW) here is what we have in XORP re. pcap support
and IS-IS. Note that not everything is tested and might be refined
once we start working on a protocol that will use that interface.

* For each interface we need to run a protocol like IS-IS we create
  a pcap descriptor that is associated with that interface,
  and we also have select()-able file descriptor (for I/O purpose).

* When the pcap descriptor is initialized, the associated filter
  program is a combination of the pre-defined ether type and the
  (optional) filter program. I.e.,:
  if (ether_type > 0)
    filter_program = (ether proto <ether_type>) and (<filter_program>)

  This will be sufficient for receiving unicast packets with
  destination the router itself.

* For receiving multicast packets as well, we explicitly join a
  link-layer multicast group by using ioctl(SIOCADDMULTI).
  If I remember correctly, IS-IS defines two distinguish multicast
  addresses for forming adjacencies on broadcast networks.
  Hence, we need to explicitly join one or both addresses per
  interface. And, no, we don't want to put the interface in
  promiscious mode just to get those (low bandwidth) control
  packets.

The downside with the above issue is that
ioctl(SIOCADDMULTI/SIOCDELMULTI) is pretty much OS-specific. The
systems I have checked so far seem to require the following:

* On Linux we need to use ifreq.ifr_hwaddr with sa_family of
  AF_UNSPEC.
* On FreeBSD and DragonFlyBSD we need to use ifreq.ifr_addr with
  sa_family of AF_LINK.
* On NetBSD and OpenBSD we need to use ifreq.ifr_addr with sa_family
  of AF_UNSPEC.

Persionally, I'd prefer to deal exclusively with pcap()-only API so
I don't have to mess with the BPF filters. Also, I'd prefer that the
join/leave L2 multicast group support becomes part of the pcap()
API, because the above ioctl() solution is not really portable.
E.g., something along the lines pcap_join_group() and
pcap_leave_group().
Thus, the underlying implementation doesn't need to explicitly parse
the pcap program to figure-out which multicast groups it should
join.

Thanks,
Pavlin

--- End Message ---
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Current thread: