tcpdump mailing list archives

Re: Proposed new pcap format


From: Darren Reed <darrenr () reed wattle id au>
Date: Tue, 13 Apr 2004 23:58:18 +1000 (EST)

In some email I received from Michael Richardson, sie wrote:
-- Start of PGP signed section.

"Darren" == Darren Reed <darrenr () reed wattle id au> writes:
    Darren> Today, some people might want MD-5, others SHA-1 and in the
    Darren> future, there may be other hashing algorithms that are
    Darren> better to use.  And there are times when we might want it
    Darren> off (algorithm 0, for example.)

  okay, meta-data.
  I think that one might want to emit the meta-data header, but not fill
it in in some cases, and calculate the hash later on, poking it in.

    Darren> As such, I believe this option should be a (type,value)
    Darren> pair, if we can agree that the hash value in the option
    Darren> header is a hash over the entire record returned by the
    Darren> kernel (with the value of the hash set to 0.)  And yes, the
    Darren> kernel computes the hash.

  Huh?  really. You want the hash over the entire packet, or just the
part that was received by pcap?

  I wondered about that part. This makes the hash very interesting.
  But, the kernel boundary is abstracted from the point of view of the
the pcap file format.

What I'd like to see hashed, by the kernel, is the data it provides
to the user application.  Depending on the purpose, this has better
trustworthiness, I feel. libpcap may decide to throw away that hash
and include its own in the dump file.

I'm not suggesting this just for a quick comparison point of view
(as are some others) but from a data reliability perspective.  If
you have a multithreaded application interacting with libpcap, it
would be nice if the pcap data that you considered sensiive could
be hashed by the provider (the kernel), as is the case with other
data streams in life.

Hmmm, having said that, I think a hash coming from the kernel would
need to cover two pieces of data: the timestamp and the portion of
the packet being returned. 

Now whether this hash/checksum is cryptographically strong (SHA-1)
or weak (32bit xor, say) should be up to me to decide on where I
choose to draw the line in the sand for performance.

  So, it we are including anything other than the packet data, we need
to define things.

  I can see some people wanting a hash over the layer-3 only, with
mutable fields set to zero (a la IPsec AH), such that they can compare
captures from different points.  Is this your desire?

No, I don't think anyone here is looking for that.  What has been
expressed as desirable is the means to do fast comparisons on packets
(4/8 bytes vs 40+) and data integrity.

And some other comments:

  a) how strong do we need to make this?
     8-byte implies it won't be CRC32. A longer CRC? MD4? MD5? SHA1?

I think it needs to support variable possibilities.
Maybe even 2s complement checksum, XOR, CRC32, MD4, MD5, SHA1, etc.

  b) how much performance can we afford?
     (clearly, it could be left as 0 and filled in later on)

Surely, if you can select from a number of different hashes then this
is a choice for the user to make.

  c) do we include this in every packet header?  Or as an extra
     meta-attribute?

Every packet.

Cheers,
Darren
-
This is the tcpdump-workers list.
Visit https://lists.sandelman.ca/ to unsubscribe.


Current thread: