Wireshark mailing list archives

Re: match packets at sender and receiver


From: Andrej van der Zee <andrejvanderzee () gmail com>
Date: Tue, 20 Apr 2010 17:20:02 +0900

Hi Ian,

Thank you again for your method. I used the following fields for
identifying a packet:

* src ip
* dst ip
* src port
* dst port
* tcp seq nr
* tcp ack seq nr
* ip id
* packet length

In a single 1.7GB cap-file with 17424367 TCP packets I get 27
identical packets based on the above id.

Are there any other field in the tcphdr struct that I could use? I am
not sure about their meaning:

struct tcphdr {
        unsigned short source;
        unsigned short dest;
        unsigned long seq;
        unsigned long ack_seq;  
        #  if __BYTE_ORDER == __LITTLE_ENDIAN
        unsigned short res1:4;
        unsigned short doff:4;
        unsigned short fin:1;
        unsigned short syn:1;
        unsigned short rst:1;
        unsigned short psh:1;
        unsigned short ack:1;
        unsigned short urg:1;
        unsigned short res2:2;
        #  elif __BYTE_ORDER == __BIG_ENDIAN
        unsigned short doff:4;
        unsigned short res1:4;
        unsigned short res2:2;
        unsigned short urg:1;
        unsigned short ack:1;
        unsigned short psh:1;
        unsigned short rst:1;
        unsigned short syn:1;
        unsigned short fin:1;
        #  endif
        unsigned short window;  
        unsigned short check;
        unsigned short urg_ptr;
};

Thank you,
Andrej



On Wed, Apr 7, 2010 at 8:54 AM, Ian Schorr <ian.schorr () gmail com> wrote:
On Tue, Apr 6, 2010 at 10:45 PM, Andrej van der Zee
<andrejvanderzee () gmail com> wrote:
What I would like to know is how to match packets on both ends of the
line, provided that I have the IP numbers. Are there any unique packet
identifiers that appear in the cap-files on both ends? What should I
use? For example, when I study the cap-file in Wireshark, I see under
"Internet Protocol" an "Identification" number that seems to be
incremented for packets over the same connection (or conversation?).
Is this Identification number generated by Wireshark or is it really
in the packet headers? Does it appear in both cap files? In that case,
I could use a tuple <IP, Identification> to match packets on both
ends.

IP IDs are actually in the packet header.  Two ways to know:  1) Click
on the field.  Notice how 4 bytes (containing the IP ID value) are
highlighted in the bottom pane, the data portion of the packet.  2)
Generally Wireshark-generated fields should be enclosed in square
brackets (though those things aren't necessarily always going to be
the case, they're generally true and are SUPPOSED to always be true)

You could use an ID field like IP ID to identify your packets.
However, IP ID is not only not guaranteed to be unique within your
capture, but it's one of the most likely fields to not be unique.  For
one thing, it's a relatively small field (16-bit) and even if a host
increments the ID steadily (i.e. it doesn't re-use IDs more often than
it has to), it will re-use an ID after sending only 65536 packets.  A
reasonably busy system is going to wrap IDs pretty quickly.

   Incidentally, different hosts increment the IDs differently.  Some
increment it globally - once per
   packet they send.  Some have an incrementing counter for each host
they're talking to.  One host that
   I was looking at yesterday does some weird things - it seems to be
aware at IP-level what
   packets are still in-flight on the network (based on being
unacknowledged at TCP level).  It only
   generated unique IDs for each in-flight packet, but once they'd
been acknowledged, it'd reuse that
   IP ID.  So it tended to use IDs 0, 1, and maybe 2 ALL the time.  Weird.

And really, that's the tricky part.  There aren't really any fields in
TCP/IP packets that are guaranteed to be unique.  There's always SOME
chance of miscorrelating two packets that share the same properties
that you're checking for.

Anyway, if you were going to go that route, the TCP sequence number is
probably much better for your you'd get better results by comparing
TCP Sequence number (tcp.seq), plus the 4-tuple (source and
destination IPs and ports).  Although it's still not 100% guaranteed
to be unique, it's much, much more likely.  Although the numbers will
wrap, it will only happen after 4billion bytes have been transferred.
There's a chance that if connections are being opened and closed
frequently, then port numbers will be reused and that the host will
also start TCP sequence numbers at the same point, but that's also
extremely unlikely (and a big no-no - to avoid attacks hosts typically
should assign a random sequence ID).

Now compare the IP IDs as well, and it's now very, very, VERY unlikely
that two packets you compare match criteria but aren't actually the
same.  Theoretically there is a chance of misidentification, but
practically, for your purposes, that's probably plenty accurate.


Keep in mind also that if the network is modifying your packets
enroute, you'll have troubles.  If there is a TCP proxy of some kind,
a NAT/PAT device, or even a router that "fragments" packets, it may
seriously impact what you're comparing.
___________________________________________________________________________
Sent via:    Wireshark-users mailing list <wireshark-users () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-users
Unsubscribe: https://wireshark.org/mailman/options/wireshark-users
            mailto:wireshark-users-request () wireshark org?subject=unsubscribe




-- 
Andrej van der Zee
Koenji-minami 2-40-19A
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: +81-(0)80-65251092
Phone/Fax: +81-(0)3-3318-3155
___________________________________________________________________________
Sent via:    Wireshark-users mailing list <wireshark-users () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-users
Unsubscribe: https://wireshark.org/mailman/options/wireshark-users
             mailto:wireshark-users-request () wireshark org?subject=unsubscribe


Current thread: