Dailydave mailing list archives

Re: Unknown Application Protocol Analysis


From: William McVey <wam () cisco com>
Date: Wed, 06 Sep 2006 15:06:12 -0500

On Wed, 2006-09-06 at 22:59 +0800, Rhys Kidd wrote: 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi List,

I've been thinking about a problem faced when approaching an unknown traffic
flow, and think this list probably contains an expert or two in this area. 


Q. How do you run a quick one pass analysis of some proprietary application
protocol?


I know it's fairly easy to look at small subsets of traffic manually,
looking for the \x00 and slowly guess-timate where fields begin and end,
what constitute a record, what are static offsets etc, but I'm imagining a
tool that would take in a batch of traffic and work out roughly what's what,
seeing the big picture.

I'd imagine this tool would run a first check, looking for what might
constitute discrete units of information, (possibly all those bounded by
\x00).

I'd imagine this tool would then look for some of the basic layouts of TLV
protocols (which seem most common IMHO) by working out lengths of what
appear to be strings, and look for those ints before or after. Maybe even
looking for md5 or sha1 hashes that correspond to other data fields. Then
look for repeating byte patterns etc.

Once it understands the structure of a single packet, then compare it over
time with other packets between similar host, looking for which fields are
constant, which ones change randomly (signifying GUID or Message IDs) and
those that only change slightly (perhaps timing fields). This would be where
the real knowledge would lie, as assumptions made about individual packets
(eg what is really static or dynamic) could be rectified over a larger
data-set.

Then print this out in a way like:

<static header><record 1><length><Unicode content><\x88\x88\x88><record
2><length><COMPUTER_NAME><record 3><CURRENT_TIME><unknown static crud>

Producing an Ethereal protocol definition file at the end would be icing on
the cake!

I've had a look at:
[1]
http://research.microsoft.com/workshops/sysml/papers/sysml-Gopalratnam.pdf
[2] http://www.ub.utwente.nl/webdocs/ctit/1/000000ef.pdf

But can't seem to find any public code that has attempted to solve the same
problem.
Has anyone else thought about this, or know of code I should look at?

There have been a couple of papers on a technique dubbed Protocol
Informatics. There was a proof of concept implementation and some
whitepapers/presentations written by  Marshall Beddoe that used to be
available at http://www.baselineresearch.net/PI/  (but is now a dead
domain... perhaps available in google cache/way back machine). The code
though appears to live on at PacketStorm:
        File Name: PI.tgz http://packetstormsecurity.org/sniffers/PI.tgz
        Description:
                        The protocol informatics project is a software
                        framework that allows for advanced sequence and
                        protocol stream analysis by utilizing
                        bioinformatics algorithms. The sole purpose of
                        this software is to identify protocol fields in
                        unknown or poorly documented network protocol
                        formats. The algorithms that are utilized
                        perform comparative analysis on a series of
                        samples to better understand the underlying
                        structure of the otherwise random-looking data.
                        The PI framework was designed for
                        experimentation through the use of a
                        widget-based component set.  
                Author:Marshall Beddoe
                Homepage:http://www.baselineresearch.net/PI 
                MD5 Checksum:26b4efae961542718a9208bca030a7e7
        
I seem to recall another app doing automated field boundary detection,
posted fairly recently; but I'm afraid I can't find it right now. :-(

  -- William
        
_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: