Bugtraq mailing list archives

Re: [Full-disclosure] [Tool] DeepToad 1.1.0


From: Joxean Koret <joxeankoret () yahoo es>
Date: Tue, 5 Jan 2010 15:00:12 +0000 (GMT)

Yes. It isn't designed to search for the differences between 2 binary files but to search for similar files, 
_independently_ of the format, and group them.

This tool can be used, in example, to search for similar "crapwares" or to search for similar image files (not similar 
looking, but similar files), similar office documents, etc...

--- El mar, 5/1/10, T Biehn <tbiehn () gmail com> escribió:

De: T Biehn <tbiehn () gmail com>
Asunto: Re: [Full-disclosure] [Tool] DeepToad 1.1.0
Para: "Dan Kaminsky" <dan () doxpara com>
CC: "Joxean Koret" <joxeankoret () yahoo es>, "Full Disclosure" <full-disclosure () lists grok org uk>, bugtraq () 
securityfocus com
Fecha: martes, 5 de enero, 2010 15:56
I can see what you're saying, it
could be useful for finding
differences in different versions of the same binary but
from what I
can see Joxean's app is meant to group files of the same
'type,' not
provide 'diff' capabilities.

-Travis

On Tue, Jan 5, 2010 at 9:51 AM, Dan Kaminsky <dan () doxpara com>
wrote:
I looked into a fair amount of this sort of
normalization back when I was
playing with dotplots.  The idea was to upgrade from
simple Levenshtein
string comparison (with no knowledge of variable
length x86 instructions,
pointers that shift from compile to compile, etc) to
something with at least
some domain specific knowledge.  What I found,
somewhat surprisingly, was
that dumb string comparison was more than enough.  In
fact, when I compared
pre-patch and post-patch builds, it was easy to
directly see when content
was added, removed, shifted in location, etc. 
Joxean's going to have much
the same result -- as basic as his similarity metric
is, he'll get the broad
strokes just fine.

Ultimately the best approach is to build a graph of
how functions interact
and measure graph isomorphism, but of course Halvar
figured that out years
ago :)

On Tue, Jan 5, 2010 at 3:41 PM, T Biehn <tbiehn () gmail com>
wrote:

Hmm,
Wouldn't it be more useful to the sec community to
have a algorithm
that abstracts at the -interpreted- content level?
That is when
analyzing binaries I wouldn't think that this
would classify two with
near identical functionality together, even though
it is removing a
significant chunk of information during the hash
pass.

I would largely assume that your algorithm, as is,
works best on
uncompressed bitmaps. Is there something I'm
missing?

-Travis

On Sun, Jan 3, 2010 at 6:37 AM, Joxean Koret
<joxeankoret () yahoo es>
wrote:
Hi all,

I'm happy to announce the very first public
release of the open source
project DeepToad, a tool for computing fuzzy
hashes from files.

DeepToad can generate signatures, clusterize
files and/or directories
and compare them. It's inspired in the very
good tool ssdeep [1] and, in
fact, both projects are very similar.

The complete project is written in pure
python and is distributed under
the LGPL license [2].

Links:
Project's Web Page http://code.google.com/p/deeptoad/
Download Web Page http://code.google.com/p/deeptoad/downloads/list
Wiki http://code.google.com/p/deeptoad/w/list

References:
[1] http://ssdeep.sourceforge.net/
[2] http://www.gnu.org/licenses/lgpl.html

Regards && Happy new year!
Joxean Koret



_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/




--
FD1D E574 6CAB 2FAF 2921  F22E B8B7 9D0D 99FF
A73C
http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on
http://pastebin.com/f6fd606da

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/





-- 
FD1D E574 6CAB 2FAF 2921  F22E B8B7 9D0D 99FF A73C
http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on
http://pastebin.com/f6fd606da






Current thread: