Dailydave mailing list archives
Re: approximate string matching - Bloom filters
From: "Fausett, Mark (US SSA)" <mark.fausett () baesystems com>
Date: Fri, 1 Sep 2006 13:23:43 -0400
Bloom filters are approximate in a different sense though -- Think of them as space efficient, but lossy token sets; you put a bunch of tokens in, and subsequently can query whether a particular token was placed into the set; to some degree of confidence. Bloom filters are subject to false positives -- they'll sometimes incorrectly tell you that a token is in the set -- but not false negatives. Because hashing functions are used to insert tokens into the bloom filter, the false positives have nothing to do with approximate string matches. Cheers, Mark Fausett -----Original Message----- From: dailydave-bounces () lists immunitysec com [mailto:dailydave-bounces () lists immunitysec com] On Behalf Of Martin Roesch Sent: Friday, September 01, 2006 10:06 AM To: Mateusz Berezecki Cc: dailydave () lists immunitysec com Subject: Re: [Dailydave] approximate string matching -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Have you taken a look at Bloom Filters? I'm not sure what your exact requirements are but Bloom Filters are fast and do probabilistic pattern matching. Check out: http://en.wikipedia.org/wiki/Bloom_filter http://www.arl.wustl.edu/~sarang/analysis.pdf They're easy to implement and fast, so maybe that will work for you. Hope that helps! -Marty ...SNIP...
Attachment:
Fausett, Mark (US SSA).vcf
Description: Fausett, Mark (US SSA).vcf
_______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com http://lists.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- Re: approximate string matching - Bloom filters Fausett, Mark (US SSA) (Sep 01)
- Re: approximate string matching - Bloom filters Mateusz Berezecki (Sep 01)
- Re: approximate string matching - Bloom filters Martin Roesch (Sep 02)
- Re: approximate string matching - Bloom filters Mateusz Berezecki (Sep 01)