Firewall Wizards mailing list archives
Re: regarding spam...
From: ant () notatla demon co uk (Antonomasia)
Date: Fri, 29 Mar 2002 23:55:16 +0000 (GMT)
From: Ryan Russell <ryan () securityfocus com>
I think the hard part becomes how do you tell if one piece of mail is the same as another? If they were absolutely identical, you could ship MD5 hashes around, and everything would be great. One problem is that many spam messages are unique in some small way to the recipient, i.e. they contain tracking info. Perhaps you then have an algorithm that can produce a percentage match when two emails are compared?
Before you leap into this consider possible litigation from people who says (truly or not) that you prevented their mail from arriving. Public distribution of the spam characteristics may not be the best plan. Suggestion for similarity checking: 1) Feed all docs being processed through a formatter to remove all whitespace (space,tab,nl,cr). Maybe squash cases. 2) Use some form of chunk recognition to slice the doc into moderate sized chunks. Something like paragraphs, but based on the data after step 1. This could be delimiting chunks by short strings kept in a local file. The strings might be a few characters long and selected from past postings at random to represent what is found in real traffic. 2b) If this is one-party replay detection the chunk delimiters can be secret and can undergo gradual change. 3) Hash each chunk and store the document description as the list of hashes, with an expiry period (say from now to now + 1 year). 4) Comparison of a new doc to the records of previous docs would result in rejection if some high-ish fraction of the chunks matched those of a previous doc. (Order should probably matter.) I downgrade incoming mail based on Received: headers (whole country codes, whole ISP and non-resolving IP's) and for content (including HTML). Frequently the downgrade is total - bouncing it unseen. -- ############################################################## # Antonomasia ant notatla.demon.co.uk # # See http://www.notatla.demon.co.uk/ # ############################################################## _______________________________________________ firewall-wizards mailing list firewall-wizards () nfr com http://list.nfr.com/mailman/listinfo/firewall-wizards
Current thread:
- regarding spam... Marcus J. Ranum (Mar 29)
- Re: regarding spam... Ryan Russell (Mar 29)
- Re: regarding spam... Alberto Begliomini (Mar 29)
- Re: regarding spam... John Adams (Mar 30)
- Re: regarding spam... Jubilation T Cornpone (Mar 29)
- Re: regarding spam... Adam Shostack (Mar 29)
- Re: regarding spam... Robert Graham (Mar 30)
- <Possible follow-ups>
- RE: regarding spam... Max Enders (Mar 29)
- Re: regarding spam... Antonomasia (Mar 30)