funsec mailing list archives

Re: Comment Spam: new trends, failing counter-measures and why it's a big deal


From: Dude VanWinkle <dudevanwinkle () gmail com>
Date: Mon, 13 Feb 2006 12:09:19 -0500

On 2/13/06, Stephen J. Smoogen <smooge () gmail com> wrote:
Doing automated searches through whois for obviously fake entries

I was thinking more along the lines of:
You get spammed by someone who promotes their (customers?) site.
Record the IP of the poster and the site being promoted.
Find the Netblock of the IP of the poster and the whois info of the
site being promoted (if comcast/etc, then ignore)
Spammer registered the domain with this info:
Registrant Name:Drazen Harauzek
Registrant State/Province:Virovitica

You could then crawl the .info whois database for domains registered
with matching information and blacklist all domains/IP's(netblock)
belonging to Mr Harauzek, finding out that he regestered en masse 50
domains with duplicate whois info. It might be nicer to mirror the
whois DB rather than bog downt heir servers, but I am not familiar
with whois netiquette

This in conjunction with a (not sure whats its called, but it has a
picture with randomly generated letters that you have to read then
enter, used for defeating bots), in addition to the normal spam
filtering techniques, which is also in addition to smooge's idea below
should help lessen the spam that gets through.

One thing I learned from my years of fighting spam is that you have to
throw every technology available at it. Each new Technology might
catch 1% or 50%, but they all add a little to the mix.

-JP

going from there to  search and verify messages to confirming that
email addresses are correct. Greylisting/whitelisting software might
also have some affect (if one can legally share that data). Say in
this way:

Being goes to blog.
 Being decides to post to blog.
 Being is given a EULA which basically says "Here are our posting
guidelines. You give up your right to anonymity if you wish to post
here."
 Being is sent a cookie with certain data in it, and is put in greylist.
 Server stores data on IP address, post data, and IP addresses in post.
 Greylisted items are posted on delayed time (or after moderation).
 Server sends greylisted aggregated data to central server (for
pattern matching AI that hey this same URL/IP address block was
embedded into 200 blogs today).
 Posting X amount of times moves one up/down from greylist to
whitelist blacklist using a bayesian scoring technique based partially
on keywords, and partially on non-whitelisted URLs.
  Client servers poll central server regularly for data to be added to
black/grey/white keyword-URL lists.

The central server is mainly to help multiple private blogs clear out
bad nets in a short order.. it would not be needed on a large central
blog aggregator that could act as the central server itself.

--
Stephen J Smoogen.
CSIRT/Linux System Administrator

_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: