Interesting People mailing list archives

IP: SIMSON SAYS: An End to Spam With SpamAssassin


From: Dave Farber <dave () farber net>
Date: Thu, 30 May 2002 10:56:53 -0400


------ Forwarded Message
From: "the terminal of Geoff Goodfellow" <geoff () iconia com>
Date: Thu, 30 May 2002 13:58:15 +0200
To: "Dave E-mail Pamphleteer Farber" <farber () cis upenn edu>
Subject: SIMSON SAYS: An End to Spam With SpamAssassin

SIMSON SAYS: An End to Spam With SpamAssassin
Simson L. Garfinkel

[This column is the first in a series of columns that Simson L.
Garfinkel will be syndicating this summer. Previously, the ?SIMSON SAYS?
columns ran in the Boston Globe from June 1995 through April 2000. I've
been unable to place a new tech column, so I've decided to try
self-syndicating a few columns and see what happens. ]


Earlier this year my email inbox was overflowing with spam --- junk
email advertising everything from bolts made in China to pornographic
websites. Although it seems hard to believe now, I was actually getting
more than 70 pieces of spam every day. There was so much spam, in fact,
that I had given up reading messages sent to an email address that I had
used since 1995. And because a few business associates didn?t know that
I had stopped using that old email address, the decision ended up
costing me thousands of dollars in missed opportunities.

Spam is not democratic: some people get hardly any, while others get
tons. If you post messages to popular mailing lists or put your email
address on web pages, you dramatically increase the chances that you?ll
get a lot of spam. You can also get a lot of spam if you simply have an
email address that?s predictable --- an address that a spammer might
reasonably guess, like frank () aol com. I get a lot of spam because my
email address has been widely published on web pages and, even worse, in
online directories.

All of that spam now in my past: today my inbox is virtually spam free.
Even better, I?ve been able to reclaim that old email account. Of
course, the spammers haven?t stopped sending me their missivies. But now
that mail is being filtered out by an ingenious piece of software called
SpamAssassin.

In the past 45 days, SpamAssassin has removed 3357 messages from my
inbox and put them in a separate box called ?Spam,? where I?m free to
either ignore them or review them at my leisure. This is a service for
which I would have happily paid. As it turns out, there?s no need:
unlike other anti-spam systems out there today, SpamAssassin is free.

The underlying SpamAssassin technology was invented in April 2001 by
Justin Mason, an Irish computer programmer living in Australia. Mason
created a rule-based system that scores email messages according to a
variety of rules. For example, an invalid time zone in the header gives
an email message 2 points; a subject that is all capital letters gives
the message another 2 points; and a link at the bottom of the message
with the word ?remove? in it gives the message 4.1 points. Any message
with more than 5 points total is considered spam.

Mason?s spam-detection engine was incredibly accurate. Unfortunately, it
was also quite slow, sometimes taking more than 10 seconds on each
message that it attempted to identify. Fortunately Mason published his
program on the Internet for anyone to use. Six months later a programmer
in California named Craig Hughes came up with a trick for making
SpamAssassin run dramatically faster.

Since then, SpamAssassin has steadily grown in popularity. According to
Hughes, more than 11,000 copies of the program were downloaded this past
April. ?People have downloaded it from addresses at IBM, RedHat,
TicketMaster, Yahoo, FedEx, Amazon, Salon, Sun, Informix, Ikea, Nortel,
Cisco, AIG, Dell, Apple, and Network Solutions, among thousands of
others,? says Hughes, who is now one of the volunteers coordinating the
project.

Today SpamAssassin has more than 300 rules and a dictionary of 10,000
phrases it uses for spam detection. SpamAssassin also hooks in to
several anti-spam networks, including the Mail Abuse Prevention System,
better known as MAPS, and Vipul?s Razor.

MAPS is a simple blacklist of companies or Internet Service Providers
that have been caught sending spam in the past. The service, which
carries a subscription fee, has been the target of criticism and the
occasional lawsuit in the past. That?s because an organizations have
been added to the MAPS blacklist, they suddenly find that there are
thousands of ISPs who will no longer accept their email.

Vipul?s Razor applies an approach called ?collaborative filtering? to
the task of fighting spam. Developed by Vipul Ved Prakash, another
California-based programmer, Razor relies on a technique for
fingerprinting email messages and a network of volunteers around the
world who report spam the instant they receive it.

Reporting spam is easier than you might imagine: many ISPs lose between
10% and 30% of their customers every year. (One of the leading reasons
for this churn, apparently, is that the customers are getting too much
spam!) After an account is turned off for six or twelve months, some
ISPs turns the accounts back on and point them at the Razor reporting
network. These email addresses become, in effect, spam traps. Any email
message that gets sent to them is automatically fingerprinted and
reported as spam.

?Spam is email broadcast, so everyone on the recipient list gets the
same spam message,? says Prakash. ?If the first receiver shares the
information identifying the contents of spam with the rest of the
intended recipients, they could refuse to accept the message before it
hits their mailbox. That's the basic idea behind Vipul's Razor. Given
enough identifiers, every spam attack is surmountable.?

SpamAssassin doesn?t use either MAPS or the Razor network as
all-or-nothing tests; instead, the scores from these systems are merely
added to SpamAssassin?s other rules. This limits the damage that occurs
when an entire ISP gets blacklisted by MAPS for one or two bad
customers --- or when a mail message for a popular mailing list gets
erroneously sent to the Razor network.

Occasionally SpamAssassin makes mistakes. Last week, for example, I
missed some messages from a mailing list that I?m on because
SpamAssassin mis-identified the message and put it into my ?spam? box.
Once I realized that problem, all I had to do was to add the sender of
those mail messages to my ?whitelist.? Now, when SpamAssassin sees those
messages, it will pass them through without delay.

Despite the minor mishap, I?ve become a SpamAssassin evangelist. One
recent convert: University of Pennsylvania professor David Farber, who
runs an influential mailing list and spent a year being the Chief
Technologist at the Federal Communications Commission. As you can
imagine, Farber gets a ton of spam --- or at least he did, before he
turned on SpamAssassin. Today he hardly gets any. ?The spam stuff works
like a charm,? he told me in an email message.

Unfortunately, there is one catch with SpamAssassin: it only runs on
UNIX-based email systems. If you are a typical home computer user who
downloads your email from an Internet Service Provider, you can?t run
SpamAssassin --- you need to have your ISP run it for you. Many ISPs
have in fact started to do so. If your ISP has not, drop them a note.
Meanwhile, Hughes and a few of his compatriots are working on a
commercial version of SpamAssassin that will run on Windows and cost
under $30.

?It?s only recently that end-users have become concerned with spam
levels --- system administrators have been concerned for much longer,?
says Hughes, noting Hotmail and other ISPs are now receiving between 4
and 20 pieces of spam mail for every genuine email message.

============
Simson L. Garfinkel is a journalist, computer columnist, and the author
of 11 books. His book Web Security, Privacy and Commerce was published
last November by O?Reilly & Associates. Garfinkel is the part owner of
Vineyard.NET, a small Internet Service Provider that serves the island
of Martha?s Vineyard.

More information about SpamAssassin can be found at
http://www.spamassassin.org/

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
geoff.goodfellow () iconia com, Prague CZ * tel/mobil +420 (0)603 706 558
"success is getting what you want & happiness is wanting what you get"
http://www.nytimes.com/library/tech/99/01/biztech/articles/17drop.html
 


------ End of Forwarded Message

For archives see:
http://www.interesting-people.org/archives/interesting-people/


Current thread: