Interesting People mailing list archives

Reflections on the 25th anniversary of Spam

From: Dave Farber <dave () farber net>
Date: Mon, 28 Apr 2003 16:51:57 -0400

------ Forwarded Message
From: Brad Templeton <brad () templetons com>
Organization: http://www.templetons.com/brad
Date: Mon, 28 Apr 2003 11:43:59 -0700
To: Dave Farber <dave () farber net>
Subject: Reflections on the 25th anniversary of Spam


Dave...

This Saturday marks the 25th anniversary of the first Spam I have found in
my research.   I wrote up an article with reflections on it.   There is
a brief mention of you.  Also note the recently passed 10th anniversary of
the first time the word Spam was used to refer to a USENET mass posting.

There are links, so it is perhaps best read at:
    
    http://www.templetons.com/brad/spam/spam25.html

However, I include text below



              Reflections on the 25th Anniversary of Spam


While many only encountered spam (junk e-mail or junk newsgroup
postings) in the mid 1990s, my research has found it goes back much
further than that.

In fact, the earliest documented junk e-mailing I've uncovered was sent
May 3, 1978 -- 25 years ago this Saturday.  And in a surprising
coincidence, just a month ago marked the 10th anniversary of March 31,
1993, the first time a USENET posting got named a spam.

I learned of that first spam through a report from Einar Stefferud who
read {link /brad/spamterm.html}{ltext a history I prepared of the term
"spam"} and how the name of the canned ham became our name for junk
e-mail.  I had original set out to research the history of the term,
but it became impossible not to research a bit of the history of the
act.

That first spam was sent by a salesman for DEC - Digital Equipment
Corporation.  Today, you may not know DEC, since it was bought by
Compaq and is now a unit of HP, but in those days it was the leading
minicomputer maker, and its computers provided the platform for the
development of Unix, C and much of the internet, to cite just a few
minor events.

By 1978 the Arpanet (as the internet was then known) had already
provided network E-mail to a large number of folks at universities,
government institutions and universities for over 6 years.  E-mail was
the biggest source of traffic on the Arpanet.  A few years prior, Dave
Farber had created "MsgGroup," the first network mailing list.  (Though
Plato and other timesharing systems had laid the foundations for online
community and conferencing some years before that.)

The DEC salesman, identified only as "THUERK at DEC-MARLBORO" (There
were no dots or dot-coms in those days, and the at-sign was often
spelled out) decided to send a notice to everybody on the ARPANET on
the west coast.  It trumpeted an open house to show off new models of
the Dec-20 computer, a hot minicomputer of the day.

This was a spam, though the term would not be used to refer to it for
another 15 years.  The spammer didn't do a very good job.  He (I
presume his sex as male) simply typed addresses into his mail program,
or possibly included them from a file.  The mail program would only
take 320 addresses.  The rest got simply shoved into the top of the
body of the message.

As you can guess there was quite a response, with -- not unusually --
far more volume of debate than actual spam.  It's amusing to see that
one future celebrity -- a young free software guru Richard Stallman --
at first wondered why people were so upset about the message.  He later
said the mistaken placement of all the addresses into the body did
bother him, but he gets the dubious honour of being perhaps the first
spam defender.  Of course like all of us he was 25 years younger and
the problem was brand new.

In those days the Arpanet had an official "acceptable use policy" which
limited it to use in support of research and education.  So this
message was a pretty clear violation, and one presumes the salesman was
given a fairly stern education on the matter.  The policy was well
enough known over time that we would not see significant spam for many
years to come after that.

                         More detailed history


You can read my {link /brad/spamterm.html}{ltext history of the term
spam} and how it came to mean abuse of the net.

You can also just go directly to {link /brad/spamreact.html}{ltext the
spam and the reaction to it} or even {link /brad/spamreact.html#rms}{ltext
Stallman's defence} of it.

This site contains a number of essays on the spam problem, which I have
been studding for many years, trying to find solutions which don't
destroy the core values embodied in the mail system.  In spite of what
some may feel, we wanted a extremely cheap e-mail system where anybody
could mail anybody, which protected anonymous communication and
fostered values like free speech, the ability to do unsolicited
communication.  Those are not bugs, so fighting spam while keeping
those values, along with other core social goals, is a delicate task.

You can read my {link endspam.html}{ltext current best plan to end
spam} if your interests lie that way.  Other essays can be found at my
{link /brad/spam/}{ltext spam essay page.}

                        Escalation of the battle


Spam fascinates me because it sits at the intersection of three
important rights -- free speech, private property and privacy.  It's
also the first major internet governance issue (possibly in tandem with
DNS) that the members of the internet community have been so deeply
concerned with.

The reaction to it has been remarkable.  By attacking something we hold
dear, and goading us by using our own tools and resources to do it,
spam generates emotion far beyond its actual harm, even though that
actual harm is quite considerable.

Spam pushes people who would proudly (and correctly) trumpet how we
shouldn't blame ISPs for offensive web sites, copyright violations
and/or MP3 trading done by downstream customers to suddenly call for
blacklisting of all the innocent users at an ISP if a spammer is to be
found among them.  People who would defend the end-to-end principle of
internet design eagerly hunt for mechanisms of centralized control to
stop it.  Those who would never agree with punishing the innocent to
find the guilty in any other field happily advocate it to stop spam.
Some conclude even entire nations must be blacklisted from sending
E-mail.  Onetime defenders of an open net with anonymous participation
call for authentication certificates on every E-mail.  Former champions
of flat-fee unlimited net access who railed against proposals for
per-packet internet pricing propose per-message usage fees on E-mail.
On USENET, where the idea of canceling another's article to retroactively
moderate a group was highly reviled, people now find they couldn't use
the net without it.  Those who reviled at any attempt to regulate
internet traffic by the government loudly petition their legislators
for some law, any law it almost seems, against spam.  Software
engineers who would be fired for building a system that drops traffic
on the floor without reporting the error change their mail systems to
silently discard mail after mail.

It's amazing.

Dozens of anti-spam companies have sprung up in the past few years,
offering a range of solutions including content-based filtering,
blacklists, collaborative filtering, spamtrap detection and removal,
e-stamps and some bulk detection.  Remarkably, one new company called
Habeas (trying an {link
http://groups.google.com/groups?q=g:thl3346077286d&dq=&hl=en&lr=&ie=UTF-8&oe
=UTF-8&safe=off&selm=5an2om%24f61%40fugue.clari.net}{ltext
old idea of mine}) is selling not a spam-blocking service, but a magic
trademarked term that will let legitimate mailing list owners get past
the collateral damage caused by existing spam-blocking tools.  _Their
product is to get you past the spam filters._

Attempts to nail down a definition of spam seem to always end in
quagmire.  Each party to the debate seems determined to make sure that
everything they think is spam be included in the definition, lest one
spam slip through, but of course also keen that nothing they don't
think is spam be blocked.  Reconciliation seems near impossible.

                             The solutions


Here's a brief summary of some of the current active methods and
proposals and how effective they are.

    Content-based filters

Filters have a big advantage because they only need to be installed at
the receiver.  Some of the latest filtering tools, like SpamAssassin
and the latest Bayesian algorithms are doing quite well in terms of the
amount of spam they stop.  However, they all have "false positives"
which means they falsely identify real mail as spam, and block it.  Most
filters have no way to identify that mail was sent in bulk (the core
requirement to spot a spam) and thus must rely on finding common
patterns used by spammers.

The hand-tuned filters need regular updating by people.  The learning
filters adjust automatically but only by letting some spam through.

In terms of effectiveness, these are 2nd only to challenge/response
tools.

    Blacklists

There are many competing blacklists, some of strong ethics, others more
dubious.  Nonetheless all rely on blocking mail from accused or
confirmed spammers, with debate over the standards of proof and the
definitions of spam.  Some have gone so far as to blacklist entire ISPs
or even nations.  Almost everybody who runs a mail server, it seems,
has a story about getting on a blacklist and having to figure out how
to get off, if they were able to.

Blacklists certainly do scare ISPs, and the blacklisting of open relay
servers had, over the course of many years, done a lot to get people to
close up their relays (at the cost of making it harder for roaming
users to send E-mail.)

    Collaborative filters

These filters, such as Vipul's Razor (now via CloudMark) rely on the
first poor souls who get a spam reporting it to a central server.  As
the reports come in, the spam can be identified and rules can be
written to block it.  These are reasonably effective, and go after
bulk, which is good.  They have fewer false positives if done well.
They are very similar to...

    Spamtrap filters

These are primarily used by BrightMail Inc., which is probably the
largest commercial anti-spam operation.  Brightmail maintains huge
numbers of addresses seeded onto spammer lists.  When messages arrive,
they are almost surely spam, and human beings look at them to derive
rules to filter out and retroactively delete the messages.  Very few
false positives, but unfortunately reportedly only about 60-70%
effective.

    Challenge/Response

Dear to my heart because to the best of my knowledge, I wrote the first
of these, a never-productized program called {link
http://www.templetons.com/tech/vikuser.html}{ltext Viking-12}.  These
tools know all your existing contacts, and when they receive mail from
a new correspondent, they send out a "challenge" E-mail that asks the
mailer to do something to confirm they are a real human being and not a
spammer.  When they do, the held mail is automatically delivered and
they are on the good list from then on.

These tools are extremely effective; only a few spammers ever respond
to the challenges.  However, for various reasons some legitimate
correspondents also don't response, so it is necessary to browse the
list of messages that never got a response to quickly search for the
real messages.  However, they are few, and they usually have low spam
scores when used in combination with filtering tools.  This can get the
false positive rate extremely low.

Challenge/response without scanning the non-respondents blocks
anonymous mail.

Today several companies offer them, and there are free software
projects like TDMA which perform this function.  A number of research
projects have developed what could be termed "Turing tests" for the
challenges, to assure that the respondent is a human being.

    ISP bulk detection

A number of large ISPs, AOL in particular, have their own spam
detectors which rely on the fact that due to their size, they have so
many addresses that any spam attack is sure to arrive multiple times.
They can thus detect these and get rid of them.  A good approach, but
past history shows some nasty false positives, with ordinary mailing
lists being blocked for their volume.  One notorious case involved AOL
blocking acceptance letters from Harvard, which were sent out as a
highly desired mass mailing.

This is a worthwhile technique but needs to be done with more care.
Today's collateral damage is too high.

    Spam-banning laws

There have been may proposed anti-spam laws, and indeed around half of
all U.S.  states have such laws -- California has two!  While most of
these state laws will eventually be declared unconstitutional since it
is important that states not have the power to regulate something as
geography independent as E-mail, what can't be disputed is that they
are having essentially no effect.  There have been very few
prosecutions under them, and spam levels continue to increase
tremendously.  Some hold out more hope for a U.S.  federal law, however
an increasing percentage of spam comes from overseas.  Advocates hope
that even overseas spam can be stopped by a federal law if a U.S.
connection can be found.  Fellow EFF board member Larry Lessig
advocates that a law which pays a bounty to those who hunt down the
U.S.  connection on any spam without a mandatory label could do the
trick.

    Torts

There's been a fair bit of success against big institutional spammers
in tort law.  AOL and other companies have sued spamming companies
using a variety of torts to shut them down.  So far, alas, like
Whack-a-moles, other spammers keep coming up.  However, there have also
been disturbing trends in the tort area.  For example, Intel has sued
an ex-employee who spammed Intel's entire employee base with his
grievances against the company using a legal doctrine called "trespass
to chattels."  Unfortunately, the consequences of declaring E-mail to
be trespass are even nastier than spam.

A large number of spams are already illegal, of course, amounting to
confidence tricks or illegal selling of prescription drugs.  Some of
those laws are being used against the spammers too.

    Opt-out lists

Recently, a federal do-not-call list was instituted in the USA to stop
phone spam.  Unfortunately, {link globout.html}{ltext doing the same
for E-mail is difficult} and faces the same problems all laws would.

    Hiding your address

The most common technique today seems to be hiding your E-mail address
so that it can't be harvested by spammers.  Unfortunately, by using
dictionary attacks, they are managing to spam people who have never
exposed their E-mail in public.  I consider this desire to never reveal
your E-mail one of the greatest damages done by spammers, so I don't
view hiding as a great solution to the problem.

    Vigilante attack

Some anti-spammers have resorted to harassment and even extra-legal
efforts against spammers.  They make a great tale to tell, but so far
do not seem to be stemming the tide.  And they have all sorts of nasty
backbite, since they amount to sinking to or below the level of the
enemy.

                        Up and coming solutions


    E-Stamps

This idea is regularly re-generated.  I first {link estamps.html}{ltext
proposed it myself back in 1995} and later came to reject it.  The idea
is to put some low (or routinely not collected) cost on sending an
E-mail that does not bother ordinary senders, but stops spam from being
cost-effective.  It has many advocates, and might work if it could be
universally adopted.  However, it suffers from a "you can't get there
from here" problem.  Until people are offering stamps with their
E-mail, you can't demand them, and they have little incentive to offer
them if nobody is demanding them.  This technique could only be built
by piggybacking on other techniques, such as doing challenge/response
and offering stamps as a means to bypass the challenge.

    Throttling bulk volume from unaccountable addresses

My current favoured proposal, detailed {link endspam.html}{ltext here}.

                             Authentication


A number of people on anti-spam lists propose putting an authentication
regime into E-mail, to the extent that one could refuse mail that was
not digitally signed or otherwise verifiable.  This would stop forging
return addresses or the use of non-existent return addresses.  A number
of laws also address this.

Such schemes unfortunately abandon the longtime goal of an open E-mail
system without central management (such as a certificate authority)
which allows anonymous speech.

                               The Future


The spam problem will get worse before it gets better.  Spammers will
try new and nastier techniques to get around the blockers, and the
blockers will try new and improved technologies.  Spammers are already
moving to even nastier techniques, such as using worm programs, or
exploiting windows in widely deployed software systems to take over
other people's machines and get them to do the spamming.  It is
rumoured that some spammers are using some of the wide number of open
wireless networks to drive up to a building and spam using the network
inside.  Such tactics can't be countered with blacklists, for example,
though they are fortunately highly illegal.

However the spam problem is solved, or partially solved, it will remain
fascinating as the internet community grapples with its first serious
abuse issue from within.  Most other abuse issues have involved
outsiders, ranging from the religious conservatives trying to ban smut
to the RIAA trying to stop file-sharing, trying to regulate the net.
Spam has caused the network insiders themselves to seek to regulate it.

This is important because it will, of course, not be the last such
issue.  How we manage ourselves here will be an indicator of things to
come.

I hope that as we do this we will remember the principles that make
free societies free, and the principles upon which the internet was
built.  End-to-end, open designs.  The ability for anybody to
communicate with anybody, even without an invitation.  Ubiquitous,
deliberately low-cost communications that are not accounted for on a
packet-by-packet basis.

In addition, we must realize that though all internet traffic flows
over private property, this does not mean we should forget constitutional
principles like the U.S.  1st amendment.  As I view it, the 1st
amendment isn't just the law, it's a good idea.  We owe a duty to
preserve the values it contains -- and the long history of how to
protect them that is embodied in 1st amendment jurisprudence -- as we
architect the communications systems of the future.  For in building
and regulating the internet, we are doing no less than creating the
primary platform for speech and the press of the new century.

That is not a task to be taken lightly.


------ End of Forwarded Message

-------------------------------------
You are subscribed as interesting-people () lists elistx com
To manage your subscription, go to
  http://v2.listbox.com/member/?listname=ip

Archives at: http://www.interesting-people.org/archives/interesting-people/
Current thread:

Reflections on the 25th anniversary of Spam Dave Farber (Apr 28)