funsec mailing list archives

Re: Anti-Virus Testing and Consumer Reports


From: David Dagon <dagon () cc gatech edu>
Date: Wed, 30 Aug 2006 09:53:48 -0400

On Wed, Aug 30, 2006 at 02:14:58AM +0100, Drsolly wrote:

I'm really surprised that neither Paul nor David knew that this
repository already exists, and is shared by the AV vendors, on a
vetted basis.

There are many such repositories, but that's another thread
altogether.  I've mentioned only a few of them in my public talks on
the new repository.  This thread reminds me to do a more complete job
of explaining that we do not "compete" with other repositories.  (I'm
worried that such a perception will be a starting point for
discussions.)

So how is this new wheel different from all the other wheels?

   1.  I think the key term in Droslly's post is "vetted".  I've been
through numerous such vettings, many of which decayed into suggestions
that I or my fellow researchers sign work-for-hire agreements, 15-page
one-way NDAs, or IP waivers, all in exchange for a handful of samples.
In some cases, people in academia hold their noses, and sign such
agreements.  In other cases they write academic papers using just the
test samples they find in their own mbox--strong on theory, and weak
on testing.  I've found that academic researchers often get
more/better samples from the blackhats themselves.  (Surely, that's a
symptom of a problem.)  I've even met academics who cannot even have
samples of viruses anywhere in their (non-networked) labs, because of
zealous university policies.

Vetting creates members and non-members, and therefore creates an
information differential.  Inevitably, the repository will create its
own differential.  We'll get it right, hopefully.  But we'll probably
erroneously exclude someone for being seven degrees away from Kevin
Bacon/$SOME_PKI_NODE, instead of the requisite six degrees we require.
(Hopefully, the cornucopia of repositories mean that such individuals
will eventually get the data they need for their research.)

I'll argue that our new wheel is "rounder than the old wheels",
because:

  a) The repository will not sell AV samples/feeds.  Some
     security companies do this; we will not.

  b) To make sure we don't buckle on point a, we will have no
     commercial interest in the collection; (the entity taking
     ownership of the submitted samples is a 501(c)(3) non-profit
     organization).  All the code used by the repository will have a
     bsd-style license.  (We of course do not want
     $RANDOM_VIRUS_WRITER to use these resources, however, as is
     the unavoidable case with many commercial AV offerings.)

  c) Users are encouraged, but not required, to share their own
     samples.  Unlike many repositories, there are no up/down
     quotas.

  d) We will take great pains to NOT have our honeyfarms and
     sensors illuminated.  Blackhats have their own (highly
     sophisticated) IP reputation system, and know where many
     of the world's honey* collection systems are found.
     (E.g., mazafaka had listed many security company's sensors.  
     The problem is *so* extensive that even academics are writing
     papers on this topic. Surely that's another symptom
     of a problem!)
    
There are many repositories that embrace one or two of these points.
Some of these points are discrete characteristics (e.g., a and b);
some are continuous (perhaps c, and of course d).  We'll make our own,
different repository, because such a thing doesn't exist, but should.
At the very least, we'll serve a different community.  And of course,
all the existing repositories are welcome to obtain feeds and samples.

   2.  A second difference is the service-oriented nature of the
repository.  We'll analyze and unpack samples, and let (properly
vetted) members use the analysis.  (In a paper to appear at the
upcoming ACSAC conference, we've found a tremendous lift in AV
detection.  I.e., AV tools green light the packed samples, but can
recognize malware in the unpacked versions.)

  The intent here is to equalize another information differential.
Often, operations engineers are quite capable of remediating (even
new) malware, but lack the time/experience needed to efficiently
analyze (often malformed) PE32 executables.  I've had many MX
engineers say, essentially: "If only I could run 'strings' on the
malware, I could delay or stop most of my users from having their
private data mailed to a random blackhat's {g,hot,yahoo,goowy,AIM,
care2,lycos}-mail account".  This information differential is not
completely solved by the offering of whitepapers and writeups (many of
which, e.g., Joe's, are quite excellent).

  By offering the malware in a canonical form (and providing other
analysis), we will provide information feeds that are operationally
relevant.  With any luck, this will facilitate the type of operational
self-help that may result in new innovations.  (I.e., give an a
snort/bro wizard a list of malicious domains, and watch what he does.
Give an MX guru a set of bogus e-mail accounts, and watch her save the
day.  Whatever you do, don't tell them the solution is to call a sales
engineer.)  This can only happen if these communities have access to
information and analysis.  Likely, some ops people do get the
information they need (perhaps indirectly) from existing repositories.
Others do not, and I usually only meet the latter.

                         *     *     *

There's a public library in my town, and it probably has many of the
books Drsolly reads at the library in his town.  If I had my way,
Drsolly would be a welcomed guest in my town's library.  But he
doesn't pay taxes here, and I'll bet neither of us know the right
people to get him in the door.  Worse still, the librarians have never
heard of Drsolly or any of his network of friends.  It would be even
more hopeless if I visited his library.

I think both libraries should exist, and that we need to build more,
since they all serve different communities.  The new library we're
building will consider and welcome those who could get borrowing
privileges at existing libraries.  (We're human, so we'll also end up
turning away honest researchers.  I'll try not to.)  My solution to
this problem is to build more libraries.

I take Drsolly's point to be a reminder that I need to properly
acknowledge the other good work that people are doing.

-- 
David Dagon              /"\                          "When cryptography
dagon () cc gatech edu      \ /  ASCII RIBBON CAMPAIGN    is outlawed, bayl
Ph.D. Student             X     AGAINST HTML MAIL      bhgynjf jvyy unir
Georgia Inst. of Tech.   / \                           cevinpl."
 LocalWords:  bayl
_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: