funsec mailing list archives

Re: another VX site?


From: Nick FitzGerald <nick () virus-l demon co uk>
Date: Sun, 08 Jan 2006 15:51:56 +1300

Drsolly to dudevanwinkle () gmail com:

Ja, no offense to the AV industry, or Dr Solomon in general ;-) , but
attempting to come up with unique names for variants of 65,000 known
viri is kind of a hopeless task,

Pretty easy, actually. We already agreed a naming scheme that's a bit like 
the scientific system for naming flora and fauna, where the problem is 
much bigger. Read the Caro naming document. Google caro naming.

Yep, and it's an incredibly simple scheme -- for every variant of an 
existing named malware family, take the next unused sub-variant 
ascription from the .A ... .Z, .AA ... .ZZ, .AAA ... .ZZZ, .AAAA ... 
counting system and append that to the family name.  This way the need 
to devise "unique" names for each variant is trivially taken care of.

(Of course, if you want to synchronize those sub-variant ascriptions 
among different developers, you have a different problem, and if you 
want to synchronize the family names between different developers you 
have another slightly different problem again...)

and even if names were contrived, those
of us without the benefit of photographic memories would soon lose
track. 

I'm lucky enough to have a memory like a goldfish; that's why I use a
computer to remember stuff like this.

8-)

Actually, memorability _is_ one of the criteria that drives the design 
of the naming scheme.  Hard as it may be to keep the differences 
between FooBar.AAB and and FooBar.ABA clear in your mind, that's much 
better (for nearly all humans) than trying to remember 
6F49434D7E4532520372A4721A7A9AEC and 3018E99857F31A59E0777396AE634A8F 
(those are the MD5's of the standard form of two very common self-
mailers, Netsky.D and Netsky.P respectively).

Shoot even the AV industry has given up, calling everything
Sober, MyDoom and Klez.

Ah, you've spotted the familial-type naming system, whereby all the 
malware that's very similar to Sober is called Sober.something, which 
makes the naming system possible. 

I wasn't there, so I'd love to hear Alan's take on this if my 
assumptions are wrong, BUT familial grouping by code similarity was 
seen as an important feature of the naming scheme by _most_ AV 
researchers in the early days as grouping similar things in a 
classification system seems to be a natural process for humans and it 
helps reduce the potential overload of a classification system that 
does NOT have such a function.

I would suggest (as I would guess others have before) that we name the
viri by their md5sum or some such naming signature. maybe if our
numbering scheme is successfully (maybe a md5 of the malicious payload,
followed by the md5 of the exploit(s) it uses to propagate, followed by
the md5 of the "schlock" (eg: "greetz to my diapers") then we could even
have a DNS-esq scheme for mapping those nasty long numbers to nifty
short names based on autovariant detection. One would hope the viri DNS
system would base the naming convention on points of entry or payload
sections of viri rather than the schlock part.

I am assuming that this has already been discussed and dismissed, does
anyone know why?

To calculate an md5, you have to specify which bytes you're going to
include in the summation. If you think about viruses, ...

That is a very important thing to recall too -- when all this started, 
virtually ALL malware that was of interest to the nascent AV industry 
was _parasitically infectious_.  Nowadays probably 99+% of the malware 
_files_ handled by AV, IDS, etc, etc systems are static _by design_. 
They may still be viral -- what I call "monolithic replicators"; think 
network share crawlers, self-mailers, etc -- but this was an almost 
unseen category back when Alan, Vess and Frisk were cutting their teeth 
on Lehigh, Stoned, Jerusalem, etc, etc and talking about standardizing 
a naming scheme.  And that was a good thing, as the scheme we have 
today is flexible and extensible enough to fairly easily deal with the 
vagaries of malware development we have seen in the last 15+ years...

... for example, you'll
recollect that each instance of a virus-infected file, will have bytes in
the virus part that are variable, and depend on the conditions of the
computer at the moment of infection.

http://vx.netlux.org/lib/avb01.html (look under 4.Classification)

Two different analysts will come up with two different decisions as to
what to include, and what not to include. That's called a "virus map".

So, the AV industry would have to agree on which bytes to include, and
which to exclude, and to have this discussion, they have to start off by
being sure that everyone is referring to the same virus, which isn't as 
easy as you might think, since they're starting off without a way to 
exactly identify the virus.

In the past, very few AV products tried to apply a virus map; working out 
a virus map is quite time consuming on the analyst. And, as of 1995, 
Findvirus was the only product that used virus maps to do exact 
identification (the situation might be different now).

I had an idea that Frisk has been using something very similar for a 
very long time (even since _before_ the major engine revision at v2.0)?

By the way, there's no such word as "viri", and people who refer to "viri" 
put themselves firmly in a group that you possibly don't want to be seen 
as being a member of.

Indeed -- very good advice (and before you ask Dude -- "virii" is 
worser...).

You really do not want the "what is the correct plural of virus" 
discussion here, but if you're at all interested go search out the 
many, many times it was discussed ad nauseum in Virus-L/comp.virus and 
in alt.comp.[anti-]virus since.


Regards,

Nick FitzGerald

_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: