Nmap Development mailing list archives

Re: Using Nmap + NSE create an embedded scanning botnet (Carna)

From: Fyodor <fyodor () nmap org>
Date: Wed, 20 Mar 2013 01:47:33 -0700

On Mon, Mar 18, 2013 at 3:35 PM, Brandon Enright <
bmenrigh () brandonenright net> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I just came across a very interesting page / paper:

http://internetcensus2012.github.com/InternetCensus2012/paper.html



Yeah, that's quite a "hack"!  I'm not going to focus on the legal and
ethical problems with the methodology in this email, but that's not to
diminish their importance.  I'll also use "he" to describe the researcher,
though I suppose it could be a woman.  And if it is, I want to marry her.

It's definitely nice that he put the data out in the public domain.  And
what a data set it is.  Nine terabytes!  Even ZPAQ compressed it is more
than half a terabyte.  I've have a machine with a gigabit Internet
connection at a datacenter in the Netherlands which has been downloading
this torrent for the last 36 hours and I've still got 103 GB to go. And
once I get it downloaded, I don't have nearly enough space to decompress
much of it.  And decompressing this ZPAQ stuff apparently requires
significant CPU resources too, considering that the researcher also
distributed a tool for spreading the decompressing workload across a
network of computers.  At least it is already split up into some smaller
but still humongous compressed files by data type.

I'll just have to deal with it piece by piece, decompressing the files to
stdout and trying to filter in just the data I need.  In some ways, the
researcher was a packrat, including terabytes of data which may seem
insignificant (like that a certain IP address failed to respond to a
certain type of ping probe at a certain time), but in other cases the
extreme abundance of data may help us.  For example, one might expect he
would only include the open ports.  But that's not enough for us to augment
Nmap's port frequency tables.  We need to know how often the port was
discovered closed too, and it looks like this data contains that
information.

Of course the researcher is anonymous, but that didn't stop CNET from
misunderstanding the concept of mailing list archive and labeling me as the
hacker as I mentioned yesterday.  Fortunately, after about 24 hours of
pestering, they updated the article and issued a small "correction" at the
bottom.  Oh well, if I'm going to be accused by the media of a crime I
didn't commit, there are far worse ones than this I suppose :).

While I could (but won't) guess at the actual researcher's identity, we
know that he's clearly an Nmap fan :).  The very first sentence of the
paper references Nmap, and it is constantly used throughout the research.
 The service detection and OS detection probes used are all from Nmap.  And
the auxiliary tools package he published[1] includes a modified version of
Nmap and a LICENSE file explicitly granting us copyright to incorporate any
of his changes into Nmap.  Unfortunately for us, I think those changes are
just to allow the matching the textual representation of service
fingerprints (the type you submit) with nmap-service-probes, and we already
have a separate program for doing that.  The researcher is probably on this
list.  Heck, lots of us can relate to this quote from the paper:

"We would also like to mention that building and running a gigantic botnet
and then watching it as it scans nothing less than the whole Internet at
rates of billions of IPs per hour over and over again is really as much fun
as it sounds like."

I'm not sure if he had the bots download Nmap directly from nmap.org, but
if so that would explain some of the strange download patterns we have
seen.  We even had to restrict downloads of certain Nmap packages and
shuffle others around last year when we were seeing huge numbers of
suspicious downloads from all of the world.  At least in some cases, the
author if this paper was using the then-latest Nmap version 6.01, which was
released last June.

A lot of articles have mentioned the 420,000 devices which made up this
botnet, but it's worth noting that the researcher actually compromised
(thanks to pathetic default passwords) 1.2 million hosts.  He determined
uniqueness by grabbing the MAC addresses with ifconfig.  But he only
installed the botnet software on about a third of them for various reasons.
 Who knew so many systems would have such terrible passwords?  And he only
tried four default credentials (root, admin, etc.) and just the telnet
protocol.  Imagine if he had the bots let loose with Nmap's NSE brute force
system (http://nmap.org/nsedoc/categories/brute.html)!

So there is a lot of data here, and one question is how it could be useful
to Nmap, assuming some of us manage to download and extract and process
this giant 9 TB dump.  I'm looking for ideas, but here are a couple which
come to mind:
 - The port scan data seems like a great chance to update our port
frequency lists.  What we currently have is from years ago, and that scan
wasn't as comprehensive as this new one.  Nmap scans the 1,000 "most
popular" ports by default, and this should allow us to choose a more
up-to-date set based on current service trends.
 - The reverse DNS data is interesting.  One thing we could do is look at
the top hostnames and use them for something like dns-brute, although rDNS
is not entirely representative of the sort of forward DNS entries that
dns-brute is after.  We certainly can't ship the whole list with Nmap, but
individual Nmap users could use the list of rDNS names and corresponding IP
addresses for scans against organizations that have widely distributed
networks.  Or they could use it to scan specific networks more quickly by
only going after, say, /24's with at least one rDNS name found.
- The list of service fingerprints could be used to identify fingerprints
which don't quite match Nmap's existing fingerprints.  For example, we
could find all of the fingerprints that Nmap doesn't match, sort them by
frequency, and look at the most common failures to match and see if we can
identify them.

Those are the ideas I have offhand.  Anyone have others?

It turns out that some attackers were already exploiting these insecure
devices to perform mischief, but the publicity behind this is likely to
make the matter worse.  We should recommend that folks run telnet-brute
against their devices, and probably give them a specific command like we
did with Conficker.   I'm pretty sure telnet-brute by default tries the
four used in this paper ("root:root", "root:[blank]", "admin:admin",
"admin:[blank]" early on, but I suppose we should verify to be sure.

Anyway, we certainly live in interesting times :).

Cheers,
Fyodor

[1]
http://internetcensus2012.github.com/InternetCensus2012/download/code.tar.bz2
_______________________________________________
Sent through the dev mailing list
http://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread:

Using Nmap + NSE create an embedded scanning botnet (Carna) Brandon Enright (Mar 18)
- Re: Using Nmap + NSE create an embedded scanning botnet (Carna) Fyodor (Mar 18)
- Re: Using Nmap + NSE create an embedded scanning botnet (Carna) Fyodor (Mar 20)
  - Re: Using Nmap + NSE create an embedded scanning botnet (Carna) Patrick Donnelly (Mar 20)