Nmap Development mailing list archives

Analysis of using CPE for Nmap OS signatures


From: David Fifield <david () bamsoftware com>
Date: Sat, 7 Aug 2010 00:02:18 -0600

Fyodor asked me to investigate what would be required to support Common
Platform Enumeration (CPE) in OS and version detection. I started with
OS detection because I thought it would be easier. For this I have
assumed that the only requirement is that Nmap be able to print the CPE
names for OSes that it finds. Here is what I have found.

CPE is a naming system for hardware, operating systems, and
applications. A CPE name is a URI with the form

cpe:/{part}:{vendor}:{product}:{version}:{update}:{edition}:{language}
(http://cpe.mitre.org/specification/diagram.html)

All of the components are optional. The {part} may be "a" for
application, "h" for hardware, or "o" for operating system. For the OS
database, we are only interested in "h" and "o". Examples of CPE names
are

cpe:/h:cisco:2100_wireless_lan_controller
cpe:/o:microsoft:windows_xp:::pro
cpe:/o:linux:kernel:2.6.35

The CPE specification is at

http://cpe.mitre.org/files/cpe-specification_2.2.pdf

For OS detection we can ignore some of the complexity of the
specification, namely the CPE Language for combining names and the
matching algorithm. Along with the specification you need a copy of the
CPE dictionary,

http://cpe.mitre.org/dictionary/index.html
http://static.nvd.nist.gov/feeds/xml/cpe/dictionary/official-cpe-dictionary_v2.2.xml

The dictionary has a long list of all known CPE names plus
human-readable equivalents for them. A sample of the file is

    <cpe-item name="cpe:/o:freebsd:freebsd:0.4_1">
        <title xml:lang="en-US">FreeBSD FreeBSD 0.4_1</title>
        <meta:item-metadata modification-date="2007-09-14T13:36:49.090-04:00" status="DRAFT" nvd-id="26781" />
    </cpe-item>
    <cpe-item name="cpe:/o:freebsd:freebsd:1.0">
        <title xml:lang="en-US">FreeBSD FreeBSD 1.0</title>
        <meta:item-metadata modification-date="2007-09-14T13:36:49.090-04:00" status="DRAFT" nvd-id="26782" />
    </cpe-item>
    <cpe-item name="cpe:/o:freebsd:freebsd:1.1">
        <title xml:lang="en-US">FreeBSD FreeBSD 1.1</title>
        <meta:item-metadata modification-date="2007-09-14T13:36:49.090-04:00" status="DRAFT" nvd-id="26783" />
    </cpe-item>

You use the dictionary to make sure your names match up with existing
names.

An Nmap OS fingerprint looks like

Fingerprint Microsoft Windows 2000 SP4
Class Microsoft | Windows | 2000 | general purpose

In order, the five fields are a freeform description, vendor, family,
generation, and device type. The vendor maps easily to the CPE {vendor}.
In most cases the family maps directly to {product}. family mostly maps
to {version}. There are exceptions, like Microsoft Windows, where the
family and generation are combined into the {product} as in
cpe:/o:microsoft:windows_2000::sp4. CPE has nothing like our device type
field. That would have to be represented separately in order to continue
to be supported. CPE directly encodes some information, such as the
service pack, that we only have in the freeform description and not in
the vendor/family/generation/devicetype quadruple.

CPE also doesn't have the equivalent of the "embedded" we use to
represent an unknown OS on a known hardward platform. In these cases we
would use the "h" part instead of "o" to represent a hardware type.

Our database is far broader than the CPE dictionary in terms of
coverage. Counting only entries where the part is "h" or "o", the CPE
dictionary has 97 vendors, while nmap-os-db has 384. There is a standard
way to submit new names for the dictionary, which we would have to do a
lot.

CPE cannot easily represent a range of versions like "Linux 2.6.11 -
2.6.15", but then again, neither can our Class lines. We relegate that
information to the freeform description. The recommended way to
represent a line such as

Class Linux | Linux | 2.6.X | general purpose

is like this:

cpe:/o:linux:kernel:2.6

In order to use CPE in the OS database, we would need to convert our
Class lines to CPE. I recommend storing the vendor, family, and
generation in CPE, and the device type elsewhere, to avoid having to
maintain concurrent copies of the same data. I have written an attached
a small program, cpeify-os.py, that reads Class lines and prints their
CPE equivalent. This translation is imperfect, because some essential
information is stored only the freeform description, which is not
designed to be parseable. A sample of its output is

Class Cisco | Linux | 2.6.X | firewall
cpe:/o:cisco:linux:2.6
Class D-Link | embedded || WAP
cpe:/h:d-link
Class OpenBSD | OpenBSD | 4.X | general purpose
cpe:/o:openbsd:openbsd:4

Note that for the "embedded" OS, results are especially bad. That is
because the model number and other information are only stored in the
freeform description. Even if we extract the model number from the
description, it still has to be canonicalized to fit dictionary forms
exactly, where they already exist. Also things like service packs need
to be extracted from the description. So the translation to CPE can be
automated to some extent, but manual work is still needed. Roughly 40%
of our 2,800 OS fingerprints are "embedded", so that's a lower bound on
how many will have to be done manually.

Changing the database will be the biggest job. Modifying Nmap and the OS
integration tools will be comparatively easy. We will want a tool that
can run on nmap-os-db and the CPE dictionary, and print out any database
entries that are not in the dictionary. It should be capable of directly
outputting the format for submission of new CPE names. I think we would
still need to keep the freeform description field, both to represent
version number ranges and to store additional information such as CPU
architecture that doesn't fit in CPE.

David Fifield

Attachment: cpeify-os.py
Description:

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

Current thread: