Nmap Development mailing list archives

Re: [NSE] Vulnerability Scan based on osvdb


From: Marc Ruef <marc.ruef () computec ch>
Date: Fri, 21 May 2010 08:55:25 +0200

Hello David,

At the moment I am using the following code to find the best product
name match:

--- cut ---

local products_words = explode(" ", product)

for x=#products_words, 1, -1 do
     -- Generate a best match string for the product name
     for y=1, x, 1 do
         if products_wordsearch == "" then
             products_wordsearch = products_words[y]
         else
             products_wordsearch = products_wordsearch .. " " ..
products_words[y]
         end
     end
end

--- cut ---

For explode, use our standard function stdnse.strsplit.

Oh thanks, I will do that.

This code was mysterious to me but I see now what it does. If you
initialize products_wordsearch = "" at each iteration, then the x loop
takes a string like "Apache Tomcat httpd" and generates the successive
values

"Apache Tomcat httpd"
"Apache Tomcat"
"Apache"

Yes, this is correct :)

That seems reasonable.

Your example with Apache proves the reasonability. But when nmap determines "Microsoft IIS httpd 7.0" I've got a new problem. Because the vendor name "Microsoft" is prefixed, I would also have to cut the preceding strings. This increases the amount of iterations I would have to do. Because in the case of IIS I would have to do the following transformation:

1 Microsoft IIS httpd 7.0 [vendor + prod + (hum info) + ver]=> no match
2 Microsoft IIS httpd     [vendor + product + (human info)] => no match
3 Microsoft IIS           [vendor + product]                => no match
4 Microsoft               [vendor]            => match (false-positive)
-
5 IIS httpd 7.0           [product + (human info) + version]=> no match
6 IIS httpd               [product + (human info)]          => no match
7 IIS                     [product]                       => best match

As you can see, this algorithm makes sense so far. But there might be two cases in which we will get wrong results:

1. If a vendor has two words for vendor names.
   => false-negative
      iteration 5sqq. does not help anymore
      trivia: object_vendors counts 2494 with two or more words

2. If a product has no vendor but two words in product name and the second string is a common word (e.g. "webserver" or "httpd").
   => false-positive
      in iteration 7 (or even 6)
      example: "Apache httpd" => "httpd"

A possible solution would be to do a replacement of vendor names before the proposed iteration. However, there will be more inconsistencies which will prevent the elimination of all false-positives and false-negatives.

This means a lookup with high confidence isn't possible anyway. Either way

* I have a high confidence but not all matches or

* I just grep the title strings, get "all" the matches but with a very
limited amount of confidence.

I think I prefer the second option. I don't know about "very limited."

I don't like to use LIKE statements to search text fields. These are usually a bit fickle. We would have two imprecisions: The version detection and the lookup process.

Software names tend to be pretty distinct. It should be possible to get
good confidence with just pattern matching and maybe some
canonicalization. We do strive for consistency in nmap-service-probes,
but it's a big database and has had several maintainers, which I'm sure
is true of OSVBD as well.

I am going to do some more experiments which shall reveal the best approach. On a long-term view the support of CPE still seems to be the best decision.

Regards,

Marc

--
Marc Ruef | marc.ruef () computec ch | http://www.computec.ch/mruef/
_________________________________________________________________
Meine letzte Publikation: "Facebook Anwendungen Design-Schwachstelle" http://www.scip.ch/?labs.20100521
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: