Nmap Development mailing list archives

Re: favicon survey script


From: Brandon Enright <bmenrigh () ucsd edu>
Date: Wed, 5 Aug 2009 06:22:43 +0000

On Tue, 4 Aug 2009 21:57:06 -0600 or thereabouts David Fifield
<david () bamsoftware com> wrote:

Hi,

There was a project to build a NSE script that would identify web
server software by hashing favicon.ico and looking it up in a
database. In fact the script exists, but the database is small and
the relevance of its entries is not known.

Last year Vlatko Kosturjak did large Internet scans and cataloged the
frequency of favicons. However for some reason this was never built
into a database and a script, as far as I know.

http://seclists.org/nmap-dev/2008/q4/0397.html
http://seclists.org/nmap-dev/2008/q4/0586.html
http://kost.com.hr/favicon.php

I think the script is a great idea, so I wrote a script to try to
duplicate Vlatko's results. The script simply downloads /favicon.ico,
hashes it, then stores the icon itself and a list of hosts using it in
files named after the hash. To give you an idea:

$ cd ~/favicon
$ ls icon/
17F03417CBF92B80992B7CA7A566FB0C.ico
C89ECD7675567625E5755A7A9C31632D.ico
379A65BEB4D412765FCF9FBBDEECD416.ico
C8BFCB5728998AC6C3DA90EA5CD2340A.ico
7131EF7073ED685BF2987B9061C65D36.ico
CB5AA723DDDB0734CEC459F2B9C3B1C4.ico
88733EE53676A47FC354A61C32516E82.ico
D16A0DA12074DAE41980A6918D33F031.ico $ ls hash/
17F03417CBF92B80992B7CA7A566FB0C  C89ECD7675567625E5755A7A9C31632D
379A65BEB4D412765FCF9FBBDEECD416  C8BFCB5728998AC6C3DA90EA5CD2340A
7131EF7073ED685BF2987B9061C65D36  CB5AA723DDDB0734CEC459F2B9C3B1C4
88733EE53676A47FC354A61C32516E82  D16A0DA12074DAE41980A6918D33F031 $
cat hash/D16A0DA12074DAE41980A6918D33F031 190.166.207.187:80
125.25.91.250:80

./nmap --datadir . -n -PN -d --script=favicon -p 80 -iR 20000 -oN
favicon-%Y%m%d-%H%M%S.nmap

I scanned port 80 of 20,000 random IP addresses (took about 16
minutes) and got these results:

$ wc -l hash/* | sort -n
  1 hash/17F03417CBF92B80992B7CA7A566FB0C
  1 hash/379A65BEB4D412765FCF9FBBDEECD416
  1 hash/7131EF7073ED685BF2987B9061C65D36
  1 hash/88733EE53676A47FC354A61C32516E82
  1 hash/A3C7BE1BCF382EA413C30453A4ACF638
  1 hash/B6141EFEE8D8E64DBC23539F99F7238E
  1 hash/C3FB27F0BF8AC3171C8105726D61380A
  1 hash/C89ECD7675567625E5755A7A9C31632D
  1 hash/C8BFCB5728998AC6C3DA90EA5CD2340A
  1 hash/CB5AA723DDDB0734CEC459F2B9C3B1C4
  1 hash/D4DA62A788942AAB81D033C9E49D57CB
  1 hash/ECF508711C226CCDA02D58853B31D7A7
  2 hash/D16A0DA12074DAE41980A6918D33F031
  4 hash/D41D8CD98F00B204E9800998ECF8427E
 19 hash/A8FE5B8AE2C445A33AC41B33CCC9A120
 37 total

Already with this tiny scan there are some promising results. Roughly
half of hosts that had a favicon had one with the hash
A8FE5B8AE2C445A33AC41B33CCC9A120 (it's actually an HTML error
message), which makes it a good candidate for fingerprinting.

The idea is to find out the most common favicons and make a user
script containing the database. João Correa volunteered to do the
large-scale scanning. He's also going to investigate ways to make the
script more effective, such as parsing HTML to find the real location
of the favicon file. I suggest he use some of Vlatko's scripts to
visit all the web sites in the dmoz in addition to random scanning.

David Fifield


I decided to give this a big scan but this script is really pushing the
limits of what Nmap on Linux can do.

For those of you trying to do big scans on Linux, make sure you set:

net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
fs.file-max = 65535

Otherwise you'll be out of sockets or file handles very quickly.

Second, make sure Nmap is only using a parallelism of around 800-900.
The list corruption in Nsock from the 1024 sockets max kills the scan
pretty quickly.

Because of these limitations, the scan is going a lot slower than
expected.  I'll try to get results posted in the morning.

Brandon


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread: