Nmap Development mailing list archives
[Proof of Concept] Efficient, ASCII-safe port compression
From: doug () hcsw org
Date: Fri, 6 Jul 2007 03:06:30 -0700
Hi nmap-dev! I was thinking this evening about the problem of encoding long lists of port strings efficently and reliably. When I have done large-scale scans in the past, I have run up against all sorts of scalability problems, many of them based on not having an efficient, transportable encoding for sets of ports. Also, when you want to keep complete information on all the ports in large scans you often end up listing out massive ASCII lists in XML/greppable output. Consider when there are 20 open ports, but 30000-some closed and 30000-some filtered; Nmap can ony collapse one of those lists without throwing out information. Insignifigant you say? Well, perhaps, but remember for a large scale distributed scanning effort you want to be able to make use of tiny 1mb shell accounts for scanning as well as your comfy terabyte servers. Finally, I was considering the problem of not being able to know what ports were scanned in the XML output of any given scan because we don't encode the contents of the nmap-services file in the scan itself. Let me introduce you to portcompress: http://hcsw.org/downloads/portcompress.c This simple, portable C file is a program that runs in 2 modes: Usage: portcompress [-e|-d] In encode mode (-e) takes whitespace separated decimal port numbers until EOF and prints out a compressed port list. In decode mode (-d) reads in a compressed port list and prints out the corresponding ports separated by newlines. Given lists of integers it encodes it in an efficient run-length encoded ASCII-armoured format: $ echo "1 2 3 4 5 6 7 8 9 10" | ./portcompress -e JZ**xA $ echo "1 2 3 4 5 6 7 8 9 10" | ./portcompress -e | ./portcompress -d 1 2 3 4 5 6 7 8 9 10 $ echo "1 2 3 4 5 6 7 8 9 10 65533 65534 65535" | ./portcompress -e JZ**u*A $ echo "1 2 3 4 5 6 7 8 9 10 9876 65533 65534 65535" | ./portcompress -e JZyaF32WT8A $ cat ~/nmap/svn/nmap/nmap-services |perl -ne 'print "$1\n" if m/^[\w-_]*\s*(\d+)/;'|sort|uniq|./portcompress -e|wc -c 696 That's right, all the TCP and UDP port numbers in the services file can be enumerated in 696 bytes, ASCII-safe! It would be something like 4 times longer (and not ASCII-safe) if we just listed the 2 byte integers back-to-front. The secret is an efficient run-length encoding algorithm I developed. It encodes runs of length 4 or more as a simple count of the length of the run. For maximum efficiency the length of the run is variable encoded itself. (This is very similar to an algorithm a professor of mine, Dr. Paeth, invented. Dr's Paeth algorithms are also used in PNG, JPEG, etc). Here is the bit-stream protocol, from the source: Protocol: Either 00 = 0 11 = 1 or 01 = RLE string of 0s 10 = RLE string of 1s followed by one of 00 = 2 bits 01 = 4 bits 10 = 8 bits 11 = 16 bits followed by (run length - 4) encoded in a binary number of the previously specified bits Examples: "101" => "110011" "111" => "111111" "1111" => "100000" "11111" => "100001" Anyways, this was a very quick hack-job so please let me know if you find any bugs or have other suggestions! I took code from at least 2 other bits of Hardcore Software: ASCII armour and nuff. :) Enjoy, Doug
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- [Proof of Concept] Efficient, ASCII-safe port compression doug (Jul 06)