Nmap Development mailing list archives
Re: [NSE] http.lua and delimiters
From: David Fifield <david () bamsoftware com>
Date: Tue, 30 Sep 2008 18:32:27 -0600
On Wed, Sep 24, 2008 at 03:43:21AM +0100, jah wrote:
I noticed a few issues with showHTMLTitle.nse and whilst I was working through these I found that http.request() was not always returning an HTTP response correctly. Specifically the call to stdnse.make_buffer() uses "\r\n" as it's pattern to delimit lines in the response. This pattern was changed from "\r?\n" when the ability to dechunk chunked encoding was added [1] in tandem with a change to the second argument to table.concat() when putting the body of the response back together again (from "\n" to "\r\n") to avoid modifying the body and messing-up the dechunking process. I decided to knock-up a quick script which sends an HTTP request, uses socket.receive() in a loop to collect the response as an unmolested string and then detects the characters used to delimit the header and body and the characters used to delimit lines in both the header and the body. I then ran this script against a few hundred thousand random hosts and extracted the following info from the results. 3902 hosts had port 80 open, but only 2770 hosts responded to the GET request. 2451 ~88.5% used \r\n\r\n to separate header and body Of these, 2374 delimited header values with \r\n, 5 used \n and 72 were single value headers containing no delimiters. Of the same 2451 hosts, 335 were header only responses, 937 delimited lines in the body of the response with \r\n and 1179 with \n. 165 ~ 6% used \n\n to separate header and body Of these, 7 delimited header values with \r\n, 17 used \n and 141 were single value headers containing no delimiters. Of the same 165 hosts, 3 were header only responses, not one delimited lines in the body of the response with \r\n and the remaining 162 used \n. 154 ~5.5% responded with a header and a body not separated by a double newline. These were all headerless responses which were dealt with in a previous patch [2].
Thanks for doing this great research! I was able to reproduce your results using the nl.nse script. I ran nmap -iR 10000 -PN -p 80 -sC --script=nl.nse -n -T4 -v 125 hosts had port 80 open, but 30 of them returned no data. Of the remaining 95, 82 86.3% used a \r\n\r\n delimiter 5 5.3% used a \n\n delimiter 8 8.4% used neither of the above delimiters I think using a heuristic to get the header delimiter is fine. Wget does it: it splits the header from the body by looking for \n\n or \n\r\n, and splits header lines on either \n or \r\n. cURL does it: it ends header lines on \n and and the header on a line beginning with \r or \n. The only thing I would do differently is this bit of code: -- try and separate the head from the body if response:match( "\r\n\r\n" ) then header, body = response:match( "^(.-)\r\n\r\n(.*)$" ) elseif response:match( "\n\n" ) then header, body = response:match( "^(.-)\n\n(.*)$" ) This would fail if the header uses \n delimiters but there is an \r\n\r\n somewhere in the body; the first match would succeed and grab part of the body with the header. What you want is whichever of those two matches gives you a shorter header. Guessing the line ending for the header is fine, but we shouldn't do anything like that for the body. We don't even know that the body is made up of "lines"; it should be treated as a block of data, in other words not split up and rejoined. In the case of chunked encoding, cURL is strict about requiring \r\n, so we should be safe doing the same. Wget doesn't do chunked. Maybe you can modify your script to report servers that send chunked encoding but delimit the chunks with bare \n? It's hard to imagine any kind of reliable relivery of chunked encoding if you have to guess at the line endings. Does anyone else have more experience dealing with HTTP? My feeling is that there should be no guessing of delimiters in the body. The technique of splitting on \r\n for the purposes of dechunking is fine, because that's at a higher layer, above the raw body. David Fifield _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- [NSE] http.lua and delimiters jah (Sep 23)
- Message not available
- Re: [NSE] http.lua and delimiters jah (Sep 30)
- Message not available
- Re: [NSE] http.lua and delimiters David Fifield (Sep 30)