Nmap Development mailing list archives
[NSE] Auto pipelining for http library
From: Peter O <perdo.olma () gmail com>
Date: Tue, 21 Aug 2012 01:29:33 +0200
Hi all, during the last few weeks I've been working on adding auto pipelining features to http library. This post summarizes what I've been able to accomplish. The library can be found in my nmap-dev directory which can be checked out with: svn co https://svn.nmap.org/nmap-exp/peter/nse-auto-pipeline. Intro ------ The new design aims to make pipelining fully automatic and transparent to script writers. It also makes use of persistent connections by keeping sockets open for some short amount of time after sending a request and reading a response. The new library should introduce speedup while doing large scans as the nsock library binding allows 20 scripts to have sockets at a time, so many of them would block. With the new library, all the scripts could use the same socket. This is accomplished by having a thread in http library that 'owns' a socket for a particular host and keeps it open for scripts to use. Unlike the old pipelining API, the new one supports caching. More details ---------------- o The old API http.{get|head} in the current API function by first creating a socket, then issuing a request on this socket, reading a response and finally closing it. This works OK if we have a moderate number of scripts running against a host and they will not block due to the limitation nsock introduces (20 sockets). However, it'd be nice to take advantage of HTTP/1.1 persistent connections that let us issue several requests on the same socket without reconnecting. This will allow many scripts to use the same socket and not block. Another problem with the current API is that it doesn't use pipelining, which is basically sending multiple requests at once. o The new API As mentioned before, there are threads in http library that own sockets connected to hosts. Scripts issue requests via a routine that communicates with those threads. Connections are closed only when there are no requests for ~1 sec for a particular host, server closed the connection or we hit some kind of limit - for example we exceeded the number of requests server allows per one connection. This image: http://s8.postimage.org/iyt8nmgv9/pipe_simple.png (it's also attached to this message) is a graphical representation of how things work. The steps listed below correspond to those in the image. - Step 1 http.{get|head} is a blocking call, what it means is a script sends a request, and blocks waiting for a response. - Step 2 The http library manages library worker threads, which keep connections open and pipeline requests for a host. On the first http.{get|head}, such a worker is created by a routine in http.lua. Worker waits on a condition variable if its request queue is empty. Scripts communicate with those threads by a routine in http library, which adds requests to worker's queue, then signals the worker. - Step 3 Worker reads the request queue, pipelines all the requests, sends them out and collects responses. - Steps 4, 5 The responses are saved and made available for the routine, which reads them and sends them back to requesting scripts. Summing up, the main difference between the current http.{get|head} and the new one is instead of a script calling http.{get|head} actually issuing a request on a socket it controls, it passes that request onto a http worker thread which is doing pipelining for a host and waits for a response. The connections are not owned by scripts anymore, the worker threads in http library own them. After issuing a request and getting a response, the connections are left open. Connections are closed when one of aforementioned conditions occur. If we decide to enable pipelining by default, then scripts will be allowed to specify no_pipeline option which results in script issuing a regular, single request the way http.{get|head} function ATM. The mechanism described above pipelines requests that multiple scripts make for the same host. http.{get|head} being a blocking call forces scripts, which would like to take advantage of pipelining, to have their own worker threads. You can see how to accomplish this here: https://svn.nmap.org/nmap-exp/peter/pipe.nse (this script is also attached to this message). The script I've linked above should have the worker pool code exported into a library to make it easier to pipeline http requests. Testing --------- Run times that I'm listing below are usually arithmetic mean of few runs. Tests have been run against a Linode server that was provided by Patrick (thanks!). As you'll probably notice, some tests have been run with --max-parallelism 1 or 2. This is an attempt to model a scenario where there's a large scan running and most of the sockets are occupied. Also, we don't want socket parallelism to weight the results for the serial GET case. If we have 20 sockets available for serial GET, pipelining will be beaten in most circumstances. The new API is not affected by --max-parallelism as much as the old one, because we reuse the same socket and pipeline the requests. In the old API, we create a socket each time we do a request so if the number of sockets is limited, blocking occurs. Setups for the following tests are: - new-pipeline == new pipelining API, with script workers, that is, a script creates a thread for each http.get it issues - parallel-request == the old API, and same as above, script creates a thread for each http.get it issues. I. The following results are against an Apache server with default settings (pipelining enabled and 100 allowed requests/connection). 1a) One script making N requests. This models a situation where a user just want to run a single script (like http-sql-injection for example) against a target. 120 reqs 320 reqs new-pipeline 3s 5.5s parallel-request 4s 8.2s 1b) Same as 1a, but with --max-parallelism 1 120 reqs 320 reqs new-pipeline 3s 5.5s parallel-request 21s 58s 2a) 100 scripts, each making 1 request. This models a situation where a lot of small http scripts (http-title, http-robots, etc.) issue a request to get some info from the host. new-pipeline 3s parallel-request 3s 2b) Same as 2a, but with --max-parallelism 1. new-pipeline 3s parallel-request 18s 3) 2 scripts, each making N requests. This is for example a case where two spiders are running. 20 req/each 120 req/each 200 req/each new-pipeline 2.15s 4.5s 6.1s parallel-request 1.5s 6.3s 10.2s 4) 2 scripts, each making 200 requests, and --max-parallelism 1. new-pipeline 6.2s parallel-request 74s 5) --script "http-* and not http-slowloris and not http-form-fuzzer and not http-enum". Note: those were single runs. --max-parallelism 20 10 2 new-pipeline 272s 421s 1114s parallel-request 294s 427s 1143s II. If a server doesn't support pipelining (this particular server was http.server that is provided by python 3). Setups are the same as above. 1a) 120 reqs 320 reqs new-pipeline 5s 9.4s parallel-request 5s 9s 1b) 120 reqs 320 reqs new-pipeline 21s 64s parallel-request 21s 64s 2a) new-pipeline 4s parallel-request 4s 2b) new-pipeline 20s parallel-request 20s 3) 20 req/each 120 req/each 200 req/each new-pipeline 2.3s 8.2s 12s parallel-request 1.9s 7.5s 12s 4) new-pipeline 85s parallel-request 84s III. Comparison of old pipelining API with the new one using http-enum script. new-pipeline 26s parallel-request 22s Options for integrating the API -------------------------------------- In the next section you can find some links that mention drawbacks of pipelining. That's why we need to think if we'd like to have pipelinig enabled by default, or just let users specify a 'pipeline' option (but still take advantage of persistent connections). Some info on pipelining ------------------------------ For further reading on why enabling pipelining by default might cause problems (and on pipelining in general), I suggest to follow those links: http://www.guypo.com/technical/http-pipelining-not-so-fast-nor-slow/ http://www.guypo.com/mobile/http-pipelining-big-in-mobile/ https://bugzilla.mozilla.org/show_bug.cgi?id=264354 https://bugzilla.mozilla.org/show_bug.cgi?id=329977. An article in one of the above links says that web browsers usually use small pipelines of 2-3 requests. The current implementation allows to specify the size of pipeline, but with smaller pipelines I got worse results. Future work --------------- Firstly, add a library that will manage a pool of workers for a script. This would make pipelining easier for scripts to use. Secondly, let there be more than one thread connected to a host. If a pipeline for one thread is full, then use another one. This should introduce some speedup. Thirdly, review the API's code once again. I'm sure there are some areas that can be written better. Finally, make a lot more tests. Main focus should be on large scans for 20+ hosts, because I haven't had a chance to run many of those yet. --------------------------------------------- I'd really appreciate any comments/tests/suggestions. - Peter
Attachment:
pipe.nse
Description:
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [NSE] Auto pipelining for http library Peter O (Aug 20)