Nmap Development mailing list archives

Re: [NSE][patch] More httpspider blacklist extensions, revamp function


From: Patrik Karlsson <patrik () cqure net>
Date: Fri, 15 Jun 2012 12:17:35 +0200

On Wed, Jun 13, 2012 at 10:57 PM, Daniel Miller <bonsaiviking () gmail com>wrote:

Hi list,

I was running into a problem with my XenServer instances, which host a MSI
installer for XenCenter on a simple web server. Running any of the scripts
that involve spidering resulted in downloading this 43MB file multiple
times. I added "msi" to the list of default blacklisted extensions in
httpspider.lua, and this solved the problem.

Of course, I couldn't stop there. I added more executable extensions
("msi", "bin"), archive extensions ("tgz", "tar.bz", "tar", "iso"), and a
new category, document extensions (pdf, {doc,xls,ppt}{,x,m}, od[fsp], ps,
xps).

I also noticed that the blacklist function being created in
Crawler:addDefaultBlacklist() was bloated, containing 4 local tables
declarations, nested for loops, and string concatenation in the innermost
loop. I converted it into a closure over a new table which only requires
one level of for loop, and already contains the properly formatted match
patterns. Also, I moved the url:getPath() call out of the loop, added a
string.lower(), and cached the result in a local variable for doing the
string.match(). Previously, uppercase extensions in a URL would not have
been matched.

Patch attached.

Dan

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Thanks. Nice work!
Applied as r28944.

//Patrik
-- 
Patrik Karlsson
http://www.cqure.net
http://twitter.com/nevdull77
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: