Nmap Development mailing list archives
Re: Thoughts about http-spider.nse
From: Ron <ron () skullsecurity net>
Date: Fri, 22 Oct 2010 09:31:08 -0500
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hey, I'm not sure about leaving the domain at all. I think I'd feel better not letting it leave the domain, no matter which options are given. Or, at the very loosest, only target domains that are on the same server (ie, same ip). Forms, I'm not parsing yet. But I'm thinking of doing GET forms by default, and requiring an argument for POST forms. I've already added a 'depth' argument, as well as 'breadth' (how many links it visits on each page). I defaulted depth to 4 and breadth to 20 for now, but that means a maximum of 20**4 or 160,000 pages. That's definitely too much, but I'm not sure what the best option will be so I'm going to have to do some experimenting. Ron On Fri, 22 Oct 2010 09:00:48 -0400 Ryan Giobbi <ryan () tgbemail com> wrote:
I think it'd be a great feature to allow the user to control where the spider goes and if it fills out forms. Something like --script-args=spider-domain=[0,1,2],spider-forms=[0,1],spider-depth=[0,1,2,3,4] spider-domain options 0 follow only links that are on the original domain and protocol of the first request. http://www.foo.com will spider to www.foo.com/foo but not foo.com or www2.foo.com or https://www.foo.com 1 follow only links that are on the original domain.http://www.foo.com will spider to www.foo.com/foo or https://www.foo.com 2 follow all links. spider-forms options 0 don't fill out forms 1 fill out and submit forms spider-depth 0 don't spider beyond first page 1 spider one link 2 spider two links 3 spider three links 4 spider until out of links The options for burp pro might be a good reference: http://portswigger.net/burp/help/spider.html#engine On Mon, Oct 18, 2010 at 11:22 PM, Ron <ron () skullsecurity net> wrote:-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hey all, I've been putting some work this week into a spider script. Right now it works with very basic functionality (basically finds all the href='..' and src='...' arguments, parses them, and stores them in the registry), but I'm hoping to get some comments. First, at the moment, the script has no output of its own. I think that's a good thing, because of the amount of information that people may or may not want to see, having other scripts that display the results might be a better plan. This leads us to required dependencies, though -- right now, dependencies only modify the order that requests happen, but for this to work we need dependencies that actually turn on other scripts. Second, I'm making heavy use of the registry to store information. I'm trying to give later scripts as much information as possible. So, I'm storing the same data in many different ways to make it easy to find exactly what you want. Here are some types of data, and some short examples (mostly from nmap.org). I've attached a full registry dump of scanning nmap.org at a depth of 2. * All pages, and all pages with their full querystring | all_pages: | 1: "/" | 2: "/shared/css/insecdb.css" | 3: "/book/man.html" | 4: "/book/install.html" | 5: "/download.html" * All directories, and all files | directories: | 1: "/" | 2: "/book/" | 3: "/presentations/BHDC10/" | 4: "/nsedoc/" | files: | 1: "/shared/css/insecdb.css" | 2: "/book/man.html" | 3: "/book/install.html" | 4: "/download.html" * All files indexed by extension | extensions: | html: | 1: "/book/man.html" | 2: "/book/install.html" | 3: "/download.html" | 4: "/changelog.html" | 5: "/docs.html" | 6: "/book/nse.html" | 7: "/movies.html" | 8: "/book/man-bugs.html" | css: | 1: "/shared/css/insecdb.css" * All pages that have arguments, as well as their arguments (can have multiple copies for pages we see linked with different arguments) | cgi_args: | /index.cfm: | 1: | pageID: "12" | 2: | pageID: "13" | 3: | pageID: "249" | 4: | pageID: "1" | cgi_querystring: | /index.cfm: | 1: "pageID=12" | 2: "pageID=13" | 3: "pageID=249" | 4: "pageID=1" | 5: "pageID=2" | cgi_full_query: | 1: "/index.cfm?pageID=12" | 2: "/index.cfm?pageID=13" | 3: "/index.cfm?pageID=249" | 4: "/index.cfm?pageID=1" | cgi: | 1: "/index.cfm" | 2: "/display.cfm" * All pages we've seen a specific page link to (or linked from) | links_to: | /docs.html: | 1: "/shared/css/insecdb.css" | 2: "http://g.adspeed.net/ad.php?do=clk&zid=14678&wd=728&ht=90&pair=as" | 3: "http://nmap.org/" | 4: "http://nmap.org/book/man.html" | 5: "http://nmap.org/book/install.html" | 6: "http://nmap.org/download.html" | 7: "http://nmap.org/changelog.html" | /book/nse.html: | 1: "/shared/css/insecdb.css" | 2: "/book/toc.html" | 3: "/book/osdetect-unidentified.html" | 4: "/book/nse-usage.html" | 5: "/book/preface.html" | 6: "/book/intro.html" * All pages indexed by content-type | content-type: | text/html; charset=UTF-8: | 1: "/" | 2: "/book/man.html" | 3: "/book/install.html" | 4: "/download.html" | 5: "/changelog.html" | 6: "/book/" | text/css: | 1: "/shared/css/insecdb.css" | text/html; charset=iso-8859-1: | 1: "/favicon" | 2: "/data/COPYING" | 3: "/fb" Opinions would be great! Ron -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAky9DuIACgkQ2t2zxlt4g/RPRACfSfK2Kgh4zRLsjmTNu+LGDxn9 +F0AoLBkFZ2EzOnW+BXuSndp8zP0N1A3 =t0DG -----END PGP SIGNATURE----- _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAkzBoDAACgkQ2t2zxlt4g/RHZwCgzwP6ldCi6Opf4bpUD/1AwBRx yS4An22EaVUZ+d/DYlf4JIwpXXxoWRST =s+Fh -----END PGP SIGNATURE----- _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- Thoughts about http-spider.nse Ron (Oct 18)
- Re: Thoughts about http-spider.nse Ryan Giobbi (Oct 22)
- Re: Thoughts about http-spider.nse Ron (Oct 22)
- Re: Thoughts about http-spider.nse Ryan Giobbi (Oct 22)