Nmap Development mailing list archives

Re: Thoughts about http-spider.nse


From: Ron <ron () skullsecurity net>
Date: Fri, 22 Oct 2010 09:31:08 -0500

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey,

I'm not sure about leaving the domain at all. I think I'd feel better not letting it leave the domain, no matter which 
options are given. Or, at the very loosest, only target domains that are on the same server (ie, same ip). 

Forms, I'm not parsing yet. But I'm thinking of doing GET forms by default, and requiring an argument for POST forms. 

I've already added a 'depth' argument, as well as 'breadth' (how many links it visits on each page). I defaulted depth 
to 4 and breadth to 20 for now, but that means a maximum of 20**4 or 160,000 pages. That's definitely too much, but I'm 
not sure what the best option will be so I'm going to have to do some experimenting. 

Ron

On Fri, 22 Oct 2010 09:00:48 -0400 Ryan Giobbi <ryan () tgbemail com> wrote:
I think it'd be a great feature to allow the user to control where the
spider goes and if it fills out forms.


Something like

--script-args=spider-domain=[0,1,2],spider-forms=[0,1],spider-depth=[0,1,2,3,4]

spider-domain options
0 follow only links that are on the original domain and protocol of
the first request. http://www.foo.com will spider to www.foo.com/foo
but not foo.com or www2.foo.com or https://www.foo.com
1 follow only links that are on the original domain.http://www.foo.com
will spider to www.foo.com/foo or https://www.foo.com
2 follow all links.

spider-forms options
0 don't fill out forms
1 fill out and submit forms

spider-depth
0 don't spider beyond first page
1 spider one link
2 spider two links
3 spider three links
4 spider until out of links


The options for burp pro might be a good reference:
http://portswigger.net/burp/help/spider.html#engine


On Mon, Oct 18, 2010 at 11:22 PM, Ron <ron () skullsecurity net> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey all,

I've been putting some work this week into a spider script. Right
now it works with very basic functionality (basically finds all the
href='..' and src='...' arguments, parses them, and stores them in
the registry), but I'm hoping to get some comments.

First, at the moment, the script has no output of its own. I think
that's a good thing, because of the amount of information that
people may or may not want to see, having other scripts that
display the results might be a better plan. This leads us to
required dependencies, though -- right now, dependencies only
modify the order that requests happen, but for this to work we need
dependencies that actually turn on other scripts.

Second, I'm making heavy use of the registry to store information.
I'm trying to give later scripts as much information as possible.
So, I'm storing the same data in many different ways to make it
easy to find exactly what you want. Here are some types of data,
and some short examples (mostly from nmap.org). I've attached a
full registry dump of scanning nmap.org at a depth of 2.

* All pages, and all pages with their full querystring
|       all_pages:
|         1: "/"
|         2: "/shared/css/insecdb.css"
|         3: "/book/man.html"
|         4: "/book/install.html"
|         5: "/download.html"

* All directories, and all files
|       directories:
|         1: "/"
|         2: "/book/"
|         3: "/presentations/BHDC10/"
|         4: "/nsedoc/"
|       files:
|         1: "/shared/css/insecdb.css"
|         2: "/book/man.html"
|         3: "/book/install.html"
|         4: "/download.html"

* All files indexed by extension
|       extensions:
|         html:
|           1: "/book/man.html"
|           2: "/book/install.html"
|           3: "/download.html"
|           4: "/changelog.html"
|           5: "/docs.html"
|           6: "/book/nse.html"
|           7: "/movies.html"
|           8: "/book/man-bugs.html"
|         css:
|           1: "/shared/css/insecdb.css"

* All pages that have arguments, as well as their arguments (can
have multiple copies for pages we see linked with different
arguments) |       cgi_args: |         /index.cfm:
|           1:
|             pageID: "12"
|           2:
|             pageID: "13"
|           3:
|             pageID: "249"
|           4:
|             pageID: "1"
|       cgi_querystring:
|         /index.cfm:
|           1: "pageID=12"
|           2: "pageID=13"
|           3: "pageID=249"
|           4: "pageID=1"
|           5: "pageID=2"
|       cgi_full_query:
|         1: "/index.cfm?pageID=12"
|         2: "/index.cfm?pageID=13"
|         3: "/index.cfm?pageID=249"
|         4: "/index.cfm?pageID=1"
|       cgi:
|         1: "/index.cfm"
|         2: "/display.cfm"

* All pages we've seen a specific page link to (or linked from)
|       links_to:
|         /docs.html:
|           1: "/shared/css/insecdb.css"
|           2:
"http://g.adspeed.net/ad.php?do=clk&amp;zid=14678&amp;wd=728&amp;ht=90&amp;pair=as";
|           3: "http://nmap.org/"; |           4:
"http://nmap.org/book/man.html"; |           5:
"http://nmap.org/book/install.html"; |           6:
"http://nmap.org/download.html"; |           7:
"http://nmap.org/changelog.html"; |         /book/nse.html:
|           1: "/shared/css/insecdb.css"
|           2: "/book/toc.html"
|           3: "/book/osdetect-unidentified.html"
|           4: "/book/nse-usage.html"
|           5: "/book/preface.html"
|           6: "/book/intro.html"

* All pages indexed by content-type
|       content-type:
|         text/html; charset=UTF-8:
|           1: "/"
|           2: "/book/man.html"
|           3: "/book/install.html"
|           4: "/download.html"
|           5: "/changelog.html"
|           6: "/book/"
|         text/css:
|           1: "/shared/css/insecdb.css"
|         text/html; charset=iso-8859-1:
|           1: "/favicon"
|           2: "/data/COPYING"
|           3: "/fb"

Opinions would be great!

Ron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAky9DuIACgkQ2t2zxlt4g/RPRACfSfK2Kgh4zRLsjmTNu+LGDxn9
+F0AoLBkFZ2EzOnW+BXuSndp8zP0N1A3
=t0DG
-----END PGP SIGNATURE-----

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAkzBoDAACgkQ2t2zxlt4g/RHZwCgzwP6ldCi6Opf4bpUD/1AwBRx
yS4An22EaVUZ+d/DYlf4JIwpXXxoWRST
=s+Fh
-----END PGP SIGNATURE-----
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: