Nmap Development mailing list archives

Re: [NSE] New NSE Script http-mirror


From: Gyanendra Mishra <anomaly.the () gmail com>
Date: Wed, 8 Jul 2015 23:20:25 +0530

Hi list,

After discussion with my mentor Daniel I made some changes to http-mirror
that has made it faster and  more accurate.
I removed a couple of gsubs and gmatch iterators and replaced them with
faster alternatives.
Removed dependence on os library and now I am using lfs instead to create
directories.
Added url validation based on rfc 1738 just so that the improper urls
mostly picked up from the JavaScript on webpages don't get replaced.

The "mirror" argument turns mirroring on.

The "preserve" argument switches all the absolute URLs back to relative
URLs. There can be cases in which /file exists in a page as a relative URL
and is never downloaded by our spider hence converting the relative to
absolute first is necessary and all the absolute urls that are downloaded
are switched back to relative urls if preserve is set to true.

The "localize" argument turns URLS like http://nmap.org/osdetect.html to
/home/user/Documents/mirror/osdetect.html. The copies created using the
localize argument are therefore not portable but on the other hand don't
require the directory to be served on localhost to be browsed properly as
paths are hard coded.

The best way to test would be to use it the following way:
nmap --script http-fetch --script-args
"destination='/home/user/Documents/mirror',mirror=true,preserve=true"
nmap.org -p 80 -d
To test the mirror please run a server that serves the mirror directory so
that visiting localhost automatically opens
"/home/user/Documents/mirror/index.html".

Currently maxdepth and maxpagecount are  set to -1.( I may make the change
soon). You might want to change that to say 5 and 50 for testing purposes.
No blacklist is set to true this means that non web documents are
downloaded by default. For example: images.

You can find the script here[1].

Gyani

On Sun, Mar 1, 2015 at 1:03 AM, Gyanendra Mishra <anomaly.the () gmail com>
wrote:



Hi,

I have been working on the script http-mirror and I have attached the
same. All the limitations/to dos/documentation are in the script itself.
Comments and suggestions regarding how many pages should be downloaded by
default, what depth should be downloaded by default and if the extensions
blocked in httpspider should be downloaded or not among other general
criticism will be appreciated.



Thanks!

Gyanendra Mishra



_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread: