Nmap Development mailing list archives

Re: html page extensions


From: Michael Pattrick <mpattrick () rhinovirus org>
Date: Mon, 14 Sep 2009 00:47:07 -0400

Don't forget about aspx and jhtml.

However, wouldn't it just be easier to check mime types? Many sites
use non-standard or seemingly random extensions: webct uses .dowebct,
sometimes .do is used for cgi scripts such as
'http://www.ic.gc.ca/app/opic-cipo/trdmrks/srch/tmSrch.do?lang=eng&apos;.

I suppose that it requires more overhead, as you would have to
download the header of every link as opposed to just specific ones...

-M

On Sun, Sep 13, 2009 at 11:48 PM, Patrick Donnelly <batrick () batbytes com> wrote:
Hi nmap-dev,

I'm working on an http spider script and need to know what file
extensions are common for html pages. Here's a list I have so far (in
Lua regular expressions):

 local html_page_extensions = {
  "%.html$", -- regular html page
  "%.htm$", -- regular html page
  "%.shtml$", -- regular html page
  "%.phtml$", -- regular html page
  "%.php$", -- php
  "%.pl$", -- perl
  "%.cgi$", -- cgi
  "%.jsp$", -- Java Server Pages
  "%.asp$", -- Active Server Pages (Microsoft)
 };

I'm also checking pages that have no extension (as that is apparently
very common). Does anyone have more to add?

--
-Patrick Donnelly

"Let all men know thee, but no man know thee thoroughly: Men freely
ford that see the shallows."

- Benjamin Franklin

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: