Nmap Development mailing list archives
Re: html page extensions
From: jah <jah () zadkiel plus com>
Date: Mon, 14 Sep 2009 11:30:47 +0100
On 14/09/2009 04:48, Patrick Donnelly wrote:
Hi nmap-dev, I'm working on an http spider script and need to know what file extensions are common for html pages. Here's a list I have so far (in Lua regular expressions): local html_page_extensions = { "%.html$", -- regular html page "%.htm$", -- regular html page "%.shtml$", -- regular html page "%.phtml$", -- regular html page "%.php$", -- php "%.pl$", -- perl "%.cgi$", -- cgi "%.jsp$", -- Java Server Pages "%.asp$", -- Active Server Pages (Microsoft) }; I'm also checking pages that have no extension (as that is apparently very common). Does anyone have more to add?
.xht, xhtml, .htmls, .hta, .cfml, .adp, .aht, .ahtm, .ahtml, .mht, mhtm, .mhtml, .jht, .jhtm, .jhtml You could probably write a script that performs a filetype:<some_random_extension> google search, HEAD request the first result and check for content-type=text/html. You'll likely end-up with a list as long as your arm. .php3 .php4 .php5 are common (frequently used to distinguish between versions of php), but php followed by 'some number' is likley. I did a few google searches for "filetype:phpN" for different values of N: 0 - 542 1 - 22500 2 - 1610 3 - 57000000 4 - 5190000 5 - 3040000 6 - 6490 7 - 245 8 - 143 9 - 568 10 - 56 ... 29 - 23 30 - 36 ... 200 - 3 ... 9999 - 1 10000 - 0 So it seems that for any non-negative integer below 10000 there's a possibility that the filetype is in use. (The single result at 9999 is a bit of an oddity since the 9999 is a parameter to the php script). jah _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- html page extensions Patrick Donnelly (Sep 13)
- Re: html page extensions Michael Pattrick (Sep 13)
- Re: html page extensions jah (Sep 14)