Nmap Development mailing list archives
Re: [RFC] Username/Password NSE library
From: "Andrew J. Bennieston" <harriergr7 () gmail com>
Date: Wed, 18 Jun 2008 15:34:09 +0100
The following are some thoughts on the statistics of password guessing. I haven't researched this in any detail whatsoever, but I'm a physicist during the day, and one of my interests lie in statistical modelling. Following on from the comment that a list of the 1000 most common passwords performs barely any better than one of the 500 most common, I was immediately reminded of the Central Limit Theorem. This arises from anything which has a Gaussian (or near-Gaussian) distribution, and states two things which are of potential interest here. The first is obvious: The most efficient is to get the password right on the first guess; this provides the "zero centred" peak for our Gaussian distribution of efficiency vs. list size. The second is the statement that the width of the distribution scales as the inverse square root of N, the number of items in the list. Interpreting this for password lists, we can say that as the length N of a list increases, the increase in effectiveness we get by virtue of it including more passwords, and thus being more likely to guess the correct one, should scale as a maximum by the square root of N (i.e. the reciprocal of the "distribution width"). For example, if we normalise our measure of effectiveness so that the constant of proportionality is 1, then the effectiveness of a list of 500 passwords is given the (dimensionless) value Sqrt[500] = 22.4 The effectiveness of 1000 passwords, a list twice as long, is Sqrt[1000] = 31.6 This clearly demonstrates that, in this model, the effectiveness is not doubled by doubling the list length. If, as was hinted in earlier posts, the fall-off of effectiveness is much steeper than this, i.e. 1000 passwords are barely any more effective than 500, then this could be accommodated using a prefactor proportional to N to some power. Of course, most people on this list aren't interested in the slightest in the statistical models of such things, but in how to choose the optimal list length. Based on the model postulated above, we can say that the expected difference in effectiveness E between two lists of length N1 and N2, given by E2 = 2*E1, and the true effectiveness, given by E2' = Sqrt[N2] can be written: delta E(N1, N2) = 2*Sqrt[N1] - Sqrt[N2] where N2 = 2 * N1 => dE(N1) = 2*Sqrt[N1] - Sqrt[2*N1] Setting an arbitrary threshold dE <= 10, we can find the value of N1 above which the discrepancy between expected effectiveness since we've doubled the list size, and true effectiveness from the Central Limit Theorem exceeds 10, we get: dE(N1) = 2Sqrt[N1] - Sqrt[2N1] <= 10 Sqrt[N1](2 - Sqrt[2]) <= 10 Sqrt[N1] <= 10/(2 - Sqrt[2]) N1 <= 291.4 In other words, a password list of size ~ 250 to 300 keeps the discrepancy between true effectiveness of the list size, and perceived effectiveness due to the presence of more words below the value of 10 (which I chose arbitrarily, but it seems to have provided a reasonable value for N!) Anyway, I hope some of that helped to get you guys thinking about the effect of list size on brute-force password guessing; there is definitely a diminishing return, and while the distribution may not be exactly Gaussian, most independent random variables fit some kind of Gaussian, and thus obey the central limit theorem. Andrew J. Bennieston Kris Katterjohn wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Brandon Enright wrote:On Tue, 17 Jun 2008 15:46:09 -0500 Kris Katterjohn <katterjohn () gmail com> wrote:Hey everyone, I've started working on a username and password NSE library. This library will separately hand out usernames and/or passwords to scripts for use with brute forcing or whathaveyou. I'll probably have one set of functions return a closure to return the usernames or passwords one-at-a-time, and possibly another set of functions to return the whole username or password table.Username specific passwords would be _really_ nice. I'm thinking for root the password list would be a few hundred long. For other users the list would probably be something like: <username> <Username> <USERNAME> <blank> password pass changeme Changeme ChangeMe guest qwerty asdf abc123These are interesting ideas, especially the overall user-specific passwords.Now I need opinions on good username and password lists to ship and use by default. There is an ordered password list shipped with John the Ripper which has 3107 entries. The license[1] pretty much says we can distribute it if we give credit and also ship the license. Are there any ideas on a better list?It has been my experience, both from UCSD being on constant password guessing the victim side, and me being on the audit our passwords side that more passwords is _not_ better. If you don't guess the password in the first hundred tries or so is is very unlikely that continued guessing will help much. Guessing passwords over the network is expensive and there is a diminishing return. The value of trying an additional password is roughly inversely proportional to the number you have already tried. We've found that a list of the 1000 most commonly guessed passwords performs almost no better than 500 but takes twice as long.Interesting! I've never been a brute-forcer, so I had no idea what a good number of guesses would be. Of course, it is up to the script how many attempts they make: the library will only provide them with the data. This library is specifically for giving scripts usernames and passwords, so that's a good reason to have a whole bunch. On one hand, I don't want Nmap's list to end up being too small because somebody's script does want to do a lot of guessing; but, on the other hand, if a user wants a massive list to use, they can always select their own.What about a good username list?Besides the obvious root, webadmin, guest, admin, test, mysql, web, oracle, student, staff, etc we should only use first names. Nearly 100% of the SSH brute force compromises we fall to are just first-name usernames like: joe bob john danielle matt david mark you get the ideaGood idea. Maybe there can be an option given to the username function to return only "administrator" usernames like root, admin, etc. But thinking about it for a second, it wouldn't be easy to do just reading from a list. Of course we could just have the administrator names at the top of the list, which is probably best anyway.Any other comments are appreciated.I think the best way to gather the root list is to collect real-world honeypot data. I have data I can provide and I'm sure hundreds of others on this list also have data. We should probably cat * | sort | uniq -c | sort -nr | head 500 to make our list.That would be awesome, though the actual number of entries in the list is still negotiable.Overall I think this is a very good idea, Kris. I look forward to the result.Ah, if only I can take credit for good ideas. This, like many things I work on, was handed to me from people in the Thinking Stuff Up Dept. ;) It really should be cool when it's complete because bruteTelnet will be ported to it and it should make the creation of other brute-force scripts a bit easier.BrandonThanks, Kris Katterjohn -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQIVAwUBSFg2N/9K37xXYl36AQKswg//Ra/MazeFe5soZCIiPf2wpFObewD6yRTF rty3izJM273QNbzQbSjhgWwZMGn1abGoAJAk3LY+qnOOLXEizm+fmYz0NGxPp9Un 78yOcF5TeIJtsaN2R3oNFXY2ECwWeu5agJChStKWTGcYFhv509Yh7Qjbh5P4xX6Y ryVXNx+W7Fl6ZTQKIEd3seUn3eIVex0Ibx7rDseazq/JBcNe9fJc1BUSO/W5tB7e t92IHCVX6kYQAq/KmJDxQkJ6p7OY9ZFD+TQnueiPIVRX+6hVckdJar5E2rcDF0WW abzKm+hs4UDFCcpt0p0KPM8EbOSmt9nfPuoegOBt5mww8voleZiWcIDLWCBwFUnl 2AnqRWX+zI8SO90KlSOGvAigpn0x3WI14LNJobrp0fYiNbHVLyXNAOGMT3e7Q/Et 5PjvXCbUuMhAKU9+BZtUvn+6Z9OjywJEwyUN21Kt1+gSB8TlqFIXETjIGHeCFtpZ NsZqEw8twWp2h2Ey6KlZVnxNsKly7ZKrrIaKCzBkpnAJ/U7diSA76FBfXNnTyJ1i 6rIofRvpPXuF5GlKdx4OouV0T6NRg9cw3S4I9vGUcKqbpchbDX1kP7ZSChcJpP5m 8GrWqgwmq+NS1QzyhhE9PzMN6G02O9EwA5G7FuU5GswQxqtILjzX23t5pbk+Nqgf epo10uwrzCI= =XSNF -----END PGP SIGNATURE----- _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- [RFC] Username/Password NSE library Kris Katterjohn (Jun 17)
- Re: [RFC] Username/Password NSE library Brandon Enright (Jun 17)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 17)
- Re: [RFC] Username/Password NSE library Andrew J. Bennieston (Jun 18)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 17)
- Re: [RFC] Username/Password NSE library Tom Sellers (Jun 17)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 17)
- Re: [RFC] Username/Password NSE library Fyodor (Jun 18)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 18)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 19)
- Re: [RFC] Username/Password NSE library Fyodor (Jun 19)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 19)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 23)
- RE: [RFC] Username/Password NSE library Thomas Buchanan (Jun 24)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 24)
- Re: [RFC] Username/Password NSE library Kris Katterjohn (Jun 17)
- Re: [RFC] Username/Password NSE library Brandon Enright (Jun 17)