Nmap Development mailing list archives
Re: Replacing passwords.lst
From: Fyodor <fyodor () insecure org>
Date: Tue, 16 Mar 2010 23:18:28 -0700
On Wed, Mar 17, 2010 at 01:16:29AM +0000, Brandon Enright wrote:
Well each is a pretty biased sample of a really huge password population. If our lists were truly random samples from that population then no amount of weighting for sample size would be better than just summing up counts and ordering them. Since we don't know how biased each list is we should just treat them equally. If our goal is to sum the counts up while keeping them equal we have to normalize those counts.
I agree with you that that each list is a bit biased and that RockYou is so huge that it dominates the other lists. But as you note, "we don't know how biased each list is", so I think treating them exactly equally is completely arbitrary. And it introduces its own biases. It would mean that the seven people from the religious site faithwriters who chose "godisgood" as their password could count as much as passwords that hundreds or thousands of people chose on Rockyou. After all, Rockyou has almost 2,000 times as many passwords as Faithwriters, so I think we'd be terribly discounting that huge and valuable sample size if we treated it the same way as the cheesy little lists. I agree that we could probably make the lists a bit better now with some weighting of the files. But I'm definitely skeptical of the approach, as it seems quite subjective. Comparing with Brandon's password list is a neat idea (and I like it), but it also has the risk of just finding a solution which is closest to the biases in that file. After all, if that file was perfect we'd use it directly. Another thing which might help, but I'm also a bit skeptical of, is assigning counts to the files which don't have them. For example, we could look at the distribution of counts in the first 3,000 entries of Rockyou or one of the others, and then assign counts to files like john.txt in those proportions. Of course that would also require us to basically subjectively decide how much to weigh the john.txt file, so it is even more problematic than the issue of weighting the individual counted files. I don't deny that a little bit of weighting to reduce the Rockyou dominance would probably help, but it would be very subjective since we don't really have a good way to decide how much weight to give each list. I do think any manipulations we do should try to be as simple as possible. And we may decide to (as we do now) just sum up the counts and order them. That also makes it very easy to add new lists, which I hope we will be doing. If RockYou's 14 million passwords is overly dominant, let's fix that by finding some more password files. Come on guys! Get to hacking! I'll send a free signed copy of Nmap Network Scanning to whoever gets me the Facebook or Twitter password list first :). OK, that's a bad joke, but I do think we'll be able to collect more password lists over time. I even have a lead on a couple now. And I think that would be the best way to remove the biases. BTW, we currently do a little bit of subjective massaging. David's script automatically takes out a handful of terribly biased results such as the "rockyou" password which is found more than 20,000 times in the rockyou DB. Cheers, -F _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- Re: Replacing passwords.lst, (continued)
- Re: Replacing passwords.lst Fyodor (Mar 06)
- Re: Replacing passwords.lst Ron (Mar 06)
- Re: Replacing passwords.lst David Fifield (Mar 06)
- Re: Replacing passwords.lst Martin Holst Swende (Mar 06)
- Re: Replacing passwords.lst David Fifield (Mar 12)
- Re: Replacing passwords.lst Fyodor (Mar 12)
- Re: Replacing passwords.lst David Fifield (Mar 16)
- Re: Replacing passwords.lst Brandon Enright (Mar 16)
- Re: Replacing passwords.lst David Fifield (Mar 16)
- Re: Replacing passwords.lst Brandon Enright (Mar 16)
- Re: Replacing passwords.lst Fyodor (Mar 16)
- Re: Replacing passwords.lst Ron (Mar 17)
- RE: [BULK] Re: Replacing passwords.lst Norris Carden (Mar 17)
- Re: [BULK] Re: Replacing passwords.lst Ron (Mar 17)
- Re: Replacing passwords.lst Ron (Mar 16)
- Re: Replacing passwords.lst Fyodor (Mar 16)
- Re: Replacing passwords.lst Fyodor (Mar 16)