Educause Security Discussion mailing list archives

Re: Password entropy


From: Harold Winshel <winshel () CAMDEN RUTGERS EDU>
Date: Mon, 24 Jul 2006 14:05:13 -0400

Valdis,

Thanks for your detailed answer.  I'm not sure if I'm following all
of it but I did not your suggestion that using a famous phrase - or
maybe any common phrase - may not be a good idea.

Harold

At 11:30 AM 7/24/2006, Valdis Kletnieks wrote:
On Mon, 24 Jul 2006 07:02:02 EDT, Harold Winshel said:
> Is there a relatively easy way to calculate the entropy of a passphrase?

Not really - it's a statistical thing, and trying to do statistics on one
sample is, of course, lunacy at best.  What you *can* do is say that "this
type of character usually has N bits" etc, and conclude "a passphrase of
this length and these characteristics will have about this many bits..."

Fortunately, for most applications, a back-of-envelope calculation is
sufficient - unless you're a cryptanalyst, the only question you really
care about is "Does this set of criteria give us a 'sufficiently difficult'
passphrase for breaking?"

> I am not familiar with this and am not sure what "3 bits of entropy
> per character" means?

"3 bits of entropy" means that, on average, it takes 3 bits to enumerate
the possible values for this character. Note this *doesn't* mean that
there's only 8 possible values - it's the *average* number of bits needed.
So for instance, if we're looking at a common character sequence such as
'th', we might use 3 2-bit values for 'a','e', and 'i', and use the 4th
2-bit value for "something else", followed by 4 more bits to cover 16 other
characters. If 75% of the time we use 2 bits, and 25% of the time we need
all 6 bits, that's an average of 3 bits...

The relationship to trying to break the password is, of course, that
if it takes 3 bits to enumerate the likely choices, it adds 3 bits to the
number of attempts on average to break the passphrase.  Of course, this
is statistical - sometimes the cracker will get it right on the first try,
other times it will have to iterate through 10 or 15 unlikely possibilities.
But on average, it means they're going to have to work 8 times harder to
guess it...

> About all I understand right now is that a higher numeric value of
> entropy is better than a lower value (i.e., 4 bits of entropy is
> better than 3 bits of entropy), but I don't know what that means or
> how you do a calculation.

OK... some handy numbers:

Running English text has about 3.5 bits of entropy per character.
Totally random A-Za-z0-9 has almost exactly 6 bits (there's 62 values,
and 6 bits can cover 64).  Totally random across the 96 printable ascii
has 6.5 bits, and random across the 32 or so "special" chars has 5 bits.

"Obvious" substitutions (O->0, etc) add about 1 bit.

Now you take a passphrase... let's say it's 20 characters long, and
contains 16 chars of normal English words, 3 places the user has "leet'ed"
a character, and 2 totally random special characters.  So we get...

16*3.5 + 3*4.5 + 2*5 = 79.5 bits or so of entropy.

(Note that's a rough "back of envelope" calculation. Also, that's an *average*
expected value for a passphrase with those characteristics, not a guarantee
that each one has exactly 79.5)..

One interesting corollary is that using an unmodified "famous phrase" has a
much lower entropy than you'd otherwise expect - because if you've gotten as
far as "To be or not to be, that is the", it's *highly* likely the next 5
letters are 'quest', and unlikely to be 'zebra'. And most "smart" password
breaking programs *do* in fact try things in "entropy order" - the
first things
they try are short and low-disorder, and they only start trying the
higher-entropy brute force efforts if the dictionary-attack and similar
heuristics fail. So "amount of entropy" *is* directly related to "time to
break".

Now, an interesting cryptographic result is that if you're using 256-bit
crypto, but your key (password, passphrase, etc) only has 40 bits of entropy,
you only have *effectively* 40-bit crypto (because a brute-force of all the
keys that have 40 bits of entropy is highly likely to find the actual key).
That's not good at all in today's environment.  The EFF's DES
cracker basically
said that *all* 56-bit crypto is dead (breakable in a day or so
given *current*
technology).  80 bits is still strong enough (about 16 million times harder to
break than the 56-bit), and anything 90 or above is probably only going to be
broken via other means..

Other means include keystroke loggers, microphones(*), shoulder surfing,
rubber-hose cryptography, and so on.  Note that most of these attacks become
a *lot* harder if you deploy 2 or 3 factor authentication....

(*) Yes, microphones.  Some interesting research from UC Berkeley:
http://www.cs.berkeley.edu/~tygar/papers/Keyboard_Acoustic_Emanations_Revisited/ccs.pdf


Harold Winshel
Computing and Instructional Technologies
Faculty of Arts & Sciences
Rutgers University, Camden Campus
311 N. 5th Street, Room B36 Armitage Hall
Camden NJ 08102
(856) 225-6669 (O)

Current thread: