Educause Security Discussion mailing list archives

Re: Password entropy


From: Harold Winshel <winshel () CAMDEN RUTGERS EDU>
Date: Mon, 24 Jul 2006 07:02:02 -0400

Is there a relatively easy way to calculate the entropy of a passphrase?

I am not familiar with this and am not sure what "3 bits of entropy
per character" means?

About all I understand right now is that a higher numeric value of
entropy is better than a lower value (i.e., 4 bits of entropy is
better than 3 bits of entropy), but I don't know what that means or
how you do a calculation.

Thanks.

At 02:34 AM 7/24/2006, Valdis Kletnieks wrote:
(Replying to several related notes)

On Sun, 23 Jul 2006 18:45:12 CDT, Roger Safian said:

> Does only English suffer from this problem, and would it make
> a stronger passphrase to use one non-English word in your phrase?

Human languages in general suffer from this problem to one extent or
another. This is based in the fact that not all sequences of phonemes are
easily pronounceable.  'xxxggzp' is going to be hard for your tongue no
matter what language is being used.  To get an accurate value for the
average entropy of any given language requires a lot of statistical work
compiling letter, letter sequence, and word sequence for large amounts
of text in the language.  However, for the purposes of this discussion,
throwing in a gratuitous Spanish 'hacienda' or a Japanese 'Domo arigato'
is *not* going to make a large difference - you're still restricted by
things like vowel/consonant mix.  Using a random 'xap4*' wins you more,
but is less rememberable.

What makes a *bigger* difference (but is almost never technically
feasible) is if you entered 'Domo arigato' as the appropriate kanji
and/or hirijana characters.  Random kanji would win even more, but
for most of us gaijin, is even more mental stress than 'xap4*'..

On Sun, 23 Jul 2006 22:07:19 EDT, Paul Russell said:
> Much of this discussion seems to have focused on the lack of entropy in
> English-language words and phrases. Both suffer from the predictability of
> letter sequences. Does entropy increase if the 'word' consists of the first
> (or last) letters of a phrase? Does it increase further if non-alphabetic
> characters are substituted for letters?

Using just first/last letters boosts the entropy per character
*somewhat*, but you end up fighting a somewhat losing battle.  Let's
take the paragraph of yours I quoted. 57 words, 337 characters.  At 3
bits of entropy per character, you have just about 1K bits of entropy.
Taking just the first characters is probably going to get you (rough
guess) 4 or maybe even 4.5 bits per character - but now you're only
getting it for 57 characters - and 57*5 is a lot less than 337*3.

5ubst1tuting 0ther characters (1ike th1s) helps somewhat, but not as
much as you'd think.  Since most of the common substitutions are fairly
easily predictable, they don't add all THAT much.  Beating up on the
character sequence 'tion' from a previous mail, let's throw a i->1 and
o->0 into the mix.  You only need 2-3 bits more to represent it
(basically, for each one, add a "leeted" bit).  Now, if you have the
mental agility to use an *odd* substitution, such as e->(, you will add
to th( (ntropy consid(rably mor(, but it g(ts a lot hard(r to typ(.
(Actually, to *r0ally* add to th1 2ntropy, you n34d to us5 a diff6r7nt
odd charact8r 9ach tim+).

On Sun, 23 Jul 2006 22:42:27 EDT, James H Moore said:
> I am reading this, because I want to know what to advise users too.  One
> thing as far as predictability, I have sometimes used a "first letter"
> formula for everything from quotes, to books/movies/music.  EG "MoonPie:
> Biography of an Out-of-This-World Snack" by David Magee (a recent book
> with a great deal of relevance) would become 1DM<>MP:boaootws .

What helps this particular passphrase the most is the inclusion of the '<>:'
characters.  Other than that that, it's 16 characters. Admittedly, the 16 are
much more random than they'd be if it was just 2-3 words totally 16
characters,
but still less *total* entropy than if you had typed the phrase out.

The *quickest* way to drive up entropy is to start *inserting* totally
gratuitous characters - midd<ling has a lot more entropy than middling, for
instance.  However, keep in mind that insertion of a random character from the
96 printables gets you 6.5 bits of entropy, while restricting it to the
'special' characters only gets you about 5.  So for maximum benefit, insert
ajdded letters and numbers too. ;)

And to combine a few things - *if* your system is able to technically handle
it, sticking the Mandarin glyph for 'cat' or several random kanji or Thai
characters into the middle is a "home run" as far as adding entropy.  However,
you still need to remember what control-alt-cokebottle key sequence
you need to
use to enter it...




Harold Winshel
Computing and Instructional Technologies
Faculty of Arts & Sciences
Rutgers University, Camden Campus
311 N. 5th Street, Room B36 Armitage Hall
Camden NJ 08102
(856) 225-6669 (O)

Current thread: