Salting and stretching a password

This post will look at a progression of ways to store passwords, from naive to sophisticated.

Most naive: clear text

Storing passwords in plain text is least secure thing a server could do. If this list is leaked, someone knows all the passwords with no effort.

Better: hash values

A better approach would be to run passwords through a secure hash function before storing them. The server would never store a password per se, only its hashed value. When someone enters a password, the server would check whether the hash of the password matches the stored hashed value. If the list of hashed values is leaked, it will take some effort to recover the original passwords, though maybe not much effort, as I wrote about in my previous post.

Better still: add salt

The next step in sophistication is to append a random value, called a salt, to the password before applying the hash function. The server would store each user’s salt value and the hash of the password with the salt added.

Now if the user has a naive password like “qwerty” you couldn’t crack the password by doing a simple Google search. You can find the hash value of “qwerty” by searching, but not the hash value of qwerty + random salt, because although the former is common the latter is probably unique. You could still crack the password if the hash is insecure, but it would take a little effort.

Better still: key stretching

Suppose an attacker has a list of salt values and corresponding hash values for salt + password. They could guess passwords, hashing each with a salt value, to see if any hash values match. They would start with the most common passwords and probably get some matches.

Key stretching is a way to make this brute force search more time consuming by requiring repeated hashing. In the following stretching algorithm, p is the password, s is the salt, h is the hash function, and || means string concatenation.

$\begin{align*} x_0 &= \phi \\ x_i &= h(x_{i-1} || p || s) \mbox{ for i = 1, \ldots, }r \\ K &= x_r \end{align*}$

Now the time required to test each password has been multiplied by r. The idea is to pick a value of r that is affordable for legitimate use but expensive for attacks.

10 thoughts on “Salting and stretching a password”

Diego

25 January 2019 at 22:57

Next step in sophistication is to avoid sending the plaintext password to the server altogether, by using a protocol like OPAQUE. See https://blog.cryptographyengineering.com/2018/10/19/lets-talk-about-pake/

Joe Taber

26 January 2019 at 01:18

This key stretching algorithm has an unfortunate runtime complexity of O(|p| * r) which forces websites to artificially limit password length to avoid ddos / very slow hashing. There’s no other real reason to limit password length except network bandwidth, and even that could easily allow passwords of 10k+ characters without issue.

So I propose a new algorithm:
x[0] = h(theta||p||s), x[n] = h(x[n-1]||s) for i from 1 to r.

I.e. the password is only read once to initialize the hash, and then it’s iterated without the password again. This changes the complexity to O(|p| + r), and makes the iteration time constant for any length password (and thus easier to tune prececely). This is much more reasonable from a performance perspective and should allow for much larger passwords.

As far as I can tell this should have the same security as the “normal” algorithm but with a significant algorithmic complexity improvement. Is there something that I’m not seeing? Perhaps a security trade-off? (If there’s a worry about “losing” entropy from the password over repeated iterations, you could add one more round with p: K = h(x[r]||p||s), which doesn’t change the better complexity).

Dave Pawson

26 January 2019 at 02:42

But for me, a password is seldom enough? With 100+ logins, I often need 5 or 6 pieces of information? I store them (plain text) as
key:{key:value [,key:value]*}
e.g. url: , login: , mmn: etc.

Could this be used for a more practical approach? I’d be interested.

John

26 January 2019 at 08:09

@Dave: An individual could do things that wouldn’t be practical for a server. You might consider a password manager or a YubiKey.

@Joe: I don’t think password length matters because password lengths will be less than hash lengths. Hardly anybody would use a 256-bit password, for example.

Steve

26 January 2019 at 09:08

Is this true?
“Now the time required to test each password has been multiplied by r. ”
As an attacker, I don’t know how many iterations you’ve used (r). Therefore I need to pick a test password and either iterate until I get a match (which may never happen if that password isn’t in the list,) or give up after some predetermined number of iterations. If that number is be less than r, I miss when I should have hit.
Also, I could make this harder by using a list of salts s_i; i = 1,…n and making each iteration
x_i+1 = h(x_i || p || s_i) i = 1,…,r
Now as an attacker, I need to know all the salts and the correct order in which to apply them, which multiplies the attack cost by s! at no additional encryption cost.

Steve

26 January 2019 at 09:10

* said s!, meant n!

John

26 January 2019 at 09:25

I’ve implicitly assumed that the attacker knows r. In practice they may or may not. Kerckhoff’s principle would say that it’s safest to assume any fixed parameters such as r are known.

Joe Taber

27 January 2019 at 00:05

@John: If you use a password generator you can get lots of entropy in 32 bytes, but most people should use passphrases — which are much better than passwords for real people — (and much less entropy dense) then suddenly 32 bytes start to seem very restrictive. 32 bytes is just a couple words and an ok-ish passphrase. I hate running up against these kinds of arbitrary password length limits, and I don’t like the idea of restricting the way a user of my service is allowed to secure themselves — who am I to say your scheme of typing a unique little story as your password is “wrong”? I would prefer the limit to be based on something *real*, like bandwidth or memory limits, UX reasons, or security reasons, not “I chose a bad algorithm so I have to punish my users with the consequences.”

Glenn Vanderburg

28 January 2019 at 08:39

From this post I deduce that you type on Dvorak or some other alternative keyboard layout, for how else does one misspell “qwerty”?

John

28 January 2019 at 08:46

Dvorak is so mainstream. I use my own artisanal keyboard layout. :)

Comments are closed.