Secure password hashing: why we need it

commonpwds

Nobody likes passwords. Especially when you receive your password in clear text after hitting “forgot my password”—evidence that the server stores the password either in clear or encrypted, which is generally a bad idea.

For servers storing passwords in clear verification is just a comparison of two strings, the submitted password and the one stored. This is simple but risky and irresponsible, because an attacker who gains access to the database directly obtains the password of each user. The attacker may then impersonate users on the service attacked or on another service where this user is registered, since most people reuse a same password accross services. Even if passwords are encrypted, an attacker who gained privileged access to the server is likely to have access to the decryption key, unless (say) a HSM is used to compute a keyed hash of the passwords with a protected secret key.

Some servers store a hash of your password. A hash is computed by applying a one-way function transforming a string of arbitrary length to a random-looking string of short fixed length (for example, 20 bytes with SHA-1). The goal of this approach is to prevent an attacker to read your passwords even if she gains access to the database. However, if the attacker knows the hash function she can try many candidate passwords until one matches the hash value observed (typically, using a dictionary of the most commonly used passwords). The degree of protection varies greatly with the hash function used:

  • Cryptographic hash functions such as MD5, SHA-1, or SHA-256: these are very fast (several hundreds of megabytes per second on a desktop CPU), which is undesirable when hashing passwords (because attackers can they try passwords at a higher rate). Furthermore, a given password is always hashed to the same value regardless of the user, which exposes the system to time-memory tradeoffs (much faster than dictionary attacks). SHA-3 is even a worse choice than SHA-1 or SHA-256 due to its excellent performance in hardware.
  • Cryptographic hash functions with a salt: a salt is an auxiliary input to the hash function selected randomly at password reset time. Two users with a same password will then have different salts, and thus different hash values. This prevents time-memory trade-off attacks, because an attacker does not know in advance the salt used. However dictionary attacks remain as fast as with unsalted hash functions.
  • Password-hashing functions, also called password-based key derivation functions: these mitigate dictionary attacks by being significantly slower, and sometimes requiring a significant amount of memory (to migitate the speed-up from GPUs or FPGAs). However, among the handful of constructions proposed (PBKDF2, bcrypt, and scrypt are the most common) none is fully satisfying, and as a result they are rarely used.

Note that in theory, servers don’t need to know a password to verify it, but only some information that allows them to verify that the client knows the password (this information is sometimes call the verifier). For example, the SRP protocol uses public-key cryptography to ensure that

  1. The messages exchanged between the client and the server don’t leak information on the password, and that
  2. A fully compromised server does not reveal the password without a bruteforce search (such as a dictionary search).

This kind of method is known as a zero-knowledge password proof. But practice is not theory, and in most password-authenticated web services the messages exchanged include the password in clear (although the session may be encrypted through HTTPS), and the verifier information is a hash of the password.

Password hashing is not only needed in web services, but everywhere a password is used, be it operating systems’ login, PINs or “passcodes” on mobile phones, key derivation in full-disk encryption schemes, protection of PGP or SSH private keys, etc. But secure password hashing is critical to internet security, because

  1. Passwords (or hashes) are often compromised, typically through database injection on the server (published “dumps” like the 50 million LivingSocial salted SHA-1 hashes, the 32 million RockYou clear passwords, the 6.5 million LinkedIn unsalted SHA-1 hashes, the million of Sony clear passwords are just the tip of the iceberg).
  2. Password-cracking from hashes has become very effective, with optimized methods and tools for general-purpose CPUs or GPUs, as well as academic research and professional services.
  3. Passwords are often reused accross web services, thus one compromised password generally leads to several compromised accounts.

About 15 years after bcrypt, we have a much better understanding of password hashing (see for example this presentation by Solar Designer and the talks from Passwords^12 conference), and a real need for better—simpler, easier to integrate, etc.—password hashing schemes. It’s the perfect time to develop state-of-the-art methods to reduce the harm done by leaks or access to password hashes, which will improve the security of many applications and in particular of internet users.

We thus initiated the Password Hashing Competition (PHC), a project organized by a panel including leading experts from industry, academia, and government on the model of crypto competitions (as AES, eSTREAM, or SHA-3). We published the call for submissions in February 2013 ending in January 2014, and plan to announce a portfolio of schemes in Q2 2015 based on public discussions and evaluation. Associated events are the PasswordsCon conferences, held on July 30-31 in Las Vegas, NV, USA and in December 2013.

The next post in this series will be more technical, and will focus on the challenges of designing and analyzing a password hashing scheme, from low-level software engineering to provable security questions.

3 comments

Leave a Reply