Password Security Part 1: By The Numbers
I don’t claim to be a security expert, and especially not a cryptography expert, but I’m pretty confident in my password security. Why? Because I’m relying on the work of actual experts instead of implementing it myself. I cannot stress how important this is as a general principle when it comes to complex topics like information security and cryptography.
When talking about passwords best practices have changed a lot over just a short amount of time. We very quickly went from plain text, to MD5, to SHA-1, then maybe SHA-1 with a salt. But if you’re still doing any of that, you are putting your users at risk.
The current best practice is bcrypt. An algorithm based on the blowfish cipher, specifically designed for password hashing. Other good options include scrypt, and PBKDF2. However, bcrypt is considered superior for passwords.
In this three-part series we will take a close look at bcrypt, comparing it to other common solutions, how we can implement it, and the role it plays in keeping your users safe.
Attacking Passwords
There are two main attack vectors we need to mitigate against:
- Dictionary attacks
- Brute Force attacks
The way they work is very simple: use a pre-generated list of passwords hashes and do a simple comparison to find a string that creates the hash you want. For non-salted passwords, there are huge pre-generated lists of passwords you can download. All it takes then is a simple lookup. This list is known as a rainbow table.
If you are using a salt, but not using a unique one for each password, then all an attacker has to do is generate a rainbow table manually, using your salt for every combination, and then do the lookups. However, if you use a unique salt for every password, then they need to generate a list for every combination, for every individual password. This is where the other type of attack, brute forcing, comes into play..
Brute force attacks work by trying to guess the password, over and over. Unlike slow hashes (like bcrypt, scrypt and pbkdf2), MD5 (and it’s siblings, MD2-6) or SHA-1 (or SHA-256, SHA-512, etc.), password hashes are meant to be message digest algorithms. This means they are meant to quickly verify that a given message has not been tampered with.
A common example is verifying downloaded files. You download the file, run MD5 on the file and compare the hash to the one supplied by the original site.
The keyword here, however, is quickly. Quick is the enemy of security when it comes to brute force. These algorithms are not designed for passwords. While we typically want to make our applications run as quickly as possible, this is the one case where things can (and should) be made intentionally slower.
When using bcrypt, we can supply a cost to ensure that it takes longer to compute the hash. This is known as key stretching. By doing so, we can make brute forcing too expensive (in terms of time or computational resources) for an attacker.
Hashing Speed Compared
In his 2012 talk, Jeremi M. Gosney uses commodity hardware to compare the performance of various hashing algorithms. The setup uses just five servers, with 18 consumer graphics cards:
- 10x HD 7970
- 4x HD 5970 (dual GPU)
- 3x HD 6990 (dual GPU)
- 1x HD 5870
His results are as follows:
Algorithm | Hashes/Second | MD5 | SHA-1 | SHA-512 | Bcrypt |
---|---|---|---|---|---|
MD5 | 180 Billion/second | 65% Faster | 99.9997% Faster | 99.9999996% Faster | |
SHA-1 | 63 Billion/second | 185% Slower | 99.9994% Faster | 99.999887% Faster | |
SHA-512 | 364,000/second | 49.5M% Slower | 17.3M% Slower | 80.49% Faster | |
Bcrypt | 71,000/second | 253.5M% Slower | 88.7M% Slower | 412% Slower |
Given the advances in hardware, we should expect to see a huge drop in cost using the same hardware today, or approximately a 2-4X increase in performance using the equivalent of today’s hardware.
These numbers are so astounding that I feel it bears repeating: if you are still using unsalted (or non-uniquely) salted MD5 for your passwords, an attacker with this setup can make 180,000,000,000 guesses per second. Even the best SHA algorithm (SHA-512), while being 49,000,000 times slower than MD5, allows 364,000 guesses per second.
Bcrypt on the other hand, with a cost factor of five, only allows 71,000 guesses per second. This is 253,500,000% slower than MD5, 88,700,000% slower than SHA-1, and even 412% slower than SHA-512 (on consumer hardware).
It’s important to note that we’re achieving this by utlizing a built-in secure mechanism of the algorithm. We’re not rolling our own solution.
Conclusion
I have seen many attempts at password security that simply do things like performing five iterations of MD5 in an attempt to slow down brute force attempts. That still only brings the number of attempts down to 36,000,000,000—only twice as slow as a single SHA-1, and still 99.99898% faster than SHA-512—compared to 443,000,000% slower with bcrypt.
Bcrypt is a great solution for hashing passwords, as it is effective in combatting both brute force and dictionary attacks.
In the next installment we will look at how to actually use bcrypt in your PHP and Ruby projects.
P.S. What is your current password hashing solution? What language are you implementing it in? Why are you not using bcrypt?
Share your thoughts with @engineyard on Twitter
OR
Talk about it on reddit