Separate password breaches last week at LinkedIn, eHarmony and Last.fm exposed millions of credentials, and once again raised the question of whether any company can get password security right. To understand more about why companies keep making the same mistakes and what they might do differently to prevent future password debacles, I interviewed Thomas H. Ptacek, a security researcher with Matasano Security.
Ptacek is just one of several extremely smart researchers I’ve been speaking with about this topic. Below are some snippets from a conversation we had last week.
BK: I was just reading an article by Eric Chabrow, which pointed out that LinkedIn — a multi-billion dollar company that holds personal information on some of world’s most important executives — has neither a chief information officer nor a chief information security officer. Is it too much to ask for a company like this to take security seriously enough to do a better job protecting and securing their users’ passwords?
Ptacek: There is no correlation between how much money a company or service has or takes in — or whether it’s free or not free — and how good their password storage practices are. Nobody gets this right. I think it’s a problem of generalist developers writing password storage systems. They may be good developers, but they’re almost never security domain specialists. There are very few good developers who are also security domain specialists. So if you’re a smart and talented developer but not a security domain specialist, and you’ve got to put together a password storage system, even if it’s just MD5, to you that’s a space alien technology. That’s a cryptographic algorithm. You can tell your boss that’s a cryptographic hash. You feel good about the fact that you’re storing passwords in a cryptographic hash. But you have to be a domain expert to know that the term cryptographic hash doesn’t really mean much.
BK: Why doesn’t cryptographic hash mean much? Maybe LinkedIn shouldn’t have been using a plain old SHA-1 cryptographic hash function, but shouldn’t developers be seeking to secure their passwords with a solid cryptographic algorithm?
Ptacek: The basic mechanism by which SHA-1 passwords are cracked, or MD5 or SHA-512 — it doesn’t matter what algorithm you use — hasn’t changed since the early 1990s. As soon as code to implement SHA-1 came out, it was also available to John the Ripper and other password cracking tools. It’s a really common misconception — including among security people — that the problem here is using SHA-1. It would not have mattered at all if they had used SHA-512, they would be no better off at all.
BK: I’ve heard people say, you know this probably would not have happened if LinkedIn and others had salted the passwords — or added some randomness to each of the passwords, thus forcing attackers to expend more resources to crack the password hashes. Do you agree with that?
Ptacek: That’s actually another misconception, the idea that the problem is that the passwords were unsalted. UNIX passwords, and they’ve been salted forever, since the 70s, and they have been cracked forever. The idea of a salt in your password is a 70s solution. Back in the 90s, when people broke into UNIX servers, they would steal the shadow password file and would crack that. Invariably when you lost the server, you lost the passwords on that server.
BK: Okay. So if the weakness isn’t with the strength of the cryptographic algorithm, and not with the lack of salt added to the hashed passwords, what’s the answer?
Ptacek: In LinkedIn’s case, and with many other sites, the problem is they’re using the wrong kind of algorithm. They use a cryptographic hash, when they need to use a password hash.
BK: I’ll bite: What’s the difference?
Ptacek: The difference between a cryptographic hash and a password storage hash is that a cryptographic hash is designed to be very, very fast. And it has to be because it’s designed to be used in things like IP-sec. On a packet-by-packet basis, every time a packet hits an Ethernet card, these are things that have to run fast enough to add no discernible latencies to traffic going through Internet routers and things like that. And so the core design goal for cryptographic hashes is to make them lightning fast.
Well, that’s the opposite of what you want with a password hash. You want a password hash to be very slow. The reason for that is a normal user logs in once or twice a day if that — maybe they mistype their password, and have to log in twice or whatever. But in most cases, there are very few interactions the normal user has with a web site with a password hash. Very little of the overhead in running a Web application comes from your password hashing. But if you think about what an attacker has to do, they have a file full of hashes, and they have to try zillions of password combinations against every one of those hashes. For them, if you make a password hash take longer, that’s murder on them.
So, if you use a modern password hash — even if you are hardware accelerated, even if you designed your own circuits to do password hashing, there are modern, secure password hashes that would take hundreds or thousands of years to test passwords on.
BK: Can you give me an example of the difference between the time and resources it would take to crack, say, a SHA-1 password versus a good password hash?
Ptacek: Let’s say you have a reasonably okay password and you’re using salted, randomized SHA-1 hash. I can, off-the-shelf, build a system that will try many, many tens of thousands of possible passwords per second. I can go into Best Buy or whatever and buy a card off the shelf that will do that, and the software to do it is open source.
If we were instead using something like Bcrypt, which would have just been the difference of using a different [software] library when I built my application, I can set Bcrypt up so that a single attempt — to test whether my password was “password” or “himom” — just that one test could take a hundred milliseconds. So all of a sudden you’re talking about tens of tries per second, instead of tens of thousands.
What you do with a password hash is you design it in the opposite way you would design a standard cryptographic hash. A cryptographic hash wants to do the minimum amount of work possible in order to arrive at a secure result. But a password hash wants to deliberately be designed to do the maximum amount of work.
Can you explain in layman’s terms what it is that makes a password hash like Bcrypt take so much longer to crack?
Ptacek: It’s similar to if you said, let’s take SHA-1 password, but instead of just using SHA-1, let’s run SHA-1 on itself thousands of times. So, we’ll take the output of SHA-1 and feed it back to SHA-1, and we’ll do that thousands and thousands of times, and you’ll only know if your password hash is right, when you look at the result of that 1,000th SHA-1 run. So, in order to make that password hash work, you have to run the algorithm 1,000 times for each guess. That’s roughly the tactic that the modern, secure password hashes take. These are algorithms that are designed so that you can’t arrive at the result without lots and lots of work. I mean, we’re talking about 100 milliseconds [one-tenth of a second] worth of work on modern hardware to get the results of a single password attempt.
BK: So what’s the catch here? Are these password hashing systems any harder or more expensive to deploy?
Ptacek: It’s a knowledge issue. This is easier to deploy that building it yourself. If you build it yourself, it takes a fair amount of effort. Whereas the libraries to secure password hashes, just where you drop them in it’s like one function call. And they’re all free. There are no patents to worry about or anything like that.
The Web application very, very minorly increases its costs. Some imperceptible amount. But cleverly, they’ve done it in a way that drastically explodes the costs for attackers. If a normal password try takes a microsecond [one millionth of a second]– but a bcrypt password trial takes 10 milliseconds [a hundredth of a second] — you might as well give up. It will take years and years.
Let’s say we have a Web application that lost 10 million password hashes, but they’re bcrypted passwords. So, I think in that last dump we saw from the LinkedIn dump, about like three million of those passwords had been cracked. It was really quick and easy to see if your password had been in there. In this case, if we lost 10 million bcrypted passwords, instead of 3 million of them being published and compromised, you might be looking at tens or hundreds of user passwords being compromised. Because you could maybe effectively look for every password that was “password,” but you certainly couldn’t run a dictionary against every password there anymore.
BK: Isn’t it likely that lots of companies that are relying on cryptographic hashes haven’t switched to password hashes because it might be disruptive, expensive or hard to do? Do you agree, or are there other issues getting in the way here?
Ptacek: At a certain point, the cost of migrating that is incredibly expensive. And securing Web applications that are as complex as LinkedIn is an incredibly hard problem. At same time, trying to find talent to make sure they’re building it correctly…there’s just not enough application security talent out there. It’s hard to hire people who know what they’re talking about. There’s so much bulk knowledge that gets substituted for real expertise. You can find a lot of general developers sharing wisdom about how to develop secure applications, but these are by and large just general developers trying to be smart, they’re not people who really know what they’re talking about.
To me, the fundamental problem is passwords. People should use better storage algorithms if they’re going to inflict passwords on their users. They should do a better job of it. But the real answer is things like two-factor authentication with smart phones. Two factor authentication seems like the answer to me. I think ten years from now this is going to be a common approach.