Password security is always a hot topic of conversation. Passwords are often your first line of defense against intruders, attackers, and all sorts of nasties. What is involved in ensuring password security, though? With something so essential, so vital, why don't we have standards about the correct length, strength, and storage to ensure that passwords are kept secure?
The first step is to decide on the strength rules for passwords. What you decide depends largely on the sensitivity of the data in your system, the target user, and a number of other small considerations. If you are building a secure banking system, then allowing 'cat' as a password probably isn't advisable; conversely insisting on three numbers, a mix of upper and lower case letters, as well as at least two other characters may well be overkill for a simple forum about knitting.
As you start to enforce stronger rules, three things will happen.
Personally I believe in setting only length limits, but informing users of the strength of their password. If extra security is required, then set the password for the user, allowing them to change it later. Ultimately I feel that the responsibility for securing an account at this stage should lie with the user.
The next step to securing your passwords - deciding how to store them. The three largely acknowledged methods are to store them as plain text (unmodified, in a database or file), to encrypt them (using a reversible cypher), or to hash them (a one-way, repeatable, obfuscation of the password). There are other options (including hard-coding passwords into the system), but even then, the password will likely be secured using one of these methods.
Plain text passwords are, in theory, relatively harmless. If they are stored in a secure location, then only the application should have access to them. Passwords should be stored in a database, which protects them from the casual prying eye, search engine spider, or simple directory listing. This isn't always the case, though. Many applications will store a password not just in plain text, but in a standard file which is world-readable. This means that anyone with access to your system (i.e. system administrators, other programmers, etc.) also has access to your password. Considering that many people use the same password between multiple applications, this could be a rather serious security problem.
Storing plain-text passwords in a database adds a slightly higher level of security, as a malicious user needs to be able to access your database in order to view them. The problem is that your application has access to the database, and if you haven't properly secured your code, it's possible for a malicious user to access them through an SQL injection attack. If this is a web application, then you may still have the problem where system administrators, and other programmers will be able to access your passwords.
The advantage of storing passwords as plain text is that they are easily retrieved. Users can send you an email complaining that they forgot your password, and you can whip off to the database, look it up, and inform them with ease that it is 'password1' (they may claim that the '1' makes this secure). If a user is complaining that something "doesn't look right" you can quickly find their password, log into their account, and walk them through the issue. This all sounds brilliant until you realise that other people may have access to your password, too.
This method of storing passwords is, sadly, all too common. It makes life convenient, but far from secure.
Encrypted passwords hold some clear advantages over plain text passwords. If someone manages to access your database through an SQL injection, all they will retrieve is a large pile of what looks like utter gibberish. As encryption is reversible, it is still possible to look up a user's password (although it may require a few extra steps), but they aren't understandable if you simply look at them. While there are some algorithms (Blowfish for example) which are considered secure, they are all reversible; they have to be. In my eyes, this in itself constitutes as a security flaw if someone manages to compromise your server.
Even if you fight off an intruder, and change your algorithm, the passwords themselves are still compromised. If you have 100, or 1000, or 10,000 users, getting every one to change their password would be a logistical nightmare. Many of them would insist on keeping their old password, some might leave for another service, and you can be sure that there will be grave concern from users in the future about the security of your software.
If you are writing software for a single user (and even then), or if you have a legitimate reason as to why you absolutely need to be able to retrieve a password (such as using the password to authenticate with another, third party system), then this method is greatly advantageous over storing plain text passwords, and should be considered. If, however, you are writing software that will be accessible through the web, or even simultaneously by multiple users on a network, then perhaps this isn't the method for you, either.
Hashing passwords can be highly secure, and is almost always my method of choice. A hash is a one-way, repeatable, algorithm. This means that you can't reverse engineer a hashed password, and find out what it actually is. Like encrypted passwords, they are secure, and cannot be read and understood by people. Unlike encrypted passwords, they're not reversible, so even if your server is compromised, your stored passwords are safe. As hashing is repeatable, you can use the same algorithm to prepare user input, and can then compare the output with the stored password.
This helps to reduce the likelihood of an SQL injection occurring, because the user input is transformed into a safe and usable format. A common algorithm is MD5, which produces a nice, 32-digit hexadecimal number. In PHP particularly, this really is as easy as:
<?php $hash = md5($_POST['password']); ?>
So what are the problems with hashing passwords? For one, you lose the ability to retrieve passwords for users. Personally, I don't see this as a flaw. I feel that users should have to reset a lost password. This may be preferable anyway, as they obviously forgot their current password at least once. This is also a better security practice, as the sole responsibility for remembering a password now lies on the user (as it should. It is, after all, their password.)
The other problem is that hashes, such as MD5, are usually finite. This means that technically, it is possible to find a translatable reverse value for every hash. Fortunately for MD5, the number of possible hashes is 2128, or 340,282,366,920,938,463,463,374,607,431,768,211,456. Some valiant efforts have been made to create a reliable rainbow table, but the numbers are so large that both storage and computation starts to become a rather expensive exercise. Luckily, this is easily circumvented through either a double-hash, or by using a salt.
Using a salt greatly reduces the likelihood that a rainbow table of hash values would find a match, as the match has to include the salt, also. For example:
<?php // a simple password $pass = "password"; // from the user $hash = md5($pass); echo $hash; // 5f4dcc3b5aa765d61d8327deb882cf99 ?>
Now run this through a MD5 reverser: http://md5.gromweb.com/?md5=5f4dcc3b5aa765d61d8327deb882cf99 and you get 'password' back.
However, when you use a salt, you append another string to the end of the user input, which makes something, just as repeatable, but far more secure.
<?php $pass = "password"; $salt = "pretzel"; // this would be set in your code $hash = md5($pass.$salt); // concatenate the salt onto the password. This makes it 'passwordpretzel' echo $hash; // ddf551bda86d5cbf69f4e5b1e55983f0 - vastly different ?>
Now run this through an MD5 reverser: http://md5.gromweb.com/?md5=ddf551bda86d5cbf69f4e5b1e55983f0 and it can't find anything!
Even if a user's password, and your salt come up with a reversible word, the fact that you have salted the input means that the discovered 'password' can't be used.
<?php $pass = "pass"; // this is the correct password, as input from the user $salt = "word"; // this would be set in your code $hash = md5($pass.$salt); // concatenate the salt onto the password. This makes it 'password' echo $hash; // 5f4dcc3b5aa765d61d8327deb882cf99 - this can be reversed, but never fear: $pass = "password"; // a malicious user uses 'password' which they got from reversing the hash $salt = "word"; $hash = md5($pass.$salt); // concatenate the salt onto the password. This makes it 'passwordword' echo $hash; // 4594e41c9841aef79e064bcd75c9b7ee - this is different, so it wouldn't authenticate. ?>
This is a simple example, and in reality you should probably not use a word for your salt, but the point is clear. Adding a salt to the password increases the security. In some instances, I have used another piece of user input (the username) as the salt, which means that it is different for each user, and even harder to guess with a brute-force attack. Perhaps the best method in this situation is to use a mixture of both a static and a dynamic (but consistently repeatable) salt. The methods at this point are merely limited by your own imagination.
The other way to circumvent the use of rainbow tables is to double-hash the input.
<?php // a simple password $pass = "password"; // from the user $hash = md5($pass); echo $hash; // 5f4dcc3b5aa765d61d8327deb882cf99 $hash2 = md5($hash); echo $hash2; // 696d29e0940a4957748fe3fc9efd22a3 ?>
This helps the issue by greatly increasing the effort required to crack the original password. Due to the nature of MD5, multiple input values could potentially share a single hash. This means that even if you find something which matches the second (final) hash, it may not be what you need to reverse to find the first (original) hash. The likelihood that two different possible real-world passwords containing simple text will hash to the same value is also so infinitesimally tiny as to be virtually non-existent.
A combination of double-hashing, dynamic, and static salts should ensure that all passwords are rendered uncrackable.
An important part of system security is to ensure that only the owner of a password ever knows what it is. While you can't ensure that a user will never tell someone else their password, you can reduce the chance that someone else will intercept or read it. If you set the password for a user, you should inform them of the password in the application only once (when it is generated), and tell them to take note of it. You should never email a password to a user, as you are then relying on the mail application to keep the password safe. If they use a mail client such as Outlook or Thunderbird, then anyone who has access to their computer will be able to access their password. Even web-mail clients could be left logged in, or get broken in to, or email may even be read on the server by a system administrator. In short, email is not a safe place to store or send a password.
The only place a password should exist is with the user. Your software should store the password hashed or, only if absolutely necessary, encrypted. Resetting a forgotten password can easily be done by emailing the user a token, which can be used only once, allowing them to change their password through the system. This way, if someone accesses their email, their password is not visible, and any token emails have been used. It is also wise to ensure that tokens only have a limited life time (from a half hour, to twenty-four hours at most) so that any intruders can't simply use an old token, that has been generated and forgotten.
Preventing brute-force dictionary attacks on your system is the next step. There are many ways to do this, and most of them rely on making such an attack infeasible. Some of the common methods include limiting the number of failed authentication attempts before either suspending the user, or attempting to block the source of the attack. Others rely on making such an attack too slow to be useful (a forced wait of 2 seconds when authenticating might not be noticed by a user, but when you are trying to test several thousand passwords, 2 seconds per attempt adds up very quickly.) Some banks have even implemented dynamic UI-based keyboards which change every view in an attempt to thwart both brute-force dictionary attacks, and key-logging software. Ultimately the solution to use should be based on the requirements and user-base of your software.
The 'password strength' debate also comes back to play here. The stronger a user's password, the more difficult it would be for a brute-force attack to crack it.
Ultimately, even with your best efforts, users may well pick an obvious password, or will get phished, or hand their password out to a friend/family/spouse. At some point, security of your software will lie in the people who use it. For this reason, I suggest these guidelines, which I myself use in my own code wherever I can.