Archive by Author
Top 15+ Best Practices for Writing Super Readable Code
Posted on 30. Mar, 2011 by Burak Guzel.
Twice a month, we revisit some of our readers’ favorite posts from throughout the history of Nettuts+.
Code readability is a universal subject in the world of computer programming. It’s one of the first things we learn as developers. This arti…
Continue Reading
Understanding Hash Functions and Keeping Passwords Safe
Posted on 18. Jan, 2011 by Burak Guzel.
Sometimes servers and databases are stolen or compromised. With this in mind, it is important to make sure that some crucial user data, such as passwords, can not be recovered. Today, we are going to learn the basics behind hashing and what it takes to protect passwords in your web application.
1. Disclaimer
Cryptology is a sufficiently complicated subject, and I am by no means an expert. There is constant research happening in this area, in many universities and security agencies.
In this article, I will try to keep things as simple as possible, while presenting to you a reasonably secure method of storing passwords in a web application.
2. What Does “Hashing” Do?
Hashing converts a piece of data (either small or large), into a relatively short piece of data such as a string or an integer.
This is accomplished by using a one-way hash function. “One-way” means that it is very difficult (or practically impossible) to reverse it.
A common example of a hash function is md5(), which is quite popular in many different languages and systems.
$data = "Hello World"; $hash = md5($data); echo $hash; // b10a8db164e0754105b7a99be72e3fe5
With md5(), the result will always be a 32 character long string. But, it contains only hexadecimal characters; technically it can also be represented as a 128-bit (16 byte) integer. You may md5() much longer strings and data, and you will still end up with a hash of this length. This fact alone might give you a hint as to why this is considered a “one-way” function.
3. Using a Hash Function for Storing Passwords
The usual process during a user registration:
- User fills out registration form, including the password field.
- The web script stores all of the information into a database.
- However, the password is run through a hash function, before being stored.
- The original version of the password has not been stored anywhere, so it is technically discarded.
And the login process:
- User enters username (or e-mail) and password.
- The script runs the password through the same hashing function.
- The script finds the user record from the database, and reads the stored hashed password.
- Both of these values are compared, and the access is granted if they match.
Once we decide on a decent method for hashing the password, we are going to implement this process later in this article.
Note that the original password has never been stored anywhere. If the database is stolen, the user logins can not be compromised, right? Well, the answer is “it depends.” Let’s look at some potential problems.
4. Problem #1: Hash Collision
A hash “collision” occurs when two different data inputs generate the same resulting hash. The likelihood of this happening depends on which function you use.
How can this be exploited?
As an example, I have seen some older scripts which used crc32() to hash passwords. This function generates a 32-bit integer as the result. This means there are only 2^32 (i.e. 4,294,967,296) possible outcomes.
Let’s hash a password:
echo crc32('supersecretpassword');
// outputs: 323322056
Now, let’s assume the role of a person who has stolen a database, and has the hash value. We may not be able to convert 323322056 into ‘supersecretpassword’, however, we can figure out another password that will convert to the same hash value, with a simple script:
set_time_limit(0);
$i = 0;
while (true) {
if (crc32(base64_encode($i)) == 323322056) {
echo base64_encode($i);
exit;
}
$i++;
}
This may run for a while, though, eventually, it should return a string. We can use this returned string — instead of ‘supersecretpassword’ — and it will allow us to successfully login into that person’s account.
For example, after running this exact script for a few moments on my computer, I was given ‘MTIxMjY5MTAwNg==‘. Let’s test it out:
echo crc32('supersecretpassword');
// outputs: 323322056
echo crc32('MTIxMjY5MTAwNg==');
// outputs: 323322056
How can this be prevented?
Nowadays, a powerful home PC can be used to run a hash function almost a billion times per second. So we need a hash function that has a very big range.
For example, md5() might be suitable, as it generates 128-bit hashes. This translates into 340,282,366,920,938,463,463,374,607,431,768,211,456 possible outcomes. It is impossible to run through so many iterations to find collisions. However some people have still found ways to do this (see here).
Sha1
Sha1() is a better alternative, and it generates an even longer 160-bit hash value.
5. Problem #2: Rainbow Tables
Even if we fix the collision issue, we’re still not safe yet.
A rainbow table is built by calculating the hash values of commonly used words and their combinations.
These tables can have as many as millions or even billions of rows.
For example, you can go through a dictionary, and generate hash values for every word. You can also start combining words together, and generate hashes for those too. That is not all; you can even start adding digits before/after/between words, and store them in the table as well.
Considering how cheap storage is nowadays, gigantic Rainbow Tables can be produced and used.
How can this be exploited?
Let’s imagine that a large database is stolen, along with 10 million password hashes. It is fairly easy to search the rainbow table for each of them. Not all of them will be found, certainly, but, nonetheless…some of them will!
How can this be prevented?
We can try adding a “salt”. Here is an example:
$password = "easypassword"; // this may be found in a rainbow table // because the password contains 2 common words echo sha1($password); // 6c94d3b42518febd4ad747801d50a8972022f956 // use bunch of random characters, and it can be longer than this $salt = "f#@V)Hu^%Hgfds"; // this will NOT be found in any pre-built rainbow table echo sha1($salt . $password); // cd56a16759623378628c0d9336af69b74d9d71a5
What we basically do is concatenate the “salt” string with the passwords before hashing them. The resulting string obviously will not be on any pre-built rainbow table. But, we’re still not safe just yet!
6. Problem #3: Rainbow Tables (again)
Remember that a Rainbow Table may be created from scratch, after the database has been stolen.
How can this be exploited?
Even if a salt was used, this may have been stolen along with the database. All they have to do is generate a new Rainbow Table from scratch, but this time they concatenate the salt to every word that they are putting in the table.
For example, in a generic Rainbow Table, “easypassword” may exist. But in this new Rainbow Table, they have “f#@V)Hu^%Hgfdseasypassword” as well. When they run all of the 10 million stolen salted hashes against this table, they will again be able to find some matches.
How can this be prevented?
We can use a “unique salt” instead, which changes for each user.
A candidate for this kind of salt is the user’s id value from the database:
$hash = sha1($user_id . $password);
This is assuming that a user’s id number never changes, which is typically the case.
We may also generate a random string for each user and use that as the unique salt. But we would need to ensure that we store that in the user record somewhere.
// generates a 22 character long random string
function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
$unique_salt = unique_salt();
$hash = sha1($unique_salt . $password);
// and save the $unique_salt with the user record
// ...
This method protects us against Rainbow Tables, because now every single password has been salted with a different value. The attacker would have to generate 10 million separate Rainbow Tables, which would be completely impractical.
7. Problem #4: Hash Speed
Most hashing functions have been designed with speed in mind, because they are often used to calculate checksum values for large data sets and files, to check for data integrity.
How can this be exploited?
As I mentioned before, a modern PC with powerful GPU’s (yes, video cards) can be programmed to calculate roughly a billion hashes per second. This way, they can use a brute force attack to try every single possible password.
You may think that requiring a minimum 8 character long password might keep it safe from a brute force attack, but let’s determine if that is, indeed, the case:
- If the password can contain lowercase, uppercase letters and number, that is 62 (26+26+10) possible characters.
- An 8 character long string has 62^8 possible versions. That is a little over 218 trillion.
- At a rate of 1 billion hashes per second, that can be solved in about 60 hours.
And for 6 character long passwords, which is also quite common, it would take under 1 minute.
Feel free to require 9 or 10 character long passwords, however you might start annoying some of your users.
How can this be prevented?
Use a slower hash function.
Imagine that you use a hash function that can only run 1 million times per second on the same hardware, instead of 1 billion times per second. It would then take the attacker 1000 times longer to brute force a hash. 60 hours would turn into nearly 7 years!
One way to do that would be to implement it yourself:
function myhash($password, $unique_salt) {
$salt = "f#@V)Hu^%Hgfds";
$hash = sha1($unique_salt . $password);
// make it take 1000 times longer
for ($i = 0; $i < 1000; $i++) {
$hash = sha1($hash);
}
return $hash;
}
Or you may use an algorithm that supports a “cost parameter,” such as BLOWFISH. In PHP, this can be done using the crypt() function.
function myhash($password, $unique_salt) {
// the salt for blowfish should be 22 characters long
return crypt($password, '$2a$10$'.$unique_salt);
}
The second parameter to the crypt() function contains some values separated by the dollar sign ($).
The first value is ‘$2a’, which indicates that we will be using the BLOWFISH algorithm.
The second value, ‘$10′ in this case, is the “cost parameter”. This is the base-2 logarithm of how many iterations it will run (10 => 2^10 = 1024 iterations.) This number can range between 04 and 31.
Let’s run an example:
function myhash($password, $unique_salt) {
return crypt($password, '$2a$10$'.$unique_salt);
}
function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
$password = "verysecret";
echo myhash($password, unique_salt());
// result: $2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC
The resulting hash contains the algorithm ($2a), the cost parameter ($10), and the 22 character salt that was used. The rest of it is the calculated hash. Let’s run a test:
// assume this was pulled from the database
$hash = '$2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC';
// assume this is the password the user entered to log back in
$password = "verysecret";
if (check_password($hash, $password)) {
echo "Access Granted!";
} else {
echo "Access Denied!";
}
function check_password($hash, $password) {
// first 29 characters include algorithm, cost and salt
// let's call it $full_salt
$full_salt = substr($hash, 0, 29);
// run the hash function on $password
$new_hash = crypt($password, $full_salt);
// returns true or false
return ($hash == $new_hash);
}
When we run this, we see “Access Granted!”
8. Putting it Together
With all of the above in mind, let’s write a utility class based on what we learned so far:
class PassHash {
// blowfish
private static $algo = '$2a';
// cost parameter
private static $cost = '$10';
// mainly for internal use
public static function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
// this will be used to generate a hash
public static function hash($password) {
return crypt($password,
self::$algo .
self::$cost .
'$' . self::unique_salt());
}
// this will be used to compare a password against a hash
public static function check_password($hash, $password) {
$full_salt = substr($hash, 0, 29);
$new_hash = crypt($password, $full_salt);
return ($hash == $new_hash);
}
}
Here is the usage during user registration:
// include the class
require ("PassHash.php");
// read all form input from $_POST
// ...
// do your regular form validation stuff
// ...
// hash the password
$pass_hash = PassHash::hash($_POST['password']);
// store all user info in the DB, excluding $_POST['password']
// store $pass_hash instead
// ...
And here is the usage during a user login process:
// include the class
require ("PassHash.php");
// read all form input from $_POST
// ...
// fetch the user record based on $_POST['username'] or similar
// ...
// check the password the user tried to login with
if (PassHash::check_password($user['pass_hash'], $_POST['password']) {
// grant access
// ...
} else {
// deny access
// ...
}
9. A Note on Blowfish Availability
The Blowfish algorithm may not be implemented in all systems, even though it is quite popular by now. You may check your system with this code:
if (CRYPT_BLOWFISH == 1) {
echo "Yes";
} else {
echo "No";
}
However, as of PHP 5.3, you do not need to worry; PHP ships with this implementation built in.
Conclusion
This method of hashing passwords should be solid enough for most web applications. That said, don’t forget: you can also require that your members use stronger passwords, by enforcing minimum lengths, mixed characters, digits & special characters.
A question to you, reader: how do you hash your passwords? Can you recommend any improvements over this implementation?
Continue Reading
CodeIgniter from Scratch: Displaying & Sorting Tabular Data
Posted on 07. Sep, 2010 by Burak Guzel.
In today’s video tutorial, we are going to use CodeIgniter to pull information from a database and display it in a sortable table structure. This is a quite common task, especially inside admin areas, for displaying database records. We’ll be utilizing a few different utilities, such as the active records library and pagination. Catch Up [...]
Continue Reading
Quick Tip: Working with MySQL and SUM
Posted on 31. Aug, 2010 by Burak Guzel.
In today’s quick tip screencast, we’re going to learn a MySQL query trick. You might be already familiar with the SUM() and IF() functions. We will be combining them to come up with some useful queries. This can reduce the number of queries you need to run for fetching certain types of summary data, and [...]
Continue Reading
CodeIgniter from Scratch: Profiling, Benchmarking & Hooks
Posted on 05. Aug, 2010 by Burak Guzel.
In this 15th episode of the series, we are going to learn about three subjects: Profiling, Benchmarking and Hooks. You can use these tools to analyze your CodeIgniter applications performance, and figure out what part of the code you need to optimize. We are also going to make even further improvements to the Profiler library [...]
Continue Reading
New Premium Tutorial: All About the “Static” Keyword
Posted on 14. Jul, 2010 by Burak Guzel.
Today, we are going to learn how to use the ’static’ keyword in PHP. We are going go over the basics and build some examples. It is time to add this interesting language feature to your web development bag of tricks. Become a Premium member today to see this and many other tutorials! Join Net [...]
Continue Reading
CodeIgniter from Scratch: Security
Posted on 30. Jun, 2010 by Burak Guzel.
In this fourteenth episode of the series, our subject is ‘Security.’ We will go over topics such as: password encryption, message encoding/decoding, XSS filtering, output filtering, sql injection, session security, private controller methods and more. Catch Up Day 1: Getting Started With the Framework Day 2: Database Selecting Methods Day 3: Sending Emails Day 4: [...]
Continue Reading
9 Useful PHP Functions and Features You Need to Know
Posted on 27. Apr, 2010 by Burak Guzel.
Even after using PHP for years, we stumble upon functions and features that we did not know about. Some of these can be very useful, yet underused. Not all of us have read the manual and the function reference from cover to cover!
1. Functions with Arbitrary Number of Arguments
You may already know that PHP allows [...]
Continue Reading
Advanced Regular Expression Tips and Techniques
Posted on 16. Apr, 2010 by Burak Guzel.
Regular Expressions are the Swiss Army knife for searching through information for certain patterns. They have a wide arsenal of tools, some of which often go undiscovered or underutilized. Today I will show you some advanced tips for working with regular expressions.
Adding Comments
Sometimes regular expressions can become complex and unreadable. A regular expression you write [...]
Continue Reading
CodeIgniter from Scratch: Extending the Framework
Posted on 02. Apr, 2010 by Burak Guzel.
In today’s screencast we are going to learn how to create our own helpers and libraries. We are also going to learn how to extend the existing helpers and libraries that come with CodeIgniter. At the end of the tutorial we will extend the Form Validation library to give it the ability to check for [...]
Continue Reading
CodeIgniter from Scratch: Shopping Cart
Posted on 25. Mar, 2010 by Burak Guzel.
Today we are going to take a look at the Shopping Cart library for CodeIgniter. This useful class allows us to add and remove items to a shopping cart, update them, and calculate prices. I will demonstrate how you can build a simple shopping cart system with the help of this library..
Catch Up
Day 1: Getting [...]
Continue Reading
CodeIgniter from Scratch: File Operations
Posted on 11. Mar, 2010 by Burak Guzel.
In today’s episode we are going to be working with several helper functions, related to files, directories, and downloads. We are going to learn how to read, write, download files, and retrieve information about both files and directories. Also at the end we will build a small file browser that utilizes jQuery as well.
Catch Up
Day [...]

