Learning Resources for Software Engineering Students ยป
Author: Jeremy Choo
Reviewers: Amrut Prabhu, Marvin Chin, Tan Zhen Yong,
Wang Junming
Many software applications use a username and password combination as user account credentials for authentication. Obviously, it is not a good idea for the software to store these credentials as
Encryption is the process of converting plaintext into
One common example of encryption is the use of shifting each letter of the alphabet to the left or right by a number of positions. This is known as the Caesar cipher For example, if we chose to shift all the letters by 3, then the encryption key (and decryption key) for this algorithm would be 3. This would result in the following encryption algorithm:
Plaintext: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Ciphertext: DEFGHIJKLMNOPQRSTUVWXYZABC
This would mean that encrypting the message I love you
would result in L oryh brx
Naturally, this isn't a very good encryption method because even if one doesn't know the decryption key, the method can be easily
Another encryption method is the Pigpen Cipher, where letters are
In this case, each letter is substituted with a symbol that matches the exterior walls of where that letter is. For example, the letter W
would be encrypted to the symbol . If the letter is located on the right side instead, then a dot is placed in the middle
of the symbol to indicate that it refers to the right letter instead of the one on the left. For example, the letter P would be encrypted to the symbol
Naturally, the decryption key would be the encryption key itself, as it can be re-used to decrypt the ciphertext. Compared to the Caesar cipher, The Pigpen cipher is more resilient to brute force attacks if one doesn't know the decryption key, as it could result in one trying all possible substitution for each symbol.
I had a secret agent send me information about the secret ingredient of Mick's cheeseburgers earlier, as they were delicious and I found myself constantly eating it. I suspect it's some addictive substance to make customers keep coming back for more. To ensure that Mick didn't know their secret ingredient was being leaked, I had my agent send it in Pigpen cipher:
Encryption might seem like a good idea because the ciphertext is meaningless without the decryption key, which prevents all of the problems with storing the data directly in plaintext. However, because encryption is
Hashing is a
Some examples of cryptographic
For example, a simple hashing algorithm that acts on numbers could add up all the digits in that number. This would mean hashing the number 1013
would result in 1+0+1+3 = 5
. Hashing the number 761
would
result in 14
. Note that after hashing the number, there is no way to regain back the original number - data about the original number (such as number and position of digits) have been irrevocably lost in the process. Additionally,
many different numbers could result in the same hash. For example, the numbers 101
and 20
both result in the hash of 2
. This is called a hash collision. A good hashing algorithm attempts to minimize
the amount of hash collisions such that the probability of it happening is close to 0. In the case of the MD5 algorithm, the probability of a hash collision given any two inputs is 1 in 2128 which is 1 in 340,282,366,920,938,463,463,374,607,431,768,211,456.
A rainbow table is a precomputed table of hashes for some set of passwords. Basically, people build huge tables of hashes wherein the plaintext is already known, so that attempting to crack hashes becomes a simple problem of looking up the hash in the table and its corresponding value, instead of attempting to reverse the hash. Through this method, it is very easy to crack simple hashes by simply doing a lookup.
An example of a service that provides this is Crackstation.
Since attacks like rainbow tables exist, passwords need another layer of security.
Salting refers to appending a string of text, unique to each user, to their password before hashing them. Since each user has a unique salt, this makes rainbow tables ineffective, as the majority of the precomputed hashes won't even contain the salt, so they wouldn't even matter anyway! In this way, the salt forces the attacker to recompute the rainbow table for each password in order to be able to effectively use it. This effectively converts the attack to brute force, as each hash must be recomputed for each possible password. Additionally, the computed rainbow table would only be useful for that specific user, as each user would have a different salt. It could take years before a password is cracked!
Note that the salt should be randomly generated, as opposed to choosing a static value that is different for every user. For example, if an application used the username of the user as the salt, then attackers can pre-generate rainbow tables for common usernames, causing users with common username to be vulnerable to a rainbow table attack, even with salt applied to their passwords.
Thus, the way to store passwords properly is to use a salted hash - take the user's password, append some data to it, and hash the result. That hash is the user's hashed password. When the user attempts to log in again, perform the same procedure again. If the hash matches, then you know that the user is who he says he is, even if you don't actually know the original password that the user provided.
One question that is commonly asked by developers is where to store the salt. The salt can be stored in plaintext, along with the user in the database. Since the goal of the salt is only to prevent precomputed rainbow tables from being used, it doesn't need to be encrypted in the database.
Many good password hashing algorithms today have built-in salts, such as Argon2, SCrypt and bcrypt. A good password hashing algorithm or library will salt automatically.
A common question asked by developers is how much all of these security measures actually matter. After all, if an attacker has gained access to the entire application, does it matter if passwords are stored in plaintext or not?
If an attacker has already gained access to the entire application, then he already has all the information that he could possibly want from the server. He would have access to all of the application's data, including data from users or from any analytics software that might be running. However, by adding salt and hashing passwords, the attacker still doesn't know customer's passwords and could take years to find out. Otherwise, since 59% of people use the same password across multiple sites source, the attacker could quickly try other websites such as banks to attempt to break into those accounts, which can potentially yield great returns in terms of information and/or money.
Additionally, when a user signed up on your website and provided you a password, they implicitly trusted you to keep that information safe and secure for them. In a sense, you do have a responsibility to keep their passwords secure. By doing proper password storage, if your servers ever get breached, you can assure your customers that their passwords are properly secured, and still maintain some of their trust in you.
Here are some libraries you can use to implement password storage: