Learning about Encryption, Encoding, and Hashing

Published in

InfoSec Write-ups

6 min readMar 17, 2023

BASIC INTRODUCTION

We no longer live in the 90s when the internet was just accessible to people. With the amount of information that is available and the sheer number of tools available to aid exploitation of vulnerabilities, it is necessary to implement proper security controls.

It is necessary to implement proper security mechanisms to not only the data stored at rest but also in transit. All it takes is a network sniffer to lose the confidentiality of the data.

Often passwords and security keys are transmitted over the network. If they get in the wrong hands, the attackers can compromise the integrity of the data or knock the systems online. Thus, affecting the availability of the data and the systems.

In a matter of minutes all the three elements of the CIA Triad (Confidentiality, Integrity and Availability) have been attacked and compromised by the attackers.

This is where complex cryptographic algorithms, hashing techniques and encoding mechanisms come into play which actually secures the information of the user that they have provided to the website or organization.

If properly implemented, these can help prevent the confidentiality and the integrity of the data.

What is Encoding and Why is it being used?

Encoding and decoding is a term usually used in the context of digital electronics. It is the process of converting a signal into another form for optimal transmission or storage.

Data is often encoded so that it preserves its meaning and can still be transmitted. For example, if you were to search for ‘who am I” on Google or any other search engine, you’ll see that the URL changes to ?query=who+am+I. Why did this happen? Why did the spaces get replaced by +. This is because, in the context of HTTP Requests, while being transmitted spaces have a different meaning. This would in turn break our request and it wouldn’t succeed. So, we say that our request got URL encoded for a safe transmission.

Similarly, there are a lot of Encoding mechanisms that are used to encode the data either for storage or transmission. The most common encoding schemes are Base64, URL, hex, etc.

Often, people confuse Encoding with Encryption. What encryption is, we will get to that in a moment, but as you can see, anyone can decode what was sent, after it gets encoded.

It wouldn’t be hard for you to decode, “?query=who+am+I” to “who am I”, after I told you that the + stands for a space. Similarly, this can be achieved for Base64, URL, hex, etc.

Even now many developers encode usernames, and user IDs to then use them to pass them to a backend function before sending the response to the user. If you are from a security background, you can guess that any attacker can select a random user, encode it and then pass it to the function. He can get the response that should not be visible to him.

What is Encryption and Why is it being used ?

Encryption, in the simplest form, is the process of converting data to text to an unreadable form with the use of a key. As long as you have the key, no one should be able to decrypt the text back to the original, readable form. This prevents the confidentiality of the data from being compromised.

As you can guess, encryption is better when it comes to sending data using an unrealizable, public network or storing it.

Consider encryption as a type of a lock. The one to hold the key gets access to the data.

Encryption is of two types, the first is called symmetric encryption and the second is called asymmetric encryption. In the former one, the same key is used to encrypt or decrypt the data.

Asymmetric encryption, on the other hand, is a form of encryption scheme where one key is used to encrypt and a completely different key is used to bring the unreadable (encrypted) data back to its original form.

Encryption and encoding often go hand in hand. As we just read that encryption converts data to an unreadable form. This data is pretty difficult to transmit or store. So, the data is first encrypted, then encoded and then sent or stored.

But if encryption is more secure then why doesn’t everyone use it every time? The answer is computing time, power, etc. Encryption comes with its own set of problems but to put it in layman’s terms, a significant amount of processing power is used to encrypt the data and when the data is pretty large, this takes a lot of time. This would produce latency on the end of your users, thus causing a poor performance on the end of the users.

What is hashing and its uses ?

An encryption function accepts data and a key and then converts it to a different text (called the encrypted data). The encrypted data’s length largely depends on the input text provided and a key is involved in encrypting the plaintext to the ciphertext.

Hashing on the other hand, accepts a text as an input and produces a fixed size random value called a hash. The hash is a one-way, unique value. There are lots of hashing functions that produce hash values of different sizes. For example, let’s consider the MD5 hashing algorithm. No matter how much input you provide to the MD5 hashing algorithm, the output will always be a 32-character hash. What is special about hashing algorithms is that the hash value produced will be completely random and dependent on the complete text that was provided.

This means that if I have 2 files, called file1.mp4 and file2.mp4 and provide them to the MD5 algorithm. Suppose the hashes produced are 12ba53ed7c3098c3d5e6d7f9821d54ca and 34bb51ed1c2801c4dce1d3c9821d54de respectively. Now, every time I supply the exact same files to the MD5 algorithm the output hashes will exactly be the same.

Hashes are typically used in place of passwords. For example, if you were to create an account on Gmail and supply my_secure_password@1 as the password, then Google won’t store this in its database. It would instead find a hash of the password that you enter and then store it in its database. This would ensure that even if the database of Gmail has been compromised, the plaintext passwords of users aren’t visible to the attackers.

Another use case of hashes is to check if the files were tampered with. The hashes of all the critical files are calculated and then stored in a separate database. When a host is compromised by the attackers, Forensic experts can easily detect which files were tampered with by recalculating the hashes of all the files and then comparing the new hashes with the old ones. If the hashes are different from older ones, then the file was tampered with, otherwise not.

Difference Between the Three

Understanding the differences between encryption, encoding, and hashing is essential for anyone working with sensitive data or dealing with cybersecurity. Each technique serves a distinct purpose and offers unique advantages when applied correctly.

Furthermore, a strong grasp of these techniques can lead to more robust cybersecurity practices, ultimately safeguarding sensitive information and minimizing the risk of data breaches. As the digital landscape continues to evolve, staying informed about encryption, encoding, and hashing will remain crucial in maintaining data security and privacy.

Learning about Encryption, Encoding, and Hashing

Written by Security Lit Limited