Message-Digest algorithm, commonly known as md5 hash, is a type of cryptographic hash function mainly used to verify the integrity of files. Md5 is a 128-bit message digest produced after running the MD5 function against a file.
Md5 has its flaws and is therefore not a very good choice for certain encryption methods, but it is very well suited for file verification. It works by creating a checksum of a file and comparing the result to the original. That means if there are changes to a file, there is no way it can produce a digest value similar to the original. The value stays constant no matter where generated or how many times as long as the file remains unchanged.
For this guide, we shall look at ways to generate an md5 hash value of a file. That will allow you to verify the integrity of files either from remote locations or on your local machine.
In Linux and almost major Unix and Unix-Like systems, they come pre-installed with an md5 tool. The most common one is md5sum. By default, you should find it available in your system.
$ which md5sum
If you do not have the tool installed, you can use the package manager of your system.
On Ubuntu and other Debian based distributions, use apt as:
sudo apt-get update
sudo apt-get install md5sum -y
On REHL and CentOS, use yum as:
sudo yum update
sudo yum install md5sum
If you are on Manjaro or other arch based distributions, use Pacman using the command:
sudo pacman -Sy
sudo pacman -S md5sum
Finally, on Fedora systems, use the dnf command as:
sudo dnf update
sudo dnf install md5sum
Generate Md5sum of a File
With the tool installed, we can proceed and generate a md5sum for a file. You can use any basic file available in your system. In my example, I am using the /etc/hosts available in Linux systems.
To generate the md5sum of a file, simply use the md5sum command followed by the filename, which you can see in the command below:
The above command should generate a hash value of the file as shown in the output below:
Once the contents of the file change, the md5sum value becomes completely different. For example, add a value to the/etc/hosts file.
Add the following entry to the file (feel free to change to any way you see fit).
If you try to calculate the md5 value of the file with the new contents as:
The hash value is different as shown in the output below:
If you revert the file to its original contents, the md5sum value is similar to the original, allowing you to know when a file has changed.
NOTE: The md5 value will be similar to the original even if the file gets renamed. This is because md5 is calculated based on file contents and not filename.
Verify Online Files
Suppose you want to verify the integrity of a file and ensure it is tamper-proof. To do this, all you need is the original md5 value. In my example, I am using a simple deb package of MySQL from the resource below:
Download the file with wget using the command as:
Once the file has downloaded:
Let us now verify the md5 value using a command:
$ md5sum libmysqlclient21_8.0.25-1debian10_amd64.deb
If the file has not been modified in any way, you should get a similar value as the original as shown:
This tutorial looked at a simple method to verify the md5 checksum of files and verify their modification state.
Here is a quick exercise for you.
Create a simple bash script that checks if a file md5 value has any recorded modification every 5 minutes. If the file has changed, delete the file and shut down the system.
That should be a fun exercise!