Home of WizardBlue.com
Hash - In Detail

HASH - In Detail

The WizardBlue Hash program uses an set of advanced and highly efficient library routines to perform various 'finger-print' functions.  This provides you with a fast, accurate and simple tool for the job of file 'finger-printing' and verification.  Below are some of the more technical details of hashing and the Hash program.

So, what is a Hash ?

The hash of a file, or digest, is an almost perfectly unique mathematical function result generated from the contents of a file.  It is statistically extremely unlikely that any two files would produce the same hash digest result.  It is even more statistically unlikely that two very similar files would produce the same hash digest result.

A hash is generated by scanning every byte within a file, passing these bytes through a complex mathematical function and at the end producing the hash digest result.  For a given hash function, the hash result digest is always the same length.  So for an SHA-1 result, this would be 160bits long (or 20bytes), regardless of the size of the input file.  i.e. file lengths of 1 byte or 20GB always produce a 160bit result for SHA-1.

The earliest form of hash was the ubiquitous CRC (Cyclic-Redundancy-Check).  In its day, it was used for all sorts of data verification and is still in common use today.  However, it does have a shortfall:  It is possible to modify the data in such a way and still maintain the CRC result.  This means that the data can be changed, without needing to change the expected CRC verification value.

In today's world, where data security is at the forefront of system integrity, the CRC is ideal as a 'quick' check to make sure that the data has been transferred OK.  But it is not accurate enough to detect modification or tampering of the data by the skilled attacker.  Don't get me wrong, CRCs are here to stay, they are far too important and efficient in areas such as data transmission and storage at the low level.  But if you want to be sure that data has not be expertly tampered with, you need a much more accurate means of verification and this is where hashing comes in.

At a slightly higher level of 'finger-printing' is the digital signature.  This not only provides the verification of the data content to detect tampering or corruption, but it also verifies who sent it and who can verify it.  Hashing sits between CRCs and digital signatures, biased closer to digital signatures, than CRCs.

In fact, many digital signature systems only sign the hash value of the data, not the data itself.  So with hashing you are pretty close to digital signatures with a much lower system overhead.

What are the different Hash types ?

There are three common has functions around, these are briefly discussed below.  If you require more information, then you can follow the external links provides, or research them yourself.

Hash: MD5

This is an older and probably considered one of the original hash algorithms around.  It is still in common use, but is considered to be less secure than the more modern hash functions around.

Hash: SHA-1

This Secure Hash Algorithm (SHA-1) was released by FIPS, details of the specification can be found here:  FIPS PUB 180-1Hash generates a FIPS compatible condensed representation digest of the file specified.  This is a 160bit (20 bytes) condensed digest of this file.  Maximum file length is 264 bytes (that is a very long file, by the way!).

You can use this digest result to verify that the file has not changed content.  For greater accuracy and security, you should consider using the 256bit SHA-256 digest result, see below.

Hash: SHA-256

This Secure Hash Algorithm (SHA-256) was released by FIPS recently as a more secure version of SHA-1, full specification details can be found here:  FIPS PUB 180-2 Secure HashHash generates a FIPS compatible condensed representation digest of the file specified.  This is a 256bit (32 bytes) condensed digest of this file.  Maximum file length is 264 bytes (that is a very long file, by the way!).  The SHA-256 is designed to provide 128bit security against collision attacks.

You can use this digest result to verify that the file has not changed content.

Note: The SHA-256 result is NOT displayed with the LITE version.  If you need to use this result, you should consider purchasing a full copy of WizardBlue Hash.