Cryptography for Network and Information Security

Message Digest Functions

Message digest functions, also called hash functions, are used to produce digital summaries of information called message digests. Message digests (also called hashes) are commonly 128 bits to 160 bits in length and provide a digital identifier for each digital file or document. Message digest functions are mathematical functions that process information to produce a different message digest for each unique document. Identical documents have the same message digest; but if even one of the bits for the document changes, the message digest changes. Figure 14.3 shows the basic message digest process.

Enlarge figure

Figure 14.3 Example of the Message Digest Process

Because message digests are much shorter than the data from which the digests are generated and the digests have a finite length, duplicate message digests called collisions can exist for different data sets. However, good message digest functions use one-way functions to ensure that it is mathematically and computationally infeasible to reverse the message digest process and discover the original data. Finding collisions for good message digest functions is also mathematically and computationally infeasible but possible given enough time and computational effort. However, even if an attacker discovers a collision, it is highly improbable that the collision could be useful. For example, assume that an English message produces a message digest with a value of n, and an attacker somehow manages to computationally generate a second set of data that also produces a message digest of n. The second set of data would have to be in the English language and form a coherent and germane message for an attacker to be able to use it for an illicit purpose, such as sending a counterfeit message in the place of the original message. With the best message digest functions in use today, the probability that a second set of collision data would be in a known language or form a coherent message is minuscule.

Message digests are commonly used in conjunction with public key technology to create digital signatures or "digital thumbprints" that are used for authentication, integrity, and nonrepudiation. Message digests also are commonly used with digital signing technology to provide data integrity for electronic files and documents.

For example, to provide data integrity for e-mail messages, message digests can be generated from the completed mail message, digitally signed with the originator's private key, and then transmitted with e-mail messages. The recipient of the message can then do the following to check the integrity of the message:

Use the same message digest function to compute a digest for the message.
Use the originator's public key to verify the signed message digest.
Compare the new message digest to the original digest.

If the two message digests do not match, the recipient knows the message was altered or corrupted. Figure 14.4 shows a basic integrity check process with a digitally signed message digest.

Enlarge figure

Figure 14.4 Example of an Integrity Check with a Digitally Signed Message Digest

Because the message digest is digitally signed with the sender's private key, it is not feasible for an intruder to intercept the message, modify it, and create a new valid encrypted message digest to send to the recipient. Another method of ensuring the integrity of data is to use message digests with a Hashed Message Authentication Code (HMAC) function, as described later in this chapter.

Two of the most commonly used message digest algorithms today are MD5, a 128-bit digest developed by RSA Data Security, Inc., and SHA-1, a 160-bit message digest developed by the National Security Agency. The SHA-1 algorithm is generally considered to provide stronger cryptographic security than MD5, because it uses a longer message digest and it is not vulnerable to some attacks that can be conducted against MD5.