A complete idiot’s introduction to HMAC

Suppose Alice sends a message to Bob that says “pay Chloe $10”. Alice and Bob could be banks communicating, and Chloe is a customer. The message is sent over the Internet.

One issue with a message being sent like this is that a transmission error could change the message and there’s no way to detect that error. An error could change the message to “pay Chloe $00” and Chloe gets nothing as a result.

One solution to detect transmission errors is to calculate a hash of the message and append it at the end of the message:

Message sent = Message + Hash(Message)

The hash that is appended here serves as a message integrity check. It can be generated using a simple hashing algorithm. Let’s use the CRC16 hash for our example. By appending the CRC16 hash value, the updated message reads “pay Chloe $10” + 0xE7D2. On the recipient side, Bob will calculate the hash (in this case CRC16) of the received message and compare it with the CRC16 value received along with the message. For instance, if the message was altered due to an error to say “pay Chloe $00”, the new CRC16 value would be 0x77D3 which would not match the CRC16 value Bob received.

Even though the message is now protected from transmission errors, what if Chloe happens to be a hacker, and decides to intercept and modify the message? Now the message could read “pay Chloe $99”. All she needs to do is recalculate the hash value and append this new value to the altered message earning a handsome profit.

To address this, Alice and Bob can use a secret key known only to them:

Message sent = Message + Algorithm(Key + Message)

The cryptographic algorithm employed here calculates a fixed-length value commonly referred to as a message authentication code or MAC. It can verify the authenticity of the sender and the integrity of the message.

Note that I used the word “algorithm” instead of “hash”. This is because certain hash functions rely on internal states and this can be exploited by a hacker. With sufficient effort, Chloe could append additional content to the original message and calculate the resulting hash without knowing the secret key or even the contents of the original message. This is known as a length extension attack. Popular hash functions such as MD5, SHA-1, and SHA256 are susceptible.

One way to address this is by using a hash-based message authentication code (HMAC) that is appended to the original message. The HMAC algorithm works in two passes utilising the shared secret key to generate two new keys (inner and outer). It then creates the initial hash by combining the inner key with the original message. Next, it computes the hash of the inner hash combined with the outer key:

Pass 1: Inner_Hash = SHA256((Key XOR ipad) + Message)
Pass 2: HMAC = SHA256((Key XOR opad) + Inner_Hash)

Here ipad and opad are padding values consisting of repeated bytes 0x36 and 0x5c, respectively. This new double hash is no longer susceptible to the length extension attack, rendering Chloe, the attacker, unable to modify the message. Note that in real-life scenarios, the message is usually encrypted as well, ensuring that an attacker cannot read or alter the message. Read up more on HMAC here.

Here’s an example of the HMAC-SHA256 algorithm implemented in PHP:

$hash = hash_hmac('sha256', 'pay Chloe $10', 'secretkey');
var_dump($hash);

Here’s the same code in Python:

import hmac
import hashlib
key = bytes('secretkey', 'UTF-8')
h = hmac.new(key, 'pay Chloe $10'.encode(), hashlib.sha256)
print(h.hexdigest())

Both produce the same HMAC code.


Posted

in

by

Tags: