Some notes on SHA-3

What is SHA-3?¶

SHA-3 is a family of cryptographic hash functions (also see SHA-3 on Wikipedia). NIST selected Keccak as implementation for SHA-3 in October 2012, and released a draft specifying the details in April 2014. I think the algorithm details are final.

How does it work?¶

Keccak consists of two components:

Permutation functions Keccak-f for specific state sizes (1600, 800, 400, 200, 100, 50 and 25 bits); for SHA-3 only Keccak-f[1600] is actually used.
A “sponge” construction: a sponge requires a transformation function (like Keccak-f[1600]) and a capacity parameter.

The sponge maintains a state (as large as the transformation function handles and initialized to zero); the state is divided into a “rate” and a “capacity” part; the rate size is state size - capacity.

Each input and output block is of rate size; when a block comes in it is bitwise XORed into the “rate” part of the state, and then the transformation is run (this is called “absorbing” a block). To output a block the “rate” part is simply copied into the output block (“squeezing”), followed by a transformation run on the state.

As the input data is usually not aligned to the block size, it is always padded by appending a 1 bit, then 0 bits until the next block is full but for one bit, and a final 1 bit again, leading to a minimum padding of two bits ("11") and a maximum of (rate + 1) bits; this padding is named pad10*1.

To generate output of a certain length one just squeezes blocks until the length is reached and truncates if necessary.

This results in the basic Keccak primitive for a capacity of c bits and a certain output length:

Keccak[c](msg, output length) := Sponge[Keccak-f[1600], c](pad10*1(msg), output length)

The output length could be “infinite”, resulting in an endless bit stream (for use in a stream cipher or as PRNG).

Security¶

An algorithm is said to provide n bits of security (for a specific property) if an attack requires 2^n steps (i.e. requires bruteforcing n bits).

To provide n bits of security (preimage, second preimage and collision) with a sponge construction (assuming the transformation function is secure) the capacity has to be 2*n bits, and the output has to be of length 2*n bits. If the output is only n bits, preimage and second preimage are still n bit secure, but collision security drops to n/2 bits (“birthday paradox”).

# || denotes concatenation of bit strings
SHA3-224(msg) := Keccak[448](msg || 01, 224)
SHA3-256(msg) := Keccak[512](msg || 01, 256)
SHA3-384(msg) := Keccak[768](msg || 01, 384)
SHA3-512(msg) := Keccak[1024](msg || 01, 512)
SHAKE128(msg, output length) := Keccak[256](msg || 1111, output length)
SHAKE256(msg, output length) := Keccak[512](msg || 1111, output length)

All functions use a capacity for a security of n bits, where n is the number in the function name. However the output length of the SHA3-n functions is only n bits, restricting collision resistance to n/2 bits - there is no way around this, as the SHA3-n functions are intended to be used as replacements for the SHA-n functions from the SHA-2 family.

The different suffixes appended to the message before padding are used to distinguish different uses of the same Keccak[c] function (domain separation): the suffix "01" represents the domain of SHA3-n functions which take a simple message. The suffix "11" represents the SHAKE functions which take a Sakura padded message; in Sakura the suffix "11" represents a single message (Sakura also supports tree hashing).

Possible NIST manipulations / History¶

The original Keccak submission used the same capacity for the SHA3-n functions as are now in the draft. For the variable output length function it used a capacity of 576 bits, leading to a rate of 1024 bits. Also the suffixes were not part of the proposal.

In 2013 NIST proposed to use capacity 256 for SHA3-224, SHA3-256 and SHAKE256, and capacity 512 for SHA3-384, SHA-512 and SHAKE512.

While the parameter selection itself would have been more consistent (the proposed SHA3-512(msg) := Keccak[512](msg, 512) would have been 256-bit secure against everything), this would have reduced preimage and second preimage resistance below what the corresponding functions in the SHA-2 family provided.

Due to the negative feedback in the crypto community NIST announced in November 2013 to pick the parameters from the original submission, and renamed the SHAKE functions to reflect the bits of security instead of capacity.

So there are two changes remaining since the final submission:¶

The suffixes for domain separation (this was proposed by the Keccak team after the submission, not by NIST itself).
SHAKE256 (512 bit capacity) is a little bit less secure than the original proposed Keccak[] with a capacity of 576; on the other hand most people agree that aiming for more than 256 bit security is pointless.

The Keccak authors wrote:

In the Keccak design philosophy, safety margin comes from the number of rounds in Keccak-f, whereas the security level comes from the selected capacity.

As far as I can see SHA-3 is as trustworthy as the original Keccak submission.

Which SHA-3 hash function to select¶

If you are not bound to be compatible with SHA-2 functions I’d recommend to use SHAKE256 with 512 bits of output to get 256-bit security instead of SHA3-512 - this is nearly twice as fast (rate of 1088 bit vs 576), and more than 256-bit security is just wasting resources.

If you are only interested in 128-bit collision resistance (because you don’t want to “waste” more space storing the output) SHA3-256 is a good choice - SHAKE128 with 256 bits of output is not much faster (rate of 1344 bits vs 1088), but having 256 bit security for preimage could be worth the cost.