Some notes on SHA-3
What is SHA-3?
SHA-3 is a family of cryptographic hash functions (also see SHA-3 on Wikipedia). NIST selected Keccak as implementation for SHA-3 in October 2012, and released a draft specifying the details in April 2014. I think the algorithm details are final.
How does it work?
Keccak consists of two components:
- Permutation functions
Keccak-f
for specific state sizes (1600, 800, 400, 200, 100, 50 and 25 bits); for SHA-3 onlyKeccak-f[1600]
is actually used. -
A “sponge” construction: a sponge requires a transformation function (like
Keccak-f[1600]
) and a capacity parameter.The sponge maintains a state (as large as the transformation function handles and initialized to zero); the state is divided into a “rate” and a “capacity” part; the rate size is
state size - capacity
.Each input and output block is of rate size; when a block comes in it is bitwise XORed into the “rate” part of the state, and then the transformation is run (this is called “absorbing” a block). To output a block the “rate” part is simply copied into the output block (“squeezing”), followed by a transformation run on the state.
As the input data is usually not aligned to the block size, it is always padded by appending a
1
bit, then0
bits until the next block is full but for one bit, and a final1
bit again, leading to a minimum padding of two bits ("11"
) and a maximum of(rate + 1)
bits; this padding is namedpad10*1
.To generate output of a certain length one just squeezes blocks until the length is reached and truncates if necessary.
This results in the basic Keccak
primitive for a capacity of c
bits and a
certain output length:
Keccak[c](msg, output length) := Sponge[Keccak-f[1600], c](pad10*1(msg), output length)
The output length could be “infinite”, resulting in an endless bit stream (for use in a stream cipher or as PRNG).
Security
An algorithm is said to provide n
bits of security (for a specific property)
if an attack requires 2^n
steps (i.e. requires bruteforcing n
bits).
To provide n
bits of security (preimage, second preimage and
collision) with a sponge construction (assuming the transformation function
is secure) the capacity has to be 2*n
bits, and the output has to be of
length 2*n
bits. If the output is only n
bits, preimage and second
preimage are still n
bit secure, but collision security drops to n/2
bits
(“birthday paradox”).
# || denotes concatenation of bit strings
SHA3-224(msg) := Keccak[448](msg || 01, 224)
SHA3-256(msg) := Keccak[512](msg || 01, 256)
SHA3-384(msg) := Keccak[768](msg || 01, 384)
SHA3-512(msg) := Keccak[1024](msg || 01, 512)
SHAKE128(msg, output length) := Keccak[256](msg || 1111, output length)
SHAKE256(msg, output length) := Keccak[512](msg || 1111, output length)
All functions use a capacity for a security of n
bits, where n
is the
number in the function name. However the output length of the SHA3-n
functions is only n
bits, restricting collision resistance to n/2
bits -
there is no way around this, as the SHA3-n
functions are intended to be used
as replacements for the SHA-n
functions from the SHA-2 family.
The different suffixes appended to the message before padding are used to
distinguish different uses of the same Keccak[c]
function (domain
separation): the suffix "01"
represents the domain of SHA3-n
functions
which take a simple message. The suffix "11"
represents the SHAKE functions
which take a Sakura padded message; in Sakura the suffix "11"
represents a
single message (Sakura also supports tree hashing).
Possible NIST manipulations / History
The original Keccak submission used the same capacity for the SHA3-n
functions as are now in the draft. For the variable output length function it
used a capacity of 576 bits, leading to a rate of 1024 bits. Also the suffixes
were not part of the proposal.
In 2013 NIST proposed to use capacity 256 for SHA3-224
, SHA3-256
and
SHAKE256
, and capacity 512 for SHA3-384
, SHA-512
and SHAKE512
.
While the parameter selection itself would have been more consistent (the
proposed SHA3-512(msg) := Keccak[512](msg, 512)
would have been 256-bit
secure against everything), this would have reduced preimage and second
preimage resistance below what the corresponding functions in the SHA-2 family
provided.
Due to the negative feedback in the crypto community NIST announced in
November 2013 to pick the parameters from the original submission, and renamed
the SHAKE
functions to reflect the bits of security instead of capacity.
So there are two changes remaining since the final submission:
- The suffixes for domain separation (this was proposed by the Keccak team after the submission, not by NIST itself).
-
SHAKE256
(512 bit capacity) is a little bit less secure than the original proposedKeccak[]
with a capacity of 576; on the other hand most people agree that aiming for more than 256 bit security is pointless.The Keccak authors wrote:
In the Keccak design philosophy, safety margin comes from the number of rounds in
Keccak-f
, whereas the security level comes from the selected capacity.
As far as I can see SHA-3 is as trustworthy as the original Keccak submission.
Which SHA-3 hash function to select
If you are not bound to be compatible with SHA-2 functions I’d recommend to
use SHAKE256
with 512 bits of output to get 256-bit security instead of
SHA3-512
- this is nearly twice as fast (rate of 1088 bit vs 576), and more
than 256-bit security is just wasting resources.
If you are only interested in 128-bit collision resistance (because you don’t
want to “waste” more space storing the output) SHA3-256
is a good choice -
SHAKE128
with 256 bits of output is not much faster (rate of 1344 bits vs
1088), but having 256 bit security for preimage could be worth the cost.