Properties

Following are the desired properties of a strong hash function:

  1. Pre-image resistance: given , it is difficult to find the pre-image of the hash .
  2. Second pre-image resistance: given message , it is difficult to find another message such that .
  3. Collision resistance: It is difficult to find two distinct messages and such that .

Collision resistance and Second pre-image resistance might seem similar but on closer look, Collision resistance hash-functions are more harder to construct as adversary has more control over choice of message .

collision resistance implies second pre-image resistance. Prove this. 1

Hash function is defined by two deterministic algorithms:

  • : outputs key . Note that is not secret here, and adversary has access to it.
  • : , outputs a binary string of length .

When input length , then is called fixed-length hash function, and a compression function as well because collision resistance is achieved trivially by setting if .

Define experiment :

  • is generated by
  • outputs
  • output 1 if and

Merkle-Damgård Constructions

Used in MD5, SHA1, SHA2.

Construct a arbitrary length hash function from fixed length compression function, i.e. use , where to create H such that .

flowchart LR
x1-->hs
IV-->hs
hs--z1-->hs2
x2-->hs2
hs2--"……"-->hs3["hs_{b-1}"]
hs3--z_{b-1}-->hs4
xb["x_{b-1}"]-->hs4
hs4-->Hs("Hs(x)")

Prove by reduction that this is collision resistant if is collision resistant.

Sponge constructions

Used in SHA3 (Keccak).

HAIFA construction

Used in Blake2.

Birthday attack

According to literature, most symmetric cryptographic primitives like encryption, block ciphers, MAC need an bit secret keys to achieve security against attackers running in time. But using birthday attacks, a hash function with output size has pre-image resistance, and second pre-image resistance and collision resistance.

Let be a hash function. Most trivial attack to find collision is to take different strings and compute hash of all of those, can we find two that are equal?

Birthday attacks

if you have q inputs , then what is the probability that any two of are equal? Assume to be a random function (because this yields the worst-case probability and all other real-world functions are worse than this).

explain birthday attack probability.

It is explained in above proof that any attacker can find a collision using inputs with more than 1/2 probability, Putting commonly occurring 512 and 256 bits output size in the above statement: 2

  • 512 bits output gives bits of pre-image resistance and bits of collision resistance
  • 256 bits output gives bits of pre, and 128 bits of col.

Small-space birthday attack: label hash outputs in a graph, and then, it’s the algorithm of finding node that forms cycle in a graph, can be done in time and no extra memory.

flowchart LR
x1-->x2-->x3-->x4-->x5
x5-->x2

Let there be a sequence of values:

  • loop: , and calculate . This means
    • if , then there is a collision between
  • again run a loop with
    • check if , if yes, output
    • else

Prove that first loop iteration halts, i.e. there exists i such that .

since , then cycle length: , then let be a multiple of , so, such that . So, if .

We can use small-space birthday attacks to find meaningful collisions in a set of two different strings, by extending the bit string to , where first bit defines string belongs to which set. Then, we can find a collision in the set, so such that , and belong to different set with probability 1/2.

Inverting hash function with time/space tradeoff

let be a hash function mapping to bit outputs, such that length of range = .

Basic idea is to generate a sequence of and then, distribute the sequence into a table of independent elements. Store .

Now, to find pre-image of , just calculate . If one of these equal end point of a tuple, then calculate and x where should lie in that sequence.

This is done more effectively by choosing two parameters and choosing the starting points uniformly. But there are two problems:

  • possibility of sequence lying in none of the row of element table.
  • false positive: where one of lies in one of the row but doesn’t. This can be estimated by probability that a point in a sequence equal some point in the table: .
    • False positives can be mediated by running the check of finding in a sequence that matches each i,j in .
  • To estimate lower bound whether lies in the table:
    • estimate probability of finding mutually exclusive elements in the table. So, a row’s starting should be new and subsequent hash of row’s starting element should be new.
    • This comes out to be
    • Thus, the probability that element lies in the table = when .
    • So, almost 4t independent tables need to be generated in order have higher probability of generating a table that contains x.

RO model

random-oracle-model

ZK Friendly hash functions

It’s natural to put cryptographic hash functions on protocols that boasts ZK properties like SNARKs or STARKs.

  • Poseidon
  • Rescue Prime
  • Vision

Questions

6.1:

  • TODO: prove that if H is second-preimage, then is preimage resistant.
  • can be given by again creating a reduction proof and proving that probability calculated of breaking pre-image resistant means breaking second-preimage, but since prob of latter is negligible, former’s prob is also negligible.

6.4: model two PPT adversaries where attacks H, and attacks h. is running as a subroutine and assumes that can break , then after querying for , receives . Using this, and a similar logic used in the Theorem 6.4, can prove that would succeed in breaking collision resistance of , but since this isn’t possible, is collision resistant.

6.5: two questions: what should be key length? is any padding needed? what if we keep key length as n, and k bit as input. then, the proof can be formulated similar to 6.4.

6.7: let h be such that , then take two inputs and .

6.8: design a non-collision resistant hash function using a collision resistant hash function such that and , where . TODO, didn’t understand the solution from manual.

6.9,6.10:

  • design a reduction proof that allows to break ‘s preimage, and second-preimage resistance using ‘s attack on .
  • Correction, assumption was wrong. both these statements are actually false.

Resources

Footnotes

  1. Second pre-image resistance vs Collision resistance

  2. 512 bits vs 256 bits