## Properties

Following are the desired properties of a strong hash function: $f$

- Pre-image resistance: given $f(m)=h$, it is difficult to find the pre-image $m$ of the hash $h$.
- Second pre-image resistance: given message $m_{1}$, it is difficult to find another message $m_{2}=m_{1}$ such that $f(m_{1})=f(m_{2})$.
- Collision resistance: It is difficult to find two distinct messages $m_{1}$ and $m_{2}$ such that $f(m_{1})=f(m_{2})$.

Collision resistance and Second pre-image resistance might seem similar but on closer look, Collision resistance hash-functions are more harder to construct as adversary has more control over choice of message $m_{1},m_{2}$.

collision resistance implies second pre-image resistance. Prove this. ^{1}

Hash function $H$ is defined by two deterministic algorithms:

- $Gen(1_{n})→s$: outputs key $s$. Note that $s$ is not secret here, and adversary has access to it.
- $H(s,x)→{0,1}_{l(n)}$: $x∈{0,1}_{∗}$, outputs a binary string of length $l(n)$.

When input length $l_{′}(n)>l(n)$, then $H$ is called fixed-length hash function, and a *compression function* as well because collision resistance is achieved trivially by setting $H(x)=x$ if $l_{′}<l$.

Define experiment $Hash-Coll_{A,H}(n)$:

- $s$ is generated by $Gen(1_{n})$
- $A$ outputs $x,x_{′}∈{0,1}_{l_{′}(n)}$
- output 1 if $x=x_{′}$ and $H(s)=H(s_{′})$

$Pr[Hash-Coll_{A,H}(n)=1]≤negl(n)$

## Merkle-Damgård Constructions

Used in MD5, SHA1, SHA2.

Construct a arbitrary length hash function from fixed length compression function, i.e. use $Gen,h$, where $h:{0,1}_{n_{′}}×{0,1}_{n}→{0,1}_{n}$ to create H such that ${0,1}_{∗}→{0,1}_{n}$.

```
flowchart LR
x1-->hs
IV-->hs
hs--z1-->hs2
x2-->hs2
hs2--"……"-->hs3["hs_{b-1}"]
hs3--z_{b-1}-->hs4
xb["x_{b-1}"]-->hs4
hs4-->Hs("Hs(x)")
```

Prove by reduction that this is collision resistant if $h$ is collision resistant.

## Sponge constructions

Used in SHA3 (Keccak).

## HAIFA construction

Used in Blake2.

## Birthday attack

According to literature, most symmetric cryptographic primitives like encryption, block ciphers, MAC need an $n−$bit secret keys to achieve security against attackers running in $2_{n}$ time. But using birthday attacks, a hash function with output size $n$ has $O(2_{n})$ pre-image resistance, and second pre-image resistance and $O(2_{n} )$ collision resistance.

Let $H:{0,1}_{∗}→{0,1}_{l}$ be a hash function. Most trivial attack to find collision is to take $2_{l}+1$ different strings and compute hash of all of those, can we find two $y_{i}$ that are equal?

Birthday attacksif you have q inputs $x_{1},…,x_{q}$, then what is the probability that any two of $H(x_{1}),…,H(x_{q})$ are equal? Assume $H$ to be a random function (because this yields the worst-case probability and all other real-world functions are worse than this).

explainbirthday attack probability.

It is explained in above proof that any attacker can find a collision using $Θ(2_{n/2})$ inputs with more than 1/2 probability, Putting commonly occurring 512 and 256 bits output size in the above statement: ^{2}

- 512 bits output gives $512$ bits of pre-image resistance and $256$ bits of collision resistance
- 256 bits output gives $256$ bits of pre, and 128 bits of col.

Small-space birthday attack: label hash outputs in a graph, and then, it’s the algorithm of finding node that forms cycle in a graph, can be done in $Θ(2_{l/2})$ time and no extra memory.

```
flowchart LR
x1-->x2-->x3-->x4-->x5
x5-->x2
```

Let there be a sequence of values: $x_{0},…,x_{q}$

- loop: $x=x_{0},x_{′}=x_{0}$, and calculate $x=H(x),x_{′}=H(H(x_{′}))$. This means $x=H_{(i)}(x_{0}),x_{′}=H_{(2i)}(x_{0})$
- if $x=x_{′}$, then there is a collision between $0,2i−1$

- again run a loop with $x=x_{0},x_{′}=x_{i}$
- check if $H(x)=H(x_{′})$, if yes, output $x,x_{′}$
- else $x=H(x),x_{′}=H(x_{′})$

Prove that first loop iteration halts, i.e. there exists i such that $x_{i}=x_{2i}$.

since $x_{I}=x_{J}$, then cycle length: $l=J−I$, then let $i$ be a multiple of $l$, so, $x_{i}=x_{i+k⋅l}$ such that $i>I,i<J$. So, $x_{i}=x_{2i}$ if $2i=i+k.l⟹k=li $.

We can use **small-space birthday attacks** to find meaningful collisions in a set of two different strings, by extending the $l−$bit string to $l+1$, where first bit defines string belongs to which set. Then, we can find a collision in the set, so $x=x_{′}$ such that $H(x)=H(x_{′})$, and $x,x_{′}$ belong to different set with probability 1/2.

### Inverting hash function with time/space tradeoff

let $H:{0,1}_{∗}→{0,1}_{l}$ be a hash function mapping to $l−$bit outputs, such that length of range = $2_{l}=N$.

Basic idea is to generate a sequence of $x_{0},H(x_{0}),…,H_{(i)}(x_{0})$ and then, distribute the sequence into a table of $N ×N $ independent elements. Store $SP_{i},SP_{i}=EP_{i}$.

Now, to find pre-image of $y$, just calculate $y,H(y),…,H_{(t)}(y)$. If one of these equal end point of a tuple, then calculate $SP_{i}$ and x where $y=H(x)$ should lie in that sequence.

This is done more effectively by choosing two parameters $s,t$ and choosing the starting points uniformly. But there are two problems:

- possibility of sequence $y,H(y),…,H_{(i)}(y)$ lying in none of the row of $s.t$ element table.
- false positive: where one of $H_{(i)}(y)$ lies in one of the row but $H_{(i−1)}(y)$ doesn’t. This can be estimated by probability that a point in a sequence equal some point in the table: $O(Nst_{2} )$.
- False positives can be mediated by running the check of finding $y$ in a sequence that matches each i,j in $H_{(j)}(y)=EP_{i}$.

- To estimate lower bound whether $x=H_{−1}(y)$ lies in the table:
- estimate probability of finding mutually exclusive elements in the table. So, a row’s starting should be new and subsequent hash of row’s starting element should be new.
- This comes out to be $∑_{i=1}∑_{j=0}Pr[H_{(j)}(SP_{i})is new]≥2st $
- Thus, the probability that element $x$ lies in the table = $2Nst =4t1 $ when $st_{2}=2N $.
- So, almost 4t independent tables need to be generated in order have higher probability of generating a table that contains x.

## RO model

## ZK Friendly hash functions

It’s natural to put cryptographic hash functions on protocols that boasts ZK properties like SNARKs or STARKs.

- Poseidon
- Rescue Prime
- Vision

## Questions

6.1:

- TODO: prove that if H is second-preimage, then $H$ is preimage resistant.
- can be given by again creating a reduction proof and proving that probability calculated of $A$ breaking pre-image resistant means $A_{′}$ breaking second-preimage, but since prob of latter is negligible, former’s prob is also negligible.

6.4: model two PPT adversaries $A,A_{′}$ where $A$ attacks H, and $A_{′}$ attacks h. $A_{′}$ is running $A$ as a subroutine and assumes that $A_{′}$ can break $H$, then after querying for $y$, receives $x,x_{′}$. Using this, and a similar logic used in the Theorem 6.4, can prove that $A$ would succeed in breaking collision resistance of $h$, but since this isn’t possible, $H$ is collision resistant.

6.5: two questions: what should be key length? is any padding needed? what if we keep key length as n, and k bit as input. then, the proof can be formulated similar to 6.4.

6.7: let h be such that $h(0_{n},0_{n_{′}})=0_{n}$, then take two inputs $x=1$ and $x_{′}=0_{n}∥1$.

6.8: design a non-collision resistant hash function using a collision resistant hash function such that $h(0∥x)=h^(x)∥0$ and $h(1∥x)=1_{n}$, where $x∈{0,1}_{2n−1}$. TODO, didn’t understand the solution from manual.

6.9,6.10:

- design a reduction proof that allows to break $h$‘s preimage, and second-preimage resistance using $A$‘s attack on $H$.
- Correction, assumption was wrong. both these statements are actually false.

## Resources

- Intro to Modern Cryptography: Chapter 6
- A graduate course in applied cryptography by Dan Boneh, Victor Shoup: Chapter 8
- ZK friendly hash functions
- Ingonyama’s ZK friendly hash functions