What is the correct mathematical satement for a cryptographic hash function as in digial currencies?

by stressed out   Last Updated July 12, 2019 09:20 AM - source

Most tutorials about digital currencies (like BitCoin) say that a hash function is a cryptographic function that takes an input and always return a binary string of length $$256$$ such that a small change in the input would result in a totally different hash.

Obviously, in the simplest interpretation, the hash function must be injective. However, since the length of input messages is unlimited while there are only $$2^{256}$$ binary strings, it is obvious that an injective function cannot exist.

So, what is the correct interpretation? I have tried to formulate a possible interpretation, and I did google it to see if there's an actual mathematical definition of the hash function as in BitCoin tutorials, but I didn't have any luck.

Anyway, here's my interpretation:

Let $$\mathcal{M}$$ denote the set of messages. We can think of this set as binary strings of unlimited length, i.e. $$\mathcal{M} = \Large\{ \large(a_1,a_2,a_3,\cdots,a_n,\cdots) \large\mid a_i=0,1 (\forall i\in \mathbb{N})\Large\}$$ We can define a distance on this set as follows:

$$d(m_1,m_2) = \sum_{i=1}^{\infty} \frac{a_i}{2^i}$$

Then the statement becomes something like this:

$$\exists \delta >0, \forall m_1\in \mathcal{M}: \,\,\mathbb{P}(\{{f(m_2)\, |\, d(m_1,m_2) > \delta \implies f(m_1) \neq f(m_2)\}}) > 1 - \frac{c(m_1)}{2^{256}}$$

Where $$\mathbb{P}$$ is the uniform distribution on the set of binary strings of length $$256$$ and $$c$$ is a constant such that $$c(m_1) \ll 2^{256}$$. For example, $$c \leq 2^{10}$$.

Are there any alternative interpretations? A formal definition perhaps?

Tags :