What is the correct mathematical satement for a cryptographic hash function as in digial currencies?

by stressed out   Last Updated July 12, 2019 09:20 AM - source

Most tutorials about digital currencies (like BitCoin) say that a hash function is a cryptographic function that takes an input and always return a binary string of length $256$ such that a small change in the input would result in a totally different hash.

Obviously, in the simplest interpretation, the hash function must be injective. However, since the length of input messages is unlimited while there are only $2^{256}$ binary strings, it is obvious that an injective function cannot exist.

So, what is the correct interpretation? I have tried to formulate a possible interpretation, and I did google it to see if there's an actual mathematical definition of the hash function as in BitCoin tutorials, but I didn't have any luck.

Anyway, here's my interpretation:

Let $\mathcal{M}$ denote the set of messages. We can think of this set as binary strings of unlimited length, i.e. $$\mathcal{M} = \Large\{ \large(a_1,a_2,a_3,\cdots,a_n,\cdots) \large\mid a_i=0,1 (\forall i\in \mathbb{N})\Large\}$$ We can define a distance on this set as follows:

$$d(m_1,m_2) = \sum_{i=1}^{\infty} \frac{a_i}{2^i}$$

Then the statement becomes something like this:

$$\exists \delta >0, \forall m_1\in \mathcal{M}: \,\,\mathbb{P}(\{{f(m_2)\, |\, d(m_1,m_2) > \delta \implies f(m_1) \neq f(m_2)\}}) > 1 - \frac{c(m_1)}{2^{256}}$$

Where $\mathbb{P}$ is the uniform distribution on the set of binary strings of length $256$ and $c$ is a constant such that $c(m_1) \ll 2^{256}$. For example, $c \leq 2^{10}$.

Are there any alternative interpretations? A formal definition perhaps?



Answers 1


The relevant cryptographic property of hash functions is that there is no feasible way of finding an input that gives a desired output other than pure brute force.

The fact that even a minor change in the input ought to give a complete change in the output is a consequence of this (rather than being a fundamental property of its own). Otherwise, if you got an output that was close to what you wanted, you could make small changes to the input, expect small changes to the output, and thus not have to brute force as much.

I have heard this property described by the term "irreversible".

Also, there is nothing special about 256 bit outputs. It's just that the currently most used cryptographic hash function (SHA-256) works that way.

Arthur
Arthur
July 12, 2019 09:04 AM

Related Questions




Exponentiation for hash function & associativity

Updated April 06, 2015 18:08 PM

Finding a path in a graph by its hash value

Updated July 27, 2015 18:08 PM