by stressed out
Last Updated July 12, 2019 09:20 AM - source

Most tutorials about digital currencies (like BitCoin) say that a hash function is a cryptographic function that takes an input and always return a binary string of length $256$ such that a small change in the input would result in a totally different hash.

Obviously, in the simplest interpretation, the hash function must be injective. However, since the length of input messages is unlimited while there are only $2^{256}$ binary strings, it is obvious that an injective function cannot exist.

So, what is the correct interpretation? I have tried to formulate a possible interpretation, and I did google it to see if there's an actual mathematical definition of the hash function as in BitCoin tutorials, but I didn't have any luck.

Anyway, here's my interpretation:

Let $\mathcal{M}$ denote the set of messages. We can think of this set as binary strings of unlimited length, i.e. $$\mathcal{M} = \Large\{ \large(a_1,a_2,a_3,\cdots,a_n,\cdots) \large\mid a_i=0,1 (\forall i\in \mathbb{N})\Large\}$$ We can define a distance on this set as follows:

$$d(m_1,m_2) = \sum_{i=1}^{\infty} \frac{a_i}{2^i}$$

Then the statement becomes something like this:

$$\exists \delta >0, \forall m_1\in \mathcal{M}: \,\,\mathbb{P}(\{{f(m_2)\, |\, d(m_1,m_2) > \delta \implies f(m_1) \neq f(m_2)\}}) > 1 - \frac{c(m_1)}{2^{256}}$$

Where $\mathbb{P}$ is the uniform distribution on the set of binary strings of length $256$ and $c$ is a constant such that $c(m_1) \ll 2^{256}$. For example, $c \leq 2^{10}$.

Are there any alternative interpretations? A formal definition perhaps?

The relevant cryptographic property of hash functions is that there is no feasible way of finding an input that gives a desired output other than pure brute force.

The fact that even a minor change in the input ought to give a complete change in the output is a consequence of this (rather than being a fundamental property of its own). Otherwise, if you got an output that was close to what you wanted, you could make small changes to the input, expect small changes to the output, and thus not have to brute force as much.

I have heard this property described by the term "irreversible".

Also, there is nothing special about 256 bit outputs. It's just that the currently most used cryptographic hash function (SHA-256) works that way.

July 12, 2019 09:04 AM

- Serverfault Help
- Superuser Help
- Ubuntu Help
- Webapps Help
- Webmasters Help
- Programmers Help
- Dba Help
- Drupal Help
- Wordpress Help
- Magento Help
- Joomla Help
- Android Help
- Apple Help
- Game Help
- Gaming Help
- Blender Help
- Ux Help
- Cooking Help
- Photo Help
- Stats Help
- Math Help
- Diy Help
- Gis Help
- Tex Help
- Meta Help
- Electronics Help
- Stackoverflow Help
- Bitcoin Help
- Ethereum Help