# Sum of i.i.d discrete random variables, anything better than CLT?

by Changyu Dong   Last Updated August 14, 2019 09:20 AM - source

I know by CLT the sum of i.i.d random variables converges to the normal distribution, but is there any way to get a more refined distribution knowing the distribution of the individual random variable? The problem I have is the following:

We have a randomized function $$f: \mathbb{Z}^*\rightarrow \mathbb{Z}^*$$, and we know the Probability mass function of its output $$f(n)$$ is: $$\left\{ \begin{array}{ll} Pr[f(n)=0]=(1-\frac{1}{2})^n &\\ Pr[f(n)=i]=(1-\frac{1}{2^{i+1}})^n \prod_{j=0}^{i-1}(1-(1-\frac{1}{2^{j+1}})^n), &1\le i\le w-1\end{array} \right.$$

Then let $$m\in \mathbb{N}$$, let $$z_1,...z_m$$ be independent samples from $$f(n)$$, what is the PMF of $$\mathcal{Z}=\sum_{i=1}^m z_i$$?

The problem why I cannot use CLT is that our $$m$$ is limited ($$10^3 - 10^4$$), and there is a question how close the distribution of $$\mathcal{Z}$$ to the normal distribution. Currently, the only bound I know is the Berry–Esseen theorem, in which the approximation is measured by the Kolmogorov–Smirnov distance, and is $$O(\frac{1}{\sqrt{m}})$$. This is too big for me. However, the bound in Berry–Esseen theorem is the worst-case bound (we don't care too much about worst-case TBH) and is loose, so maybe we can get a more precise result.

Tags :