Lets say i do have a TimeSeries with N > 300 but values like [0, 1, 2, 1, 2, 0, 2, ...] representing the visitors count of a website per day.
Since there are only few visitors and each of them can be considered an individual, is there a way to "prove" or maybe some statement in literature that these values are too low for e.g. prediction with random forest or simply a correlation with other, better performing websites? E.g. if there is a high correlation based on higher visitor counts on mondays, can this correlation actually be considered valid?
Or more specifically: can p < 0.05 as received from scipy.stats.pearsonr actually be considered valid, even if the values of one of the input-arrays are low?
Additionaly, lets say i did some SEO and my visitors count mean improves by 400%, the actual values will still be low and could still be based on random effects, or am i getting this wrong?