Lets say i do have a TimeSeries with N > 300 but values like [0, 1, 2, 1, 2, 0, 2, ...] representing the visitors count of a website per day.

Since there are only few visitors and each of them can be considered an individual, is there a way to "prove" or maybe some statement in literature that these values are too low for e.g. prediction with random forest or simply a correlation with other, better performing websites? E.g. if there is a high correlation based on higher visitor counts on mondays, can this correlation actually be considered valid?

Or more specifically: can p < 0.05 as received from scipy.stats.pearsonr actually be considered valid, even if the values of one of the input-arrays are low?

Additionaly, lets say i did some SEO and my visitors count mean improves by 400%, the actual values will still be low and could still be based on random effects, or am i getting this wrong?

Kind regards,

Pascal

- Serverfault Help
- Superuser Help
- Ubuntu Help
- Webapps Help
- Webmasters Help
- Programmers Help
- Dba Help
- Drupal Help
- Wordpress Help
- Magento Help
- Joomla Help
- Android Help
- Apple Help
- Game Help
- Gaming Help
- Blender Help
- Ux Help
- Cooking Help
- Photo Help
- Stats Help
- Math Help
- Diy Help
- Gis Help
- Tex Help
- Meta Help
- Electronics Help
- Stackoverflow Help
- Bitcoin Help
- Ethereum Help