I want to compare retentions for (2) groups in an AB test. Number of users is large enough. Currently, I'm using t-tests between groups for each day since install. This can be problematic because retention of
nth day depends on
n-1th day. How to correctly interpret these results? Is there a better way to calculate the significance.
My approach and way of thinking:
My way of think is that if only a few points are significant but the trend is that one group is (almost) all the time above the other - the retention is higher. In my opinion, a lot of non-significant changes in the same direction reflect significant change. The opposite argument is that a little (random) change on the first day could affect the following days. But again - if the change is random that it probably wouldn't affect the following days - because it's random.