I want to compare retentions for (2) groups in an AB test. Number of users is large enough. Currently, I'm using t-tests between groups for each day since install. This can be problematic because retention of `n`

th day depends on `n-1`

th day. How to correctly interpret these results? Is there a better way to calculate the significance.

My approach and way of thinking:

My way of think is that if only a few points are significant but the trend is that one group is (almost) all the time above the other - the retention is higher. In my opinion, a lot of non-significant changes in the same direction reflect significant change. The opposite argument is that a little (random) change on the first day could affect the following days. But again - if the change is random that it probably wouldn't affect the following days - because it's random.

The plot on which I'm making my decision looks like this (Data is simulated - my question is not meant for this exact plot!). Points represent statistical significance.

