Mathematics

Power and Sample size

DS-Lee 2021. 1. 1. 11:24

Effect Size

The minimum size of the effect that you hope to be able to detect in a statistical test, such as "a 20% improvement in click rates".

Power

= Probability of correctly rejecting a null hypothesis $H_0$.

= Probability of detecting a given effect size with given sample size. Then, $H_1$, here, is the probability of correctly rejecting $H_0$ where $H_0$ is that there is no improvement (= no difference b/n no-treatment and treatment).

Power Calculation / Power Analysis

There are two main things you can do with the power calculation: 1) to calculate the power, 2) to calculate how big a sample you will need approximately. For calculating power or required sample size, there are four moving parts:

  1. Sample size
  2. Effect size you want to detect
  3. Significance level $\alpha$ at which the test will be conducted (commonly, 0.05)
  4. Power (commonly, 0.8)

Specify any three of them, and the fourth can be calculated. You can use statistical software for the power calculation (there are available everywhere). Most commonly, you would want to calculate sample size, so you must specify the other three.

Note the following things:

  • The more power you want, the greater the sample size you will need.
  • The greater the effect size is (= the bigger improvement you expect), the smaller the sample size you need. It's vice and versa: the smaller the effect size is (= the smaller improvement you expect), the greater the sample size you need. This is explained in the following figure:

 

Fig.1: the smaller improvement is expected (large sample size is required), Fig.2: the larger improvement is expected (smaller sample size would be enough to reject the null hypothesis)

 

References: 
1) Peter B. and Andrew B., "Practical Statistics for Data Scientists", O'REILLY, p.112
2) Youtube tutorial video