Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Hypothesis testing helps decide whether an observed difference is probably real or whether it could easily be explained by randomness. This matters in product experiments, business reporting, and model comparison.## Core Terms- Null hypothesis (H0H_0): no effect or no difference- Alternative hypothesis (H1H_1): the competing claim- p-value: how surprising the data would be if H0H_0 were true- α\alpha: the cutoff used to judge significancemermaidflowchart LR A[Define H0 and H1] --> B[Collect data] --> C[Compute statistic] --> D[Compute p-value] --> E[Decision]Alt text: Hypothesis testing moves from a claim to data, then to a statistic, a p-value, and a decision.## Worked Example 1: Coin FlipIf a coin shows 8 heads in 10 flips, we can test whether that is surprisingly far from a fair coin.## Worked Example 2: A/B TestIf page A converts 50 of 1000 visitors and page B converts 62 of 1000, we can estimate whether page B is meaningfully better or whether the difference could be sampling noise.:::{admonition} Important caution:class: tipA significant result is not automatically a large or valuable result. Practical importance still matters.:::## Guided Practice1. What does the p-value tell you?2. Why is rejecting H0H_0 not the same as proving H1H_1?3. Why should a business team still care about effect size?

Answers
1. How surprising the observed data is if the null is true 2. Statistical evidence is not absolute proof 3. Because a tiny effect can be statistically detectable but economically irrelevant

import mathn = 10k = 8p0 = 0.5def binom_prob(n, k, p):    return math.comb(n, k) * (p ** k) * ((1 - p) ** (n - k))upper_tail = sum(binom_prob(n, i, p0) for i in range(k, n + 1))lower_tail = sum(binom_prob(n, i, p0) for i in range(0, n - k + 1))p_value = upper_tail + lower_tailprint('coin flip p-value =', round(p_value, 5))
import mathx_a, n_a = 50, 1000x_b, n_b = 62, 1000p_a = x_a / n_ap_b = x_b / n_bp_pool = (x_a + x_b) / (n_a + n_b)se = math.sqrt(p_pool * (1 - p_pool) * (1 / n_a + 1 / n_b))z = (p_b - p_a) / setwo_tailed_p = math.erfc(abs(z) / math.sqrt(2))one_tailed_p = two_tailed_p / 2 if z > 0 else 1 - two_tailed_p / 2print('z-statistic =', round(z, 4))print('one-tailed p-value =', round(one_tailed_p, 5))