Hypothesis testing helps decide whether an observed difference is probably real or whether it could easily be explained by randomness. This matters in product experiments, business reporting, and model comparison.## Core Terms- Null hypothesis (): no effect or no difference- Alternative hypothesis (): the competing claim- p-value: how surprising the data would be if were true- : the cutoff used to judge significancemermaidflowchart LR A[Define H0 and H1] --> B[Collect data] --> C[Compute statistic] --> D[Compute p-value] --> E[Decision]Alt text: Hypothesis testing moves from a claim to data, then to a statistic, a p-value, and a decision.## Worked Example 1: Coin FlipIf a coin shows 8 heads in 10 flips, we can test whether that is surprisingly far from a fair coin.## Worked Example 2: A/B TestIf page A converts 50 of 1000 visitors and page B converts 62 of 1000, we can estimate whether page B is meaningfully better or whether the difference could be sampling noise.:::{admonition} Important caution:class: tipA significant result is not automatically a large or valuable result. Practical importance still matters.:::## Guided Practice1. What does the p-value tell you?2. Why is rejecting not the same as proving ?3. Why should a business team still care about effect size?Answers
import mathn = 10k = 8p0 = 0.5def binom_prob(n, k, p): return math.comb(n, k) * (p ** k) * ((1 - p) ** (n - k))upper_tail = sum(binom_prob(n, i, p0) for i in range(k, n + 1))lower_tail = sum(binom_prob(n, i, p0) for i in range(0, n - k + 1))p_value = upper_tail + lower_tailprint('coin flip p-value =', round(p_value, 5))import mathx_a, n_a = 50, 1000x_b, n_b = 62, 1000p_a = x_a / n_ap_b = x_b / n_bp_pool = (x_a + x_b) / (n_a + n_b)se = math.sqrt(p_pool * (1 - p_pool) * (1 / n_a + 1 / n_b))z = (p_b - p_a) / setwo_tailed_p = math.erfc(abs(z) / math.sqrt(2))one_tailed_p = two_tailed_p / 2 if z > 0 else 1 - two_tailed_p / 2print('z-statistic =', round(z, 4))print('one-tailed p-value =', round(one_tailed_p, 5))