Hypothesis Testing Basics - Machine Learning for Business

Hypothesis testing helps decide whether an observed difference is probably real or whether it could easily be explained by randomness. This matters in product experiments, business reporting, and model comparison.## Core Terms- Null hypothesis ( $H_0$ ): no effect or no difference- Alternative hypothesis ( $H_1$ ): the competing claim- p-value: how surprising the data would be if $H_0$ were true- $\alpha$ : the cutoff used to judge significancemermaidflowchart LR A[Define H0 and H1] --> B[Collect data] --> C[Compute statistic] --> D[Compute p-value] --> E[Decision]Alt text: Hypothesis testing moves from a claim to data, then to a statistic, a p-value, and a decision.## Worked Example 1: Coin FlipIf a coin shows 8 heads in 10 flips, we can test whether that is surprisingly far from a fair coin.## Worked Example 2: A/B TestIf page A converts 50 of 1000 visitors and page B converts 62 of 1000, we can estimate whether page B is meaningfully better or whether the difference could be sampling noise.:::{admonition} Important caution:class: tipA significant result is not automatically a large or valuable result. Practical importance still matters.:::## Guided Practice1. What does the p-value tell you?2. Why is rejecting $H_0$ not the same as proving $H_1$ ?3. Why should a business team still care about effect size?

import mathn = 10k = 8p0 = 0.5def binom_prob(n, k, p):    return math.comb(n, k) * (p ** k) * ((1 - p) ** (n - k))upper_tail = sum(binom_prob(n, i, p0) for i in range(k, n + 1))lower_tail = sum(binom_prob(n, i, p0) for i in range(0, n - k + 1))p_value = upper_tail + lower_tailprint('coin flip p-value =', round(p_value, 5))

import mathx_a, n_a = 50, 1000x_b, n_b = 62, 1000p_a = x_a / n_ap_b = x_b / n_bp_pool = (x_a + x_b) / (n_a + n_b)se = math.sqrt(p_pool * (1 - p_pool) * (1 / n_a + 1 / n_b))z = (p_b - p_a) / setwo_tailed_p = math.erfc(abs(z) / math.sqrt(2))one_tailed_p = two_tailed_p / 2 if z > 0 else 1 - two_tailed_p / 2print('z-statistic =', round(z, 4))print('one-tailed p-value =', round(one_tailed_p, 5))