Plotting Loyalty Curves
“The Kaplan–Meier curve: because sometimes you need to visualize how fast your customers ghost you.”
🧠 What Is the Kaplan–Meier Estimator?¶
The Kaplan–Meier estimator (a.k.a. the KM curve) helps us estimate the probability that something (or someone 👀) survives beyond a given time.
In business, that “something” is usually:
A customer staying subscribed,
A product lasting before it breaks, or
An employee staying before they update their LinkedIn headline to “Open to Work.”
💡 Core Idea¶
Instead of guessing when people will churn, KM helps us say:
“What’s the chance that a customer is still active after X days?”
The KM survival function is calculated step by step:
[ S(t) = \prod_{t_i \leq t} \left(1 - \frac{d_i}{n_i}\right) ]
Where:
(t_i): time points where events (churns) happen
(d_i): number of churns at time (t_i)
(n_i): number of customers still “at risk” just before (t_i)
Translation in business-speak:
Each time a customer churns, the survival probability takes a small hit — like your morale when you check monthly retention numbers.
🧾 Example Time!¶
Let’s track 5 customers:
| Customer | Time (days) | Event (1=Churned, 0=Active) |
|---|---|---|
| A | 10 | 1 |
| B | 20 | 0 |
| C | 20 | 1 |
| D | 30 | 1 |
| E | 40 | 0 |
Now, step through:
| Time | At Risk | Events | Survival Probability |
|---|---|---|---|
| 10 | 5 | 1 | (1 - 1/5) = 0.8 |
| 20 | 4 | 1 | 0.8 × (1 - 1/4) = 0.6 |
| 30 | 2 | 1 | 0.6 × (1 - 1/2) = 0.3 |
| 40 | 1 | 0 | 0.3 × 1 = 0.3 |
🎯 Interpretation: After 30 days, there’s about a 30% chance a customer is still active. So your product’s half-life is basically one billing cycle.
📊 Plot It Like a Pro¶
from lifelines import KaplanMeierFitter
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
"T": [10, 20, 20, 30, 40],
"E": [1, 0, 1, 1, 0]
})
kmf = KaplanMeierFitter()
kmf.fit(durations=df["T"], event_observed=df["E"], label="Customer Retention")
kmf.plot_survival_function()
plt.title("📉 Kaplan–Meier Curve: How Loyal Are Your Customers?")
plt.xlabel("Days Since Subscription")
plt.ylabel("Probability of Staying Subscribed")
plt.show()📈 What the Curve Tells You¶
A steep drop early on → Customers are ghosting faster than your follow-up emails.
A flat curve → You’ve found the loyal ones. They’ll probably name their Wi-Fi after you.
Censoring marks (⧫) → Customers who haven’t yet churned — still in the game.
🎯 Business Applications¶
| Use Case | Description |
|---|---|
| Subscription Retention | Estimate average lifetime of a paying customer. |
| Product Warranty | Predict how long a product lasts before failure. |
| Employee Turnover | Visualize “time until resignation” (HR horror story). |
| Campaign Effectiveness | Compare survival curves of two marketing groups. |
🧩 Practice Exercise¶
Simulate 100 customers with random churn times.
Use
lifelines.KaplanMeierFitter()to plot their survival curve.Split into Group A (promo emails) and Group B (no emails).
See which group survives longer. (Hint: Don’t bet on the “no emails” group.)
🤹 Fun Thought¶
If your KM curve stays above 0.8 after 3 months, you’ve achieved business immortality. 🧙♂️
If it hits 0.1 after two weeks, consider changing your pricing… or your product.
🧭 Next Stop¶
➡️ Cox Proportional Hazards Model – where we stop pretending all customers are equal and start quantifying who’s most likely to churn next.
# Your code here