Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Welcome to Calibration & Class Imbalance — the part of machine learning where your model says:

“I’m 99% confident... and 99% wrong.” 😬

Don’t worry — we’ll teach it humility.


🎯 The Business Reality Check

In most real-world business data:

  • 95% of customers don’t churn

  • 98% of transactions aren’t fraud

  • 99% of emails aren’t spam

So your model could predict “no churn” every time and still boast 95% accuracy. But that’s like saying:

“I never predict rain — and I’m right most of the year!” ☀️

High accuracy, zero usefulness.


🧩 Understanding Class Imbalance

Class imbalance happens when one class dominates the other.

ExampleMajorityMinority
Churn PredictionLoyal CustomersChurners
Fraud DetectionLegit TransactionsFrauds
Medical DiagnosisHealthy PatientsSick Patients

The model ends up being biased — not because it’s evil, but because math told it “majority wins!” 🧮👑


📉 The Deceptive Accuracy Trap

Let’s simulate the tragedy:

from sklearn.metrics import accuracy_score

y_true = [0]*95 + [1]*5  # 5% churn
y_pred = [0]*100          # predicts 'no churn' for everyone

print("Accuracy:", accuracy_score(y_true, y_pred))

Output:

Accuracy: 0.95

Looks amazing — until your CEO asks:

“So which 5 customers are actually leaving?” …and you just smile nervously. 😅


🧠 Solutions That Save the Day

🩹 1. Resampling

  • Oversampling: Clone the minority class (e.g., churners).

  • Undersampling: Downsize the majority class.

  • SMOTE: Synthetic Minority Over-sampling Technique — a fancy way of saying “make fake churners” 🤖.

from imblearn.over_sampling import SMOTE
X_res, y_res = SMOTE().fit_resample(X, y)

Now your model trains on a fairer playground.


⚙️ 2. Class Weights

Tell your model:

“Pay more attention to minority classes!”

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(class_weight='balanced')
model.fit(X, y)

It’s like giving your model sensitivity training. 🧘


📊 3. Better Metrics

Forget accuracy. Instead, use:

  • Precision → Of predicted positives, how many are right?

  • Recall → Of actual positives, how many did we catch?

  • F1-score → The harmonic lovechild of both ❤️

from sklearn.metrics import classification_report
print(classification_report(y_true, y_pred))

Now we talk business results, not vanity metrics.


🎯 Calibration — Teaching Models Humility

Calibration ensures that when your model says “0.8 probability,” it actually means “8 out of 10 times, this happens.”

Many models are overconfident — especially tree-based and neural ones.

Let’s fix that with a calibration curve:

from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt

prob_true, prob_pred = calibration_curve(y_true, y_pred_prob, n_bins=10)
plt.plot(prob_pred, prob_true, marker='o')
plt.plot([0, 1], [0, 1], '--', color='gray')
plt.title("Calibration Curve – Confidence vs Reality")
plt.xlabel("Predicted Probability")
plt.ylabel("True Probability")
plt.show()

A perfectly calibrated model hugs the diagonal. If yours doesn’t — don’t worry, nobody’s does. 😅


🧩 Practice Challenge

  1. Create a highly imbalanced dataset using make_classification.

  2. Train two models: one with and one without class_weight='balanced'.

  3. Compare precision, recall, and F1.

  4. Plot a calibration curve.

  5. Write a “CEO report” explaining why 95% accuracy is misleading — bonus points for sarcasm.


🧮 Quick Recap

ConceptWhat It Means
Imbalanced DataOne class dominates the other
ResamplingAdjust data ratios
Class WeightsPenalize majority mistakes
CalibrationAlign predicted probs with real outcomes
Better MetricsPrecision, Recall, F1 over Accuracy

💬 “A model without calibration is like a confident intern — always sure, sometimes right, never humble.” 😎


🔗 Next Up: Lab – Churn Prediction Time to build your own churn model and see if you can spot the quitters before they quit! 🧠💼

# Your code here