Calibration & Class Imbalance#

Welcome to Calibration & Class Imbalance — the part of machine learning where your model says:

“I’m 99% confident… and 99% wrong.” 😬

Don’t worry — we’ll teach it humility.


🎯 The Business Reality Check#

In most real-world business data:

  • 95% of customers don’t churn

  • 98% of transactions aren’t fraud

  • 99% of emails aren’t spam

So your model could predict “no churn” every time and still boast 95% accuracy. But that’s like saying:

“I never predict rain — and I’m right most of the year!” ☀️

High accuracy, zero usefulness.


🧩 Understanding Class Imbalance#

Class imbalance happens when one class dominates the other.

Example

Majority

Minority

Churn Prediction

Loyal Customers

Churners

Fraud Detection

Legit Transactions

Frauds

Medical Diagnosis

Healthy Patients

Sick Patients

The model ends up being biased — not because it’s evil, but because math told it “majority wins!” 🧮👑


📉 The Deceptive Accuracy Trap#

Let’s simulate the tragedy:

from sklearn.metrics import accuracy_score

y_true = [0]*95 + [1]*5  # 5% churn
y_pred = [0]*100          # predicts 'no churn' for everyone

print("Accuracy:", accuracy_score(y_true, y_pred))

Output:

Accuracy: 0.95

Looks amazing — until your CEO asks:

“So which 5 customers are actually leaving?” …and you just smile nervously. 😅


🧠 Solutions That Save the Day#

🩹 1. Resampling#

  • Oversampling: Clone the minority class (e.g., churners).

  • Undersampling: Downsize the majority class.

  • SMOTE: Synthetic Minority Over-sampling Technique — a fancy way of saying “make fake churners” 🤖.

from imblearn.over_sampling import SMOTE
X_res, y_res = SMOTE().fit_resample(X, y)

Now your model trains on a fairer playground.


⚙️ 2. Class Weights#

Tell your model:

“Pay more attention to minority classes!”

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(class_weight='balanced')
model.fit(X, y)

It’s like giving your model sensitivity training. 🧘


📊 3. Better Metrics#

Forget accuracy. Instead, use:

  • Precision → Of predicted positives, how many are right?

  • Recall → Of actual positives, how many did we catch?

  • F1-score → The harmonic lovechild of both ❤️

from sklearn.metrics import classification_report
print(classification_report(y_true, y_pred))

Now we talk business results, not vanity metrics.


🎯 Calibration — Teaching Models Humility#

Calibration ensures that when your model says “0.8 probability,” it actually means “8 out of 10 times, this happens.”

Many models are overconfident — especially tree-based and neural ones.

Let’s fix that with a calibration curve:

from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt

prob_true, prob_pred = calibration_curve(y_true, y_pred_prob, n_bins=10)
plt.plot(prob_pred, prob_true, marker='o')
plt.plot([0, 1], [0, 1], '--', color='gray')
plt.title("Calibration Curve – Confidence vs Reality")
plt.xlabel("Predicted Probability")
plt.ylabel("True Probability")
plt.show()

A perfectly calibrated model hugs the diagonal. If yours doesn’t — don’t worry, nobody’s does. 😅


🧩 Practice Challenge#

  1. Create a highly imbalanced dataset using make_classification.

  2. Train two models: one with and one without class_weight='balanced'.

  3. Compare precision, recall, and F1.

  4. Plot a calibration curve.

  5. Write a “CEO report” explaining why 95% accuracy is misleading — bonus points for sarcasm.


🧮 Quick Recap#

Concept

What It Means

Imbalanced Data

One class dominates the other

Resampling

Adjust data ratios

Class Weights

Penalize majority mistakes

Calibration

Align predicted probs with real outcomes

Better Metrics

Precision, Recall, F1 over Accuracy


💬 “A model without calibration is like a confident intern — always sure, sometimes right, never humble.” 😎


🔗 Next Up: Lab – Churn Prediction Time to build your own churn model and see if you can spot the quitters before they quit! 🧠💼

# Your code here