Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Welcome to the crime scene of machine learning, where we don’t just predict numbers — we predict categories: who buys, who churns, who clicks, and who ghosted your email campaign 👻.

This is where probability meets persuasion — and your model learns to choose sides.


🧠 What You’ll Learn

In this chapter, you’ll meet the two great clans of classifiers:

ClanMottoMembers
Probabilistic Models“Everything is probability.”Naive Bayes
Discriminative Models“Just tell me where the line is.”Logistic Regression

You’ll master:

  • 🧮 Logistic Regression — when you want both simplicity and interpretability.

  • 🎲 Naive Bayes — when you believe features are “innocent until correlated.”

  • ⚖️ Calibration & Class Imbalance — when your model’s overconfident or data’s unfair.

  • 🧪 Churn Prediction Lab — when you finally turn your math into business money 💸.


💬 Business Translation

ML ConceptBusiness Analogy
Logistic RegressionA polite salesperson who says, “There’s a 70% chance this customer will buy.”
Naive BayesA manager who assumes all employees work independently (spoiler: they don’t).
Class ImbalanceYour sales team chasing 1 big client while ignoring 99 small ones.
CalibrationMaking sure your confidence matches reality — no fake bravado here. 😎

📈 Why Classification Matters in Business

  • Marketing: Predict which leads will convert.

  • Finance: Detect fraudulent transactions.

  • Operations: Flag products likely to fail.

  • Customer Success: Identify churn risks early.

In short, classification turns “gut feeling” into data-backed decisions.


⚙️ The Math Behind the Magic

A classifier predicts: [ P(y = 1 \mid x) = \sigma(w^T x + b) ] where:

  • ( \sigma ) is the sigmoid — it squashes numbers into probabilities (0–1 range).

  • ( w ) and ( b ) are parameters learned from data.

When ( P(y=1|x) > 0.5 ) → Class 1 (Yes) Otherwise → Class 0 (No)

Simple, elegant, and endlessly useful.


🧩 Practice Starter: Mini Experiment

import numpy as np
from sklearn.linear_model import LogisticRegression

# Toy data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([0, 0, 0, 1, 1])

model = LogisticRegression()
model.fit(X, y)

print("Predicted probabilities:", model.predict_proba([[2.5], [4.5]]))

You’ll get probabilities like [0.3, 0.7] — a classifier saying: “I’m 70% sure this customer is leaving, maybe send them a discount?” 💌


🎓 Learning Map

SectionTopicVibe
Logistic RegressionLinear boundaries meet probability.📈 Elegant math & smooth curves
Naive BayesProbabilistic reasoning with independence assumptions.🎲 Old-school Bayesian cool
Calibration & Class ImbalanceHow to make fair and honest models.⚖️ Responsible ML
Lab – Churn PredictionPredict who’s about to cancel.💸 Real business impact

💬 “Regression predicts numbers. Classification predicts destiny.” 🌌


👉 Next Up: Logistic Regression Let’s start with the most elegant liar in ML — it gives you probabilities that sound confident, but only sometimes tell the truth 😅.

# Your code here