Lab – Customer Segmentation#
Welcome to the Customer Segmentation Lab — where we use KNN to group similar customers faster than a marketing intern sorting spreadsheets on caffeine. ☕💻
In this lab, you’ll:
Use distance-based similarity to find customer groups
Compare different K values
Visualize segment boundaries
Interpret business insights
Let’s turn data into marketing magic. 🪄📈
🧰 Setup#
You can run this notebook directly in:
🧠 JupyterLite (Run above)
🧩 Google Colab
🪄 Step 1: Load the Data#
Let’s create a mock dataset of customers based on:
Annual Income
Spending Score
Yes, it’s inspired by the famous “Mall Customers” dataset — because malls and marketing never go out of style. 🛍️
`
🧼 Step 2: Clean & Scale Data#
Scaling matters here — otherwise “Income” might bully “SpendingScore” in distance calculations. 💰💪
🧮 Step 3: Train KNN#
Let’s start simple — K=5. Our KNN model will look for 5 closest customers for each new one.
🎨 Step 4: Visualize the Segmentation#
Let’s see how KNN “draws” its decision boundaries — like a business strategist armed with crayons.
💡 You should see regions showing different segments of customers — our KNN just made a segmentation strategy based on who spends how much. 💳✨
🧪 Step 5: Tuning K (aka “How many friends to trust?”)#
The number of neighbors K controls how smooth or chaotic your decision boundary becomes.
Let’s test different K values.
🧠 Try interpreting:
Low K: Very reactive, may overfit (believes the nearest gossip).
High K: Too smooth, may underfit (trusts everyone too much).
Choose your K wisely — business strategy meets social dynamics.
💼 Step 6: Business Interpretation#
Now, how does this matter in real life?
Segment |
Description |
Example Business Use |
|---|---|---|
0 |
Low income / low spenders |
Offer discount coupons or loyalty rewards |
1 |
High income / high spenders |
Upsell luxury products or premium memberships |
2+ |
Other combinations |
Personalized cross-sells |
✨ You just did customer segmentation using distance-based reasoning — the heart of recommender systems, marketing analytics, and churn prediction!
🧩 TL;DR Summary#
Step |
Concept |
Business Angle |
|---|---|---|
Data Prep |
Scaling & splitting |
Make sure income ≠ everything |
Model |
KNN |
Lazy learner, smart pattern finder |
K Tuning |
Finding best K |
Trust right number of neighbors |
Visualization |
Boundaries |
Understand customer clusters |
Insight |
Segments |
Strategy-ready grouping |
“KNN doesn’t predict with equations — it predicts with empathy.” 💬 Similar people, similar outcomes.
⏭️ Next Chapter: Unsupervised Learning – Clustering & Dimensionality Reduction We’ll stop asking for labels altogether — and let the data find its own tribe. 🧭
# Your code here