Lab – Customer Segmentation#

⏳ Loading Pyodide…

Welcome to the Customer Segmentation Lab — where we use KNN to group similar customers faster than a marketing intern sorting spreadsheets on caffeine. ☕💻

In this lab, you’ll:

  • Use distance-based similarity to find customer groups

  • Compare different K values

  • Visualize segment boundaries

  • Interpret business insights

Let’s turn data into marketing magic. 🪄📈


🧰 Setup#

You can run this notebook directly in:


🪄 Step 1: Load the Data#

Let’s create a mock dataset of customers based on:

  • Annual Income

  • Spending Score

Yes, it’s inspired by the famous “Mall Customers” dataset — because malls and marketing never go out of style. 🛍️

`


🧼 Step 2: Clean & Scale Data#

Scaling matters here — otherwise “Income” might bully “SpendingScore” in distance calculations. 💰💪


🧮 Step 3: Train KNN#

Let’s start simple — K=5. Our KNN model will look for 5 closest customers for each new one.


🎨 Step 4: Visualize the Segmentation#

Let’s see how KNN “draws” its decision boundaries — like a business strategist armed with crayons.

💡 You should see regions showing different segments of customers — our KNN just made a segmentation strategy based on who spends how much. 💳✨


🧪 Step 5: Tuning K (aka “How many friends to trust?”)#

The number of neighbors K controls how smooth or chaotic your decision boundary becomes.

Let’s test different K values.

🧠 Try interpreting:

  • Low K: Very reactive, may overfit (believes the nearest gossip).

  • High K: Too smooth, may underfit (trusts everyone too much).

Choose your K wisely — business strategy meets social dynamics.


💼 Step 6: Business Interpretation#

Now, how does this matter in real life?

Segment

Description

Example Business Use

0

Low income / low spenders

Offer discount coupons or loyalty rewards

1

High income / high spenders

Upsell luxury products or premium memberships

2+

Other combinations

Personalized cross-sells

✨ You just did customer segmentation using distance-based reasoning — the heart of recommender systems, marketing analytics, and churn prediction!


🧩 TL;DR Summary#

Step

Concept

Business Angle

Data Prep

Scaling & splitting

Make sure income ≠ everything

Model

KNN

Lazy learner, smart pattern finder

K Tuning

Finding best K

Trust right number of neighbors

Visualization

Boundaries

Understand customer clusters

Insight

Segments

Strategy-ready grouping


“KNN doesn’t predict with equations — it predicts with empathy.” 💬 Similar people, similar outcomes.


⏭️ Next Chapter: Unsupervised Learning – Clustering & Dimensionality Reduction We’ll stop asking for labels altogether — and let the data find its own tribe. 🧭

# Your code here