Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Welcome to the wild west of machine learning — where there are no labels, no supervision, and your model just… vibes with the data 🎶

Here, we don’t tell the algorithm what’s “right” or “wrong.” We just hand it a bunch of unlabeled points and say:

“Figure out who hangs out with who.”


🧩 What’s This Chapter About?

In this section, you’ll explore how to:

  • Discover patterns when no target variable exists

  • Reduce high-dimensional chaos into beautiful 2D visuals

  • Group similar customers like a marketing guru with a spreadsheet addiction

We’ll go from:

  1. PCA — the “Marie Kondo” of ML, helping your features declutter their lives 🧺

  2. K-Means — assigning each data point a squad to belong to 💁‍♀️

  3. GMM — K-Means’ artsy cousin that prefers probabilities over hard decisions 🎨

  4. t-SNE & UMAP — for visualizing high-dimensional data so coolly that you’ll want to frame it.

  5. Lab: Customer Segmentation — because marketing loves unsupervised chaos.


🤖 Why It Matters

Not everything in business has labels:

  • You don’t always know who your “high-value” customers are 💸

  • You might not know which products belong together 🛒

  • And your dataset might have 200+ features screaming for attention 😩

That’s where unsupervised learning comes to the rescue — it finds structure, relationships, and patterns without ever asking for help.


🐍 Python Heads-Up

You’ll soon meet: sklearn.decomposition, sklearn.cluster, and umap-learn – all of which love throwing parties for your data in fewer dimensions 🎉

If Python feels rusty, warm up with 👉 Programming for Business


🧭 What’s Coming Up

SectionWhat You’ll Learn
PCADimensionality reduction — “compress your data without losing its soul.”
K-MeansGrouping similar points (and pretending it’s objective).
GMMProbabilistic clustering with fancy math and soft edges.
t-SNE & UMAPMaking data visualization look like digital art.
LabCustomer segmentation for marketing insights.

🎓 Key Takeaway

Supervised learning asks “What’s the answer?” Unsupervised learning asks “What’s the question?” 🤔

It’s the data science equivalent of philosophy — except with less existential dread and more scatter plots. 🧠📊


Next up: Let’s start cleaning the high-dimensional mess with PCA – The Feature Therapist 🛋️

# Your code here