Lab – Market Basket & Recommendations#

“Because your data deserves a shopping spree.” 🛒


🎯 Objective#

Build your own Market Basket Recommender — a system that tells customers:

“You bought X, you might also like Y (and probably don’t need Z… but who’s stopping you?)”

We’ll mix:

  • Collaborative Filtering (people like you bought…)

  • Content-Based Filtering (items similar to this…)

  • Association Rules (if-this-then-that magic…)


🧠 Setup#

Fire up your Jupyter Notebook and import the usual suspects:

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from sklearn.metrics.pairwise import cosine_similarity

🧺 Step 1 – Create Your Mini Store#

Let’s make a pretend e-commerce dataset.

data = {
    'CustomerID': [1, 1, 2, 2, 3, 3, 4, 5],
    'Item': [
        'Laptop', 'Mouse',
        'Phone', 'Headphones',
        'Milk', 'Bread',
        'Milk',
        'Laptop'
    ]
}

df = pd.DataFrame(data)
df

CustomerID

Item

1

Laptop

1

Mouse

2

Phone

2

Headphones

3

Milk

3

Bread

4

Milk

5

Laptop

Nice — a perfect mix of tech geeks and breakfast lovers.


🧮 Step 2 – Association Rules Magic#

basket = pd.get_dummies(df.set_index('CustomerID')['Item']).groupby(level=0).sum()
frequent_items = apriori(basket, min_support=0.2, use_colnames=True)
rules = association_rules(frequent_items, metric='lift', min_threshold=1)
rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]

If your rule shows {Milk} {Bread}, congratulations — your grocery recommender now understands carbohydrates 🍞🥛


🤝 Step 3 – Collaborative Filtering (Mini Edition)#

Let’s simulate user-item preferences:

import numpy as np

ratings = pd.DataFrame({
    'User1': [5, 4, np.nan, 3],
    'User2': [4, np.nan, 5, 2],
    'User3': [np.nan, 4, 4, np.nan]
}, index=['Laptop', 'Mouse', 'Phone', 'Headphones'])

similarity = pd.DataFrame(
    cosine_similarity(ratings.fillna(0)),
    index=ratings.index, columns=ratings.index
)

similarity

See which products are “best buddies.” If Laptop and Mouse have a high similarity score → your recommender nods wisely. 🧠💻🐭


🧩 Step 4 – Content-Based Filtering (Optional Spice)#

You can also compare product features instead of ratings:

features = pd.DataFrame({
    'Item': ['Laptop', 'Mouse', 'Phone', 'Headphones'],
    'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics'],
    'Wireless': [0, 1, 1, 1]
})

Compute similarities between feature vectors to recommend similar items. Your model now says:

“If you like wireless headphones, you’ll love wireless regret when they run out of battery.” 🔋😅


💡 Step 5 – Combine the Insights#

Fuse the power of:

  • Collaborative Filtering: What similar users bought

  • Content-Based: What similar items exist

  • Association Rules: What items co-occur frequently

🎯 Business logic:

  1. Recommend from collaborative filtering first.

  2. Fill gaps using content-based similarity.

  3. Add “bonus” suggestions from association rules.

Boom — your Hybrid Recommender is alive! 🤖💞


📊 Step 6 – Evaluate (a.k.a. “Does It Even Work?”)#

You can track:

  • Precision@K – how many suggested items were actually bought

  • Coverage – % of items that appear in recommendations

  • Business KPI Impact – conversion rate, AOV (Average Order Value), or repeat purchase rate


🏪 Business Scenario#

Context: You’re an analyst for an online retailer. Your boss wants a dashboard that shows:

  • Top 10 frequent item pairs

  • Personalized recommendations per user

  • Average basket size increase post-recommendation

Deliver it with a smile (and maybe a PowerPoint). Suddenly you’re the office “AI wizard.” 🧙‍♂️


🧍 Real-World Example#

Platform

Technique Used

Example

Amazon

Hybrid (Collaborative + Association)

“Frequently Bought Together”

Netflix

Collaborative Filtering

“Because you watched…”

Spotify

Content-Based

“More songs like this”

Walmart

Association Rules

Diapers → Beer 🍼🍺


🐍 Python Heads-Up#

If you’re just getting started with pandas, data cleaning, or loops, warm up with 👉 Programming for Business

You’ll thank yourself when your code doesn’t scream KeyError: 'CustomerID'. 😭


🧠 TL;DR#

  • Combine Collaborative, Content-Based, and Association Rules for a hybrid system.

  • Use mlxtend for mining frequent patterns.

  • Use sklearn.metrics.pairwise for similarities.

  • Think like a marketer, code like a data scientist.


🏁 Final Thought#

A great recommender doesn’t just predict what customers want — it gently whispers:

“You deserve this… and also maybe two more.” 😏

Now go forth and make shopping addictive — ethically! 🛍️💡

# Your code here