Lab – Market Basket & Recommendations

Lab – Market Basket & Recommendations#

“Because your data deserves a shopping spree.” 🛒

🎯 Objective#

Build your own Market Basket Recommender — a system that tells customers:

“You bought X, you might also like Y (and probably don’t need Z… but who’s stopping you?)”

We’ll mix:

Collaborative Filtering (people like you bought…)
Content-Based Filtering (items similar to this…)
Association Rules (if-this-then-that magic…)

🧠 Setup#

Fire up your Jupyter Notebook and import the usual suspects:

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from sklearn.metrics.pairwise import cosine_similarity

🧺 Step 1 – Create Your Mini Store#

Let’s make a pretend e-commerce dataset.

data = {
    'CustomerID': [1, 1, 2, 2, 3, 3, 4, 5],
    'Item': [
        'Laptop', 'Mouse',
        'Phone', 'Headphones',
        'Milk', 'Bread',
        'Milk',
        'Laptop'
    ]
}

df = pd.DataFrame(data)
df

CustomerID	Item
1	Laptop
1	Mouse
2	Phone
2	Headphones
3	Milk
3	Bread
4	Milk
5	Laptop

Nice — a perfect mix of tech geeks and breakfast lovers.

🧮 Step 2 – Association Rules Magic#

basket = pd.get_dummies(df.set_index('CustomerID')['Item']).groupby(level=0).sum()
frequent_items = apriori(basket, min_support=0.2, use_colnames=True)
rules = association_rules(frequent_items, metric='lift', min_threshold=1)
rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]

If your rule shows {Milk} → {Bread}, congratulations — your grocery recommender now understands carbohydrates 🍞🥛

🤝 Step 3 – Collaborative Filtering (Mini Edition)#

Let’s simulate user-item preferences:

import numpy as np

ratings = pd.DataFrame({
    'User1': [5, 4, np.nan, 3],
    'User2': [4, np.nan, 5, 2],
    'User3': [np.nan, 4, 4, np.nan]
}, index=['Laptop', 'Mouse', 'Phone', 'Headphones'])

similarity = pd.DataFrame(
    cosine_similarity(ratings.fillna(0)),
    index=ratings.index, columns=ratings.index
)

similarity

See which products are “best buddies.” If Laptop and Mouse have a high similarity score → your recommender nods wisely. 🧠💻🐭

🧩 Step 4 – Content-Based Filtering (Optional Spice)#

You can also compare product features instead of ratings:

features = pd.DataFrame({
    'Item': ['Laptop', 'Mouse', 'Phone', 'Headphones'],
    'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics'],
    'Wireless': [0, 1, 1, 1]
})

Compute similarities between feature vectors to recommend similar items. Your model now says:

“If you like wireless headphones, you’ll love wireless regret when they run out of battery.” 🔋😅

💡 Step 5 – Combine the Insights#

Fuse the power of:

Collaborative Filtering: What similar users bought
Content-Based: What similar items exist
Association Rules: What items co-occur frequently

🎯 Business logic:

Recommend from collaborative filtering first.
Fill gaps using content-based similarity.
Add “bonus” suggestions from association rules.

Boom — your Hybrid Recommender is alive! 🤖💞

📊 Step 6 – Evaluate (a.k.a. “Does It Even Work?”)#

You can track:

Precision@K – how many suggested items were actually bought
Coverage – % of items that appear in recommendations
Business KPI Impact – conversion rate, AOV (Average Order Value), or repeat purchase rate

🏪 Business Scenario#

Context: You’re an analyst for an online retailer. Your boss wants a dashboard that shows:

Top 10 frequent item pairs
Personalized recommendations per user
Average basket size increase post-recommendation

Deliver it with a smile (and maybe a PowerPoint). Suddenly you’re the office “AI wizard.” 🧙‍♂️

🧍 Real-World Example#

Platform	Technique Used	Example
Amazon	Hybrid (Collaborative + Association)	“Frequently Bought Together”
Netflix	Collaborative Filtering	“Because you watched…”
Spotify	Content-Based	“More songs like this”
Walmart	Association Rules	Diapers → Beer 🍼🍺

🐍 Python Heads-Up#

If you’re just getting started with pandas, data cleaning, or loops, warm up with 👉 Programming for Business

You’ll thank yourself when your code doesn’t scream KeyError: 'CustomerID'. 😭

🧠 TL;DR#

Combine Collaborative, Content-Based, and Association Rules for a hybrid system.
Use mlxtend for mining frequent patterns.
Use sklearn.metrics.pairwise for similarities.
Think like a marketer, code like a data scientist.

🏁 Final Thought#

A great recommender doesn’t just predict what customers want — it gently whispers:

“You deserve this… and also maybe two more.” 😏

Now go forth and make shopping addictive — ethically! 🛍️💡

# Your code here