“Because your shopping cart has secrets.”
🧠 What’s This About?¶
Welcome to Association Rule Mining (ARM) — the detective agency of data science 🕵️♀️
Its mission? To uncover hidden patterns in your customers’ baskets, playlists, or product combos. Think of it as:
“People who bought 🥖 bread also bought 🧈 butter… and occasionally 🍷 wine.”
It’s the backbone of:
Market Basket Analysis
Product Recommendations
Cross-selling strategies
Retail store layout optimization
🧩 The Business Intuition¶
Imagine you own a supermarket. You notice that customers who buy pasta often buy tomato sauce. 🍝 So you:
Bundle them together
Offer a discount
Move them closer on shelves
Result → 💰 more sales and happier (carb-loving) customers.
⚙️ How It Works¶
Association Rule Mining finds relationships between items in large datasets using frequent itemsets.
An association rule looks like:
[ A \Rightarrow B ]
Meaning:
If a customer buys A, they’re likely to buy B.
For example:
{Laptop} → {Mouse}(If you buy a laptop, you’ll probably grab a mouse too.)
📈 Key Metrics¶
| Metric | Meaning | Analogy |
|---|---|---|
| Support | How often the combo appears | Popularity score |
| Confidence | Probability of B given A | “How reliable is this rule?” |
| Lift | Strength of association vs chance | “How much better than random?” |
Formulas:¶
[ \text{Support}(A \Rightarrow B) = P(A \cap B) ]
[ \text{Confidence}(A \Rightarrow B) = P(B|A) = \frac{P(A \cap B)}{P(A)} ]
[ \text{Lift}(A \Rightarrow B) = \frac{P(A \cap B)}{P(A) \cdot P(B)} ]
Lift > 1: Strong positive correlation 🚀
Lift = 1: Independent items 🤝
Lift < 1: Negative relationship (they rarely go together) ❌
🧪 Quick Example (with Python!)¶
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
# Sample data
dataset = [
['milk', 'bread', 'butter'],
['beer', 'bread'],
['milk', 'bread', 'beer', 'butter'],
['bread', 'butter'],
['milk', 'beer']
]
df = pd.DataFrame(dataset)
df = pd.get_dummies(df.stack()).groupby(level=0).sum()
# Frequent itemsets
frequent = apriori(df, min_support=0.3, use_colnames=True)
# Generate rules
rules = association_rules(frequent, metric="lift", min_threshold=1)
rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]Run this and see if “milk → bread” or “beer → bread” makes the cut 🍺🍞
💡 Real-World Use Cases¶
| Industry | Example | Goal |
|---|---|---|
| Retail | Bread → Butter | Product bundling |
| E-commerce | Phone → Case | Cross-selling |
| Streaming | Movie A → Movie B | Watch recommendations |
| Banking | Credit Card → Savings Plan | Customer profiling |
| Healthcare | Symptom A → Diagnosis B | Pattern detection |
⚖️ Algorithm Comparison¶
| Algorithm | Description | Use Case |
|---|---|---|
| Apriori | Classic algorithm, simple but slower | Small to medium datasets |
| FP-Growth | Faster, more memory-efficient | Large datasets |
| Eclat | Vertical dataset version | High-dimensional data |
🧙♀️ Fun Tip¶
The term “Market Basket Analysis” literally came from grocery stores tracking which items are bought together. Legend says a famous retailer discovered that diapers and beer often appeared together… because tired parents deserve rewards too. 🍼🍺
🧍 Business Example¶
Scenario: You manage an online retail platform.
You find:
{Headphones} → {Bluetooth Adapter} (lift = 2.1)
{Phone Case} → {Screen Protector} (lift = 3.4)You:
Bundle them into “Frequently Bought Together” offers
Increase basket value 💰
Reduce customer decision fatigue 😴
Boom! Your recommender now earns its marketing salary. 🧾
🐍 Python Heads-Up¶
You’ll use:
mlxtendforapriori()andassociation_rules()pandasfor preprocessing your transactions
If this code makes your Jupyter notebook cry, grab a refresher from 👉 Programming for Business
🧠 TL;DR¶
Association Rules find “if-this-then-that” item relationships
Great for cross-selling and recommendations
Key metrics: Support, Confidence, Lift
Algorithms: Apriori, FP-Growth
Next up: Time to bring it all together in the Market Basket & Recommendations Lab 🧺💡 Let’s make your data whisper:
“Hey… wanna buy this too?” 😉
# Your code here