“Because the only thing worse than a bad forecast is not knowing it’s bad.” 😬
🎯 Why Backtesting?¶
Imagine your forecasting model as a fortune teller. 🔮 You wouldn’t just trust their prediction that your Q4 sales will skyrocket without proof, right? Backtesting is how we test our fortune teller — by asking:
“Okay, smartypants, what would you have predicted last year?”
If their “forecast” doesn’t match what actually happened — we politely say, “You’re fired,” and try again.
🧠 The Big Idea¶
Backtesting = pretend the past is the future, make a prediction, and see how wrong you were.
In code terms:
Split your time series into train and test sets.
Train your model on the earlier part.
Predict the later part.
Compare the predictions vs. reality.
Cry a little. Adjust parameters. Repeat. 🌀
📊 Basic Backtesting in Python¶
Here’s how the data therapist session goes:
from sklearn.metrics import mean_absolute_error
from prophet import Prophet
import pandas as pd
# Split data
train = df.iloc[:-12]
test = df.iloc[-12:]
# Train model
model = Prophet()
model.fit(train)
# Forecast into test period
future = model.make_future_dataframe(periods=12, freq='M')
forecast = model.predict(future)
# Compare
preds = forecast.set_index('ds').loc[test['ds'], 'yhat']
mae = mean_absolute_error(test['y'], preds)
print(f"Mean Absolute Error: {mae:.2f}")📉 Output:
Mean Absolute Error: 57.23
Translation: your model missed the target by about 57 units per month. Not catastrophic… but CFO might still send you “that” email.
🧪 Rolling Backtest (Walk-Forward Validation)¶
One test isn’t enough — let’s simulate multiple points in time.
errors = []
for i in range(6, 13):
train = df.iloc[:-i]
test = df.iloc[-i:-i+1]
model = Prophet().fit(train)
future = model.make_future_dataframe(periods=i, freq='M')
forecast = model.predict(future)
y_pred = forecast.iloc[-1]['yhat']
errors.append(abs(test['y'].values[0] - y_pred))
print(f"Average Error: {sum(errors)/len(errors):.2f}")🧮 It’s like asking Prophet,
“What if you had been alive in 2019, 2020, 2021… how would you have done?”
📈 KPI Metrics You Should Know¶
| Metric | Formula | Business Translation |
|---|---|---|
| MAE | Mean Absolute Error | Average “ouch” per prediction |
| RMSE | Root Mean Squared Error | Like MAE, but penalizes bigger mistakes |
| MAPE | Mean Absolute % Error | “On average, how far off was I in percentage terms?” |
| R² | Coefficient of Determination | “How much of reality did I actually explain?” |
💬 Tip: MAPE is great for business decks — it turns abstract errors into something managers understand:
“Our forecast is 7% off, not 700 widgets.”
🧮 KPI Interpretation – “How Wrong Is Acceptably Wrong?”¶
| Accuracy | Description | Manager’s Reaction |
|---|---|---|
| < 5% | 🔥 Excellent | “You’re getting a promotion!” |
| 5–10% | 👍 Good | “Let’s put this in the report.” |
| 10–20% | 😐 Acceptable | “Hmm, close enough for planning.” |
| > 20% | 🚨 Bad | “We’ll blame marketing again.” |
🪞 Backtesting in Business Terms¶
Forecasting sales? Backtesting tells you how much inventory you should have ordered vs. what you actually needed.
Forecasting website traffic? It tells marketing how many ads they wasted money on. 💸
Forecasting stock levels? It saves your warehouse from drowning in 10,000 unsold “Summer 2022” mugs.
💼 KPI Alignment with Business Goals¶
| Business Goal | Forecast Metric | KPI Alignment |
|---|---|---|
| Profit Planning | RMSE | Minimizing overall uncertainty |
| Inventory | MAE or MAPE | Fewer overstock/understock events |
| Marketing ROI | R² | Model explains real demand shifts |
| Finance Budgeting | MAPE | Predictability over perfection |
🧠 Practice Challenge¶
Try backtesting Prophet or ARIMA on:
Monthly sales or revenue
Customer support tickets
Website visits
Compute MAE, RMSE, and MAPE — then answer:
“Would I trust this forecast in a board meeting?”
🧾 TL;DR¶
| Concept | TL;DR |
|---|---|
| Backtesting | Testing your forecast on old data |
| KPIs | Quantify how wrong (or right) you were |
| Rolling test | Multiple points of validation |
| MAPE | Business-friendly accuracy score |
| RMSE | Punishes big errors |
| Business takeaway | Forecasts are only useful when you measure their reliability |
“Backtesting is like checking your ex’s old messages — you might not like what you find, but it teaches you what to avoid next time.” 💔📉
# Your code here