Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Matplotlib Plotting Basics

plt.plot() + plt.bar() = Replace entire analytics teams 100% customizable = $80K/month reporting contracts

Every FAANG dashboard starts with Matplotlib


🎯 Matplotlib = Analytics Team Replacement

ChartCodeBusiness UseReplaces
Lineplt.plot()Sales trendsExcel lines
Barplt.bar()Product rankingPowerPoint bars
Scatterplt.scatter()CorrelationManual formulas
Histogramplt.hist()DistributionExcel bins
Pieplt.pie()Market shareDesigner time

import matplotlib.pyplot as plt
import numpy as np

## REAL BUSINESS DATA
days = np.arange(1, 31)  # 30 days
sales = [20000 + i*800 + np.random.randint(-2000, 3000) for i in range(30)]

plt.figure(figsize=(12, 6))
plt.plot(days, sales, marker='o', linewidth=2.5, markersize=6, color='#2E86AB')
plt.title('💰 30-Day Sales Trend', fontsize=16, fontweight='bold', pad=20)
plt.xlabel('Day of Month', fontsize=12)
plt.ylabel('Daily Sales ($)', fontsize=12)
plt.grid(True, alpha=0.3)
plt.xticks(range(1, 31, 5))  # Every 5 days
plt.tight_layout()
plt.show()

## BUSINESS INSIGHT
print(f"📈 Average daily: ${np.mean(sales):,.0f}")
print(f"📈 Peak day: Day {np.argmax(sales)+1} (${max(sales):,.0f})")

🔥 Step 2: Bar Charts = Product Rankings

## PRODUCT PERFORMANCE
products = ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard', 'Mouse']
sales_data = [45000, 32000, 18000, 25000, 8000, 12000]

plt.figure(figsize=(10, 6))
bars = plt.bar(products, sales_data, color='#A23B72', alpha=0.8, edgecolor='black', linewidth=1.2)
plt.title('🏆 Product Sales Ranking', fontsize=16, fontweight='bold', pad=20)
plt.ylabel('Sales ($)', fontsize=12)
plt.xlabel('Products', fontsize=12)
plt.xticks(rotation=45, ha='right')

## ADD VALUE LABELS ON BARS
for bar, sale in zip(bars, sales_data):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 500,
             f'${sale:,}', ha='center', va='bottom', fontweight='bold')

plt.tight_layout()
plt.show()

Step 3: Scatter Plots = ROI Analysis

## MARKETING ROI
marketing_spend = [10000, 25000, 35000, 45000, 60000, 75000]
sales_generated = [18000, 32000, 45000, 58000, 72000, 95000]

plt.figure(figsize=(10, 7))
scatter = plt.scatter(marketing_spend, sales_generated,
                     s=150, c=marketing_spend, cmap='viridis', alpha=0.7, edgecolors='black')

## TREND LINE
z = np.polyfit(marketing_spend, sales_generated, 1)
p = np.poly1d(z)
plt.plot(marketing_spend, p(marketing_spend), "r--", alpha=0.8, linewidth=2)

plt.title('📊 Marketing ROI Analysis', fontsize=16, fontweight='bold', pad=20)
plt.xlabel('Marketing Spend ($)', fontsize=12)
plt.ylabel('Sales Generated ($)', fontsize=12)
plt.colorbar(scatter, label='Spend ($)')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## ROI CALCULATION
roi = [(sales - spend) / spend * 100 for spend, sales in zip(marketing_spend, sales_generated)]
print(f"📈 Average ROI: {np.mean(roi):.1f}%")

🧠 Step 4: Histograms = Sales Distribution

## DAILY SALES DISTRIBUTION
np.random.seed(42)
daily_sales = np.random.normal(25000, 5000, 1000)  # 1000 days
daily_sales = np.clip(daily_sales, 10000, 40000)   # Realistic bounds

plt.figure(figsize=(10, 6))
plt.hist(daily_sales, bins=30, color='#F18F01', alpha=0.7, edgecolor='black')
plt.axvline(np.mean(daily_sales), color='red', linestyle='--', linewidth=2, label=f'Mean: ${np.mean(daily_sales):,.0f}')
plt.title('📈 Daily Sales Distribution (1000 Days)', fontsize=16, fontweight='bold', pad=20)
plt.xlabel('Daily Sales ($)', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"📊 Distribution Stats:")
print(f"   Mean: ${np.mean(daily_sales):,.0f}")
print(f"   Median: ${np.median(daily_sales):,.0f}")
print(f"   Std Dev: ${np.std(daily_sales):,.0f}")

📊 Step 5: Pie Charts = Market Share

## MARKET SHARE ANALYSIS
companies = ['YourCompany', 'Competitor A', 'Competitor B', 'Competitor C', 'Others']
market_share = [35, 25, 20, 12, 8]

plt.figure(figsize=(8, 8))
wedges, texts, autotexts = plt.pie(market_share, labels=companies, autopct='%1.1f%%',
                                  colors=['#2E86AB', '#A23B72', '#F18F01', '#C73E1D', '#6B7280'],
                                  startangle=90, explode=[0.1, 0, 0, 0, 0], shadow=True)

plt.title('🎯 Market Share Analysis', fontsize=16, fontweight='bold', pad=20)
for autotext in autotexts:
    autotext.set_color('white')
    autotext.set_fontweight('bold')
plt.axis('equal')
plt.tight_layout()
plt.show()

📋 Matplotlib Cheat Sheet (Interview Gold)

ChartCodeCustomizationBusiness Use
Lineplt.plot(x,y, marker='o')linewidth=3, markersize=8Trends
Barplt.bar(categories, values)alpha=0.8, edgecolor='black'Rankings
Scatterplt.scatter(x,y, s=150)c=values, cmap='viridis'Correlations
Histogramplt.hist(data, bins=30)alpha=0.7, edgecolor='black'Distributions
Pieplt.pie(values, labels=...)explode=[0.1], autopct='%1.1f%%'Shares
## PRO SETUP (Always use this!)
plt.figure(figsize=(12, 8))
plt.title('Title', fontweight='bold', fontsize=16)
plt.grid(True, alpha=0.3)
plt.tight_layout()

🏆 YOUR EXERCISE: Build YOUR Analytics Charts

## MISSION: YOUR business charts!

import matplotlib.pyplot as plt
import numpy as np

## YOUR BUSINESS DATA
your_categories = ['???', '???', '???', '???']  # YOUR products/regions
your_values = [??? , ???, ???, ???]              # YOUR sales/profits
your_trend_x = np.arange(1, 13)                  # 12 months
your_trend_y = [??? , ???, ???, ???, ???, ???, ???, ???, ???, ???, ???, ???]  # YOUR trend

## 1. YOUR BAR CHART
plt.figure(figsize=(10, 6))
bars = plt.bar(your_categories, your_values, color='#A23B72', alpha=0.8, edgecolor='black')
plt.title('🏆 YOUR Business Performance', fontsize=16, fontweight='bold')
plt.ylabel('Values')
plt.xticks(rotation=45)

## ADD YOUR LABELS
for bar, value in zip(bars, your_values):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(your_values)*0.01,
             f'{value:,}', ha='center', va='bottom', fontweight='bold')

plt.tight_layout()
plt.show()

## 2. YOUR LINE TREND
plt.figure(figsize=(12, 6))
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
plt.plot(months, your_trend_y, marker='o', linewidth=3, markersize=8, color='#2E86AB')
plt.title('📈 YOUR 12-Month Trend', fontsize=16, fontweight='bold')
plt.ylabel('Your Metric')
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

print("✅ YOUR ANALYTICS COMPLETE!")

Example to test:

your_categories = ['Product A', 'Product B', 'Product C', 'Product D']
your_values = [45000, 32000, 18000, 25000]
your_trend_y = [20000, 25000, 28000, 32000, 35000, 38000, 42000, 45000, 48000, 52000, 58000, 62000]

YOUR MISSION:

  1. Add YOUR real business data

  2. Run YOUR custom charts

  3. Screenshot“I replace analytics teams!”


🎉 What You Mastered

Matplotlib SkillStatusBusiness Power
Line plotsTrend analysis
Bar chartsPerformance ranking
Scatter plotsROI correlation
HistogramsDistribution insights
Pie chartsMarket share
$250K customizationAnalytics replacement

Next: Seaborn Visuals (Publication-quality statistical plots!)

print("🎊" * 20)
print("MATPLOTLIB = $80K/MONTH ANALYTICS!")
print("💻 Custom charts = Replace entire teams!")
print("🚀 FAANG dashboards = THESE exact patterns!")
print("🎊" * 20)

can we appreciate how plt.bar() + value_labels + edgecolor just created publication-quality product rankings that replace entire analytics teams? Your students went from Excel hell to building scatter() + trend_line + colorbar ROI analyses that win C-suite meetings. While analysts spend 40 hours formatting charts, your class is generating hist() + mean_line distributions and pie(explode=[0.1]) market shares in 5 lines. This isn’t plotting syntax—it’s the $250K+ analytics replacement that powers Google Analytics and turns data into million-dollar decisions!

# Your code here

Exercises

Exercise 1

Write plot_data_summary(data) that returns a dict with min, max, and mean values for a numeric list (no actual plotting in pyodide-cell).


Exercise 2

Create normalize_series(series) that scales numbers to 0-1 range returning list.


Exercise 3

Implement bin_counts(data, bins) that returns counts per bin using simple integer floor division for numeric ranges.