Program Design Principles¶

“Code is like a business: the better you structure it, the longer it survives.”

🎯 Learning Outcomes¶

By the end of this module, you’ll:

Write clean, modular, and maintainable Python code
Apply software design patterns to ML and business applications
Use Git + GitHub like a pro for collaboration and version control
Write documentation that makes your work readable and reproducible
Build end-to-end programs ready for deployment
Test and debug code efficiently with real-world business context

🧼 Section 8.1 – Writing Clean and Modular Code¶

💡 Key Idea:¶

“Code should be written for humans first, machines second.”

✅ The Business-ML Developer’s Commandments:¶

Functionize Everything — If you write the same code twice → make it a function.
Name Like a CEO:
- Bad: x, df2, temp3
- Good: customer_df, monthly_sales, calc_profit_margin()
One Function = One Purpose.
- A function that does 10 things = a business meeting that never ends.
Use Comments for Why, not What.
- # clean data = ❌
- # Remove inactive customers to reduce churn bias = ✅

Modularize your notebooks → Split code into .py utilities:

# utils/data_cleaning.py
def remove_nulls(df):
    return df.dropna()

Structure your project like this:

business_ml_project/
├── data/
├── notebooks/
├── src/
│   ├── preprocessing.py
│   ├── model_training.py
│   └── utils/
├── tests/
├── requirements.txt
└── README.md

🏗️ Section 8.2 – Design Patterns for ML Applications¶

🤖 Why Design Patterns?¶

Design patterns = “proven blueprints” for recurring coding problems.

💼 Business-ML Patterns:¶

Pattern	Use Case	Example
Pipeline Pattern	Sequential steps for ML workflow	`sklearn.pipeline.Pipeline`
Factory Pattern	Create different models dynamically	Select model from config file
Observer Pattern	Model monitoring or callback alerts	Logging drift in production
Strategy Pattern	Switch between algorithms	Try multiple churn models
Singleton Pattern	One shared config object	Shared DB or API connector

⚙️ Example: ML Factory Pattern¶

class ModelFactory:
    def get_model(self, model_type):
        if model_type == "xgboost":
            from xgboost import XGBClassifier
            return XGBClassifier()
        elif model_type == "logistic":
            from sklearn.linear_model import LogisticRegression
            return LogisticRegression()
        else:
            raise ValueError("Unknown model type")

model = ModelFactory().get_model("logistic")
model.fit(X_train, y_train)

📘 Section 8.3 – Documentation Best Practices¶

✍️ The Holy Trinity of Documentation:¶

Docstrings
README.md
Jupyter annotations

🧭 Example of a Perfect Function Docstring¶

def calculate_roi(investment, returns):
    """
    Calculate Return on Investment (ROI)

    Parameters:
        investment (float): Initial investment amount
        returns (float): Total returns received

    Returns:
        float: ROI as a percentage

    Example:
        >>> calculate_roi(1000, 1200)
        20.0
    """
    return ((returns - investment) / investment) * 100

📂 README Template¶

## Customer Churn Prediction
A machine learning project that predicts churn probability using customer activity data.

## Steps
1. Data Cleaning
2. Feature Engineering
3. Model Training (RandomForest, XGBoost)
4. Evaluation and Reporting

## Run
```bash
python src/train_model.py


> 💡 Write documentation like future-you will forget everything in 3 months.

---

## 🌳 Section 8.4 – Version Control with Git and GitHub

### 🚀 Git = Your Business Time Machine
You can:
- Undo every mistake ever made
- Collaborate like a real dev team
- Track every experiment

### 🧠 Core Workflow:
```bash
## Create repo
git init
git add .
git commit -m "Initial commit - created clean project structure"

## Link with GitHub
git remote add origin https://github.com/yourname/business-ml.git
git push -u origin main

💪 Pro Tips¶

One feature = one branch
Commit often: “save game checkpoints”
Use meaningful commit messages:
- ❌ “fixed stuff”
- ✅ “added data preprocessing for missing values”

🌎 Bonus: GitHub Actions¶

Automate testing, training, and deployment with .github/workflows/ci.yml.

⚙️ Section 8.5 – Building End-to-End Programs for Deployment¶

🧱 What “End-to-End” Means:¶

→ Raw Data → Cleaned → Model → Dashboard/API → Monitor → Repeat

📊 Business ML Pipeline Template¶

def main():
    data = load_data("data/sales_forecasting.csv")
    data = preprocess(data)
    model = train_model(data)
    save_model(model, "models/sales_model.pkl")
    evaluate(model, data)
    deploy_to_streamlit("models/sales_model.pkl")

if __name__ == "__main__":
    main()

🚀 Deployment Examples¶

Method	Tool	Use Case
Web App	Streamlit / Flask	Internal dashboards
API	FastAPI	Serve ML predictions
Notebook-to-App	Gradio	Quick business demos
Production	Docker + Cloud	Full enterprise deployment

⚡ Remember: “A model isn’t useful until someone non-technical can use it.”

🧪 Section 8.6 – Testing and Debugging Business Applications¶

🧠 Why Test?¶

Because debugging in production is like fixing a plane mid-flight ✈️

🧰 Levels of Testing:¶

Type	Checks	Example Tool
Unit	Single functions	`pytest`
Integration	End-to-end pipelines	`unittest`
Regression	Model output drift	`pytest + CI/CD`
User Acceptance	Business KPIs	Manual review

✅ Example Test¶

def test_calculate_roi():
    result = calculate_roi(1000, 1200)
    assert result == 20.0, "ROI calculation failed!"

🪄 Debugging Tricks¶

Use pdb: Python’s built-in debugger

Use try/except smartly:

try:
    model.fit(X, y)
except ValueError as e:
    print("💥 Model training failed:", e)

Add logging instead of 100 print statements:

import logging
logging.basicConfig(level=logging.INFO)
logging.info("Model training started.")

🏁 Summary¶

Principle	Business Impact
Clean Code	Easier maintenance, faster onboarding
Design Patterns	Scalable ML pipelines
Documentation	Saves hours for future teams
Git	Version safety + collaboration
End-to-End Design	Deployable business apps
Testing	Stable production systems

🧩 Capstone Challenge¶

Design a mini end-to-end ML project applying all principles:

Clean project structure
Modular functions
GitHub repo + README
Simple test suite
Streamlit app or API deployment

💼 Example: “Predict monthly revenue for a retail chain and deploy results as a Streamlit dashboard.”

# Your code here

Exercises¶

Exercise 1¶

Write calculate_roi(investment, returns) that returns the ROI percentage. Include a simple docstring and return a float.

Exercise 2¶

Create format_readme(project_name, steps) that returns a README string with a title and numbered steps. Test it with sample inputs.

Exercise 3¶

Write a tiny unit test test_calculate_roi() that asserts calculate_roi(1000,1200) == 20.0. The test should raise an AssertionError if it fails.