Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Program Design Principles

“Code is like a business: the better you structure it, the longer it survives.”


🎯 Learning Outcomes

By the end of this module, you’ll:

  • Write clean, modular, and maintainable Python code

  • Apply software design patterns to ML and business applications

  • Use Git + GitHub like a pro for collaboration and version control

  • Write documentation that makes your work readable and reproducible

  • Build end-to-end programs ready for deployment

  • Test and debug code efficiently with real-world business context


🧼 Section 8.1 – Writing Clean and Modular Code

💡 Key Idea:

“Code should be written for humans first, machines second.”

✅ The Business-ML Developer’s Commandments:

  1. Functionize Everything — If you write the same code twice → make it a function.

  2. Name Like a CEO:

    • Bad: x, df2, temp3

    • Good: customer_df, monthly_sales, calc_profit_margin()

  3. One Function = One Purpose.

    • A function that does 10 things = a business meeting that never ends.

  4. Use Comments for Why, not What.

    • # clean data = ❌

    • # Remove inactive customers to reduce churn bias = ✅

  5. Modularize your notebooks → Split code into .py utilities:

    # utils/data_cleaning.py
    def remove_nulls(df):
        return df.dropna()
  6. Structure your project like this:

    business_ml_project/
    ├── data/
    ├── notebooks/
    ├── src/
    │   ├── preprocessing.py
    │   ├── model_training.py
    │   └── utils/
    ├── tests/
    ├── requirements.txt
    └── README.md

🏗️ Section 8.2 – Design Patterns for ML Applications

🤖 Why Design Patterns?

Design patterns = “proven blueprints” for recurring coding problems.

💼 Business-ML Patterns:

PatternUse CaseExample
Pipeline PatternSequential steps for ML workflowsklearn.pipeline.Pipeline
Factory PatternCreate different models dynamicallySelect model from config file
Observer PatternModel monitoring or callback alertsLogging drift in production
Strategy PatternSwitch between algorithmsTry multiple churn models
Singleton PatternOne shared config objectShared DB or API connector

⚙️ Example: ML Factory Pattern

class ModelFactory:
    def get_model(self, model_type):
        if model_type == "xgboost":
            from xgboost import XGBClassifier
            return XGBClassifier()
        elif model_type == "logistic":
            from sklearn.linear_model import LogisticRegression
            return LogisticRegression()
        else:
            raise ValueError("Unknown model type")

model = ModelFactory().get_model("logistic")
model.fit(X_train, y_train)

📘 Section 8.3 – Documentation Best Practices

✍️ The Holy Trinity of Documentation:

  1. Docstrings

  2. README.md

  3. Jupyter annotations

🧭 Example of a Perfect Function Docstring

def calculate_roi(investment, returns):
    """
    Calculate Return on Investment (ROI)

    Parameters:
        investment (float): Initial investment amount
        returns (float): Total returns received

    Returns:
        float: ROI as a percentage

    Example:
        >>> calculate_roi(1000, 1200)
        20.0
    """
    return ((returns - investment) / investment) * 100

📂 README Template

## Customer Churn Prediction
A machine learning project that predicts churn probability using customer activity data.

## Steps
1. Data Cleaning
2. Feature Engineering
3. Model Training (RandomForest, XGBoost)
4. Evaluation and Reporting

## Run
```bash
python src/train_model.py

> 💡 Write documentation like future-you will forget everything in 3 months.

---

## 🌳 Section 8.4 – Version Control with Git and GitHub

### 🚀 Git = Your Business Time Machine
You can:
- Undo every mistake ever made
- Collaborate like a real dev team
- Track every experiment

### 🧠 Core Workflow:
```bash
## Create repo
git init
git add .
git commit -m "Initial commit - created clean project structure"

## Link with GitHub
git remote add origin https://github.com/yourname/business-ml.git
git push -u origin main

💪 Pro Tips

  • One feature = one branch

  • Commit often: “save game checkpoints”

  • Use meaningful commit messages:

    • ❌ “fixed stuff”

    • ✅ “added data preprocessing for missing values”

🌎 Bonus: GitHub Actions

Automate testing, training, and deployment with .github/workflows/ci.yml.


⚙️ Section 8.5 – Building End-to-End Programs for Deployment

🧱 What “End-to-End” Means:

→ Raw Data → Cleaned → Model → Dashboard/API → Monitor → Repeat

📊 Business ML Pipeline Template

def main():
    data = load_data("data/sales_forecasting.csv")
    data = preprocess(data)
    model = train_model(data)
    save_model(model, "models/sales_model.pkl")
    evaluate(model, data)
    deploy_to_streamlit("models/sales_model.pkl")

if __name__ == "__main__":
    main()

🚀 Deployment Examples

MethodToolUse Case
Web AppStreamlit / FlaskInternal dashboards
APIFastAPIServe ML predictions
Notebook-to-AppGradioQuick business demos
ProductionDocker + CloudFull enterprise deployment

⚡ Remember: “A model isn’t useful until someone non-technical can use it.”


🧪 Section 8.6 – Testing and Debugging Business Applications

🧠 Why Test?

Because debugging in production is like fixing a plane mid-flight ✈️

🧰 Levels of Testing:

TypeChecksExample Tool
UnitSingle functionspytest
IntegrationEnd-to-end pipelinesunittest
RegressionModel output driftpytest + CI/CD
User AcceptanceBusiness KPIsManual review

✅ Example Test

def test_calculate_roi():
    result = calculate_roi(1000, 1200)
    assert result == 20.0, "ROI calculation failed!"

🪄 Debugging Tricks

  • Use pdb: Python’s built-in debugger

  • Use try/except smartly:

    try:
        model.fit(X, y)
    except ValueError as e:
        print("💥 Model training failed:", e)
  • Add logging instead of 100 print statements:

    import logging
    logging.basicConfig(level=logging.INFO)
    logging.info("Model training started.")

🏁 Summary

PrincipleBusiness Impact
Clean CodeEasier maintenance, faster onboarding
Design PatternsScalable ML pipelines
DocumentationSaves hours for future teams
GitVersion safety + collaboration
End-to-End DesignDeployable business apps
TestingStable production systems

🧩 Capstone Challenge

Design a mini end-to-end ML project applying all principles:

  • Clean project structure

  • Modular functions

  • GitHub repo + README

  • Simple test suite

  • Streamlit app or API deployment

💼 Example: “Predict monthly revenue for a retail chain and deploy results as a Streamlit dashboard.”

# Your code here

Exercises

Exercise 1

Write calculate_roi(investment, returns) that returns the ROI percentage. Include a simple docstring and return a float.


Exercise 2

Create format_readme(project_name, steps) that returns a README string with a title and numbered steps. Test it with sample inputs.


Exercise 3

Write a tiny unit test test_calculate_roi() that asserts calculate_roi(1000,1200) == 20.0. The test should raise an AssertionError if it fails.