Program Design Principles¶
“Code is like a business: the better you structure it, the longer it survives.”
🎯 Learning Outcomes¶
By the end of this module, you’ll:
Write clean, modular, and maintainable Python code
Apply software design patterns to ML and business applications
Use Git + GitHub like a pro for collaboration and version control
Write documentation that makes your work readable and reproducible
Build end-to-end programs ready for deployment
Test and debug code efficiently with real-world business context
🧼 Section 8.1 – Writing Clean and Modular Code¶
💡 Key Idea:¶
“Code should be written for humans first, machines second.”
✅ The Business-ML Developer’s Commandments:¶
Functionize Everything — If you write the same code twice → make it a function.
Name Like a CEO:
Bad:
x,df2,temp3Good:
customer_df,monthly_sales,calc_profit_margin()
One Function = One Purpose.
A function that does 10 things = a business meeting that never ends.
Use Comments for Why, not What.
# clean data= ❌# Remove inactive customers to reduce churn bias= ✅
Modularize your notebooks → Split code into
.pyutilities:# utils/data_cleaning.py def remove_nulls(df): return df.dropna()Structure your project like this:
business_ml_project/ ├── data/ ├── notebooks/ ├── src/ │ ├── preprocessing.py │ ├── model_training.py │ └── utils/ ├── tests/ ├── requirements.txt └── README.md
🏗️ Section 8.2 – Design Patterns for ML Applications¶
🤖 Why Design Patterns?¶
Design patterns = “proven blueprints” for recurring coding problems.
💼 Business-ML Patterns:¶
| Pattern | Use Case | Example |
|---|---|---|
| Pipeline Pattern | Sequential steps for ML workflow | sklearn.pipeline.Pipeline |
| Factory Pattern | Create different models dynamically | Select model from config file |
| Observer Pattern | Model monitoring or callback alerts | Logging drift in production |
| Strategy Pattern | Switch between algorithms | Try multiple churn models |
| Singleton Pattern | One shared config object | Shared DB or API connector |
⚙️ Example: ML Factory Pattern¶
class ModelFactory:
def get_model(self, model_type):
if model_type == "xgboost":
from xgboost import XGBClassifier
return XGBClassifier()
elif model_type == "logistic":
from sklearn.linear_model import LogisticRegression
return LogisticRegression()
else:
raise ValueError("Unknown model type")
model = ModelFactory().get_model("logistic")
model.fit(X_train, y_train)📘 Section 8.3 – Documentation Best Practices¶
✍️ The Holy Trinity of Documentation:¶
Docstrings
README.md
Jupyter annotations
🧭 Example of a Perfect Function Docstring¶
def calculate_roi(investment, returns):
"""
Calculate Return on Investment (ROI)
Parameters:
investment (float): Initial investment amount
returns (float): Total returns received
Returns:
float: ROI as a percentage
Example:
>>> calculate_roi(1000, 1200)
20.0
"""
return ((returns - investment) / investment) * 100📂 README Template¶
## Customer Churn Prediction
A machine learning project that predicts churn probability using customer activity data.
## Steps
1. Data Cleaning
2. Feature Engineering
3. Model Training (RandomForest, XGBoost)
4. Evaluation and Reporting
## Run
```bash
python src/train_model.py
> 💡 Write documentation like future-you will forget everything in 3 months.
---
## 🌳 Section 8.4 – Version Control with Git and GitHub
### 🚀 Git = Your Business Time Machine
You can:
- Undo every mistake ever made
- Collaborate like a real dev team
- Track every experiment
### 🧠 Core Workflow:
```bash
## Create repo
git init
git add .
git commit -m "Initial commit - created clean project structure"
## Link with GitHub
git remote add origin https://github.com/yourname/business-ml.git
git push -u origin main💪 Pro Tips¶
One feature = one branch
Commit often: “save game checkpoints”
Use meaningful commit messages:
❌ “fixed stuff”
✅ “added data preprocessing for missing values”
🌎 Bonus: GitHub Actions¶
Automate testing, training, and deployment with .github/workflows/ci.yml.
⚙️ Section 8.5 – Building End-to-End Programs for Deployment¶
🧱 What “End-to-End” Means:¶
→ Raw Data → Cleaned → Model → Dashboard/API → Monitor → Repeat
📊 Business ML Pipeline Template¶
def main():
data = load_data("data/sales_forecasting.csv")
data = preprocess(data)
model = train_model(data)
save_model(model, "models/sales_model.pkl")
evaluate(model, data)
deploy_to_streamlit("models/sales_model.pkl")
if __name__ == "__main__":
main()🚀 Deployment Examples¶
| Method | Tool | Use Case |
|---|---|---|
| Web App | Streamlit / Flask | Internal dashboards |
| API | FastAPI | Serve ML predictions |
| Notebook-to-App | Gradio | Quick business demos |
| Production | Docker + Cloud | Full enterprise deployment |
⚡ Remember: “A model isn’t useful until someone non-technical can use it.”
🧪 Section 8.6 – Testing and Debugging Business Applications¶
🧠 Why Test?¶
Because debugging in production is like fixing a plane mid-flight ✈️
🧰 Levels of Testing:¶
| Type | Checks | Example Tool |
|---|---|---|
| Unit | Single functions | pytest |
| Integration | End-to-end pipelines | unittest |
| Regression | Model output drift | pytest + CI/CD |
| User Acceptance | Business KPIs | Manual review |
✅ Example Test¶
def test_calculate_roi():
result = calculate_roi(1000, 1200)
assert result == 20.0, "ROI calculation failed!"🪄 Debugging Tricks¶
Use
pdb: Python’s built-in debuggerUse
try/exceptsmartly:try: model.fit(X, y) except ValueError as e: print("💥 Model training failed:", e)Add logging instead of 100 print statements:
import logging logging.basicConfig(level=logging.INFO) logging.info("Model training started.")
🏁 Summary¶
| Principle | Business Impact |
|---|---|
| Clean Code | Easier maintenance, faster onboarding |
| Design Patterns | Scalable ML pipelines |
| Documentation | Saves hours for future teams |
| Git | Version safety + collaboration |
| End-to-End Design | Deployable business apps |
| Testing | Stable production systems |
🧩 Capstone Challenge¶
Design a mini end-to-end ML project applying all principles:
Clean project structure
Modular functions
GitHub repo + README
Simple test suite
Streamlit app or API deployment
💼 Example: “Predict monthly revenue for a retail chain and deploy results as a Streamlit dashboard.”
# Your code hereExercises¶
Exercise 1¶
Write calculate_roi(investment, returns) that returns the ROI percentage. Include a simple docstring and return a float.
Exercise 2¶
Create format_readme(project_name, steps) that returns a README string with a title and numbered steps. Test it with sample inputs.
Exercise 3¶
Write a tiny unit test test_calculate_roi() that asserts calculate_roi(1000,1200) == 20.0. The test should raise an AssertionError if it fails.