Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Version Control with Git and GitHub

(a.k.a. “How to Time Travel Without Breaking the Space-Time Continuum”)

If you’ve ever:

  • Deleted your main.py by accident,

  • “Fixed” a bug that created three new ones,

  • Or spent an hour screaming “WHY WON’T IT PUSH?” at your terminal...

Then congratulations — you’ve already experienced the spiritual initiation known as Git. 🙏

Git is not just version control — it’s a multiverse manager for your code. You can go back in time, explore alternate universes (branches), and even merge timelines (hopefully without paradoxes).


🪄 1. Git in One Sentence

“Git is like a diary for your code — except it keeps receipts.” 🧾

Every time you change something, Git takes a snapshot. When something breaks, you can say, “It worked yesterday, let’s time-travel back!”

Or, if you’re a machine learning engineer:

“It worked yesterday... on my machine... with that one data file I deleted.” 😅


📦 2. The Business Case for Git

Business software (and ML pipelines) change constantly — new features, new models, new interns.

Without Git, you get chaos:

pricing_model_final_v2_really_final_THIS_ONE_TRUST_ME.py

With Git:

git commit -m "Refactored pricing model with improved elasticity factor"

Boom. Instant professionalism. Git makes your project:

  • Reproducible — You can recreate any version.

  • Collaborative — Multiple devs, one codebase.

  • Accountable — Every change has an author and a reason (or a panicked “fixes bug!!!”).


🪜 3. The Git Workflow (a.k.a. “Git Yoga”) 🧘‍♂️

The Core Moves:

CommandWhat It DoesReal-Life Analogy
git initStarts version control“Today I begin journaling my chaos.”
git add .Stages files for commit“These are the thoughts I want to keep.”
git commit -m "message"Saves a snapshot“Diary entry saved.”
git pushUploads to GitHub“Backing up to the cloud, just in case.”
git pullGets latest changes“Let’s see what the others broke today.”
git branchCreates a new timeline“What if I tried a new idea… safely?”
git mergeMerges timelines“Okay, let’s pretend both versions can coexist peacefully.”

🌳 4. Branching Strategies for Real ML Projects

For large ML or business systems, branching is how you avoid coding over each other’s work like a chaotic group project.

👇 The Classic Setup:

main
│
├── dev
│   ├── feature/pricing_model
│   ├── feature/dashboard_ui
│   ├── bugfix/logging_issue
│   └── experiment/lstm_forecast

Main → Production-ready code Dev → Where active features live Feature branches → Each new addition (model, API, dashboard) Experiment branches → For ML chaos: tuning, trying new models, breaking things intentionally

Never train directly on main — that’s like cooking in the server room. 🍳🔥


🧠 5. Git for Machine Learning Projects

Here’s the thing: ML work isn’t just code. You’re versioning:

  • Code

  • Data

  • Models

  • Configurations

That’s like juggling chainsaws. So you need tools that extend Git.

🔧 Use These:

  • Git LFS (Large File Storage): For big model weights (.pkl, .h5, .pt).

  • DVC (Data Version Control): Tracks datasets and model artifacts.

  • MLflow + Git: Log experiments while keeping the code version linked to commits.

Example:

git commit -m "Add XGBoost model training pipeline"
dvc add data/sales_2024.csv
dvc push

Now your repo knows which version of data and model went with that code — future you will weep tears of joy. 😭


🧩 6. The Sacred “Pull Request” (PR)

A Pull Request is like saying:

“Dear Team, I made changes. Please judge me.” 🙇‍♂️

It’s not just a merge — it’s a review ritual where:

  • Teammates comment, suggest, or roast your code.

  • Automated tests check if you broke production.

  • You discover your function names are “too creative.”

Pro Tip: Add humor to your PR titles.

  • “Add new forecasting module (it actually works this time)”

  • “Refactor ML pipeline to stop crying at runtime”

And always attach screenshots or metrics. Because reviewers love proof. 📊


🔍 7. GitHub for Business and ML Projects

GitHub isn’t just a storage unit — it’s your collaboration hub.

Use these features to run a real team like a pro startup:

  • 🧩 Issues → For tasks, bugs, and “remember to fix that thing we ignored.”

  • 🔀 Pull Requests → Code reviews and merge requests.

  • ⚙️ Actions (CI/CD) → Automate model tests, deployments, or notebook runs.

  • 🧾 Wiki → Document system architecture or business logic.

  • 📊 Projects → Kanban boards for “what’s cooking.”

Example: When your ML model passes all tests, GitHub Actions can automatically deploy it to an API.

You sip your coffee while bots do the boring stuff. ☕🤖


🧱 8. Best Practices for Business-Scale Version Control

✅ Do:

  • Use meaningful commit messages (“Added price elasticity model” > “stuff fixed lol”).

  • Keep feature branches small — don’t merge a 5,000-line “update.”

  • Review before merging — even if you’re reviewing your own code.

  • Tag releasesv1.0.0, v1.1.0 — for clear version history.

  • Automate everything — testing, linting, data checks.

🚫 Don’t:

  • Commit large datasets (use DVC or external storage).

  • Force push to main (unless you enjoy chaos).

  • Store API keys in your repo. (Your company will not find this funny.)


🧮 9. Example: Version-Controlled ML Repository

sales_forecasting_system/
│
├── data/                  # tracked with DVC
│   ├── raw/
│   ├── processed/
│   └── dvc.yaml
│
├── models/                # tracked with Git LFS
│   ├── xgboost_model.pkl
│   └── lstm_model.pt
│
├── src/
│   ├── data_pipeline.py
│   ├── model_training.py
│   └── api_server.py
│
├── .github/
│   ├── workflows/
│   │   └── ci.yml
│
└── README.md

Now every commit = versioned code, model, and data pipeline snapshot.

If something goes wrong in production, you can literally rewind to the moment before disaster struck — like an undo button for life. 🕰️


🖱️ 10. The GitHub GUI: Version Control Without Terminal PTSD

Some developers swear by the terminal. Others just want a nice button that says “Push” and doesn’t judge them.

Enter: GitHub’s Graphical Interface (GUI) — your friendly visual assistant for all things version control.

It’s like using Excel instead of raw SQL: same power, fewer tears. 😅


🎨 GitHub Desktop: The “Click-Friendly” Git

If you prefer drag-and-drop over type-and-cry, GitHub Desktop is your best friend. It’s a free desktop app that makes Git operations as easy as sending an email.

You can:

  • Clone repositories with a click 🖱️

  • Switch and create branches visually 🌳

  • Commit changes with descriptive messages 💬

  • Undo mistakes without summoning Stack Overflow 🔄

  • Compare code side-by-side before pushing

Pro Tip: You can even see your commit graph as a timeline of your emotional state during development.

Example:

  • Calm commits: “Update README.md”

  • Chaotic commits: “pls work now FINAL_fixed2.py”

  • Panic merges: “merged main into main??? help”


💻 GitHub Web Interface: The Cloud-Based Control Room ☁️

The GitHub website isn’t just for browsing repos — it’s a full command center. From your browser, you can:

  • Edit files directly (yes, even in production — but please don’t 😬)

  • Create and review Pull Requests with rich diff views

  • Comment, tag, and assign teammates to issues

  • Visualize branch graphs

  • Manage workflows and deployments with Actions

  • Write Markdown documentation that looks like a website

Use Case: Let’s say your marketing analyst spots a typo in your ML dashboard’s docs. Instead of cloning, branching, and pushing — they can edit the README right on GitHub, submit a PR, and you can approve it before your coffee cools. ☕

That’s collaboration — with zero terminal trauma.


⚙️ The GitHub UI for Collaboration: The “Control Tower”

In large business or ML projects, GitHub’s GUI becomes your team’s command hub:

  • Pull Requests: Where code meets judgment day.

  • Issues: To track bugs, ideas, and existential questions like “why is accuracy dropping?”

  • Projects: Kanban-style task boards for your dev + data teams.

  • Actions: Your CI/CD pipelines with buttons and dashboards (no YAML panic).

  • Wiki: Internal knowledge base for onboarding new hires and confused executives.

Each of these sections in GitHub GUI connects your code to business reality — tracking work, ownership, and progress.


🧩 GitHub GUI for ML Teams

For Machine Learning workflows, the GitHub GUI also plays a key role in:

  • Reviewing model results (attach plots to PRs 📈)

  • Discussing metrics (use comment threads like peer reviews)

  • Tracking data version files (through DVC/Git LFS integrations)

  • Managing release versions of trained models

Pro Tip: Use GitHub’s “Releases” tab to tag trained model versions:

v1.0.0 - Prophet model for Q1 demand forecasting
v1.1.0 - XGBoost model with improved holiday season accuracy

This lets business teams know which model went live — without ever opening code.


🕶️ When to Use GUI vs. Terminal

TaskGUITerminal
Cloning a repo
Creating a branch
Resolving merge conflicts😬 (possible)😎 (powerful)
Reviewing PRs🚫
Setting up CI/CD😕
Deep Git surgery (rebases, cherry-picks)🚫🧙‍♂️ Advanced magic only

Think of the GUI as the “Tesla autopilot” of Git — great for 90% of the journey, but you still need manual control for the tricky turns. ⚙️


🌍 GitHub Copilot and Codespaces

GitHub is more than just version control now — it’s an AI-powered coding platform.

  • GitHub Copilot — Your AI pair programmer. Suggests code, comments, and sometimes poetry. Perfect for ML devs who type “import pandas as pd” 400 times a day.

  • GitHub Codespaces — A full VS Code environment in the browser. Launch, edit, and run your entire ML project on the cloud. No setup. No “works on my machine.” Just instant productivity. 🚀

Combine them, and you’re coding, committing, and deploying — all from the GUI. The future is now (and it has dark mode). 🌚


💬 Final Thoughts

The GitHub GUI isn’t “cheating” — it’s efficient delegation. You’re freeing your brain from remembering 20 different Git commands and focusing on building great systems.

Use the GUI for what it’s best at:

  • Clarity

  • Visualization

  • Collaboration

  • Saving your sanity

“Real developers use Git in the terminal.” — Someone who hasn’t discovered the ‘Undo’ button yet. 😏


# Your code here

Exercises

Exercise

Write format_branch_name(ticket, desc) that returns a sanitized branch name like ticket/short-desc (lowercase, replace spaces with dashes, remove illegal characters).