File System Operations and Scripting¶
“Because every hero’s journey starts with: cd ~.”¶
🧭 1. The Linux Jungle¶
Welcome to the file system — a mysterious land filled with folders named after punctuation.
Here’s the lay of the land:
| Directory | Purpose | Fun Fact |
|---|---|---|
/home/ | Where your personal mess lives | Like your desktop, but Linuxier |
/etc/ | System config files | Stands for “et cetera”… because no one knows what’s really in there |
/var/ | Logs, temp data, chaos | “var” stands for “variable,” as in it varies how badly this breaks |
/tmp/ | Temporary files | Like a hotel for files — everyone checks in, nobody survives reboot |
/bin/ | System binaries | Where ls, cp, and your fate reside |
If you ever want to feel powerful and terrified at the same time, just run:
sudo rm -rf /And congratulations — you’ve achieved enlightenment through total data loss. ☠️
📂 2. Basic File Operations¶
The Linux file system doesn’t care who you are — if you don’t have permissions, you’re just another mortal.
Look around:¶
ls -lhThe
-lhmakes your listing human-friendly. (Because computers don’t care if a file is 5 GB or “Oops, too big.”)
Move around:¶
cd /home/user/Documentscd — the adult version of “Are we there yet?”
Make new stuff:¶
mkdir reports
touch data.csvmkdir: makes a foldertouch: creates an empty file or updates its timestamp (it’s basically a polite “poke”)
🗃️ 3. Copy, Move, Rename — the Linux Shuffle¶
Copy a file:¶
cp model.pkl backup_model.pklMove or rename:¶
mv backup_model.pkl /opt/models/Copy a whole folder (recursively):¶
cp -r data/ archive/⚠️ Be careful with -r. It’s recursive — meaning it’ll dive into every subfolder like a nosy detective.
🧨 4. Deletion: The Point of No Return¶
When you run:
rm important_file.txtLinux doesn’t ask “Are you sure?” — it assumes you are a responsible adult. Spoiler: you’re not.
To safely remove things:
rm -i important_file.txtThe -i makes it interactive — Linux now politely asks before nuking your data.
To delete a folder:
rm -rf old_logs/This one means:
-r: dive deep-f: don’t ask questionsTogether: 💀 “Say goodbye forever.”
📜 5. Reading Files from the Command Line¶
Sometimes you just need to peek inside a file — not open a whole editor.
cat data.txt
head -n 10 data.txt
tail -f logs.txttail -f is especially cool — it lets you watch logs live, like:
“Oh look, my server crashed again… and again… and—yep, there it goes.”
🔁 6. Automating File Operations¶
Once you master file commands, you can automate your chaos with Bash scripts.
Example: A script to back up your models every morning.
#!/bin/bash
DATE=$(date +%Y-%m-%d)
SRC_DIR="/home/user/models"
DEST_DIR="/backups/$DATE"
mkdir -p "$DEST_DIR"
cp -r "$SRC_DIR" "$DEST_DIR"
echo "Backup completed on $DATE 🎉"Run it:
bash backup_models.shAnd voilà — your 3 AM “panic about losing files” crisis just got automated.
🕵️ 7. File Searching Like a Pro¶
Find that one rogue .csv that’s ruining your life:
find /home/user -name "*.csv"Or look inside files:
grep "sales" data/*.csvCombine with pipes:
grep "ERROR" /var/log/syslog | tail -n 5Congratulations, you’re now 50% sysadmin, 50% detective.
🧮 8. Permissions: The Linux Hunger Games¶
Every file in Linux has permissions:
r= readw= writex= execute
Check them with:
ls -lOutput example:
-rwxr-xr--Breakdown:
| Symbol | Meaning |
|---|---|
rwx | Owner can do anything |
r-x | Group can read and execute |
r-- | Others can just look sadly |
Change permissions:
chmod +x train.shNow your script is executable, a.k.a. alive! ⚡
🧠 9. Business Use Case: Automated File Pipelines¶
Imagine you’re running an ML pipeline that:
Receives daily sales data via SFTP
Cleans and merges CSVs
Triggers model retraining
Archives old logs
A simple Bash script + cron job can handle that entire flow:
#!/bin/bash
cd /home/user/sales_pipeline
python3 clean_data.py
python3 train_model.py
mv raw/*.csv archive/
echo "Pipeline completed at $(date)" >> pipeline.logYou’ve basically just replaced a junior data engineer.
🎬 Final Hook¶
The Linux file system isn’t scary — it’s just… one command away from total destruction.
But with great power (sudo) comes great responsibility.
Master file ops, and you’ll:
Automate boring stuff
Keep your ML projects organized
And never again lose sleep over “where did I save that model?”
Just remember:
Friends don’t let friends
rm -rf /.
# Your code hereExercises¶
Exercise 1¶
Write extract_extension(filename) that returns the file extension (without the dot) or an empty string if none.
Exercise 2¶
Implement join_paths(parts) which joins a list of path parts with ‘/’ and normalizes duplicate slashes.
Exercise 3¶
Given a list of filenames, write count_files_with_ext(files, ext) that counts how many end with the given extension.
Exercise 4¶
Write normalize_path(path) that collapses repeated slashes into single slashes.
Exercise 5¶
Create human_readable_size(n_bytes) that returns KB/MB/GB formatted string (KB precision).