Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

File System Operations and Scripting

“Because every hero’s journey starts with: cd ~.”


🧭 1. The Linux Jungle

Welcome to the file system — a mysterious land filled with folders named after punctuation.

Here’s the lay of the land:

DirectoryPurposeFun Fact
/home/Where your personal mess livesLike your desktop, but Linuxier
/etc/System config filesStands for “et cetera”… because no one knows what’s really in there
/var/Logs, temp data, chaos“var” stands for “variable,” as in it varies how badly this breaks
/tmp/Temporary filesLike a hotel for files — everyone checks in, nobody survives reboot
/bin/System binariesWhere ls, cp, and your fate reside

If you ever want to feel powerful and terrified at the same time, just run:

sudo rm -rf /

And congratulations — you’ve achieved enlightenment through total data loss. ☠️


📂 2. Basic File Operations

The Linux file system doesn’t care who you are — if you don’t have permissions, you’re just another mortal.

Look around:

ls -lh

The -lh makes your listing human-friendly. (Because computers don’t care if a file is 5 GB or “Oops, too big.”)

Move around:

cd /home/user/Documents

cdthe adult version of “Are we there yet?”

Make new stuff:

mkdir reports
touch data.csv
  • mkdir: makes a folder

  • touch: creates an empty file or updates its timestamp (it’s basically a polite “poke”)


🗃️ 3. Copy, Move, Rename — the Linux Shuffle

Copy a file:

cp model.pkl backup_model.pkl

Move or rename:

mv backup_model.pkl /opt/models/

Copy a whole folder (recursively):

cp -r data/ archive/

⚠️ Be careful with -r. It’s recursive — meaning it’ll dive into every subfolder like a nosy detective.


🧨 4. Deletion: The Point of No Return

When you run:

rm important_file.txt

Linux doesn’t ask “Are you sure?” — it assumes you are a responsible adult. Spoiler: you’re not.

To safely remove things:

rm -i important_file.txt

The -i makes it interactive — Linux now politely asks before nuking your data.

To delete a folder:

rm -rf old_logs/

This one means:

  • -r: dive deep

  • -f: don’t ask questions

  • Together: 💀 “Say goodbye forever.”


📜 5. Reading Files from the Command Line

Sometimes you just need to peek inside a file — not open a whole editor.

cat data.txt
head -n 10 data.txt
tail -f logs.txt

tail -f is especially cool — it lets you watch logs live, like:

“Oh look, my server crashed again… and again… and—yep, there it goes.”


🔁 6. Automating File Operations

Once you master file commands, you can automate your chaos with Bash scripts.

Example: A script to back up your models every morning.

#!/bin/bash
DATE=$(date +%Y-%m-%d)
SRC_DIR="/home/user/models"
DEST_DIR="/backups/$DATE"

mkdir -p "$DEST_DIR"
cp -r "$SRC_DIR" "$DEST_DIR"

echo "Backup completed on $DATE 🎉"

Run it:

bash backup_models.sh

And voilà — your 3 AM “panic about losing files” crisis just got automated.


🕵️ 7. File Searching Like a Pro

Find that one rogue .csv that’s ruining your life:

find /home/user -name "*.csv"

Or look inside files:

grep "sales" data/*.csv

Combine with pipes:

grep "ERROR" /var/log/syslog | tail -n 5

Congratulations, you’re now 50% sysadmin, 50% detective.


🧮 8. Permissions: The Linux Hunger Games

Every file in Linux has permissions:

  • r = read

  • w = write

  • x = execute

Check them with:

ls -l

Output example:

-rwxr-xr--

Breakdown:

SymbolMeaning
rwxOwner can do anything
r-xGroup can read and execute
r--Others can just look sadly

Change permissions:

chmod +x train.sh

Now your script is executable, a.k.a. alive!


🧠 9. Business Use Case: Automated File Pipelines

Imagine you’re running an ML pipeline that:

  • Receives daily sales data via SFTP

  • Cleans and merges CSVs

  • Triggers model retraining

  • Archives old logs

A simple Bash script + cron job can handle that entire flow:

#!/bin/bash
cd /home/user/sales_pipeline
python3 clean_data.py
python3 train_model.py
mv raw/*.csv archive/
echo "Pipeline completed at $(date)" >> pipeline.log

You’ve basically just replaced a junior data engineer.


🎬 Final Hook

The Linux file system isn’t scary — it’s just… one command away from total destruction.

But with great power (sudo) comes great responsibility. Master file ops, and you’ll:

  • Automate boring stuff

  • Keep your ML projects organized

  • And never again lose sleep over “where did I save that model?”

Just remember:

Friends don’t let friends rm -rf /.


# Your code here

Exercises

Exercise 1

Write extract_extension(filename) that returns the file extension (without the dot) or an empty string if none.


Exercise 2

Implement join_paths(parts) which joins a list of path parts with ‘/’ and normalizes duplicate slashes.


Exercise 3

Given a list of filenames, write count_files_with_ext(files, ext) that counts how many end with the given extension.


Exercise 4

Write normalize_path(path) that collapses repeated slashes into single slashes.


Exercise 5

Create human_readable_size(n_bytes) that returns KB/MB/GB formatted string (KB precision).