
Topological Sorting¶
Ordering Tasks with Dependencies¶
Cycle detection answered a yes-or-no question: does a dependency loop exist? Topological sorting goes one step further. If the graph is acyclic, it builds a valid order so every prerequisite appears before the task that depends on it.
Why this matters in practice
a data workflow must load data before cleaning it and clean it before training
a software build must compile dependencies before packaging the final app
a business approval chain must finish review steps before release or deployment
from collections import deque
workflow = {
"Plan Project": ["Collect Data", "Set Up Infrastructure"],
"Collect Data": ["Clean Data"],
"Clean Data": ["Engineer Features"],
"Set Up Infrastructure": ["Train Model"],
"Engineer Features": ["Train Model"],
"Train Model": ["Review Metrics"],
"Review Metrics": ["Approve Release"],
"Approve Release": ["Deploy Model"],
"Deploy Model": [],
}
def topological_sort(graph):
indegree = {node: 0 for node in graph}
for node in graph:
for neighbor in graph[node]:
indegree[neighbor] += 1
queue = deque(sorted(node for node, degree in indegree.items() if degree == 0))
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
if len(order) != len(graph):
raise ValueError("Topological order is impossible because the graph contains a cycle.")
return order
valid_order = topological_sort(workflow)
print("One valid task order:")
for step_number, step in enumerate(valid_order, start=1):
print(f"{step_number}. {step}")One valid task order:
1. Plan Project
2. Collect Data
3. Set Up Infrastructure
4. Clean Data
5. Engineer Features
6. Train Model
7. Review Metrics
8. Approve Release
9. Deploy Model
Core Idea and Visual Intuition¶
Topological sorting applies to a directed acyclic graph. Each arrow means “must happen before.” A valid topological order respects every arrow in the graph.
Read the graph visually
Plan Projectcomes first because nothing must happen before itTrain Modelhas to wait for bothSet Up InfrastructureandEngineer FeaturesDeploy Modelappears late because it depends on the whole chain before it
The indegree of a node is how many prerequisites still point into it. If indegree is zero, that task is ready to begin.
If every remaining node still has an incoming dependency, nothing can start next. That deadlock is exactly what a cycle creates.
Cycle detection told us when dependency loops exist. Topological sorting now assumes no such loop exists and produces an executable order.
Worked Example and Practice Lab¶
What to notice
only zero-indegree tasks can appear next in a valid order
at the beginning,
Plan Projectis available because nothing points into itafter removing a task, some downstream tasks may become newly available
Expected contrast
Valid DAG order: ['Plan Project', 'Collect Data', 'Set Up Infrastructure', 'Clean Data', 'Engineer Features', 'Train Model', 'Review Metrics', 'Approve Release', 'Deploy Model']
Cycle found: no topological order exists.Guided Practice¶
When can a node appear next in a topological order?¶
Why does a cycle prevent topological sorting?¶
Which statement is true about topological order?¶
Exercises¶
Change the workflow so
Review Metricsalso depends onClean Data. Predict whether the displayed order still works.Add a new task called
Write Reportthat must happen afterReview Metricsbut beforeApprove Release.Modify the code so it collects all zero-indegree tasks at each round and prints them as processing layers.
Hint
Think in terms of prerequisites still remaining. A task becomes available only after every incoming dependency has been removed from consideration.
Key Takeaway¶
Topological sorting turns an acyclic dependency graph into a valid execution order by repeatedly choosing zero-indegree tasks. Cycle detection told us when ordering is impossible; topological sorting shows how to proceed when the graph is safe. In the next notebook, the focus shifts again from prerequisite order to route exploration in the path-finder visualizer.