System Design for AI - Programming for Machine Learning and Business

System Design for AI and Business Applications¶

Writing a model is only one part of building a useful product. Real systems need reliable APIs, storage, queues, monitoring, and enough resilience to survive traffic spikes, failures, and changing business requirements.

Why This Chapter Matters¶

A good model can still fail in production if the surrounding system is weak. In business settings you usually need to answer questions like:

Where will requests enter the system?
How will the application store user data, events, and model outputs?
What happens if traffic suddenly grows 10x?
How do we update models without breaking the customer experience?
How do we keep costs under control while staying reliable?

System design is the discipline that connects code, infrastructure, data, and business goals into one working solution.

The Big Idea¶

When you design a software system, you are making structured trade-offs between:

Goal	What it means in practice
Reliability	The system keeps working even when one part fails
Scalability	The system handles more users, data, or requests over time
Maintainability	Engineers can understand, debug, and change the system
Cost efficiency	You do not overpay for unused infrastructure
Security	Data access and operations are controlled and auditable
Latency	Users get responses quickly enough for the use case

A Simple AI Product Architecture¶

This architecture is useful because each block has a clear job:

The client app handles user interaction.
The API gateway manages incoming traffic and routing.
The application service enforces business rules.
The database stores operational state.
The queue absorbs spikes and decouples slow work.
The model service performs prediction or ranking.
Monitoring tells you when reality differs from your expectations.

Core Building Blocks¶

Layer	Typical responsibility	Example business use
Presentation	What the user sees and clicks	Dashboard, mobile app, internal analytics portal
Application	Request handling and workflows	Place order, approve loan, create support ticket
Data	Persistence and retrieval	Customer records, transactions, features, logs
Intelligence	Rules or models that guide decisions	Fraud scoring, recommendation, demand forecasting
Operations	Monitoring, deployment, security	Alerts, CI/CD, IAM, audit trails

Architecture Thinking Checklist¶

Before drawing boxes, ask these questions:

Who are the users and what is the critical action they perform?
What requests must be fast, and what work can be delayed?
Which data is transactional, and which data is analytical?
What components are most likely to fail or become bottlenecks?
Which metrics tell you the system is healthy?

What You Will Learn Next¶

Practice Prompt¶

Sketch a system for an internal sales assistant that answers product questions, retrieves customer history, and generates follow-up email drafts. Label which parts need fast synchronous responses and which parts can be delayed in the background.

Takeaway¶

System design is not just drawing infrastructure diagrams. It is the practical skill of shaping a product so that business value, technical reliability, and operational reality all fit together.

Tiny Architecture Diagram¶

Use this first diagram to explain the basic flow: request in, logic in the middle, data and async work behind it.

from dataclasses import dataclass

@dataclass
class Needs:
    realtime: bool
    spikes: bool
    large_media: bool
    async_jobs: bool

def recommend(needs: Needs):
    comps = ["API service", "DB", "Monitoring"]
    if needs.realtime:
        comps.append("Model inference service")
    if needs.spikes:
        comps += ["Load balancer", "Cache"]
    if needs.large_media:
        comps.append("Object storage")
    if needs.async_jobs:
        comps += ["Queue", "Worker"]
    return comps

req = Needs(realtime=True, spikes=False, large_media=True, async_jobs=True)
print(recommend(req))