Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

concurrency{width=90}

Why this matters

Many business tasks are IO-bound: calling APIs, reading files, or fetching metrics. Choosing the right concurrency model (threads, asyncio, or processes) speeds pipelines and avoids common pitfalls like race conditions.

Learning objectives

  • Recognize trade-offs between threads and asyncio for IO-bound workloads.

  • Run a tiny, deterministic demo that compares ThreadPoolExecutor and asyncio for simulated IO tasks (Pyodide-safe).

  • Understand basic thread-safety pitfalls and when to prefer each model.


# Concurrency demo with safe fallbacks for in-browser execution\n
import time\n
import random\n
random.seed(1)\n
\n
def io_task(i):\n
    delay = (i % 3) * 0.01 + 0.02\n
    time.sleep(delay)\n
    return ('thread', i, delay)\n
\n
async def async_io_task(i):\n
    delay = (i % 3) * 0.01 + 0.02\n
    await asyncio.sleep(delay)\n
    return ('async', i, delay)\n
\n
try:\n
    import asyncio\n
    from concurrent.futures import ThreadPoolExecutor\n
    start = time.perf_counter()\n
    with ThreadPoolExecutor(max_workers=4) as ex:\n
        results_threads = list(ex.map(io_task, range(8)))\n
    thread_time = time.perf_counter() - start\n
\n
    async def run_async():\n
        t0 = time.perf_counter()\n
        res = await asyncio.gather(*(async_io_task(i) for i in range(8)))\n
        return time.perf_counter() - t0, res\n
    try:\n
        async_time, results_async = asyncio.run(run_async())\n
    except Exception:\n
        async_time = None\n
        results_async = []\n
\n
    print(f"ThreadPool elapsed: {thread_time:.4f}s")\n
    if async_time is not None:\n
        print(f"Asyncio elapsed: {async_time:.4f}s")\n
    else:\n
        print('Asyncio not available in this environment; skipped.')\n
    print('\nSample thread results:', results_threads[:3])\n
    print('Sample async results:', results_async[:3])\n
except Exception as e:\n
    print('Concurrency demo not available in this environment:', e)\n
    results_seq = []\n
    for i in range(8):\n
        time.sleep((i % 3) * 0.01 + 0.02)\n
        results_seq.append(('seq', i))\n
    print('Sequential fallback results:', results_seq)\n

Visual intuition: Thread pool vs event loop\n

\n

flowchart LR\n
  subgraph ThreadPool\n
    A[Tasks queued] --> B[Worker 1]\n
    A --> C[Worker 2]\n
    A --> D[Worker 3]\n
  end\n
  subgraph EventLoop\n
    E[Event loop] --> F[Coroutine 1]\n
    E --> G[Coroutine 2]\n
  end\n
  style ThreadPool fill:#f9f,stroke:#333,stroke-width:1px\n
  style EventLoop fill:#cff,stroke:#333,stroke-width:1px\n
```\n
\n
*Caption:* Thread pools execute tasks on OS threads; event loops schedule coroutines cooperatively.

Pyodide caveat\n

\n Some concurrency features (for example, creating OS threads or running a full event loop) may behave differently or be unavailable in the browser/Pyodide environment. Demos include fallbacks so examples remain runnable; for production, test code in a real Python process.

Exercises\n

\n

  1. Implement process_store(store_id) which simulates fetching store metrics (returning a dict with sales and profit). Use it with a small ThreadPoolExecutor to demonstrate concurrency and print the collected results.\n

  2. Replace the ThreadPool demo with an asyncio variant using asyncio.sleep() and compare runtimes (remember the Pyodide caveat).\n

  3. (Stretch) Implement a thread-safe counter using threading.Lock to aggregate a metric across workers.\n \n ---\n \n Next notebook per TOC: Programming_for_Business/notebooks/apis_webscraping.ipynb.