Implementing Custom Middleware for Request Tracing in FastAPI

Debugging asynchronous API calls in production requires precise request correlation. This guide details how to implement custom middleware for request tracing within the broader Core Architecture & Routing Patterns ecosystem, ensuring every inbound request receives a unique, traceable identifier that propagates through logs, dependencies, and external service calls.

Understanding Request Tracing in Async Environments

In synchronous Python, developers historically relied on thread-local storage for request-scoped state. In an asyncio event loop, multiple coroutines share the same OS thread, making thread-locals unsafe and highly prone to cross-request contamination. Starlette’s middleware stack processes requests sequentially through a chain of callables, requiring a concurrency-safe mechanism for state isolation. Aligning your tracing strategy with established Middleware Implementation patterns ensures predictable execution order, minimal event-loop blocking, and clean separation of concerns.

Building the Custom Middleware Class

For production FastAPI applications, extending starlette.middleware.base.BaseHTTPMiddleware provides the cleanest interface for request/response interception. While raw ASGI middleware offers marginally lower overhead, BaseHTTPMiddleware safely handles request body streaming and response wrapping without manual scope management.

The following implementation extracts an existing trace ID from upstream proxies (e.g., API gateways, load balancers) or generates a cryptographically secure UUIDv4. It binds the identifier to an async-safe context variable and injects it into the response headers.

import uuid
from contextvars import ContextVar
from typing import Callable, Awaitable
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response

# Request-scoped context variable for trace ID propagation
trace_id_ctx: ContextVar[str] = ContextVar("trace_id", default="")

class RequestTracingMiddleware(BaseHTTPMiddleware):
 async def dispatch(
 self, request: Request, call_next: Callable[[Request], Awaitable[Response]]
 ) -> Response:
 # Extract from upstream proxy or generate new UUIDv4
 incoming_trace_id = request.headers.get("X-Trace-Id")
 trace_id = incoming_trace_id or str(uuid.uuid4())

 # Bind trace ID to the async execution context
 token = trace_id_ctx.set(trace_id)
 try:
 response = await call_next(request)
 response.headers["X-Trace-Id"] = trace_id
 return response
 finally:
 # CRITICAL: Reset context to prevent state leakage across concurrent requests
 trace_id_ctx.reset(token)

Production Constraints:

The finally block is non-negotiable. Omitting it causes stale trace IDs to leak into subsequent requests handled by the same worker process.
uuid.uuid4() is highly optimized in CPython and adds negligible overhead (<0.05ms).

Injecting Trace IDs into Request Context

Once the middleware binds the trace ID to contextvars, downstream dependencies and route handlers can access it without explicit parameter passing. This eliminates signature pollution and maintains clean separation between routing logic and observability concerns.

from fastapi import Depends, FastAPI, HTTPException
from starlette.status import HTTP_200_OK

app = FastAPI()

def get_active_trace_id() -> str:
 """FastAPI dependency to safely retrieve the active request trace ID."""
 trace_id = trace_id_ctx.get()
 if not trace_id:
 raise HTTPException(status_code=500, detail="Trace context missing")
 return trace_id

@app.get("/api/v1/health", status_code=HTTP_200_OK)
async def health_check(trace_id: str = Depends(get_active_trace_id)) -> dict:
 # trace_id is automatically injected via DI
 return {"status": "healthy", "trace_id": trace_id}

Context Propagation Rules:

contextvars automatically propagate to child tasks spawned via asyncio.create_task() only if the task is created within the same coroutine scope.
For BackgroundTasks or external thread pools, explicitly capture trace_id_ctx.get() before dispatch and pass it as an argument.

Configuration & Environment Overrides

Hardcoding middleware behavior violates twelve-factor principles. Use Pydantic BaseSettings to toggle tracing, adjust header names, and implement sampling strategies via environment variables.

from pydantic_settings import BaseSettings
from pydantic import Field

class TracingConfig(BaseSettings):
 enabled: bool = Field(default=True, description="Toggle request tracing globally")
 header_name: str = Field(default="X-Trace-Id", description="HTTP header used for trace propagation")
 sampling_rate: float = Field(default=1.0, ge=0.0, le=1.0, description="Probability of tracing a request (0.0-1.0)")

 model_config = {"env_prefix": "TRACE_"}

config = TracingConfig()

# Conditional middleware registration
if config.enabled:
 app.add_middleware(RequestTracingMiddleware)

Environment Overrides:

TRACE_ENABLED=false disables tracing in local development or staging.
TRACE_SAMPLING_RATE=0.1 traces 10% of traffic, reducing log volume in high-throughput production environments.
Upstream proxies (NGINX, AWS ALB) can inject custom headers by setting TRACE_HEADER_NAME=X-Correlation-Id.

Debugging & Performance Validation

1. Header Presence Testing

Use fastapi.testclient.TestClient to assert trace ID generation and propagation without spinning up a live server.

from fastapi.testclient import TestClient

def test_trace_id_generation() -> None:
 client = TestClient(app)
 response = client.get("/api/v1/health")
 assert response.status_code == 200
 assert "X-Trace-Id" in response.headers
 assert len(response.headers["X-Trace-Id"]) == 36 # UUIDv4 format

2. Latency Profiling

Middleware overhead should remain sub-millisecond. Profile with pyinstrument to isolate blocking calls:

pyinstrument -r html -o profile.html -m uvicorn main:app --host 0.0.0.0 --port 8000

If P99 latency spikes, verify that no synchronous I/O (e.g., requests, sqlite3, logging.FileHandler) executes inside dispatch().

3. Background Task & WebSocket Caveats

Background Tasks: contextvars do not auto-propagate to BackgroundTasks because they run in separate event loop cycles. Capture the ID explicitly: bg_task_id = trace_id_ctx.get(); background_tasks.add_task(worker, bg_task_id)
WebSockets: BaseHTTPMiddleware only intercepts HTTP requests. For WebSocket tracing, implement a raw ASGI middleware that inspects scope["type"] == "websocket" and attaches metadata to scope["state"].

Common Production Pitfalls

Anti-Pattern	Impact	Production Fix
Using module-level globals for trace state	Race conditions corrupt logs across concurrent requests	Always use `contextvars.ContextVar` for async-safe isolation
Omitting `trace_id_ctx.reset(token)` in `finally`	Memory leaks and stale IDs persist across worker lifecycles	Wrap `call_next` in `try/finally` to guarantee context cleanup
Executing blocking I/O in `dispatch()`	Event loop starvation, P99 latency spikes >500ms	Use `await` for all network/DB calls; offload sync work to `run_in_executor`

Frequently Asked Questions

Does custom request tracing middleware impact API latency?

Minimal overhead (<1ms) when using contextvars and avoiding blocking I/O. The primary cost is header parsing and UUID generation, both highly optimized in CPython.

How do I ensure trace IDs propagate to background tasks?

Capture the trace ID before dispatching the task using trace_id_ctx.get() and explicitly pass it as an argument. Inside the task, set it in a new context scope if downstream dependencies rely on contextvars.

Can this middleware coexist with CORS and authentication middleware?

Yes. FastAPI executes middleware in reverse order of registration. Place tracing middleware closest to the application core (register it last) to ensure it captures all routed requests after CORS preflight and authentication validation.