Skip to content

ADR-045 Phase 3 — MCP-path state inventory (stateless-deployment audit)

Date: 2026-07-03 · Scope: every piece of state on the MCP tool path (consult_council, council_health_check, verify, audit) that outlives a single request, audited for multi-instance (load-balanced) deployment. Validated by the two-instance smoke suite (tests/test_stateless_smoke.py).

Deployment model

The MCP server is stdio: one process per client session, so MCP tool calls never span instances today. Multi-instance concerns apply to (a) the HTTP server behind a load balancer and (b) durable stores shared by concurrent processes. Tasks-backed deliberation (ADR-045 P1) is the piece that will make MCP results cross-instance — which is why TaskStore is the load-bearing item below.

Inventory

# State Location Class Multi-instance impact Verdict
1 TaskStore (.council/tasks/) mcp_tasks.py durable Cross-instance by design: atomic writes (tmp+os.replace), created_at TTL, terminal states first-writer-wins OK — smoke-tested (test_stateless_smoke.py: lifecycle split across instances, two-process concurrency, torn-read guard)
2 Layer-event accumulator layer_contracts.py _layer_events in-memory Was unbounded — a memory leak in any long-lived process (only tests cleared it) FIXED — bounded ring buffer (MAX_LAYER_EVENTS=1000); durable observability goes through metrics adapters (ADR-030), this is a debugging window
3 Circuit-breaker registry gateway/circuit_breaker_registry.py _circuit_breakers in-memory Instances learn failures independently; one instance's open breaker isn't visible to others Per-instance by design. Each instance protecting itself with its own failure window is standard practice; divergence affects optimality (a few extra probes), never response correctness. Distributed breaker state is deliberate non-scope.
4 Audition tracker cache audition/tracker.py in-memory over JSONL Quarantine/promotion decisions read through a per-process cache; instances converge on the JSONL store with a staleness window Eventual-consistent by design. Audition state gates voting weight ramp (advisory), not answer correctness.
5 Metrics adapter subscriptions observability/metrics_adapter.py in-memory Each instance emits its own metrics Per-instance by design — that is how StatsD/Prometheus exporters work; aggregation happens in the metrics backend.
6 Telemetry singleton telemetry.py _telemetry in-memory Set at process startup OK — startup-scoped configuration, identical across instances started from the same config.
7 Model registry / metadata cache metadata/registry.py in-memory TTL over static/dynamic providers Staleness window between instances for model discovery Eventual-consistent by design — TTL-bounded; affects candidate selection freshness, not correctness. Offline mode (ADR-026) is the degenerate always-static case.
8 Module-level config constants (COUNCIL_MODELS, CONFIDENCE_CONFIGS, TIER_MODEL_POOLS, unified config) mcp_server.py, unified_config.py import-time immutable Identical across instances given identical config/env OK — frozen at import; per ADR-024 config priority is deterministic.
9 Request API key ContextVar in the HTTP path request-scoped None — async-local OK — the reference pattern for request scope.
10 Durable JSONL stores (bias persistence, performance tracker, transcripts under .council/logs) bias_persistence.py, performance/ durable append-only Two instances appending is safe (line-append semantics); aggregation reads whole files OK — append-only; cross-session analytics tolerate interleaving.

Conclusions

  • No correctness violations remain. The one true defect found (unbounded _layer_events) is fixed with a bounded ring buffer.
  • Per-instance circuit breakers, metrics, and caches are deliberate: they affect efficiency, not answers, and distributing them would add a shared dependency (the thing statelessness avoids) for no correctness gain.
  • The cross-instance contract that matters — durable task results — is the ADR-045 P1 TaskStore, and is pinned by the two-instance smoke suite.

Re-check triggers

  • If the MCP server gains a streamable-HTTP transport (post SDK v2, #425), re-run this audit for MCP-session-scoped state — stdio's process-per-session assumption no longer holds there.
  • If audition/quarantine ever becomes a hard gate (voting EXCLUDED enforced at selection), revisit #4's staleness window.