Skip to content

Bench Results

Regenerated by llm-council bench report --publish docs/bench-results.md — do not edit by hand (ADR-048 P3). No published run yet: this placeholder is replaced by the first published harness output, which stamps dataset version, run date, spend, and per-item results.

See the benchmark guide for how runs are produced and the caveats that accompany published numbers.