Bench Results¶
Regenerated by
llm-council bench report --publish docs/bench-results.md— do not edit by hand (ADR-048 P3). No published run yet: this placeholder is replaced by the first published harness output, which stamps dataset version, run date, spend, and per-item results.
See the benchmark guide for how runs are produced and the caveats that accompany published numbers.