Editorial · Last reviewed May 2026
Benchmark methodology · how we score AI tools
How we score AI tools in our monthly benchmark dataset. Task selection, rubric, run protocol, version history.
The benchmark methodology specification will be locked before the first run in Month 11.
This page will document task selection, scoring rubric, run protocol, and version history. Until then, see the benchmarks index for the run schedule.