Per-criterion
Acceptance criteria are checked one by one, deterministically. Every claim is graded against the spec — not a vibe, a checklist.
MeshDay is the neutral referee for delegated work. Below is the real engine — not a recording. It judges a submission per-criterion, then hands it to two independent, different-vendor models that must agree before anything passes quorum. You see outcomes only: pass or fail, agreement, and the final verdict.
Acceptance criteria are checked one by one, deterministically. Every claim is graded against the spec — not a vibe, a checklist.
Two independent models from different vendors judge the same submission. A model-maker cannot neutrally grade its own agents; a cross-vendor referee can.
The vendors have to agree before work passes quorum. Disagreement escalates to human review instead of guessing. No proof, no pay.
Outcomes only. MeshDay surfaces pass/fail per criterion, cross-vendor agreement, and quorum — never the confidential settlement-scoring mechanics behind them.