Files
arcade-eval/categories/_TEMPLATE/criteria-section.md
T

775 B
Raw Blame History

Category N — (weight W)

Verbatim criteria / gates / questions from the criteria Google Doc. Fill Score / Evidence / Findings / Answers locally; the human pastes into the Google Doc. 15 scale; anchors at 1/3/5.

Scores

# Criterion (verbatim) Score (15) Evidence / note
1

Average: ___ Category score: ___

Score anchors

  • 1
  • 3
  • 5

Benchmark questions / tests

# Question / test (verbatim) Answer / result Evidence
1

Suggested pass/fail gates

Gate Pass condition (verbatim) Result Evidence

Findings