Files
arcade-eval/GROUND-RULES.md

44 lines
2.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ground Rules (binding)
These apply to every lane and every session. Read before doing anything.
## Credentials
- Credentials live **only** in the git-ignored `.env`. Never print, commit, or persist keys
elsewhere (not in docs, not in `config/targets.yaml`, not in commit messages).
- Load with `set -a && . ./.env && set +a`.
## The criteria Google Doc
- **Never write the criteria Google Doc from a session.** Concurrent writes spliced tables
mid-word in the prior eval. Compose `criteria-section-N.md` locally; **the human pastes.**
- Criterion / gate / benchmark-question wording is **verbatim** from the criteria doc —
never paraphrase. Re-read the doc if unsure.
## Live-state check (REQUIRED before any conclusion)
The deployment is actively changing; status docs age within a day. Before drawing any
conclusion from the live instance:
```
git -C ~/repos/k8s-backstage-v2 log --oneline -8 origin/master -- apps/arcade
```
plus a dashboard/gateway health probe (e.g. `curl -sS -o /dev/null -w '%{http_code}\n' https://dashboard.arcade.st.dev`).
Any not-yet-reverted in-flight "TEMPORARY"/teardown commit means the bench is NOT in a
validated steady state — don't draw conclusions from it.
## File ownership (parallel-session safety)
| You may write | You may NOT write |
|---|---|
| your `categories/catN-*/` subtree (criteria-section-N.md, tests/, NOTES.md) | another lane's `categories/` subtree |
| your own section of `STATUS.md` | another lane's STATUS section |
| `config/targets.yaml`, `lib/`, top-level docs — **append-mostly**, coordinate | — |
| `results/` (git-ignored) | the criteria Google Doc (see above) |
`git pull --rebase` before starting and again before pushing; on rejection, `git pull --rebase`.
## Deployment changes
- `~/repos/k8s-backstage-v2/apps/arcade/**` is read freely but changed **only deliberately,
with the operator** (infra owns this cluster/POC). Expected case: the cat-5 collector+exporter
remediation — propose first, execute together, document before/after.
## Scoring
- Single candidate (Arcade only): 15 scale, anchors at 1/3/5. Scores drafted locally;
nothing lands in the Google Doc/spreadsheet without the human pasting.