feat: in-repo arcade-gateway-eval bootstrap skill

This commit is contained in:
2026-06-18 10:07:47 -04:00
parent 60b33575d0
commit 29c5b2c8be
@@ -0,0 +1,50 @@
---
name: arcade-gateway-eval
description: Use when starting or resuming any lane of the Arcade.dev MCP-gateway evaluation (categories 1-10), especially a parallel session. Establishes repo location, the read-first order, the live-state check, ground rules, and per-lane file ownership.
---
# Arcade gateway eval — lane bootstrap
You're picking up a lane of the Arcade.dev MCP-gateway benchmark. This repo is the
tool-agnostic source of truth; this skill just orients you. Do these in order.
## 1. Sync
- Repo: `~/repos/arcade-eval`. `git pull` first (on rejection, `git pull --rebase`).
- `cp config/.env.example .env` and fill creds if you haven't (creds live ONLY in `.env`).
## 2. Read first (in order)
1. `STATUS.md` — who owns what, where each lane is.
2. `LIVE-POC.md` — frozen deployment facts (endpoints, IdP=Entra, the OTEL/metrics evidence).
3. `GROUND-RULES.md` — binding rules.
## 3. Live-state check (REQUIRED before any conclusion)
The deployment changes under you; docs age within a day.
```
git -C ~/repos/k8s-backstage-v2 log --oneline -8 origin/master -- apps/arcade
curl -sS -o /dev/null -w '%{http_code}\n' https://dashboard.arcade.st.dev
```
Any not-yet-reverted in-flight "TEMPORARY"/teardown commit on `apps/arcade` ⇒ not a validated
steady state; don't draw conclusions from it.
## 4. File ownership (parallel-session safety)
- Write only inside your `categories/catN-*/` subtree + your own `STATUS.md` section.
- Shared files (`config/targets.yaml`, `lib/`, top-level docs) are append-mostly; `git pull
--rebase` before push. Full table in `GROUND-RULES.md`.
## 5. Ground rules you will trip on if you forget
- **Never write the criteria Google Doc from a session** — compose `criteria-section-N.md`
locally; the human pastes. Criterion wording is **verbatim** from the criteria doc.
- Credentials only in `.env`. Single candidate ⇒ 15 scoring, anchors at 1/3/5.
## 6. Starting your lane
- Your `categories/catN-*/criteria-section-N.md` is pre-seeded with verbatim criteria.
- Copy `categories/_TEMPLATE/`'s `NOTES.md` + `tests/` into your dir; record progress in `NOTES.md`.
## 7. Category-specific pointers
- **cat 5 (auditability):** metrics go to **Grafana/Mimir**, NOT ELK. Engine OTLP is currently
dropped (no collector resolves at `arcade-otel-collector:4318`). See LIVE-POC "Metrics pipeline".
- **cat 1 / 2 (identity):** per-user behavior is testable via `user_id` headers (one API key,
many users); a real Entra **SSO login → identity-mapping** test is cat-2 and wants a second
real account (a teammate is the natural User B).
- **cat 9 (dev experience):** the shared `lib/mcp_server` (echo/whoami/add) is the fixture;
the DX *timing* of building a server from scratch is the separate measurement.