feat: in-repo arcade-gateway-eval bootstrap skill
This commit is contained in:
@@ -0,0 +1,50 @@
|
||||
---
|
||||
name: arcade-gateway-eval
|
||||
description: Use when starting or resuming any lane of the Arcade.dev MCP-gateway evaluation (categories 1-10), especially a parallel session. Establishes repo location, the read-first order, the live-state check, ground rules, and per-lane file ownership.
|
||||
---
|
||||
|
||||
# Arcade gateway eval — lane bootstrap
|
||||
|
||||
You're picking up a lane of the Arcade.dev MCP-gateway benchmark. This repo is the
|
||||
tool-agnostic source of truth; this skill just orients you. Do these in order.
|
||||
|
||||
## 1. Sync
|
||||
- Repo: `~/repos/arcade-eval`. `git pull` first (on rejection, `git pull --rebase`).
|
||||
- `cp config/.env.example .env` and fill creds if you haven't (creds live ONLY in `.env`).
|
||||
|
||||
## 2. Read first (in order)
|
||||
1. `STATUS.md` — who owns what, where each lane is.
|
||||
2. `LIVE-POC.md` — frozen deployment facts (endpoints, IdP=Entra, the OTEL/metrics evidence).
|
||||
3. `GROUND-RULES.md` — binding rules.
|
||||
|
||||
## 3. Live-state check (REQUIRED before any conclusion)
|
||||
The deployment changes under you; docs age within a day.
|
||||
```
|
||||
git -C ~/repos/k8s-backstage-v2 log --oneline -8 origin/master -- apps/arcade
|
||||
curl -sS -o /dev/null -w '%{http_code}\n' https://dashboard.arcade.st.dev
|
||||
```
|
||||
Any not-yet-reverted in-flight "TEMPORARY"/teardown commit on `apps/arcade` ⇒ not a validated
|
||||
steady state; don't draw conclusions from it.
|
||||
|
||||
## 4. File ownership (parallel-session safety)
|
||||
- Write only inside your `categories/catN-*/` subtree + your own `STATUS.md` section.
|
||||
- Shared files (`config/targets.yaml`, `lib/`, top-level docs) are append-mostly; `git pull
|
||||
--rebase` before push. Full table in `GROUND-RULES.md`.
|
||||
|
||||
## 5. Ground rules you will trip on if you forget
|
||||
- **Never write the criteria Google Doc from a session** — compose `criteria-section-N.md`
|
||||
locally; the human pastes. Criterion wording is **verbatim** from the criteria doc.
|
||||
- Credentials only in `.env`. Single candidate ⇒ 1–5 scoring, anchors at 1/3/5.
|
||||
|
||||
## 6. Starting your lane
|
||||
- Your `categories/catN-*/criteria-section-N.md` is pre-seeded with verbatim criteria.
|
||||
- Copy `categories/_TEMPLATE/`'s `NOTES.md` + `tests/` into your dir; record progress in `NOTES.md`.
|
||||
|
||||
## 7. Category-specific pointers
|
||||
- **cat 5 (auditability):** metrics go to **Grafana/Mimir**, NOT ELK. Engine OTLP is currently
|
||||
dropped (no collector resolves at `arcade-otel-collector:4318`). See LIVE-POC "Metrics pipeline".
|
||||
- **cat 1 / 2 (identity):** per-user behavior is testable via `user_id` headers (one API key,
|
||||
many users); a real Entra **SSO login → identity-mapping** test is cat-2 and wants a second
|
||||
real account (a teammate is the natural User B).
|
||||
- **cat 9 (dev experience):** the shared `lib/mcp_server` (echo/whoami/add) is the fixture;
|
||||
the DX *timing* of building a server from scratch is the separate measurement.
|
||||
Reference in New Issue
Block a user