# Deploy arcade-eval reference MCP server to backstage k8s **Date:** 2026-06-22 **Status:** DONE — deployed and verified end-to-end. ## Goal Replace the ephemeral cloudflared **quick tunnel** (used to register the `arcade-eval-ref` server with the self-hosted Arcade engine) with a permanent deployment on `backstage-wus2-v4`, so the engine reaches the server over a stable URL instead of a `trycloudflare.com` URL that dies on restart. Relevant eval categories: cat-4 (custom server dev), cat-8 (deployment), cat-9 (DX). ## Key finding that shaped the final design The first attempt registered the in-cluster **Service DNS** (`http://arcade-eval-ref.arcade-eval-ref.svc.cluster.local:8000`) as a dashboard worker. Health went green but **0 tools loaded**. Engine logs showed: ``` Failed to get worker tools: Get ".../worker/tools": dial tcp 10.0.192.27:8000: publicOnlyTransport: blocked connection to internal address ``` **The Arcade engine has an SSRF guard (`publicOnlyTransport`) that blocks dashboard-registered worker URIs resolving to internal/private (RFC1918) addresses.** Only workers declared in the **engine config file** (e.g. the bundled `arcade-worker-main` at `http://arcade-worker-main:8001`) may use internal URIs. Health checks aren't guarded (hence green), but the authenticated `/worker/tools` discovery is. The cloudflared tunnel worked only because it was a *public* URL. ⇒ A dashboard-registered in-cluster worker **must be exposed on a public URL**. (The worker secret was a red herring — the connection is refused before auth.) ## Architecture / data flow (final) ``` Claude Code ──▶ gateway zeb-gateway-test ──▶ Arcade engine ──HTTPS /worker/*──▶ https://arcade-eval-ref.st.dev (Cloudflare CNAME → k8s-backstage.st.dev → nginx ingress) └─▶ Service → Deployment: python:3.12 running mcp_server.server over HTTP :8000 (echo / add / whoami). /mcp also served; /worker/* auth = ARCADE_WORKER_SECRET. ``` ### Runtime facts (verified by introspecting `arcade-mcp-server` 1.17) - `app.run()` honors env overrides via `_get_configuration_overrides()`: `ARCADE_SERVER_TRANSPORT=http`, `ARCADE_SERVER_HOST=0.0.0.0`, `ARCADE_SERVER_PORT=8000` — so the hardcoded `127.0.0.1` in `server.py` is overridden at runtime (no code change). - `ARCADE_WORKER_SECRET` enables worker routes at `/worker/*`; the engine authenticates with an HS256 JWT (`aud=worker`, `ver=1`) signed with that secret. MCP is served at `/mcp`. ## Components (three repos) ### 1. `arcade-eval` — image - `lib/mcp_server/Dockerfile` — `python:3.12-slim`, `pip install .`, HTTP transport via env, non-root, port 8000. - `.github/workflows/build-push-acr.yml` — pushes `servicetitandev.azurecr.io/arcade-eval-ref:1.0.` (secrets `ACR_DEV_USERNAME`/`ACR_DEV_PASSWORD`). Adapted from `servicetitan/mem0`. ### 2. `k8s-backstage-v2` — `apps/mcp/arcade-eval-ref/` - `namespace.yaml` — ns `arcade-eval-ref`. - `server.yaml` — **st-app HelmRelease** (chart 2.0.72): `image` pinned to `1.0.1`, `service.internalPort: 8000`, **`ingress.enabled` host `arcade-eval-ref.st.dev` class `nginx`, `oAuth.enabled: false`** (no SSO wall over `/worker/*` or `/mcp`), worker secret via `envFrom` from the SealedSecret, probes off. TLS = ingress default `*.st.dev` wildcard cert. - `sealedsecret.yaml` — `arcade-eval-ref-worker-secret` (key `ARCADE_WORKER_SECRET`), strict scope, sealed with the backstage-wus2-v4 sealed-secrets cert. ### 3. `iac-terraform-workspaces` — DNS - CNAME `arcade-eval-ref.st.dev` → `k8s-backstage.st.dev` (st.dev zone), mirroring the `anvil`/`alerts` pattern. ## Registration (dashboard) Add/repoint the worker: URI `https://arcade-eval-ref.st.dev`, Secret = the worker-secret plaintext (git-ignored at `results/arcade-eval-ref-worker-secret.txt`). The engine then fetches `/worker/tools` over the public URL → tools load → add to `zeb-gateway-test`. ## Verified - `https://arcade-eval-ref.st.dev/worker/health` → 200 (valid `*.st.dev` LE cert); `/worker/tools` with a correct worker JWT → 200, tools `Echo/Add/Whoami`. - Through the gateway: `ArcadeEvalRef_Whoami()` → the caller's Entra `sub` (`GvgRofe5…`), proving per-user execution across the full client → gateway → engine → public URL → in-cluster pod chain. ## Alternative considered (not taken) Declare the server as a static worker in the **engine config** (`tools.directors[].workers`, like `arcade-worker-main`) — that path allows internal URIs and avoids public exposure, but edits the vendor Helm release (`apps/arcade`) and loses the dashboard per-project workflow. Public ingress was chosen as the lower-touch option.