Category 7 — Performance and Availability (weight 8)

Because every gateway-mediated tool call routes through the Arcade Engine — even when the custom server is self-hosted — Engine latency and availability are a floor on the entire agent stack. Verbatim criteria/gates from the criteria Google Doc. Fill Score/Evidence locally; the human pastes. 1–5 scale; anchors at 1/3/5.

Scores

#	Criterion (verbatim)	Score (1–5)	Evidence / note
1	Engine-added latency per tool call is within acceptable bounds for interactive agent use.
2	Engine SLA — defined uptime guarantees with incident response process.
3	Failure behavior when Engine is unavailable: fail-closed with a clean, catchable error.
4	Self-hosted server HA — multi-replica, pod failure handling, no dropped calls on restart.
5	Multi-region failover design — documented and validated.
6	Engine geographic placement and round-trip latency from ServiceTitan's primary region.

Average: ___ Category score: ___

Score anchors

1 — Engine SLA undocumented; failure behavior is a hang or silent failure; no HA guidance
3 — SLA documented; HA works with manual configuration; failure behavior is known but requires client-side handling
5 — SLA with incident response in writing; HA is the documented default; failure behavior is clean and observable

Benchmark tests

#	Test (verbatim)	Result	Evidence
1	Make 100 tool calls through the Engine to a self-hosted server. Measure P50, P95, P99 round-trip latency. Compare against a direct server call (bypassing the Engine) to isolate Engine-added overhead.
2	Simulate Engine unavailability (block the Engine endpoint). Confirm tool calls fail with a clean, catchable error — not a hang or silent failure.
3	Deploy the custom server with multiple replicas. Kill one pod. Confirm tool calls continue without dropped requests.
4	Confirm Engine SLA documentation: uptime percentage, response time commitment, and P0 escalation path.

Suggested pass/fail gates

Gate	Pass condition (verbatim)	Result	Evidence
Engine overhead	P95 Engine-added latency is under 500ms for standard (non-streaming) tool calls
SLA documented	Engine uptime SLA and incident response process confirmed in writing
HA	Self-hosted server survives pod failure; no tool calls dropped during pod restart
Fail behavior	Engine outage produces a clean, catchable error to the agent — no hangs

2.6 KiB Raw Blame History Unescape Escape

Category 7 — Performance and Availability (weight 8)

Scores

Score anchors

Benchmark tests

Suggested pass/fail gates

Findings

2.6 KiB

Raw Blame History