servicetitan/arcade-eval

Files

T

ztaylor 593e1e63b6 docs: _TEMPLATE + all-10 criteria-section stubs (verbatim criteria)

2026-06-18 10:10:17 -04:00

3.1 KiB

Raw Blame History

Category 6 — Security and Compliance (weight 10)

Verbatim criteria/gates from the criteria Google Doc. Fill Score/Evidence locally; the human pastes. 1–5 scale; anchors at 1/3/5.

Scores

#	Criterion (verbatim)	Score (1–5)	Evidence / note
1	PII masking or redaction at the gateway layer — without changes to tool code.
2	Input blocking — Contextual Access policy can block tool calls based on content.
3	MCPs can be scaled to less than human access.
4	Output redaction — sensitive fields removed from responses before reaching the agent.
5	Data processing agreement (DPA) and sub-processor disclosure in place.
6	SOC 2 / ISO 27001 certification (or equivalent) confirmed.
7	Data boundary acceptable to InfoSec — tool call payloads route through Arcade's Engine; execution stays in ServiceTitan's infrastructure.
8	Raw OAuth tokens are never exposed to the LLM, agent code, or logs.
9	Secrets management integration (Azure Key Vault or equivalent) for API key storage.
10	Potential for log forwarding for telemetry, alerting
11	Potential integration for DLP tooling if possible
12	Data boundary guardrails (able to block querying all records from a table)

Average: ___ Category score: ___

Score anchors

1 — No policy enforcement; payloads flow unmodified; DPA and certifications unconfirmed
3 — Some policy controls exist; DPA in progress; compliance posture requires follow-up
5 — Full policy enforcement, DPA executed, compliant data boundary, tokens never exposed

Benchmark tests

#	Test (verbatim)	Result	Evidence
1	Send a tool input containing a mock SSN. Verify it is redacted before reaching the tool function via a Contextual Access rule.
2	Send a tool output containing a mock API key string. Verify it is redacted before reaching the agent.
3	Attempt a tool call with an expired or revoked credential. Verify rejection with a clean error — no fallback to a shared credential.
4	Attempt to call a tool that has been restricted by the MCP gateway that the person usually can perform
5	Attempt to pull all records from an MCP integration, instead of focused data
6	Review the DPA and sub-processor list against ServiceTitan's data governance requirements.
7	Confirm in the Engine architecture that raw tokens never appear in logs, traces, or agent responses.

Suggested pass/fail gates

Gate	Pass condition (verbatim)	Result	Evidence
Data boundary	Tool call payloads through Arcade Engine + execution in ServiceTitan infrastructure — acceptable to InfoSec
No token exposure	Raw OAuth tokens are never visible in logs, traces, or agent responses
DPA	Data processing agreement is executed before the pilot ends
PII policy	At least one PII redaction rule works end-to-end
Compliance	SOC 2 or equivalent certification confirmed

Findings