docs(cat3): AI gateway + MCP gateway (Arcade) integration architecture
Where the AI Gateway (LLM proxy) and MCP Gateway (Arcade) fit into the Agent Platform -> Tool Hub -> Automation Hub stack without major work on either app: AI Gateway = config repoint; Arcade = Tool Hub's existing mcp_proxy adapter seam, with per-user OAuth living in Arcade. Includes GitHub-renderable mermaid topology + sequence diagrams and a change-surface table, grounded in tool-hub + automation-hub source reads.
This commit is contained in:
@@ -0,0 +1,200 @@
|
||||
# Where the AI Gateway and MCP Gateway fit — target architecture
|
||||
|
||||
> Cat-3 (Tool-Level Access Control & Policy) deliverable: the V4 seam map, extended into a
|
||||
> concrete integration design. **Goal:** place an **AI Gateway** (LLM/model proxy) and an
|
||||
> **MCP Gateway** (Arcade) into the existing `Agent Platform → Tool Hub → Automation Hub`
|
||||
> stack **without major work on the Tool Hub or Automation Hub applications.**
|
||||
>
|
||||
> Grounded in: `servicetitan/tool-hub` @ master, `servicetitan/automation-hub` @ master,
|
||||
> arcade-eval LIVE-POC (all read 2026-06-22).
|
||||
|
||||
## The thesis in one paragraph
|
||||
|
||||
Both Tool Hub and Automation Hub were built with the exact seams this needs, and neither does
|
||||
the one thing Arcade is for. **Tool Hub** already has a data-driven `IExecutionAdapter` registry
|
||||
with a **`mcp_proxy` SourceType named in the contract** — adding Arcade is the *intended*
|
||||
extension, not surgery. **Automation Hub** explicitly scopes per-user OAuth / connector
|
||||
infrastructure as a **non-goal** and names per-user OAuth brokering as the gap an external
|
||||
platform fills. So the minimal-work design is: **(1) AI Gateway = pure configuration** (repoint
|
||||
the model/embedding base URLs every component already calls); **(2) MCP Gateway (Arcade) = one
|
||||
adapter pair behind Tool Hub's existing `mcp_proxy` seam**, with all per-user third-party OAuth
|
||||
living *inside Arcade* (so Tool Hub needs no credential vault and no new OBO authority).
|
||||
Automation Hub is untouched. Tool Hub remains the single authority/policy/audit plane over
|
||||
**both** execution backends.
|
||||
|
||||
## Design constraints — what "no major work" means here
|
||||
|
||||
| App | Allowed | Explicitly avoided |
|
||||
|---|---|---|
|
||||
| **Tool Hub** | Implement one `ICatalogSource` + one `IExecutionAdapter` (`type='arcade'`/`mcp_proxy`) — the designed extension point. Config: model base URLs → AI Gateway. | No change to discovery hot path, permission model, idempotency, audit, or the OBO core. Per-user SaaS OAuth is **not** added to Tool Hub. |
|
||||
| **Automation Hub** | Nothing. | No new executor, no connector framework, no OAuth store. AH stays one of Tool Hub's catalog sources. |
|
||||
| **Agent Platform** | Config: inference endpoint → AI Gateway; identity = per-user Entra SSO. | No re-architecture. |
|
||||
|
||||
## 1. Target topology
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph IDP["Identity"]
|
||||
Entra["Entra ID SSO<br/>per-user login / IUM"]
|
||||
end
|
||||
|
||||
subgraph AGENT["Agent plane"]
|
||||
Agent["LLM Agent<br/>(AgentOS / sidecar)"]
|
||||
end
|
||||
|
||||
subgraph GW["Gateways — inserted, no app surgery"]
|
||||
AIGW["AI Gateway<br/>LiteLLM-class LLM/model proxy<br/>keys · routing · rate-limit · cost · audit"]
|
||||
MCPGW["MCP Gateway — Arcade<br/>MCP transport + per-user OAuth broker"]
|
||||
end
|
||||
|
||||
subgraph TH["Tool Hub — authority / data plane (core UNCHANGED)"]
|
||||
MCPHost["MCP surface<br/>search_tools · get_tool_details · execute_tool"]
|
||||
Policy["Stage0-6: permission re-check ·<br/>idempotency · rate-limit · audit/outbox"]
|
||||
Reg["IExecutionAdapter registry<br/>(catalog_source.type → adapter)"]
|
||||
AHAdapter["automation_hub adapter<br/>(exists)"]
|
||||
ArcAdapter["arcade adapter<br/>(NEW — mcp_proxy seam)"]
|
||||
end
|
||||
|
||||
subgraph AH["Automation Hub — UNCHANGED"]
|
||||
AHCat["Catalog API<br/>GET /api/catalog/actions (ETag, cursor)"]
|
||||
AHExec["POST /actions/{id}/execute<br/>st.automation_hub.execute"]
|
||||
AHDown["ST Core API v2 / Internal API<br/>IUM bot-user impersonation"]
|
||||
end
|
||||
|
||||
subgraph EXT["Third-party + custom capability"]
|
||||
SaaS["GitHub · Slack · Google · ..."]
|
||||
Custom["Custom / partner MCP servers"]
|
||||
end
|
||||
|
||||
subgraph MODELS["Model providers"]
|
||||
LLMs["Anthropic · Voyage · OpenAI · internal"]
|
||||
end
|
||||
|
||||
Entra -. "per-user token" .-> Agent
|
||||
Agent -- "inference" --> AIGW
|
||||
Agent -- "MCP meta-tools (carries user identity)" --> MCPHost
|
||||
MCPHost --> Policy --> Reg
|
||||
Reg --> AHAdapter
|
||||
Reg --> ArcAdapter
|
||||
AHAdapter -- "catalog sync" --> AHCat
|
||||
AHAdapter -- "IUM OBO execute" --> AHExec
|
||||
AHExec --> AHDown
|
||||
ArcAdapter -- "MCP tools/call + user identity" --> MCPGW
|
||||
MCPGW -- "resolve per-user OAuth token" --> SaaS
|
||||
MCPGW --> Custom
|
||||
AIGW --> LLMs
|
||||
TH -. "enrichment · query rewrite · embeddings · rerank" .-> AIGW
|
||||
|
||||
classDef new fill:#ffe8cc,stroke:#e8860c,stroke-width:2px,color:#000;
|
||||
class AIGW,MCPGW,ArcAdapter new;
|
||||
```
|
||||
|
||||
Highlighted (orange) = the only new pieces: the **AI Gateway**, the **MCP Gateway (Arcade)**,
|
||||
and the thin **arcade adapter** that slots into Tool Hub's existing registry.
|
||||
|
||||
## 2. Two execution paths through one authority plane
|
||||
|
||||
Tool Hub stays the single point of policy, idempotency, and audit. The *only* difference
|
||||
between an internal action and a third-party action is which adapter the registry resolves — and
|
||||
that the Arcade path adds per-user OAuth that neither Tool Hub nor AH can do today.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
autonumber
|
||||
participant U as User / Agent
|
||||
participant TH as Tool Hub
|
||||
participant AR as Arcade (MCP GW)
|
||||
participant SaaS as Third-party SaaS
|
||||
participant AH as Automation Hub
|
||||
participant ST as ServiceTitan APIs
|
||||
|
||||
Note over U,ST: A. Internal ServiceTitan action — existing path, unchanged
|
||||
U->>TH: execute_tool(automation_hub://crm.create_job, input)
|
||||
TH->>TH: permission re-check · idempotency · rate-limit · audit
|
||||
TH->>AH: POST /actions/{id}/execute (IUM OBO, bot-user)
|
||||
AH->>ST: call Core / Internal API
|
||||
ST-->>AH: result
|
||||
AH-->>TH: ActionExecutionResult
|
||||
TH-->>U: CallToolResult
|
||||
|
||||
Note over U,SaaS: B. Third-party action — NEW path via Arcade
|
||||
U->>TH: execute_tool(arcade://github.create_issue, input)
|
||||
TH->>TH: SAME permission re-check · idempotency · rate-limit · audit
|
||||
TH->>AR: MCP tools/call + user identity (Entra SSO)
|
||||
AR->>AR: resolve this user's stored GitHub OAuth token
|
||||
AR->>SaaS: call GitHub API AS THE USER
|
||||
SaaS-->>AR: result
|
||||
AR-->>TH: MCP CallToolResult
|
||||
TH-->>U: CallToolResult
|
||||
```
|
||||
|
||||
The critical property: **the per-user OAuth complexity lives entirely in Arcade.** Tool Hub only
|
||||
authenticates the *user* to Arcade and passes identity — so it needs no third-party token vault
|
||||
and no change to its Entra/IUM OBO core (the arcade adapter sets `RequiresObo=false` for the
|
||||
third-party-OAuth case; Arcade does the brokering). That is what keeps this out of "major work."
|
||||
|
||||
## 3. The AI Gateway is a configuration change, not a build
|
||||
|
||||
Every model/embedding call in the stack already goes through a pinned SDK with a configurable
|
||||
endpoint. Point those endpoints at one AI Gateway and you get unified keys, routing, rate-limit,
|
||||
cost control, and audit across all AI traffic — with zero application code change.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["Agent inference"] --> AIGW
|
||||
B["Tool Hub — enrichment (Claude)"] --> AIGW
|
||||
C["Tool Hub — query rewrite (Claude Haiku)"] --> AIGW
|
||||
D["Tool Hub — embeddings + rerank (Voyage)"] --> AIGW
|
||||
E["Arcade engine — LLM / embeddings"] --> AIGW
|
||||
AIGW["AI Gateway (LiteLLM-class)<br/>keys · routing · rate-limit · cost · audit"] --> P["Anthropic · Voyage · OpenAI · internal"]
|
||||
classDef new fill:#ffe8cc,stroke:#e8860c,stroke-width:2px,color:#000;
|
||||
class AIGW new;
|
||||
```
|
||||
|
||||
The Arcade POC already routes its engine LLM + embeddings through in-cluster LiteLLM
|
||||
(LIVE-POC), so this consolidates an existing pattern rather than inventing one.
|
||||
|
||||
## 4. Change surface — component by component
|
||||
|
||||
| Component | Role in target | Change required | Evidence it's minimal |
|
||||
|---|---|---|---|
|
||||
| **AI Gateway** (LiteLLM-class) | Single egress for all LLM/embedding traffic | **Config only** — repoint base URLs | Tool Hub model providers are DI seams with configurable endpoints (`IEmbeddingProvider`, `IEnrichmentProvider`, `IQueryRewriter`, `IReranker`); Arcade already uses in-cluster LiteLLM |
|
||||
| **MCP Gateway (Arcade)** | MCP transport + **per-user OAuth broker** for SaaS / custom MCP | **Deploy + register** as Tool Hub catalog source | Arcade is a running self-hosted POC (`api.arcade.st.dev`) |
|
||||
| **Tool Hub** | Authority: discovery, policy, idempotency, audit over both backends | **One adapter pair** in the `mcp_proxy` slot + endpoint config | `ICatalogSource` docstring already names `"mcp_proxy"`; adapter selection is `catalog_source.type → registry`, dispatch site unchanged |
|
||||
| **Automation Hub** | One of Tool Hub's catalog sources (internal ST actions) | **None** | AH's catalog + `/actions/{id}/execute` contract already matches Tool Hub 1:1 (same 4 execution modes, JSON-Schema I/O, `namespace:name@semver`) |
|
||||
| **Agent Platform** | Caller | **Config** — inference → AI Gateway; identity → per-user Entra SSO | — |
|
||||
|
||||
## 5. Why this is the right seam (and the one open decision)
|
||||
|
||||
- **It fills a real, documented gap.** Per-user third-party OAuth is explicitly absent from
|
||||
*both* apps: AH lists "OAuth token management / connector marketplace" as a **V1 non-goal** and
|
||||
its own platform research names per-user OAuth brokering as what an external platform must add;
|
||||
Tool Hub's downstream auth is Entra/IUM-only. Arcade is precisely that missing layer.
|
||||
- **It uses the designed extension point.** Tool Hub's `mcp_proxy` SourceType and data-driven
|
||||
adapter registry exist *for this*. No core path changes.
|
||||
- **It preserves the authority model (cat-3 criterion 5).** Tool Hub remains the single Engine
|
||||
for permission re-check, idempotency, rate-limit, and audit over *both* AH and Arcade calls —
|
||||
so the policy/enforcement story is unchanged and now covers third-party tools too.
|
||||
- **One decision to confirm with Platform (chump/tahmad):** Tool Hub's ADR-009 currently intends
|
||||
partner/MCP capabilities to arrive *through AH as actions*. Routing Arcade **direct into Tool
|
||||
Hub** as a peer catalog source is a conscious deviation (ADR-009 even lists "BYO MCP outside
|
||||
AH's onboarding flow" as a trigger to reconsider). The recommendation here is the direct path,
|
||||
because AH has no plugin model and explicitly defers third-party connectivity — so going
|
||||
through AH would push *more* net-new work into AH, violating the "no major work" constraint.
|
||||
|
||||
## Evidence index
|
||||
|
||||
- **Tool Hub:** `src/ToolHub.Contracts/Catalog/ICatalogSource.cs` (`mcp_proxy` named);
|
||||
`src/ToolHub.Contracts/Execution/IExecutionAdapter.cs` (`RequiresObo`, `GetOboAuthority`);
|
||||
`src/ToolHub.Execution/Dispatch/ExecutionAdapterRegistry.cs` (data-driven dispatch);
|
||||
`Stage3_OboAcquisitionStage.cs` (Entra/IUM-only OBO); ADR-009, ADR-007.
|
||||
Full seam map: `architecture/toolhub-arcade-integration.md` (outer repo).
|
||||
- **Automation Hub:** `src/server/Host.Api/Controllers/ActionExecutionController.cs`
|
||||
(`POST /actions/{id}/execute`); `Host.CatalogApi/Controllers/CatalogActionsController.cs`
|
||||
(catalog sync contract); `Domain/Catalog/Actions/DownstreamApiAuthType.cs`
|
||||
(`{ApiAccessToken, TokenServer, None}` — no per-user OAuth);
|
||||
`crap/blueprint/system/context/v1-roadmap.md` (external integration = non-goal);
|
||||
`docs/research/platform-selection/paragon.md` (per-user OAuth named as the external gap).
|
||||
- **Arcade POC:** arcade-eval `LIVE-POC.md` (self-hosted, Entra IdP, in-cluster LiteLLM);
|
||||
`criteria-section-3.md` (enforcement-at-Engine + bypass findings).
|
||||
</content>
|
||||
Reference in New Issue
Block a user