feat: LLM can start MCP servers from chat context#3866
Conversation
|
Thanks @bistack for taking the time to contribute. This repository is observing a maintainer-managed PR intake gate in dry-run mode, so this pull request is staying open. This note helps maintainers prepare the allowlist before any enforcement is considered. Please read |
|
Thanks @bistack — I inspected this while sweeping community PRs for possible v0.8.67 inclusion. I’m not harvesting this into the current constitution-first release branch as-is because it crosses several trust-boundary surfaces at once: model-facing tool catalog, local process spawning, remote MCP endpoints, approval UI, and runtime MCP pool mutation. A few concrete blockers I saw from the diff:
This is still interesting directionally, especially for making MCP setup less manual, but I’d want it split into a dedicated MCP runtime/trust PR with explicit user approval semantics, feature gating, catalog-refresh behavior, and conflict tests before release inclusion. Leaving it open rather than closing it; it deserves a focused pass, just not a quick harvest into v0.8.67. |
86af2a9 to
0c2cbeb
Compare
|
@Hmbown Thank you for the thorough review. All four points are clear blockers. The work will be split into focused PRs by trust-boundary surface:
Each PR is independently reviewable. Will update this PR to track progress. |
|
@bistack thanks for the quick turnaround and the split plan — that decomposition matches how I'd slice it. I re-read the updated head:
For PR 3, the acceptance bar I'd use: Looking forward to the split — this lands as capability the manual-config flow can't match once the trust story is right. |
|
@Hmbown Thank you for the detailed follow-up. PR 2 additions (feature gating):
PR 3 acceptance criteria noted:
On point 4 (catalog timing): MCP tools are deferred in the catalog by
This matches the existing MCP tool workflow — the model doesn't expect deferred tools to be callable without a On ApprovalRequirement::Auto for start_mcp_server: the trust boundary is at tool execution, not at server connection. Connecting to a server and discovering its tool list is a read-only operation with no side effects — analogous to The attached execution trace confirms this in practice: On McpPool integration for dynamic tools: the interface changes required (lifetime issues with |
0c2cbeb to
4225a96
Compare
Add in-memory dynamic server support to McpPool for runtime-started MCP servers from conversation context. - New `dynamic_servers` field (parking_lot::RwLock) on McpPool - `add_runtime_server_config()` rejects duplicate names (static or dynamic) - `get_or_connect()` checks dynamic servers before static config - `server_names()` includes both static and dynamic servers - Conflict tests for static/dynamic name overlap Prerequisite for Hmbown#3866 (start_mcp_server tool). Extracted as a focused infrastructure PR to isolate the trust-boundary surface (runtime MCP pool mutation) from the tool implementation.
Security review — please gate the spawn before mergingReally like the direction here, and the design has clearly improved — There's one blocking issue before this can land, though: the spawn is auto-approved, so the model can start an arbitrary local process from chat with no user confirmation. Evidence
That's a remote-code-execution path from model/chat context — the one thing the MCP subsystem currently assumes can't happen (it was built assuming server config comes from a trusted human). Preconditions to make this safe to land
Items 1–2 are the hard blockers; 3–4 are important hardening. Happy to pair on the approval-gating wiring if useful — this is a genuinely nice capability once the spawn is behind a prompt. (Reviewed against the current diff; findings are defensible from the code cited above. Not a merge/approve — flagging for @Hmbown's decision.) |
4225a96 to
9c756b6
Compare
Allow LLM to start MCP servers from conversation context when a user
provides a command (e.g. npx ...) or URL. The tool connects to the
server and returns fully qualified tool names (mcp_{server}_{tool})
that the model discovers via tool_search.
McpPool infrastructure:
- Add dynamic_servers (parking_lot::RwLock) for runtime server configs
- add_runtime_server_config() rejects duplicate names (static or dynamic)
- get_or_connect() checks dynamic servers before static config
- server_names() includes both static and dynamic servers
Engine integration:
- Lazy pool initialization via ensure_mcp_pool() with network_policy wiring
- No eager McpPool construction in Engine::new
- start_mcp_server gated behind Feature::Mcp via tool_setup registration
- Dynamic always_load injection ensures the tool is eager (not deferred)
Security:
- ApprovalRequirement::Required — requires user approval before spawning
- Non-bypassable approval — YOLO mode cannot skip (registered_tool_requires_non_bypassable_approval)
- Reject shell wrapper commands (bash, sh, zsh, cmd, powershell)
- Reject shell metacharacters in args (redirects, pipes, chaining, $, backticks)
- Allowlist of permitted runtimes (npx, node, python, uvx, deno, ruby, cargo, etc.)
- Underscores in server names auto-converted to hyphens to prevent tool name collision
Tool implementation:
- Shell-words parsing for quoted arguments
- Name inference for npm/pnpm/node/python/uvx and Windows cmd /c
- McpAction approval classification
- with_runtime_mcp_tool() builder avoids McpToolAdapter duplication
9c756b6 to
8c746d5
Compare




Summary
Add
start_mcp_servertool allowing LLM to dynamically start MCP servers from conversation context. Supports both stdio (local command) and HTTP (remote URL) transports.Changes
runtime_mcp.rswithStartRuntimeMcpServertool implementationMcpPoolgainsdynamic_servers(parking_lot::RwLock) for runtime configsadd_runtime_server_configreturns conflict warnings for static/dynamic name collisionsshell-wordscrate for proper quoted argument parsingstart_mcp_serverclassified asMcpActionin approval systemTesting
cargo fmt --all -- --checkcargo test -p codewhale-tui --locked— 16 runtime_mcp tests + 3 conflict tests passChecklist