Skip to content

fix(runtime): compact historical write_file tool-call args#3137

Open
kibabsquirrel wants to merge 4 commits into
bytedance:mainfrom
kibabsquirrel:fix/bug-003-compact
Open

fix(runtime): compact historical write_file tool-call args#3137
kibabsquirrel wants to merge 4 commits into
bytedance:mainfrom
kibabsquirrel:fix/bug-003-compact

Conversation

@kibabsquirrel
Copy link
Copy Markdown
Contributor

@kibabsquirrel kibabsquirrel commented May 21, 2026

Related to #3114

Summary

This PR fixes the model-bound history inflation surface of write_file.

After a large write_file call completes, its original content can remain in historical AIMessage.tool_calls.args and be serialized again in later model requests. This patch adds request-local compaction for model-bound message views, so completed write_file tool-call/result pairs no longer re-enter later model context verbatim.

Key Decisions

  • compact only real completed write_file tool-call/result pairs
  • keep the fix request-local by rewriting model-bound message views instead of mutating persisted history/state
  • use a marker-only replacement for oversized args.content
  • preserve raw provider tool-call metadata while updating serialized args
  • apply the same compaction to lead-agent requests, summary-bound message views, and subagent runtime requests

Tests

Validated:

  • completed large write_file calls are compacted
  • unpaired/active tool calls are not compacted
  • repeated tool_call_id histories compact only the occurrence paired with a real tool result
  • mixed tool-call messages compact only completed write_file entries
  • compacted messages do not mutate the original message objects
  • summary-bound message views are compacted before summary formatting without mutating state
  • subagent runtime middleware includes the same tool-args compaction path
  • raw provider metadata is preserved when tool-call args are rewritten, including args, function.arguments, and function.args
  • provider-side serialized payloads no longer contain the original large write_file.content

@kibabsquirrel
Copy link
Copy Markdown
Contributor Author

kibabsquirrel commented May 21, 2026

Minimal case:

  1. model emits a large write_file(content=...) call
  2. tool returns a normal ToolMessage
  3. later model requests include the historical tool call again
  4. provider-bound payload still carries the original large args unless the request view is compacted

A few design choices behind this patch:

  • I keep compaction request-local (wrap_model_call / awrap_model_call) rather than mutating persisted state, so replay/debug history stays faithful while still shrinking model-bound payloads.
  • I compact only real completed write_file tool-call/result pairs, not in-flight calls, to avoid rewriting args before a real tool result is present.
  • I use a marker-only replacement for oversized content rather than a preview, to keep the cap deterministic and avoid leaking large content back into context.
  • Raw provider tool-call metadata is updated alongside tool_calls, so the payload reduction also reaches provider serialization paths like MindIE/Codex.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR mitigates model-context inflation caused by large write_file tool-call arguments lingering in historical AIMessage.tool_calls.args and being re-serialized in subsequent model requests. It introduces a request-scoped middleware that compacts only completed write_file tool-call/result pairs in the model-bound request view (without mutating persisted history), while keeping raw provider tool-call metadata consistent with the rewritten args.

Changes:

  • Add ToolArgsCompactionMiddleware to replace oversized completed write_file.args.content with a marker string in outgoing model requests.
  • Extend tool-call metadata cloning utilities to optionally sync/replace raw provider serialized args when tool_calls are rewritten.
  • Add targeted unit/regression tests across middleware behavior and provider serialization paths (Codex, MindIE), and assert middleware ordering in the lead agent.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
backend/packages/harness/deerflow/agents/middlewares/tool_args_compaction_middleware.py New middleware that compacts completed large write_file args in the request-visible message list.
backend/packages/harness/deerflow/agents/middlewares/tool_call_metadata.py Adds raw tool-call cloning with updated args + new API to keep provider metadata in sync when tool_calls are rewritten.
backend/packages/harness/deerflow/agents/middlewares/dangling_tool_call_middleware.py Marks synthetic placeholder ToolMessages via additional_kwargs so compaction doesn’t treat them as completed tool results.
backend/packages/harness/deerflow/agents/lead_agent/agent.py Inserts compaction middleware into the lead-agent middleware chain (after summarization).
backend/tests/test_tool_args_compaction_middleware.py New unit tests covering compaction conditions, non-mutation, dangling-tool-result handling, and raw metadata syncing.
backend/tests/test_mindie_provider.py Regression test ensuring compacted write_file args don’t bloat MindIE XML serialization.
backend/tests/test_codex_provider.py Regression test ensuring compacted write_file args don’t bloat Codex function-call serialization.
backend/tests/test_lead_agent_model_resolution.py Verifies middleware ordering places compaction after summarization and before title/memory.

@kibabsquirrel
Copy link
Copy Markdown
Contributor Author

kibabsquirrel commented May 22, 2026

For broader direction, I’d like to check whether this matches how maintainers want to evolve this area.

I see the original write_file inflation/stability issue as having a few separate surfaces:

  • fix(runtime): bound write_file execution-failure observations #3133 handled bounded execution-failure observations from write_file
  • this PR handles model-bound history replay, where completed write_file args remain in historical AIMessage.tool_calls and can be serialized again in later model requests
  • still out of scope here are persisted history/checkpoint compaction, journal/trace/client payload hygiene, and broader large-arg tools beyond write_file

The trade-off in this PR is intentionally conservative: reduce provider-bound payloads without mutating persisted state, at the cost of recomputing compaction per request and hiding historical write_file.content from later model context.

I’m not trying to bundle all remaining surfaces into this PR, but I’d appreciate guidance on whether follow-ups should stay narrowly focused on write_file first, or move toward a more general context/payload budget policy.

@ShenAC-SAC ShenAC-SAC added the reviewing The PR is in reviewing status label May 23, 2026
@kibabsquirrel kibabsquirrel force-pushed the fix/bug-003-compact branch from e371326 to 5a30b5e Compare May 23, 2026 05:23
@kibabsquirrel
Copy link
Copy Markdown
Contributor Author

Updated the PR with occurrence-aware completed-pair detection for repeated tool_call_id histories and broader raw metadata sync coverage. Rebased onto latest main; targeted tests and ruff checks pass.

@ShenAC-SAC
Copy link
Copy Markdown
Collaborator

@kibabsquirrel Thanks for working on this. I took another look and I think the request-local compaction approach makes sense, but two model-bound paths may still be uncovered:

  1. Summarization runs in before_model and invokes its own model before ToolArgsCompactionMiddleware.wrap_model_call, so completed large write_file.content can still enter the summary prompt via get_buffer_string(messages_to_summarize).

    Suggested fix: reuse the compaction logic on the summary-bound message view before formatting messages_to_summarize, without mutating persisted state.

  2. The middleware is currently wired only into the lead agent. Subagents also have model-bound histories, and the bash subagent can use write_file, so large completed write_file calls there can still be resent verbatim.

    Suggested fix: wire the compaction middleware into the shared runtime/subagent middleware path as well.

Would you mind checking these paths and adding focused regression tests for summary-model input and subagent request serialization if you agree?

@kibabsquirrel
Copy link
Copy Markdown
Contributor Author

@ShenAC-SAC Thanks, agreed these are additional model-bound paths worth covering. I updated the patch to reuse the same request-local compaction for the summary-bound message view before summary formatting, and wired ToolArgsCompactionMiddleware into the subagent runtime chain as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reviewing The PR is in reviewing status

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants