Skip to content

fix(middleware): handle repeated tool call ids#3143

Merged
WillemJiang merged 3 commits into
bytedance:mainfrom
ggnnggez:fix-3142
May 22, 2026
Merged

fix(middleware): handle repeated tool call ids#3143
WillemJiang merged 3 commits into
bytedance:mainfrom
ggnnggez:fix-3142

Conversation

@ggnnggez
Copy link
Copy Markdown
Collaborator

Summary

  • keep all matching ToolMessages when tool_call_id strings repeat across assistant turns
  • consume matching ToolMessages in occurrence order during dangling tool-call normalization
  • cover the summarization-adjacent repeated-id transcript shape

Context

Fixes #3142.

While validating PR #2883, a valid compressed-history transcript could contain a preserved tool-call turn and a later assistant turn that reuse the same tool_call_id string. The provider accepts that shape when each assistant turn is followed by its own ToolMessages, but DanglingToolCallMiddleware used a single-value map keyed by tool_call_id and could drop the later matching ToolMessage during normalization.

Test Plan

  • cd backend && UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/test_dangling_tool_call_middleware.py tests/test_summarization_middleware.py
  • pre-commit backend lint on commit

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes DanglingToolCallMiddleware so it no longer assumes tool_call_id strings are globally unique across the entire message history. This prevents the middleware from corrupting otherwise-valid transcripts (notably after summarization/context compression) when providers reuse tool call ID strings in later assistant turns.

Changes:

  • Preserve all ToolMessages for a given tool_call_id (queue per ID) instead of keeping only one.
  • During normalization, consume matching ToolMessages in occurrence order for each tool-call occurrence.
  • Add a regression test covering a summarization-adjacent transcript with repeated tool-call IDs across separate AI turns.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
backend/packages/harness/deerflow/agents/middlewares/dangling_tool_call_middleware.py Switch tool-message lookup to per-ID queues and consume in order to support repeated tool_call_ids across turns.
backend/tests/test_dangling_tool_call_middleware.py Add coverage ensuring valid transcripts with repeated IDs remain unchanged.

Comment on lines +143 to +146
tool_msg_queue = tool_messages_by_id.get(tc_id)
while tool_msg_queue and id(tool_msg_queue[0]) in consumed_tool_msg_objects:
tool_msg_queue.popleft()

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The deque already guarantees FIFO consumption once a ToolMessage is removed
with popleft(), so the extra object-id tracking is unnecessary here. I removed
consumed_tool_msg_objects and the cleanup loop, and kept the matching logic based on queue
occurrence order only.

@WillemJiang
Copy link
Copy Markdown
Collaborator

@ggnnggez
Thanks for digging into the code. Missing test coverage I'd like to see:

  • A test where the second occurrence of a repeated ID is dangling (missing its ToolMessage) — the current test only covers the "already valid" case. The most important edge case is: first turn has tool_call_id=X with a result, second turn has tool_call_id=X without a result. Does the middleware correctly inject a synthetic ToolMessage only for the second occurrence?
  • A test where the first occurrence is dangling and the second has a result — does it correctly consume the result for the first and inject a synthetic for the second? (This would be a misuse by the provider, but defensive coding should handle it.)

@WillemJiang WillemJiang added the reviewing The PR is in reviewing status label May 22, 2026
@ggnnggez
Copy link
Copy Markdown
Collaborator Author

ggnnggez commented May 22, 2026

@WillemJiang
Thanks, I added coverage for both repeated-ID dangling cases:

  • first occurrence has a ToolMessage, second occurrence is dangling: the middleware keeps
    the first result and injects a synthetic ToolMessage only for the second occurrence;
  • first occurrence is dangling, second occurrence has the only matching result: the
    middleware consumes that result for the first occurrence under the current occurrence-order
    matching rule, then injects a synthetic ToolMessage for the second occurrence.

This keeps the repeated-ID behavior explicit and documents the FIFO matching semantics for
defensive normalization.

@WillemJiang WillemJiang merged commit f0bae28 into bytedance:main May 22, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reviewing The PR is in reviewing status

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[runtime] DanglingToolCallMiddleware treat tool call IDs as globally unique

3 participants