fix(middleware): handle repeated tool call ids by ggnnggez · Pull Request #3143 · bytedance/deer-flow

ggnnggez · 2026-05-21T23:18:44Z

Summary

keep all matching ToolMessages when tool_call_id strings repeat across assistant turns
consume matching ToolMessages in occurrence order during dangling tool-call normalization
cover the summarization-adjacent repeated-id transcript shape

Context

While validating PR #2883, a valid compressed-history transcript could contain a preserved tool-call turn and a later assistant turn that reuse the same tool_call_id string. The provider accepts that shape when each assistant turn is followed by its own ToolMessages, but DanglingToolCallMiddleware used a single-value map keyed by tool_call_id and could drop the later matching ToolMessage during normalization.

Test Plan

cd backend && UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/test_dangling_tool_call_middleware.py tests/test_summarization_middleware.py
pre-commit backend lint on commit

Copilot

Pull request overview

This PR fixes DanglingToolCallMiddleware so it no longer assumes tool_call_id strings are globally unique across the entire message history. This prevents the middleware from corrupting otherwise-valid transcripts (notably after summarization/context compression) when providers reuse tool call ID strings in later assistant turns.

Changes:

Preserve all ToolMessages for a given tool_call_id (queue per ID) instead of keeping only one.
During normalization, consume matching ToolMessages in occurrence order for each tool-call occurrence.
Add a regression test covering a summarization-adjacent transcript with repeated tool-call IDs across separate AI turns.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
backend/packages/harness/deerflow/agents/middlewares/dangling_tool_call_middleware.py	Switch tool-message lookup to per-ID queues and consume in order to support repeated `tool_call_id`s across turns.
backend/tests/test_dangling_tool_call_middleware.py	Add coverage ensuring valid transcripts with repeated IDs remain unchanged.

ggnnggez · 2026-05-22T10:04:48Z

+                tool_msg_queue = tool_messages_by_id.get(tc_id)
+                while tool_msg_queue and id(tool_msg_queue[0]) in consumed_tool_msg_objects:
+                    tool_msg_queue.popleft()
+


Good catch. The deque already guarantees FIFO consumption once a ToolMessage is removed
with popleft(), so the extra object-id tracking is unnecessary here. I removed
consumed_tool_msg_objects and the cleanup loop, and kept the matching logic based on queue
occurrence order only.

WillemJiang · 2026-05-22T01:18:27Z

@ggnnggez
Thanks for digging into the code. Missing test coverage I'd like to see:

A test where the second occurrence of a repeated ID is dangling (missing its ToolMessage) — the current test only covers the "already valid" case. The most important edge case is: first turn has tool_call_id=X with a result, second turn has tool_call_id=X without a result. Does the middleware correctly inject a synthetic ToolMessage only for the second occurrence?
A test where the first occurrence is dangling and the second has a result — does it correctly consume the result for the first and inject a synthetic for the second? (This would be a misuse by the provider, but defensive coding should handle it.)

ggnnggez · 2026-05-22T10:05:36Z

@WillemJiang
Thanks, I added coverage for both repeated-ID dangling cases:

first occurrence has a ToolMessage, second occurrence is dangling: the middleware keeps
the first result and injects a synthetic ToolMessage only for the second occurrence;
first occurrence is dangling, second occurrence has the only matching result: the
middleware consumes that result for the first occurrence under the current occurrence-order
matching rule, then injects a synthetic ToolMessage for the second occurrence.

This keeps the repeated-ID behavior explicit and documents the FIFO matching semantics for
defensive normalization.

fix(middleware): handle repeated tool call ids

298ee92

WillemJiang requested a review from Copilot May 22, 2026 00:38

Copilot started reviewing on behalf of WillemJiang May 22, 2026 00:39 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

WillemJiang added the reviewing The PR is in reviewing status label May 22, 2026

ggnnggez added 2 commits May 22, 2026 11:59

add tests

ca8a68f

refactor(middleware): rely on tool result queues

40f16f6

WillemJiang approved these changes May 22, 2026

View reviewed changes

WillemJiang merged commit f0bae28 into bytedance:main May 22, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(middleware): handle repeated tool call ids#3143

fix(middleware): handle repeated tool call ids#3143
WillemJiang merged 3 commits into
bytedance:mainfrom
ggnnggez:fix-3142

ggnnggez commented May 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

ggnnggez May 22, 2026

Uh oh!

WillemJiang commented May 22, 2026

Uh oh!

ggnnggez commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ggnnggez commented May 21, 2026

Summary

Context

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

ggnnggez May 22, 2026

Choose a reason for hiding this comment

Uh oh!

WillemJiang commented May 22, 2026

Uh oh!

ggnnggez commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ggnnggez commented May 22, 2026 •

edited

Loading