fix(middleware): handle repeated tool call ids#3143
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes DanglingToolCallMiddleware so it no longer assumes tool_call_id strings are globally unique across the entire message history. This prevents the middleware from corrupting otherwise-valid transcripts (notably after summarization/context compression) when providers reuse tool call ID strings in later assistant turns.
Changes:
- Preserve all
ToolMessages for a giventool_call_id(queue per ID) instead of keeping only one. - During normalization, consume matching
ToolMessages in occurrence order for each tool-call occurrence. - Add a regression test covering a summarization-adjacent transcript with repeated tool-call IDs across separate AI turns.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| backend/packages/harness/deerflow/agents/middlewares/dangling_tool_call_middleware.py | Switch tool-message lookup to per-ID queues and consume in order to support repeated tool_call_ids across turns. |
| backend/tests/test_dangling_tool_call_middleware.py | Add coverage ensuring valid transcripts with repeated IDs remain unchanged. |
| tool_msg_queue = tool_messages_by_id.get(tc_id) | ||
| while tool_msg_queue and id(tool_msg_queue[0]) in consumed_tool_msg_objects: | ||
| tool_msg_queue.popleft() | ||
|
|
There was a problem hiding this comment.
Good catch. The deque already guarantees FIFO consumption once a ToolMessage is removed
with popleft(), so the extra object-id tracking is unnecessary here. I removed
consumed_tool_msg_objects and the cleanup loop, and kept the matching logic based on queue
occurrence order only.
|
@ggnnggez
|
|
@WillemJiang
This keeps the repeated-ID behavior explicit and documents the FIFO matching semantics for |
Summary
Context
Fixes #3142.
While validating PR #2883, a valid compressed-history transcript could contain a preserved tool-call turn and a later assistant turn that reuse the same tool_call_id string. The provider accepts that shape when each assistant turn is followed by its own ToolMessages, but DanglingToolCallMiddleware used a single-value map keyed by tool_call_id and could drop the later matching ToolMessage during normalization.
Test Plan
cd backend && UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/test_dangling_tool_call_middleware.py tests/test_summarization_middleware.py