fix(chat): persist thread transcripts outside checkpoints#2385
fix(chat): persist thread transcripts outside checkpoints#2385LittleChenLiya wants to merge 4 commits into
Conversation
…pr2385 # Conflicts: # backend/app/gateway/routers/threads.py # backend/app/gateway/services.py # frontend/src/components/workspace/messages/message-list.tsx
|
我重新检查了这个 PR 和最新 main 的冲突,暂时先转回 draft。 主要原因不是普通文本冲突,而是主线现在已经新增了 目前更合理的收敛方向是:不要再新增一套并行 transcript API,而是基于主线 run event store 消息历史方案继续修边界问题(例如分页边界、SSE 恢复、summary 后显示恢复)。因此这个 PR 不适合直接 rebase 后继续保持 ready 状态。 |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Introduces a canonical, durable transcript for threads (separate from checkpoint/model context) and wires it into the UI so message history remains stable across summarization/context compression.
Changes:
- Added backend transcript storage utilities with filtering + deduplication, plus tests covering key behaviors.
- Added a threads API endpoint to fetch the canonical transcript and integrated transcript lifecycle with run execution and thread deletion.
- Updated the frontend message list to fetch the canonical transcript and merge it with live streaming messages.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/components/workspace/messages/message-list.tsx | Fetches canonical thread transcript and merges it with live messages for rendering. |
| backend/app/gateway/transcripts.py | Implements transcript normalization/filtering/dedup + persistence in the Store. |
| backend/app/gateway/services.py | Appends submitted messages and syncs final checkpoint messages into the transcript. |
| backend/app/gateway/routers/threads.py | Adds /threads/{thread_id}/messages endpoint and deletes transcript on thread deletion. |
| backend/packages/harness/deerflow/agents/middlewares/summarization_middleware.py | Tags summary messages as hidden/summary via additional_kwargs. |
| backend/tests/test_transcripts.py | Adds test coverage for transcript append/filter/dedup behavior. |
| backend/tests/test_summarization_middleware.py | Asserts summary messages are tagged as hidden + summary marker. |
Comments suppressed due to low confidence (1)
backend/packages/harness/deerflow/agents/middlewares/summarization_middleware.py:190
- There are two _build_new_messages definitions; the later one overrides the earlier in Python, making the first implementation dead code and potentially confusing for future changes. Remove the shadowed method or consolidate the behavior into a single override to keep the tagging logic in one place.
def _build_new_messages(self, summary: str) -> list[HumanMessage]:
问题原因
UI 聊天记录依赖 LangGraph checkpoint history。长对话触发 summarization 后,checkpoint 会被用于压缩模型上下文,导致前端从 checkpoint history 恢复聊天记录时丢失早期可见消息。
checkpoint 更适合作为 agent/model state,不应承担完整 UI transcript 的职责。
修改内容
关联 issue
Closes #2012
Supersedes #2424
Problem Cause
UI chat history depended on LangGraph checkpoint history. After a long conversation triggered summarization, checkpoints were used to compress model context, so restoring chat history from checkpoint history could lose earlier visible messages in the frontend.
Checkpoints are better suited for agent/model state and should not be responsible for the complete UI transcript.
Changes
Related Issue
Closes #2012
Supersedes #2424