fix(agents): offload UploadsMiddleware uploads scan off the event loop#3311
Conversation
UploadsMiddleware defines only the sync `before_agent` hook. LangChain wires a sync-only hook as `RunnableCallable(before_agent, None)`, and LangGraph's `ainvoke` runs it directly on the event loop when `afunc is None` — so the per-message uploads-directory scan (`exists`/`iterdir`/`stat` plus reading sibling `.md` outlines) blocks the asyncio event loop on every message that has an uploads directory. Add `abefore_agent` that offloads the scan to a worker thread via `run_in_executor`; it copies the current context, preserving the `user_id` contextvar read by `get_effective_user_id()`. Add a runtime anchor under `tests/blocking_io/` that drives the real `create_agent` graph via `ainvoke` under the strict Blockbuster gate, so a regression back onto the event loop fails CI. Update blocking-IO docs.
There was a problem hiding this comment.
Pull request overview
Adds an async abefore_agent hook to UploadsMiddleware that offloads its blocking uploads-directory scan to a worker thread via run_in_executor, preventing event-loop stalls under async graph execution. A blocking-IO runtime anchor and docs updates are included.
Changes:
- Implement
abefore_agentinUploadsMiddlewaredelegating tobefore_agentviarun_in_executor(preserves context). - Add
tests/blocking_io/test_uploads_middleware.pydriving the realcreate_agentgraph viaainvokeunder the Blockbuster gate. - Update
BLOCKING_IO_DETECTION.mdandCLAUDE.mdto document the new anchor.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| backend/packages/harness/deerflow/agents/middlewares/uploads_middleware.py | Adds async hook offloading sync before_agent via run_in_executor. |
| backend/tests/blocking_io/test_uploads_middleware.py | New runtime anchor under Blockbuster gate using real create_agent graph. |
| backend/docs/BLOCKING_IO_DETECTION.md | Documents UploadsMiddleware runtime coverage. |
| backend/CLAUDE.md | Lists the new blocking-IO anchor alongside existing ones. |
|
@ShenAC-SAC Please fix the conflict with the main branch. |
Resolve conflicts in backend/CLAUDE.md and backend/docs/BLOCKING_IO_DETECTION.md by keeping both runtime anchors: the JsonlRunEventStore async-IO anchor (bytedance#3084) from main and the UploadsMiddleware uploads-scan anchor from this PR.
|
@WillemJiang Thanks for the heads-up — conflict resolved. I merged the latest |
Problem
UploadsMiddlewaredefines only the synchronousbefore_agenthook. LangChain wires a sync-only hook asRunnableCallable(before_agent, None), and LangGraph'sainvokeruns it directly on the event loop whenafunc is None. Sobefore_agent's uploads-directory scan (exists/iterdir/stat+ reading sibling.mdoutlines) blocks the asyncio event loop on every message that has an uploads directory.Fix
Add
abefore_agentthat offloads the scan to a worker thread viarun_in_executor.run_in_executorcopies the current context, so theuser_idcontextvar read byget_effective_user_id()is preserved. The syncbefore_agentpath (used when the graph runs synchronously) is unchanged.Test
Add a runtime anchor under
backend/tests/blocking_io/that drives the realcreate_agentgraph viaainvokeunder the strict Blockbuster gate:BlockingError: Blocking call to os.statinsidebefore_agent(verified).make detect-blocking-iono longer reports theSYNC_AGENT_MIDDLEWARE_HOOKfinding for this path.Docs (
backend/CLAUDE.md,backend/docs/BLOCKING_IO_DETECTION.md) updated to list the new anchor.Closes #3310