feat: real-time context window usage indicator with per-category breakdown (#3125)#3183
Open
NewAmorend wants to merge 14 commits into
Open
feat: real-time context window usage indicator with per-category breakdown (#3125)#3183NewAmorend wants to merge 14 commits into
NewAmorend wants to merge 14 commits into
Conversation
Adds a `context_usage` block to `GET /api/threads/{id}/token-usage`
(token count from the live checkpoint, the thread model's
`context_window`, and a percentage), introduces a new
`ModelConfig.context_window` distinct from the per-call `max_tokens`
output cap, and surfaces the percentage in the chat header — inside
`TokenUsageIndicator` when token-usage tracking is on, or as a
standalone badge when it's off so context capacity stays visible
independent of cost tracking.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the single-number context_usage payload with a Claude-Code-style breakdown — messages, system prompt, skills, system/MCP tools (active + deferred), custom agents, memory injection, autocompact buffer, and free space — and surface it in the chat UI with a segmented progress bar and per-row table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add `context_window` to every example model in config.example.yaml so the new chat-UI "% context used" indicator works out of the box for whichever example a user adopts. Each value is the published default at the time of writing; users are pointed at the official model spec to verify. Bumps config_version to 11 so `make config-upgrade` flags outdated user configs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
Author
Contributor
Author
|
the intuition is from Claude code's context window ui |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a real-time “context window usage” indicator to the chat UI, backed by a new backend context_usage payload that computes approximate token usage per prompt category and uses a new per-model context_window config field as the percentage denominator.
Changes:
- Backend: introduce
ModelConfig.context_windowand extendGET /api/threads/{id}/token-usagewith acontext_usagebreakdown computed from the live checkpoint + prompt/tool/memory sources. - Frontend: render a header badge/percentage and a shared breakdown UI (bar + table), using TanStack Query
placeholderDatato avoid flicker during refetch. - Docs/config/tests: bump
config_version, documentcontext_window, and add unit tests for both backend payload math and frontend formatting/selection.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/tests/unit/core/threads/token-usage.test.ts | Adds unit tests for selectContextUsage mapping from backend shape to UI shape. |
| frontend/tests/unit/components/workspace/context-usage-format.test.ts | Tests formatting rules for the displayed context percentage. |
| frontend/tests/unit/components/workspace/context-usage-bar.test.ts | Tests segment conversion logic for the context usage bar. |
| frontend/src/core/threads/types.ts | Adds TypeScript types for the new context_usage payload and breakdown keys. |
| frontend/src/core/threads/token-usage.ts | Adds selectContextUsage selector and UI-facing ContextUsage types. |
| frontend/src/core/threads/hooks.ts | Keeps previous token-usage query data visible during refetch via placeholderData. |
| frontend/src/core/i18n/locales/en-US.ts | Adds English strings for context usage UI and category labels. |
| frontend/src/core/i18n/locales/zh-CN.ts | Adds Chinese strings for context usage UI and category labels. |
| frontend/src/core/i18n/locales/types.ts | Extends i18n typings with the new contextUsage translation block. |
| frontend/src/components/workspace/token-usage-indicator.tsx | Displays context percentage alongside token usage (when enabled) and embeds the breakdown section. |
| frontend/src/components/workspace/context-usage-format.ts | Implements percentage formatting helper (null for unknown, integer vs 1-decimal rendering). |
| frontend/src/components/workspace/context-usage-breakdown.tsx | Shared breakdown UI: summary, multi-segment bar, and per-category table. |
| frontend/src/components/workspace/context-usage-bar.tsx | Renders the multi-segment progress bar and exposes breakdownToSegments. |
| frontend/src/components/workspace/context-usage-badge.tsx | Adds standalone header badge when token_usage.enabled is off. |
| frontend/src/app/workspace/chats/[thread_id]/page.tsx | Fetches token usage unconditionally (non-mock) and switches between TokenUsageIndicator vs ContextUsageBadge. |
| frontend/src/app/workspace/agents/[agent_name]/chats/[thread_id]/page.tsx | Same conditional rendering/fetch behavior for agent chat page. |
| config.example.yaml | Documents context_window vs max_tokens and bumps config_version to 11. |
| backend/tests/test_thread_token_usage.py | Adds endpoint shape checks and unit tests for context-usage payload assembly helpers. |
| backend/packages/harness/deerflow/config/model_config.py | Adds context_window field to ModelConfig. |
| backend/CLAUDE.md | Updates API table entry to mention context_usage in /token-usage response. |
| backend/app/gateway/routers/thread_runs.py | Extends response schema with ThreadContextUsage and populates it via build_context_usage. |
| backend/app/gateway/context_usage.py | New module implementing per-category context usage computation + payload assembly. |
No behavior change — collapses two multi-line expressions that fit on one line under the project's 240-char limit. Picked up by `make format`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- token-usage-indicator: switch `{contextPercentage && (...)}` to an
explicit `!= null` check. (The string `"0"` is actually truthy in JS so
the original code wasn't buggy, but the explicit check is clearer.)
- context-usage-breakdown: drop the `useMemo` around segments/totals — the
computation is O(n) over a handful of rows and the previous memo deps
omitted `t.contextUsage.categories`, so the bar's tooltips/aria-labels
could stay in the old language after a locale switch.
- context_usage._split_tools: snapshot MCP names from
`get_cached_mcp_tools()` directly instead of re-reading
`extensions_config.json` after `get_available_tools()` already loaded
it. Removes redundant file I/O on every `/token-usage` poll.
(`get_available_tools()` still emits its own INFO logs — silencing
those is out of scope here.)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CI's `pnpm format` (prettier --check) caught two lines previously formatted by hand. Collapses one comma to fit on one line; no behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
Author
|
@WillemJiang Hi, would you mind taking a look at this PR when you have time? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Closes #3125.
Summary
ModelConfig.context_windowfield — distinct from the per-call output capmax_tokens— and uses it as the denominator for the percentage.GET /api/threads/{id}/token-usagewith acontext_usageblock that decomposes the live checkpoint into messages, system prompt, skills, system/MCP tools (active + deferred), custom agents, memory injection, autocompact buffer, and free space.token_usage.enabled(acceptance criterion) — when token-cost tracking is off, a standalone badge still renders the context indicator.placeholderDataso the percentage doesn't flicker between turns while a refetch is in flight.Acceptance criteria (from #3125)
placeholderDatakeeps the last value while refetching).token_usage.enabled(separate<ContextUsageBadge>for the disabled case; shared breakdown component reused inside<TokenUsageIndicator>when enabled).max_tokensfrom context window size (newcontext_windowfield; documented inconfig.example.yaml).How it computes each row
count_tokens_approximately(checkpoint.channel_values[\"messages\"])apply_prompt_template(...)rendered output minus skills / deferred-tools / subagent sections (which get their own rows)get_skills_prompt_section(app_config=...)convert_to_openai_tool()per active tool, JSON-serialized, char-based estimateget_deferred_registry()(whentool_search.enabled)_build_subagent_section(...)whensubagents.enabled_get_memory_context(app_config=...)context_window - summarization.trigger.tokens(only when both are set and the diff is positive)context_window - sum(other rows), clamped to ≥ 0Every category is wrapped in its own
try/except— any single failure degrades to a zero row instead of taking down the endpoint. Zero-token rows are filtered out of the breakdown server-side.Files
Backend
backend/packages/harness/deerflow/config/model_config.py— newcontext_windowfield onModelConfig.backend/app/gateway/context_usage.py— new module: all per-category counting + payload assembly.backend/app/gateway/routers/thread_runs.py—ThreadContextUsageschema rewritten withused_tokens,breakdown[]; endpoint callsbuild_context_usage(...).backend/tests/test_thread_token_usage.py— 14 tests covering payload shape, ordering, autocompact / free-space math, deferred-vs-active semantics, summarization-trigger parsing.Frontend
frontend/src/core/threads/types.ts+token-usage.ts— types and selector for the new shape.frontend/src/core/threads/hooks.ts—placeholderDataonuseThreadTokenUsage.frontend/src/components/workspace/context-usage-bar.tsx— new: multi-segment progress bar (active vs reserved coloring) +breakdownToSegments()helper.frontend/src/components/workspace/context-usage-breakdown.tsx— new: dropdown content (header summary, bar, table) shared by both surfaces.frontend/src/components/workspace/context-usage-badge.tsx— standalone header pill whentoken_usage.enabled = false; opens the same breakdown.frontend/src/components/workspace/token-usage-indicator.tsx— embeds the breakdown inside the Token Usage dropdown when enabled.frontend/src/app/workspace/chats/[thread_id]/page.tsx+ agent-chats page — fetch usage unconditionally; render the right surface based ontokenUsageEnabled.frontend/src/core/i18n/locales/{en-US,zh-CN,types}.ts— labels for every breakdown category in both languages.Config / docs
config.example.yaml—context_windowadded to every example model with comments explaining the field;config_versionbumped to 11 somake config-upgradeflags outdated user configs.backend/CLAUDE.md— note on the newcontext_usageblock in the API table.Test plan
cd backend && PYTHONPATH=. uv run pytest tests/test_thread_token_usage.py -v— 14/14 pass.cd backend && PYTHONPATH=. uv run pytest tests/ --ignore=tests/test_runtime_lifecycle_e2e.py— 3636 pass, 19 skipped. (test_runtime_lifecycle_e2e.pyhas a pre-existing timing-dependent flake onmain; unrelated to this change.)cd backend && uv run ruff check app/gateway/context_usage.py app/gateway/routers/thread_runs.py packages/harness/deerflow/config/model_config.py tests/test_thread_token_usage.py— clean.cd frontend && pnpm test --run— 95/95 pass.cd frontend && pnpm check(ESLint +tsc --noEmit) — clean.context_windowon a model inconfig.yaml, open a chat, watch the indicator update across turns; toggletoken_usage.enabledand confirm the standalone badge still renders the breakdown.🤖 Generated with Claude Code