Skip to content

feat: real-time context window usage indicator with per-category breakdown (#3125)#3183

Open
NewAmorend wants to merge 14 commits into
bytedance:mainfrom
NewAmorend:feat/context-window-usage-indicator
Open

feat: real-time context window usage indicator with per-category breakdown (#3125)#3183
NewAmorend wants to merge 14 commits into
bytedance:mainfrom
NewAmorend:feat/context-window-usage-indicator

Conversation

@NewAmorend
Copy link
Copy Markdown
Contributor

Closes #3125.

Summary

  • Adds a Claude-Code-style context window indicator to the chat UI: a percentage badge in the header that opens a multi-segment progress bar and per-category token breakdown.
  • Introduces a new ModelConfig.context_window field — distinct from the per-call output cap max_tokens — and uses it as the denominator for the percentage.
  • Extends GET /api/threads/{id}/token-usage with a context_usage block that decomposes the live checkpoint into messages, system prompt, skills, system/MCP tools (active + deferred), custom agents, memory injection, autocompact buffer, and free space.
  • Works independently of token_usage.enabled (acceptance criterion) — when token-cost tracking is off, a standalone badge still renders the context indicator.
  • Uses TanStack Query placeholderData so the percentage doesn't flicker between turns while a refetch is in flight.

Acceptance criteria (from #3125)

  • Context percentage is always visible during a conversation.
  • Does not disappear when sending a message (placeholderData keeps the last value while refetching).
  • Works regardless of token_usage.enabled (separate <ContextUsageBadge> for the disabled case; shared breakdown component reused inside <TokenUsageIndicator> when enabled).
  • Correctly distinguishes output max_tokens from context window size (new context_window field; documented in config.example.yaml).

How it computes each row

Row Source
Messages count_tokens_approximately(checkpoint.channel_values[\"messages\"])
System prompt apply_prompt_template(...) rendered output minus skills / deferred-tools / subagent sections (which get their own rows)
Skills get_skills_prompt_section(app_config=...)
System tools / MCP tools convert_to_openai_tool() per active tool, JSON-serialized, char-based estimate
MCP tools (deferred) / System tools (deferred) Same but only for tools in get_deferred_registry() (when tool_search.enabled)
Custom agents _build_subagent_section(...) when subagents.enabled
Memory files _get_memory_context(app_config=...)
Autocompact buffer context_window - summarization.trigger.tokens (only when both are set and the diff is positive)
Free space context_window - sum(other rows), clamped to ≥ 0

Every category is wrapped in its own try/except — any single failure degrades to a zero row instead of taking down the endpoint. Zero-token rows are filtered out of the breakdown server-side.

Files

Backend

  • backend/packages/harness/deerflow/config/model_config.py — new context_window field on ModelConfig.
  • backend/app/gateway/context_usage.pynew module: all per-category counting + payload assembly.
  • backend/app/gateway/routers/thread_runs.pyThreadContextUsage schema rewritten with used_tokens, breakdown[]; endpoint calls build_context_usage(...).
  • backend/tests/test_thread_token_usage.py — 14 tests covering payload shape, ordering, autocompact / free-space math, deferred-vs-active semantics, summarization-trigger parsing.

Frontend

  • frontend/src/core/threads/types.ts + token-usage.ts — types and selector for the new shape.
  • frontend/src/core/threads/hooks.tsplaceholderData on useThreadTokenUsage.
  • frontend/src/components/workspace/context-usage-bar.tsxnew: multi-segment progress bar (active vs reserved coloring) + breakdownToSegments() helper.
  • frontend/src/components/workspace/context-usage-breakdown.tsxnew: dropdown content (header summary, bar, table) shared by both surfaces.
  • frontend/src/components/workspace/context-usage-badge.tsx — standalone header pill when token_usage.enabled = false; opens the same breakdown.
  • frontend/src/components/workspace/token-usage-indicator.tsx — embeds the breakdown inside the Token Usage dropdown when enabled.
  • frontend/src/app/workspace/chats/[thread_id]/page.tsx + agent-chats page — fetch usage unconditionally; render the right surface based on tokenUsageEnabled.
  • frontend/src/core/i18n/locales/{en-US,zh-CN,types}.ts — labels for every breakdown category in both languages.

Config / docs

  • config.example.yamlcontext_window added to every example model with comments explaining the field; config_version bumped to 11 so make config-upgrade flags outdated user configs.
  • backend/CLAUDE.md — note on the new context_usage block in the API table.

Test plan

  • cd backend && PYTHONPATH=. uv run pytest tests/test_thread_token_usage.py -v — 14/14 pass.
  • cd backend && PYTHONPATH=. uv run pytest tests/ --ignore=tests/test_runtime_lifecycle_e2e.py — 3636 pass, 19 skipped. (test_runtime_lifecycle_e2e.py has a pre-existing timing-dependent flake on main; unrelated to this change.)
  • cd backend && uv run ruff check app/gateway/context_usage.py app/gateway/routers/thread_runs.py packages/harness/deerflow/config/model_config.py tests/test_thread_token_usage.py — clean.
  • cd frontend && pnpm test --run — 95/95 pass.
  • cd frontend && pnpm check (ESLint + tsc --noEmit) — clean.
  • Manual smoke (reviewer): set context_window on a model in config.yaml, open a chat, watch the indicator update across turns; toggle token_usage.enabled and confirm the standalone badge still renders the breakdown.

🤖 Generated with Claude Code

NewAmorend and others added 4 commits May 23, 2026 14:06
Adds a `context_usage` block to `GET /api/threads/{id}/token-usage`
(token count from the live checkpoint, the thread model's
`context_window`, and a percentage), introduces a new
`ModelConfig.context_window` distinct from the per-call `max_tokens`
output cap, and surfaces the percentage in the chat header — inside
`TokenUsageIndicator` when token-usage tracking is on, or as a
standalone badge when it's off so context capacity stays visible
independent of cost tracking.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the single-number context_usage payload with a Claude-Code-style
breakdown — messages, system prompt, skills, system/MCP tools (active +
deferred), custom agents, memory injection, autocompact buffer, and free
space — and surface it in the chat UI with a segmented progress bar and
per-row table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add `context_window` to every example model in config.example.yaml so the
new chat-UI "% context used" indicator works out of the box for whichever
example a user adopts. Each value is the published default at the time of
writing; users are pointed at the official model spec to verify. Bumps
config_version to 11 so `make config-upgrade` flags outdated user configs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@NewAmorend
Copy link
Copy Markdown
Contributor Author

image ui preview

@NewAmorend
Copy link
Copy Markdown
Contributor Author

the intuition is from Claude code's context window ui

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a real-time “context window usage” indicator to the chat UI, backed by a new backend context_usage payload that computes approximate token usage per prompt category and uses a new per-model context_window config field as the percentage denominator.

Changes:

  • Backend: introduce ModelConfig.context_window and extend GET /api/threads/{id}/token-usage with a context_usage breakdown computed from the live checkpoint + prompt/tool/memory sources.
  • Frontend: render a header badge/percentage and a shared breakdown UI (bar + table), using TanStack Query placeholderData to avoid flicker during refetch.
  • Docs/config/tests: bump config_version, document context_window, and add unit tests for both backend payload math and frontend formatting/selection.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
frontend/tests/unit/core/threads/token-usage.test.ts Adds unit tests for selectContextUsage mapping from backend shape to UI shape.
frontend/tests/unit/components/workspace/context-usage-format.test.ts Tests formatting rules for the displayed context percentage.
frontend/tests/unit/components/workspace/context-usage-bar.test.ts Tests segment conversion logic for the context usage bar.
frontend/src/core/threads/types.ts Adds TypeScript types for the new context_usage payload and breakdown keys.
frontend/src/core/threads/token-usage.ts Adds selectContextUsage selector and UI-facing ContextUsage types.
frontend/src/core/threads/hooks.ts Keeps previous token-usage query data visible during refetch via placeholderData.
frontend/src/core/i18n/locales/en-US.ts Adds English strings for context usage UI and category labels.
frontend/src/core/i18n/locales/zh-CN.ts Adds Chinese strings for context usage UI and category labels.
frontend/src/core/i18n/locales/types.ts Extends i18n typings with the new contextUsage translation block.
frontend/src/components/workspace/token-usage-indicator.tsx Displays context percentage alongside token usage (when enabled) and embeds the breakdown section.
frontend/src/components/workspace/context-usage-format.ts Implements percentage formatting helper (null for unknown, integer vs 1-decimal rendering).
frontend/src/components/workspace/context-usage-breakdown.tsx Shared breakdown UI: summary, multi-segment bar, and per-category table.
frontend/src/components/workspace/context-usage-bar.tsx Renders the multi-segment progress bar and exposes breakdownToSegments.
frontend/src/components/workspace/context-usage-badge.tsx Adds standalone header badge when token_usage.enabled is off.
frontend/src/app/workspace/chats/[thread_id]/page.tsx Fetches token usage unconditionally (non-mock) and switches between TokenUsageIndicator vs ContextUsageBadge.
frontend/src/app/workspace/agents/[agent_name]/chats/[thread_id]/page.tsx Same conditional rendering/fetch behavior for agent chat page.
config.example.yaml Documents context_window vs max_tokens and bumps config_version to 11.
backend/tests/test_thread_token_usage.py Adds endpoint shape checks and unit tests for context-usage payload assembly helpers.
backend/packages/harness/deerflow/config/model_config.py Adds context_window field to ModelConfig.
backend/CLAUDE.md Updates API table entry to mention context_usage in /token-usage response.
backend/app/gateway/routers/thread_runs.py Extends response schema with ThreadContextUsage and populates it via build_context_usage.
backend/app/gateway/context_usage.py New module implementing per-category context usage computation + payload assembly.

Comment thread frontend/src/components/workspace/token-usage-indicator.tsx Outdated
Comment thread frontend/src/components/workspace/context-usage-breakdown.tsx Outdated
Comment thread backend/app/gateway/context_usage.py
NewAmorend and others added 3 commits May 23, 2026 23:17
No behavior change — collapses two multi-line expressions that fit on
one line under the project's 240-char limit. Picked up by `make format`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- token-usage-indicator: switch `{contextPercentage && (...)}` to an
  explicit `!= null` check. (The string `"0"` is actually truthy in JS so
  the original code wasn't buggy, but the explicit check is clearer.)
- context-usage-breakdown: drop the `useMemo` around segments/totals — the
  computation is O(n) over a handful of rows and the previous memo deps
  omitted `t.contextUsage.categories`, so the bar's tooltips/aria-labels
  could stay in the old language after a locale switch.
- context_usage._split_tools: snapshot MCP names from
  `get_cached_mcp_tools()` directly instead of re-reading
  `extensions_config.json` after `get_available_tools()` already loaded
  it. Removes redundant file I/O on every `/token-usage` poll.
  (`get_available_tools()` still emits its own INFO logs — silencing
  those is out of scope here.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@WillemJiang WillemJiang added this to the 2.0.0 milestone May 24, 2026
@NewAmorend
Copy link
Copy Markdown
Contributor Author

@WillemJiang Hi, would you mind taking a look at this PR when you have time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: display real-time context window usage percentage in chat UI

3 participants