feat: Add context window % visualization to web dashboard#56
Conversation
ce59b1b to
bafa917
Compare
|
@microsoft-github-policy-service agree company="Microsoft" |
b35a208 to
06eb665
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #56 +/- ##
=======================================
Coverage ? 85.73%
=======================================
Files ? 45
Lines ? 6261
Branches ? 0
=======================================
Hits ? 5368
Misses ? 893
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
06eb665 to
93cd59b
Compare
0f1a239 to
1e74ab2
Compare
Show real-time context window utilization per agent, helping developers catch overflow risks before agents fail (RISK-008). Backend: - Extend ModelPricing with context_window field (unified model registry, no separate lookup table) - Add context_window to all models in DEFAULT_PRICING, plus new models (Claude 4.6, GPT-4.1/5.x, O-series, Gemini) - Fix prefix-matching sort in get_pricing() — longest key first to avoid 'o1' matching before 'o1-mini' (pre-existing bug) - Emit context_window_used (input_tokens) and context_window_max on agent_started, agent_completed, and parallel_agent_completed events Frontend: - 2px progress bar on AgentNode (green <70%, amber 70-89%, red 90%+) - Thresholds extracted to named constants (CONTEXT_WARN_PCT, CONTEXT_DANGER_PCT) - Context row in MetadataGrid detail panel with division-by-zero guard - Pulse animation at 100% usage - Hidden when model context window size is unknown - lib/utils.ts and lib/constants.ts (required by existing imports) Tests: - 21 Python tests: model lookups via get_pricing().context_window (exact, prefix, suffix, edge cases) and event emission via real WorkflowEngine - All tests call production code, not copies - Existing pricing tests (20) pass with zero regressions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1e74ab2 to
18e4b11
Compare
|
@jrob5756 this is ready for review again. Let me know your thoughts, if this can go in. Thanks |
There was a problem hiding this comment.
Review
Sorry for the delay @aviraldua93 ! I had this sitting as a pending review that I never sent. Just some small clean up stuff but happen to merge after :)
Missing test coverage for parallel agent context window events
The parallel_agent_completed event handler in workflow-store.ts (lines 691-703) also computes context_pct from context_window_used/context_window_max, and workflow.py emits these fields for parallel agents (line 2068). However, there are no tests covering this code path. Consider adding a test for parallel agent context window event emission.
| agent_name: string; | ||
| turn?: number; | ||
| context_window_used?: number; | ||
| context_window_max?: number; |
There was a problem hiding this comment.
Bug: context_window_used and context_window_max are added to AgentTurnStartData, but the Python backend never emits these fields in agent_turn_start events. The providers emit agent_turn_start with only {"turn": N} or {"turn": "awaiting_model"}, and the workflow engine doesn't inject context window fields for this event type. The workflow-store handler for agent_turn_start also never reads these fields.
These are dead types that will mislead consumers into expecting data that won't arrive. They should be removed from AgentTurnStartData.
…ow tests - Remove dead context_window_used and context_window_max fields from AgentTurnStartData in events.ts (backend never emits these on agent_turn_start events, frontend never reads them) - Add 2 tests for parallel agent context window event emission: - Known model (gpt-4o) emits context_window_max=128000 - Unknown model emits context_window_max=None Addresses review comments from @jrob5756 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- 9 screenshots: green (14%), amber (74%), red (93%), unknown, dark mode plus detail panel views for each threshold - 24 assertions: node rendering, API data correctness, detail panels - Covers all color thresholds and the no-bar case for unknown models Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Apply ruff format to test_context_window_screenshots.py - Remove binary screenshot PNGs from repo (should not be committed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Not a real test — requires manual server + browser. Was used for local QA verification only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem
When running multi-agent workflows, developers have no visibility into how much of each agent's context window is consumed. Context overflow is flagged as RISK-008 in the conductor design docs, but the dashboard doesn't show utilization.
Solution
Add real-time context window percentage visualization to the web dashboard.
Agent Node — Progress Bar
Detail Panel — Context Row
Contextrow in MetadataGrid:17,862 / 200,000 (8.9%)Changes
Backend (Python)
providers/base.py:get_context_window_size(model)abstract methodproviders/claude.py: Model lookup table (12 Claude models → 200K)providers/copilot.py: Model lookup table (17 models, GPT + Claude via Copilot)engine/workflow.py: Emitcontext_window_used/context_window_maxon agent lifecycle eventsFrontend (React/TypeScript)
types/events.ts: New context window fields on event data typesstores/workflow-store.ts: Computecontext_pctonagent_completedcomponents/graph/AgentNode.tsx: Progress bar with color thresholdscomponents/detail/MetadataGrid.tsx: Context row in detail panelglobals.css: Pulse animation keyframeslib/utils.ts+lib/constants.ts: Fixes pre-existing brokennpm run buildvitest.config.ts+ smoke test: Adds frontend test infrastructureTesting
Verification
Addresses
Closes #57