Skip to content

feat: Add context window % visualization to web dashboard#56

Merged
jrob5756 merged 6 commits intomicrosoft:mainfrom
aviraldua93:feature/dashboard-enhancements
Mar 31, 2026
Merged

feat: Add context window % visualization to web dashboard#56
jrob5756 merged 6 commits intomicrosoft:mainfrom
aviraldua93:feature/dashboard-enhancements

Conversation

@aviraldua93
Copy link
Copy Markdown
Contributor

@aviraldua93 aviraldua93 commented Mar 25, 2026

Problem

When running multi-agent workflows, developers have no visibility into how much of each agent's context window is consumed. Context overflow is flagged as RISK-008 in the conductor design docs, but the dashboard doesn't show utilization.

Solution

Add real-time context window percentage visualization to the web dashboard.

Agent Node — Progress Bar

  • 2px bar at the bottom of each agent node
  • Green (<70%) → Amber (70-89%) → Red (90%+)
  • Pulse animation at 100% usage
  • Hidden when model context window size is unknown

Detail Panel — Context Row

  • New Context row in MetadataGrid: 17,862 / 200,000 (8.9%)

Changes

Backend (Python)

  • providers/base.py: get_context_window_size(model) abstract method
  • providers/claude.py: Model lookup table (12 Claude models → 200K)
  • providers/copilot.py: Model lookup table (17 models, GPT + Claude via Copilot)
  • engine/workflow.py: Emit context_window_used/context_window_max on agent lifecycle events

Frontend (React/TypeScript)

  • types/events.ts: New context window fields on event data types
  • stores/workflow-store.ts: Compute context_pct on agent_completed
  • components/graph/AgentNode.tsx: Progress bar with color thresholds
  • components/detail/MetadataGrid.tsx: Context row in detail panel
  • globals.css: Pulse animation keyframes
  • lib/utils.ts + lib/constants.ts: Fixes pre-existing broken npm run build
  • vitest.config.ts + smoke test: Adds frontend test infrastructure

Testing

  • 14 Python unit tests (provider lookups + event emission)
  • 2 frontend vitest smoke tests

Verification

# Python tests
uv run pytest tests/test_providers/test_context_window.py tests/test_engine/test_context_window_events.py -v

# Frontend tests
cd src/conductor/web/frontend && npm test

# Lint
uv run ruff check src/conductor/providers/ src/conductor/engine/workflow.py

# Visual test
uv run conductor run examples/simple-qa.yaml --web --input question="What is Python?"
01-dashboard-overview 02-green-14pct 03-amber-74pct 04-red-93pct 05-unknown-no-bar

Addresses

  • RISK-008: Context token overflow

Closes #57

@aviraldua93 aviraldua93 force-pushed the feature/dashboard-enhancements branch from ce59b1b to bafa917 Compare March 25, 2026 06:09
@aviraldua93
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree company="Microsoft"

@aviraldua93 aviraldua93 marked this pull request as ready for review March 25, 2026 16:35
@aviraldua93 aviraldua93 force-pushed the feature/dashboard-enhancements branch 2 times, most recently from b35a208 to 06eb665 Compare March 25, 2026 17:41
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@27fbd37). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/conductor/engine/pricing.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main      #56   +/-   ##
=======================================
  Coverage        ?   85.73%           
=======================================
  Files           ?       45           
  Lines           ?     6261           
  Branches        ?        0           
=======================================
  Hits            ?     5368           
  Misses          ?      893           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aviraldua93 aviraldua93 force-pushed the feature/dashboard-enhancements branch from 06eb665 to 93cd59b Compare March 25, 2026 18:22
Comment thread src/conductor/engine/workflow.py Outdated
Comment thread src/conductor/engine/workflow.py Outdated
Comment thread tests/test_providers/test_context_window.py Outdated
Comment thread src/conductor/providers/claude.py Outdated
Comment thread src/conductor/providers/copilot.py Outdated
@aviraldua93 aviraldua93 force-pushed the feature/dashboard-enhancements branch 8 times, most recently from 0f1a239 to 1e74ab2 Compare March 26, 2026 17:57
@aviraldua93 aviraldua93 requested a review from jrob5756 March 26, 2026 18:00
Show real-time context window utilization per agent, helping developers
catch overflow risks before agents fail (RISK-008).

Backend:
- Extend ModelPricing with context_window field (unified model registry,
  no separate lookup table)
- Add context_window to all models in DEFAULT_PRICING, plus new models
  (Claude 4.6, GPT-4.1/5.x, O-series, Gemini)
- Fix prefix-matching sort in get_pricing() — longest key first to avoid
  'o1' matching before 'o1-mini' (pre-existing bug)
- Emit context_window_used (input_tokens) and context_window_max on
  agent_started, agent_completed, and parallel_agent_completed events

Frontend:
- 2px progress bar on AgentNode (green <70%, amber 70-89%, red 90%+)
- Thresholds extracted to named constants (CONTEXT_WARN_PCT, CONTEXT_DANGER_PCT)
- Context row in MetadataGrid detail panel with division-by-zero guard
- Pulse animation at 100% usage
- Hidden when model context window size is unknown
- lib/utils.ts and lib/constants.ts (required by existing imports)

Tests:
- 21 Python tests: model lookups via get_pricing().context_window
  (exact, prefix, suffix, edge cases) and event emission via real
  WorkflowEngine
- All tests call production code, not copies
- Existing pricing tests (20) pass with zero regressions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@aviraldua93 aviraldua93 force-pushed the feature/dashboard-enhancements branch from 1e74ab2 to 18e4b11 Compare March 26, 2026 18:06
@aviraldua93
Copy link
Copy Markdown
Contributor Author

@jrob5756 this is ready for review again. Let me know your thoughts, if this can go in. Thanks

Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Sorry for the delay @aviraldua93 ! I had this sitting as a pending review that I never sent. Just some small clean up stuff but happen to merge after :)

Missing test coverage for parallel agent context window events

The parallel_agent_completed event handler in workflow-store.ts (lines 691-703) also computes context_pct from context_window_used/context_window_max, and workflow.py emits these fields for parallel agents (line 2068). However, there are no tests covering this code path. Consider adding a test for parallel agent context window event emission.

agent_name: string;
turn?: number;
context_window_used?: number;
context_window_max?: number;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: context_window_used and context_window_max are added to AgentTurnStartData, but the Python backend never emits these fields in agent_turn_start events. The providers emit agent_turn_start with only {"turn": N} or {"turn": "awaiting_model"}, and the workflow engine doesn't inject context window fields for this event type. The workflow-store handler for agent_turn_start also never reads these fields.

These are dead types that will mislead consumers into expecting data that won't arrive. They should be removed from AgentTurnStartData.

Aviral Dua and others added 5 commits March 30, 2026 20:50
…ow tests

- Remove dead context_window_used and context_window_max fields from
  AgentTurnStartData in events.ts (backend never emits these on
  agent_turn_start events, frontend never reads them)
- Add 2 tests for parallel agent context window event emission:
  - Known model (gpt-4o) emits context_window_max=128000
  - Unknown model emits context_window_max=None

Addresses review comments from @jrob5756

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- 9 screenshots: green (14%), amber (74%), red (93%), unknown, dark mode
  plus detail panel views for each threshold
- 24 assertions: node rendering, API data correctness, detail panels
- Covers all color thresholds and the no-bar case for unknown models

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Apply ruff format to test_context_window_screenshots.py
- Remove binary screenshot PNGs from repo (should not be committed)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Not a real test — requires manual server + browser. Was used for
local QA verification only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jrob5756 jrob5756 merged commit 3859725 into microsoft:main Mar 31, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Context window % visualization on web dashboard

3 participants