feat(vis): add session tracing visualization system#1391
Conversation
- Added WireViewer component to display wire events with filtering, searching, and navigation features. - Introduced hooks for theme management and caching API responses. - Created utility functions for class name merging. - Established a new API layer for fetching session and wire event data. - Implemented error boundary for better error handling in the application. - Configured Vite for development with React and Tailwind CSS support. - Added global styles and theming support using CSS variables. - Set up TypeScript configuration for improved type safety and module resolution.
There was a problem hiding this comment.
Pull request overview
This PR introduces a new "vis" (Agent Tracing Visualizer) feature to the kimi-cli project. It adds a full-stack visualization tool with a React/Vite/Tailwind frontend and a FastAPI backend that allows developers to inspect and debug agent sessions, including wire events, context messages, session state, and aggregate statistics.
Changes:
- Added a complete React frontend (
vis/) with session explorer, wire event viewer, context message viewer, state viewer, dual view, statistics dashboard, and various analysis tools (decision path, turn efficiency, integrity checks, tool stats). - Added a Python FastAPI backend (
src/kimi_cli/vis/) with API endpoints for listing sessions, reading wire events, context messages, session state, and computing aggregate statistics. - Integrated the new
viscommand into the CLI, added build scripts (scripts/build_vis.py), and updated the Makefile with build/dev targets.
Reviewed changes
Copilot reviewed 50 out of 53 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
vis/package.json |
Frontend dependencies and build scripts |
vis/vite.config.ts |
Vite config with API proxy for development |
vis/tsconfig*.json |
TypeScript configuration files |
vis/index.html |
HTML entry point |
vis/src/main.tsx |
React entry point with error boundary |
vis/src/App.tsx |
Main app component with routing, tabs, keyboard shortcuts |
vis/src/index.css |
Global styles, theming, Tailwind CSS setup |
vis/src/lib/api.ts |
Frontend API layer with data fetching |
vis/src/lib/cache.ts |
Client-side API response caching |
vis/src/lib/utils.ts |
Tailwind class merge utility |
vis/src/hooks/use-theme.ts |
Theme toggle hook |
vis/src/components/markdown.tsx |
Markdown renderer using streamdown |
vis/src/components/ui/select.tsx |
Radix UI select component |
vis/src/features/wire-viewer/*.tsx |
Wire event viewer with filters, charts, tree nav, integrity checks |
vis/src/features/context-viewer/*.tsx |
Context message viewer with space map |
vis/src/features/state-viewer/state-viewer.tsx |
Session state JSON viewer |
vis/src/features/dual-view/dual-view.tsx |
Side-by-side wire/context view |
vis/src/features/sessions-explorer/*.tsx |
Sessions explorer with search, sort, grouping |
vis/src/features/session-picker/session-picker.tsx |
Session picker dropdown |
vis/src/features/statistics/statistics-view.tsx |
Aggregate statistics dashboard |
src/kimi_cli/vis/app.py |
FastAPI app creation and server runner |
src/kimi_cli/vis/api/sessions.py |
Session data API endpoints |
src/kimi_cli/vis/api/statistics.py |
Statistics API endpoint |
src/kimi_cli/vis/api/__init__.py |
API router exports |
src/kimi_cli/vis/__init__.py |
Package init |
src/kimi_cli/cli/vis.py |
CLI vis command |
src/kimi_cli/cli/__init__.py |
Register vis CLI command |
src/kimi_cli/utils/pyinstaller.py |
Include vis static assets in PyInstaller bundle |
scripts/build_vis.py |
Build script for vis frontend |
scripts/build_web.py |
Updated to handle incomplete dev dependencies |
Makefile |
Added vis build/dev targets |
.gitignore |
Ignore vis build artifacts |
vis/components.json |
shadcn/ui configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/kimi_cli/vis/api/statistics.py
Outdated
| from kimi_cli.vis.api.sessions import get_work_dir_for_hash | ||
| from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line | ||
|
|
||
| router = APIRouter(prefix="/api/vis", tags=["vis"]) | ||
|
|
||
|
|
||
| def _collect_events( | ||
| msg_type: str, | ||
| payload: dict[str, Any], | ||
| out: list[tuple[str, dict[str, Any]]], | ||
| ) -> None: | ||
| """Recursively unwrap SubagentEvent and collect (type, payload) pairs.""" | ||
| if msg_type == "SubagentEvent": | ||
| inner: dict[str, Any] | None = payload.get("event") | ||
| if isinstance(inner, dict): | ||
| inner_type: str = inner.get("type", "") | ||
| inner_payload: dict[str, Any] = inner.get("payload", {}) | ||
| if inner_type: | ||
| _collect_events(inner_type, inner_payload, out) | ||
| else: | ||
| out.append((msg_type, payload)) | ||
|
|
||
|
|
||
| # Simple in-memory cache: (result, timestamp) |
There was a problem hiding this comment.
Maintainability: The _collect_events helper function is duplicated identically across sessions.py (lines 23-37) and statistics.py (lines 19-33). Consider extracting this into a shared utility module (e.g., src/kimi_cli/vis/api/utils.py) to avoid code duplication and ensure future changes are applied consistently.
| from kimi_cli.vis.api.sessions import get_work_dir_for_hash | |
| from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line | |
| router = APIRouter(prefix="/api/vis", tags=["vis"]) | |
| def _collect_events( | |
| msg_type: str, | |
| payload: dict[str, Any], | |
| out: list[tuple[str, dict[str, Any]]], | |
| ) -> None: | |
| """Recursively unwrap SubagentEvent and collect (type, payload) pairs.""" | |
| if msg_type == "SubagentEvent": | |
| inner: dict[str, Any] | None = payload.get("event") | |
| if isinstance(inner, dict): | |
| inner_type: str = inner.get("type", "") | |
| inner_payload: dict[str, Any] = inner.get("payload", {}) | |
| if inner_type: | |
| _collect_events(inner_type, inner_payload, out) | |
| else: | |
| out.append((msg_type, payload)) | |
| # Simple in-memory cache: (result, timestamp) | |
| from kimi_cli.vis.api.sessions import get_work_dir_for_hash, _collect_events | |
| from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line | |
| router = APIRouter(prefix="/api/vis", tags=["vis"]) | |
| # Simple in-memory cache: (result, timestamp) | |
| _cache: dict[str, tuple[dict[str, Any], float]] = {} | |
| _CACHE_TTL = 60 # seconds | |
| @router.get("/statistics") | |
| async def get_statistics() -> dict[str, Any]: | |
| """Aggregate statistics across all sessions.""" | |
| now = time.time() | |
| cached = _cache.get("statistics") | |
| if cached and (now - cached[1]) < _CACHE_TTL: | |
| return cached[0] | |
| sessions_root = get_share_dir() / "sessions" | |
| if not sessions_root.exists(): | |
| empty: dict[str, Any] = { | |
| "total_sessions": 0, | |
| "total_turns": 0, | |
| "total_tokens": {"input": 0, "output": 0}, | |
| "total_duration_sec": 0, | |
| "tool_usage": [], | |
| "daily_usage": [], | |
| "per_project": [], | |
| } | |
| _cache["statistics"] = (empty, now) | |
| return empty | |
| total_sessions = 0 | |
| total_turns = 0 | |
| total_input_tokens = 0 | |
| total_output_tokens = 0 | |
| total_duration_sec = 0.0 | |
| # tool_name -> { count, error_count } | |
| tool_stats: dict[str, dict[str, int]] = defaultdict(lambda: {"count": 0, "error_count": 0}) | |
| # date_str -> { sessions, turns } |
| // Duration | ||
| let durationSec = 0; | ||
| if (resultEvent) { | ||
| durationSec = Math.max(0, (resultEvent.timestamp - ev.timestamp) / 1000); |
There was a problem hiding this comment.
Bug: The durationSec calculation divides by 1000, but based on wire-event-card.tsx line 94-96 (formatTimestamp multiplies ts * 1000 to get milliseconds for Date), the timestamps are already in seconds. The tool-call-detail.tsx line 77 and turn-efficiency.tsx line 83 both compute duration as a simple difference without dividing by 1000 (toolResult.timestamp - toolCall.timestamp). This /1000 will produce values 1000x too small, showing durations like "0.0s" instead of the correct value.
| private evictIfNeeded(): void { | ||
| if (this.store.size <= MAX_ENTRIES) return; | ||
| // Evict oldest entries (by insertion order — Map preserves order) | ||
| const toDelete = this.store.size - MAX_ENTRIES; | ||
| let deleted = 0; | ||
| for (const [key, entry] of this.store) { | ||
| // Don't evict in-flight requests | ||
| if (entry.promise) continue; | ||
| this.store.delete(key); | ||
| deleted++; | ||
| if (deleted >= toDelete) break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Bug: The evictIfNeeded() check uses <= which means eviction only runs when size > MAX_ENTRIES. However, eviction is called after this.store.set(key, ...) in the success handler (line 51-52), meaning the store can temporarily hold MAX_ENTRIES + 1 items. More importantly, if all excess entries have in-flight promises (the if (entry.promise) continue; guard on line 20), no entries will be deleted and the store will grow unbounded. Consider handling this edge case or at minimum evicting entries that have completed (no promise) regardless of their age.
| application.add_middleware( | ||
| cast(Any, CORSMiddleware), | ||
| allow_origins=["*"], | ||
| allow_credentials=True, | ||
| allow_methods=["*"], | ||
| allow_headers=["*"], | ||
| ) |
There was a problem hiding this comment.
Security: The CORS middleware is configured with allow_origins=["*"]. While this is a local development/debugging tool, if this server is ever exposed on a network (e.g., host=0.0.0.0), this allows any website to make authenticated requests to it. Consider restricting origins to localhost only (e.g., ["http://127.0.0.1:5495", "http://localhost:5495"]) or at least documenting the security implication.
| if wire_path.exists(): | ||
| try: | ||
| with wire_path.open(encoding="utf-8") as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line: | ||
| continue | ||
| try: | ||
| parsed = parse_wire_file_line(line) | ||
| except Exception: | ||
| logger.debug("Skipped malformed line in %s", wire_path) | ||
| continue | ||
| if isinstance(parsed, WireFileMetadata): | ||
| continue | ||
| if parsed.message.type == "TurnBegin": | ||
| turn_count += 1 | ||
| if turn_count == 1: | ||
| user_input = parsed.message.payload.get("user_input", "") | ||
| if isinstance(user_input, str): | ||
| title = user_input[:100] | ||
| elif isinstance(user_input, list) and user_input: | ||
| first = user_input[0] | ||
| if isinstance(first, dict): | ||
| title = str(first.get("text", ""))[:100] | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Performance: The list_sessions endpoint reads and parses every line of every wire.jsonl file across all sessions on each request (with only a 30s client-side cache). For large session stores, this could be very slow and block the event loop since the file reads in this function are synchronous (wire_path.open() without aiofiles). Consider using aiofiles here too (as is done in other endpoints), or caching the results server-side similar to how get_statistics does it.
| with wire_path.open(encoding="utf-8") as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line: | ||
| continue | ||
| try: | ||
| parsed = parse_wire_file_line(line) | ||
| except Exception: | ||
| continue | ||
| if isinstance(parsed, WireFileMetadata): | ||
| continue | ||
|
|
||
| ts = parsed.timestamp | ||
| msg_type = parsed.message.type | ||
| payload = parsed.message.payload | ||
|
|
||
| if first_ts == 0: | ||
| first_ts = ts | ||
| # Determine date from first timestamp | ||
| try: | ||
| dt = datetime.fromtimestamp(ts, tz=UTC) | ||
| session_date = dt.strftime("%Y-%m-%d") | ||
| except Exception: | ||
| pass | ||
| last_ts = ts | ||
|
|
||
| # Collect (type, payload) pairs, unwrapping SubagentEvent recursively | ||
| events_to_process: list[tuple[str, dict[str, Any]]] = [] | ||
| _collect_events(msg_type, payload, events_to_process) | ||
|
|
||
| for ev_type, ev_payload in events_to_process: | ||
| if ev_type == "TurnBegin": | ||
| session_turns += 1 | ||
| elif ev_type == "ToolCall": | ||
| fn: dict[str, Any] | None = ev_payload.get("function") | ||
| tool_id: str = ev_payload.get("id", "") | ||
| if isinstance(fn, dict): | ||
| name: str = fn.get("name", "unknown") | ||
| tool_stats[name]["count"] += 1 | ||
| if tool_id: | ||
| pending_tools[tool_id] = name | ||
| elif ev_type == "ToolResult": | ||
| tool_call_id: str = ev_payload.get("tool_call_id", "") | ||
| rv: dict[str, Any] | None = ev_payload.get("return_value") | ||
| if isinstance(rv, dict) and rv.get("is_error"): | ||
| tool_name = pending_tools.get(tool_call_id) | ||
| if tool_name: | ||
| tool_stats[tool_name]["error_count"] += 1 | ||
| pending_tools.pop(tool_call_id, None) | ||
| elif ev_type == "StatusUpdate": | ||
| tu: dict[str, Any] | None = ev_payload.get("token_usage") | ||
| if isinstance(tu, dict): | ||
| session_input_tokens += ( | ||
| int(tu.get("input_other", 0)) | ||
| + int(tu.get("input_cache_read", 0)) | ||
| + int(tu.get("input_cache_creation", 0)) | ||
| ) | ||
| session_output_tokens += int(tu.get("output", 0)) | ||
| except Exception: | ||
| continue |
There was a problem hiding this comment.
Performance: The get_statistics endpoint reads all wire files synchronously (using wire_path.open() instead of aiofiles) inside an async function. This will block the event loop for the duration of all file I/O, potentially making the server unresponsive during the computation. The other endpoints in sessions.py use aiofiles for async I/O; this endpoint should do the same for consistency and to avoid blocking.
| const labelCount = Math.min(5, daily.length); | ||
| const labelIndices: number[] = []; | ||
| for (let i = 0; i < labelCount; i++) { | ||
| labelIndices.push( | ||
| Math.round((i / (labelCount - 1)) * (daily.length - 1)), | ||
| ); | ||
| } |
There was a problem hiding this comment.
Bug: The DailyUsageChart computes label indices with Math.round((i / (labelCount - 1)) * (daily.length - 1)). When labelCount is 1 (which happens when daily.length is 1), this results in a division by zero (0/0 = NaN), producing NaN indices. Although the if (daily.length === 0) return null; guard exists, a single-day dataset would still hit this bug. Add a guard for labelCount <= 1 or handle it in the loop.
…improve label index calculation in DailyUsageChart
Signed-off-by: Kai <me@kaiyi.cool>
| } | ||
| } else if (ev.type === "ApprovalResponse") { | ||
| totalPairable++; | ||
| const id = ev.payload.id as string | undefined; |
There was a problem hiding this comment.
🔴 IntegrityCheck uses wrong field (id instead of request_id) to match ApprovalResponse to ApprovalRequest
In integrity-check.tsx, ApprovalResponse events are matched to ApprovalRequest events using ev.payload.id (line 105). However, in timeline-view.tsx:306, the same matching logic correctly uses ev.payload.request_id. Since ApprovalResponse payloads carry a request_id field that references the original request's id, reading payload.id on the response will either be undefined or be the response's own ID — never matching the request's ID in approvalMap. This causes all approval pairs to be falsely reported as orphaned events, inflating the orphan count and deflating the integrity score.
| const id = ev.payload.id as string | undefined; | |
| const id = ev.payload.request_id as string | undefined; |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
vissubsystem with a FastAPI backend and React frontend for visualizing session tracing dataSubagentEventin metrics aggregation so subagent tool calls, errors, and token usage are correctly countedkimi visCLI command and build scriptsTest plan
kimi visand verify the web UI loadsmake checkpasses (ruff, pyright, format)Checklist
make gen-changelogto update the changelog.make gen-docsto update the user documentation.