Skip to content

feat(vis): add session tracing visualization system#1391

Merged
RealKai42 merged 4 commits intomainfrom
kaiyi/vis-tracing-system
Mar 10, 2026
Merged

feat(vis): add session tracing visualization system#1391
RealKai42 merged 4 commits intomainfrom
kaiyi/vis-tracing-system

Conversation

@RealKai42
Copy link
Copy Markdown
Collaborator

@RealKai42 RealKai42 commented Mar 10, 2026

Summary

  • Add a new vis subsystem with a FastAPI backend and React frontend for visualizing session tracing data
  • Backend APIs serve wire events, context messages, session state, per-session summaries, and aggregate statistics
  • Frontend includes wire event viewer, context viewer, state viewer, decision path, timeline, usage charts, and a statistics dashboard
  • Recursively unwrap SubagentEvent in metrics aggregation so subagent tool calls, errors, and token usage are correctly counted
  • Add kimi vis CLI command and build scripts

Test plan

  • Run kimi vis and verify the web UI loads
  • Open a session with subagent activity and confirm tool usage / token stats include subagent events
  • Verify make check passes (ruff, pyright, format)

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

- Added WireViewer component to display wire events with filtering, searching, and navigation features.
- Introduced hooks for theme management and caching API responses.
- Created utility functions for class name merging.
- Established a new API layer for fetching session and wire event data.
- Implemented error boundary for better error handling in the application.
- Configured Vite for development with React and Tailwind CSS support.
- Added global styles and theming support using CSS variables.
- Set up TypeScript configuration for improved type safety and module resolution.
Copilot AI review requested due to automatic review settings March 10, 2026 13:33
@github-actions github-actions bot changed the title feat: add vis feat: add vis || feat: add vis Mar 10, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new "vis" (Agent Tracing Visualizer) feature to the kimi-cli project. It adds a full-stack visualization tool with a React/Vite/Tailwind frontend and a FastAPI backend that allows developers to inspect and debug agent sessions, including wire events, context messages, session state, and aggregate statistics.

Changes:

  • Added a complete React frontend (vis/) with session explorer, wire event viewer, context message viewer, state viewer, dual view, statistics dashboard, and various analysis tools (decision path, turn efficiency, integrity checks, tool stats).
  • Added a Python FastAPI backend (src/kimi_cli/vis/) with API endpoints for listing sessions, reading wire events, context messages, session state, and computing aggregate statistics.
  • Integrated the new vis command into the CLI, added build scripts (scripts/build_vis.py), and updated the Makefile with build/dev targets.

Reviewed changes

Copilot reviewed 50 out of 53 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
vis/package.json Frontend dependencies and build scripts
vis/vite.config.ts Vite config with API proxy for development
vis/tsconfig*.json TypeScript configuration files
vis/index.html HTML entry point
vis/src/main.tsx React entry point with error boundary
vis/src/App.tsx Main app component with routing, tabs, keyboard shortcuts
vis/src/index.css Global styles, theming, Tailwind CSS setup
vis/src/lib/api.ts Frontend API layer with data fetching
vis/src/lib/cache.ts Client-side API response caching
vis/src/lib/utils.ts Tailwind class merge utility
vis/src/hooks/use-theme.ts Theme toggle hook
vis/src/components/markdown.tsx Markdown renderer using streamdown
vis/src/components/ui/select.tsx Radix UI select component
vis/src/features/wire-viewer/*.tsx Wire event viewer with filters, charts, tree nav, integrity checks
vis/src/features/context-viewer/*.tsx Context message viewer with space map
vis/src/features/state-viewer/state-viewer.tsx Session state JSON viewer
vis/src/features/dual-view/dual-view.tsx Side-by-side wire/context view
vis/src/features/sessions-explorer/*.tsx Sessions explorer with search, sort, grouping
vis/src/features/session-picker/session-picker.tsx Session picker dropdown
vis/src/features/statistics/statistics-view.tsx Aggregate statistics dashboard
src/kimi_cli/vis/app.py FastAPI app creation and server runner
src/kimi_cli/vis/api/sessions.py Session data API endpoints
src/kimi_cli/vis/api/statistics.py Statistics API endpoint
src/kimi_cli/vis/api/__init__.py API router exports
src/kimi_cli/vis/__init__.py Package init
src/kimi_cli/cli/vis.py CLI vis command
src/kimi_cli/cli/__init__.py Register vis CLI command
src/kimi_cli/utils/pyinstaller.py Include vis static assets in PyInstaller bundle
scripts/build_vis.py Build script for vis frontend
scripts/build_web.py Updated to handle incomplete dev dependencies
Makefile Added vis build/dev targets
.gitignore Ignore vis build artifacts
vis/components.json shadcn/ui configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +13 to +36
from kimi_cli.vis.api.sessions import get_work_dir_for_hash
from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line

router = APIRouter(prefix="/api/vis", tags=["vis"])


def _collect_events(
msg_type: str,
payload: dict[str, Any],
out: list[tuple[str, dict[str, Any]]],
) -> None:
"""Recursively unwrap SubagentEvent and collect (type, payload) pairs."""
if msg_type == "SubagentEvent":
inner: dict[str, Any] | None = payload.get("event")
if isinstance(inner, dict):
inner_type: str = inner.get("type", "")
inner_payload: dict[str, Any] = inner.get("payload", {})
if inner_type:
_collect_events(inner_type, inner_payload, out)
else:
out.append((msg_type, payload))


# Simple in-memory cache: (result, timestamp)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainability: The _collect_events helper function is duplicated identically across sessions.py (lines 23-37) and statistics.py (lines 19-33). Consider extracting this into a shared utility module (e.g., src/kimi_cli/vis/api/utils.py) to avoid code duplication and ensure future changes are applied consistently.

Suggested change
from kimi_cli.vis.api.sessions import get_work_dir_for_hash
from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line
router = APIRouter(prefix="/api/vis", tags=["vis"])
def _collect_events(
msg_type: str,
payload: dict[str, Any],
out: list[tuple[str, dict[str, Any]]],
) -> None:
"""Recursively unwrap SubagentEvent and collect (type, payload) pairs."""
if msg_type == "SubagentEvent":
inner: dict[str, Any] | None = payload.get("event")
if isinstance(inner, dict):
inner_type: str = inner.get("type", "")
inner_payload: dict[str, Any] = inner.get("payload", {})
if inner_type:
_collect_events(inner_type, inner_payload, out)
else:
out.append((msg_type, payload))
# Simple in-memory cache: (result, timestamp)
from kimi_cli.vis.api.sessions import get_work_dir_for_hash, _collect_events
from kimi_cli.wire.file import WireFileMetadata, parse_wire_file_line
router = APIRouter(prefix="/api/vis", tags=["vis"])
# Simple in-memory cache: (result, timestamp)
_cache: dict[str, tuple[dict[str, Any], float]] = {}
_CACHE_TTL = 60 # seconds
@router.get("/statistics")
async def get_statistics() -> dict[str, Any]:
"""Aggregate statistics across all sessions."""
now = time.time()
cached = _cache.get("statistics")
if cached and (now - cached[1]) < _CACHE_TTL:
return cached[0]
sessions_root = get_share_dir() / "sessions"
if not sessions_root.exists():
empty: dict[str, Any] = {
"total_sessions": 0,
"total_turns": 0,
"total_tokens": {"input": 0, "output": 0},
"total_duration_sec": 0,
"tool_usage": [],
"daily_usage": [],
"per_project": [],
}
_cache["statistics"] = (empty, now)
return empty
total_sessions = 0
total_turns = 0
total_input_tokens = 0
total_output_tokens = 0
total_duration_sec = 0.0
# tool_name -> { count, error_count }
tool_stats: dict[str, dict[str, int]] = defaultdict(lambda: {"count": 0, "error_count": 0})
# date_str -> { sessions, turns }

Copilot uses AI. Check for mistakes.
// Duration
let durationSec = 0;
if (resultEvent) {
durationSec = Math.max(0, (resultEvent.timestamp - ev.timestamp) / 1000);
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The durationSec calculation divides by 1000, but based on wire-event-card.tsx line 94-96 (formatTimestamp multiplies ts * 1000 to get milliseconds for Date), the timestamps are already in seconds. The tool-call-detail.tsx line 77 and turn-efficiency.tsx line 83 both compute duration as a simple difference without dividing by 1000 (toolResult.timestamp - toolCall.timestamp). This /1000 will produce values 1000x too small, showing durations like "0.0s" instead of the correct value.

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +25
private evictIfNeeded(): void {
if (this.store.size <= MAX_ENTRIES) return;
// Evict oldest entries (by insertion order — Map preserves order)
const toDelete = this.store.size - MAX_ENTRIES;
let deleted = 0;
for (const [key, entry] of this.store) {
// Don't evict in-flight requests
if (entry.promise) continue;
this.store.delete(key);
deleted++;
if (deleted >= toDelete) break;
}
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The evictIfNeeded() check uses <= which means eviction only runs when size > MAX_ENTRIES. However, eviction is called after this.store.set(key, ...) in the success handler (line 51-52), meaning the store can temporarily hold MAX_ENTRIES + 1 items. More importantly, if all excess entries have in-flight promises (the if (entry.promise) continue; guard on line 20), no entries will be deleted and the store will grow unbounded. Consider handling this edge case or at minimum evicting entries that have completed (no promise) regardless of their age.

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +44
application.add_middleware(
cast(Any, CORSMiddleware),
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: The CORS middleware is configured with allow_origins=["*"]. While this is a local development/debugging tool, if this server is ever exposed on a network (e.g., host=0.0.0.0), this allows any website to make authenticated requests to it. Consider restricting origins to localhost only (e.g., ["http://127.0.0.1:5495", "http://localhost:5495"]) or at least documenting the security implication.

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +125
if wire_path.exists():
try:
with wire_path.open(encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line:
continue
try:
parsed = parse_wire_file_line(line)
except Exception:
logger.debug("Skipped malformed line in %s", wire_path)
continue
if isinstance(parsed, WireFileMetadata):
continue
if parsed.message.type == "TurnBegin":
turn_count += 1
if turn_count == 1:
user_input = parsed.message.payload.get("user_input", "")
if isinstance(user_input, str):
title = user_input[:100]
elif isinstance(user_input, list) and user_input:
first = user_input[0]
if isinstance(first, dict):
title = str(first.get("text", ""))[:100]
except Exception:
pass
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: The list_sessions endpoint reads and parses every line of every wire.jsonl file across all sessions on each request (with only a 30s client-side cache). For large session stores, this could be very slow and block the event loop since the file reads in this function are synchronous (wire_path.open() without aiofiles). Consider using aiofiles here too (as is done in other endpoints), or caching the results server-side similar to how get_statistics does it.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +162
with wire_path.open(encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line:
continue
try:
parsed = parse_wire_file_line(line)
except Exception:
continue
if isinstance(parsed, WireFileMetadata):
continue

ts = parsed.timestamp
msg_type = parsed.message.type
payload = parsed.message.payload

if first_ts == 0:
first_ts = ts
# Determine date from first timestamp
try:
dt = datetime.fromtimestamp(ts, tz=UTC)
session_date = dt.strftime("%Y-%m-%d")
except Exception:
pass
last_ts = ts

# Collect (type, payload) pairs, unwrapping SubagentEvent recursively
events_to_process: list[tuple[str, dict[str, Any]]] = []
_collect_events(msg_type, payload, events_to_process)

for ev_type, ev_payload in events_to_process:
if ev_type == "TurnBegin":
session_turns += 1
elif ev_type == "ToolCall":
fn: dict[str, Any] | None = ev_payload.get("function")
tool_id: str = ev_payload.get("id", "")
if isinstance(fn, dict):
name: str = fn.get("name", "unknown")
tool_stats[name]["count"] += 1
if tool_id:
pending_tools[tool_id] = name
elif ev_type == "ToolResult":
tool_call_id: str = ev_payload.get("tool_call_id", "")
rv: dict[str, Any] | None = ev_payload.get("return_value")
if isinstance(rv, dict) and rv.get("is_error"):
tool_name = pending_tools.get(tool_call_id)
if tool_name:
tool_stats[tool_name]["error_count"] += 1
pending_tools.pop(tool_call_id, None)
elif ev_type == "StatusUpdate":
tu: dict[str, Any] | None = ev_payload.get("token_usage")
if isinstance(tu, dict):
session_input_tokens += (
int(tu.get("input_other", 0))
+ int(tu.get("input_cache_read", 0))
+ int(tu.get("input_cache_creation", 0))
)
session_output_tokens += int(tu.get("output", 0))
except Exception:
continue
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: The get_statistics endpoint reads all wire files synchronously (using wire_path.open() instead of aiofiles) inside an async function. This will block the event loop for the duration of all file I/O, potentially making the server unresponsive during the computation. The other endpoints in sessions.py use aiofiles for async I/O; this endpoint should do the same for consistency and to avoid blocking.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +86
const labelCount = Math.min(5, daily.length);
const labelIndices: number[] = [];
for (let i = 0; i < labelCount; i++) {
labelIndices.push(
Math.round((i / (labelCount - 1)) * (daily.length - 1)),
);
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The DailyUsageChart computes label indices with Math.round((i / (labelCount - 1)) * (daily.length - 1)). When labelCount is 1 (which happens when daily.length is 1), this results in a division by zero (0/0 = NaN), producing NaN indices. Although the if (daily.length === 0) return null; guard exists, a single-day dataset would still hit this bug. Add a guard for labelCount <= 1 or handle it in the loop.

Copilot uses AI. Check for mistakes.
@RealKai42 RealKai42 changed the title feat: add vis || feat: add vis feat(vis): add session tracing visualization system Mar 10, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

…improve label index calculation in DailyUsageChart
Signed-off-by: Kai <me@kaiyi.cool>
@RealKai42 RealKai42 merged commit a17d55f into main Mar 10, 2026
20 checks passed
@RealKai42 RealKai42 deleted the kaiyi/vis-tracing-system branch March 10, 2026 15:01
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

}
} else if (ev.type === "ApprovalResponse") {
totalPairable++;
const id = ev.payload.id as string | undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 IntegrityCheck uses wrong field (id instead of request_id) to match ApprovalResponse to ApprovalRequest

In integrity-check.tsx, ApprovalResponse events are matched to ApprovalRequest events using ev.payload.id (line 105). However, in timeline-view.tsx:306, the same matching logic correctly uses ev.payload.request_id. Since ApprovalResponse payloads carry a request_id field that references the original request's id, reading payload.id on the response will either be undefined or be the response's own ID — never matching the request's ID in approvalMap. This causes all approval pairs to be falsely reported as orphaned events, inflating the orphan count and deflating the integrity score.

Suggested change
const id = ev.payload.id as string | undefined;
const id = ev.payload.request_id as string | undefined;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants