fix: support newer Gemini CLI and Copilot session formats#155
Conversation
Handle Gemini CLI jsonl chat logs, warning messages, and nested chat directories while tolerating empty Copilot sessions from newer VS Code releases. This restores parsing for newly recorded sessions and adds regression coverage for issue Piebald-AI#154. Signed-off-by: jimyag <git@jimyag.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds JSONL support and per-message-ID deduplication to the Gemini CLI analyzer, expands chat-file discovery to include Changes
Sequence Diagram(s)sequenceDiagram
participant Reader as File Reader
participant Parser as JSON/JSONL Parser
participant Deduper as ID Deduplicator
participant Converter as messages_from_session
participant Storage as Message Sink
Reader->>Parser: Read file (detect .json / .jsonl)
alt .jsonl
Parser->>Parser: Stream parse lines -> events
Parser->>Deduper: Emit message events (id, payload)
Deduper->>Deduper: Keep latest per id (by timestamp/$set)
Deduper->>Converter: Emit ordered latest messages
else .json
Parser->>Converter: Parse full session object
end
Converter->>Storage: Convert to ConversationMessage list (compute conversation_hash)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/analyzers/gemini_cli.rs (1)
416-422:⚠️ Potential issue | 🟠 MajorUpdate
get_data_glob_patternsto include JSONL files and nested chat paths.Line 421 only emits
.../chats/*.json, omitting.jsonlfiles and nested sessions underchats/**directories. Whilediscover_data_sourcesusesWalkDirwithis_gemini_cli_chat_pathvalidation (which correctly handles both formats and nested paths), the glob patterns should match the actual data locations supported by the analyzer. Other similar analyzers (copilot_cli,codex_cli,claude_code) all include nested paths with**or explicit*/*patterns and multiple file formats.Suggested patch
fn get_data_glob_patterns(&self) -> Vec<String> { let mut patterns = Vec::new(); if let Some(home_dir) = dirs::home_dir() { let home_str = home_dir.to_string_lossy(); + patterns.push(format!("{home_str}/.gemini/tmp/*/chats/*.json")); + patterns.push(format!("{home_str}/.gemini/tmp/*/chats/*.jsonl")); + patterns.push(format!("{home_str}/.gemini/tmp/*/chats/**/*.json")); + patterns.push(format!("{home_str}/.gemini/tmp/*/chats/**/*.jsonl")); } patterns }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/gemini_cli.rs` around lines 416 - 422, The current get_data_glob_patterns function only returns patterns matching chats/*.json; update it to also match .jsonl files and nested chat directories by adding glob entries such as "{home_str}/.gemini/tmp/**/chats/**/*.json", "{home_str}/.gemini/tmp/**/chats/**/*.jsonl" (and mirror any non-** variants you use elsewhere like "{home_str}/.gemini/tmp/*/chats/**/*.json" if you prefer single-level wildcards) so the returned Vec<String> covers both .json and .jsonl and nested sessions; modify the patterns pushed inside get_data_glob_patterns accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/tests/copilot.rs`:
- Around line 203-232: Update the test
test_parse_empty_copilot_session_without_requester_username to exercise the
parser tolerance by removing all formerly-required fields instead of only
responderUsername: construct the JSON fixture passed to
parse_copilot_session_file with responderUsername, initialLocation, and the
other previously-required field (e.g., inputState) omitted (or add separate
tests that omit each required field combination) so the test asserts the parser
accepts sessions missing all those optional fields.
---
Outside diff comments:
In `@src/analyzers/gemini_cli.rs`:
- Around line 416-422: The current get_data_glob_patterns function only returns
patterns matching chats/*.json; update it to also match .jsonl files and nested
chat directories by adding glob entries such as
"{home_str}/.gemini/tmp/**/chats/**/*.json",
"{home_str}/.gemini/tmp/**/chats/**/*.jsonl" (and mirror any non-** variants you
use elsewhere like "{home_str}/.gemini/tmp/*/chats/**/*.json" if you prefer
single-level wildcards) so the returned Vec<String> covers both .json and .jsonl
and nested sessions; modify the patterns pushed inside get_data_glob_patterns
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 672706b6-d339-4247-8d2a-16cb81c728d8
📒 Files selected for processing (4)
src/analyzers/copilot.rssrc/analyzers/gemini_cli.rssrc/analyzers/tests/copilot.rssrc/analyzers/tests/gemini_cli.rs
Cover Copilot sessions that omit multiple optional metadata fields so the parser regression test matches the intended tolerance. Signed-off-by: jimyag <git@jimyag.com>
|
@coderabbitai review it |
|
✅ Actions performedReview triggered.
|
|
@coderabbitai Review |
✅ Actions performedReview triggered.
|



Summary
.jsonlsession logs and nestedchats/subdirectorieswarningmessages and$setmetadata updates while keeping the latest message version per idRoot Cause
Recent Gemini CLI releases changed chat persistence from a single
.jsonobject to incremental.jsonllogs and introduced nested subagent session paths. Newer Copilot session files can also omit fields that the parser previously required even when the session has no requests.Validation
Fixes #154
Summary by CodeRabbit
New Features
Bug Fixes
Tests