Store subagent transcripts as separate prompt records#562
Open
Store subagent transcripts as separate prompt records#562
Conversation
1dad61a to
03e9dea
Compare
Member
I'm not a huge fan of this Claude... May have mentioned it elsewhere, but can we add them as normal prompts into the authorship log but just include a |
c194678 to
e536d45
Compare
96cd5b6 to
7261e7b
Compare
Claude Code stores subagent (Task tool) transcripts in separate JSONL files at <session-uuid>/subagents/agent-<id>.jsonl, but the transcript parser only read the main session file. This meant all subagent conversation content was silently dropped from git-ai authorship records. Extract the JSONL line parsing into a reusable parse_claude_jsonl_content helper, then after parsing the main transcript, discover and parse any subagent JSONL files from the sibling subagents directory. Subagent messages are appended to the main transcript in sorted filename order for deterministic results. Fixes #371 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an optional parent_id field to both PromptRecord (git notes) and PromptDbRecord (SQLite). This links subagent prompt records back to their parent prompt, enabling hierarchical transcript storage. Includes DB migration 3→4 (ALTER TABLE prompts ADD COLUMN parent_id) and updates all construction sites with parent_id: None. Refs: #371 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of merging Claude Code subagent messages into the parent transcript, each subagent now produces a separate PromptRecord with parent_id linking it to the parent prompt. - Add SubagentInfo struct; parser returns subagents separately - Propagate subagents through PromptUpdateResult pipeline - Serialize subagent info into checkpoint agent_metadata - Expand into separate PromptDbRecords at post-commit DB upsert - Expand into separate PromptRecords in VirtualAttributions Fixes: #371 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update all test files for the new 3-tuple return type from transcript_and_model_from_claude_code_jsonl and the parent_id field on PromptRecord/PromptDbRecord. The subagent test now verifies that subagents are returned separately (not merged) and that the main transcript contains only main-session messages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run cargo fmt on all changed files. Fix test_claude_transcript_parsing_malformed_json which incorrectly expected Err — the parser skips unparseable lines by design, returning Ok with an empty transcript. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7261e7b to
2d52f49
Compare
Update test_initialize_schema to expect version "4" instead of "3" since SCHEMA_VERSION was bumped when adding the parent_id column. Fix test_initialize_schema_handles_preexisting_cas_cache_table to include the prompts and cas_sync_queue tables in its setup, since the test simulates being at schema version 2 (meaning migrations 0->1 and 1->2 have already run). Without these tables, migration 3->4 (ALTER TABLE prompts ADD COLUMN parent_id) fails. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
|
|
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
<session>/subagents/agent-<id>.jsonlinto separateSubagentInfostructs instead of merging them into the parent transcriptparent_idfield toPromptRecordandPromptDbRecordto link subagent prompt records back to their parentapply_migration()now catches"duplicate column name"errors fromALTER TABLE ADD COLUMNso a crash between the DDL and the version-bump doesn't brick the database on restartCloses #371
Design
Subagent transcripts stored in Claude Code's
subagents/directory are now collected asSubagentInfostructs during JSONL parsing rather than being flattened into the parent transcript. These are propagated through the checkpoint pipeline viaPromptUpdateResult::Updatedand stored as JSON metadata (__subagentskey) on the parent checkpoint.At post-commit time (and in virtual attributions), the subagent metadata is expanded into separate
PromptDbRecordentries, each with a unique hash ID and aparent_idlinking back to the parent prompt. This preserves the natural thread structure of agentic coding sessions.DB schema migrated from version 3 to 4 to add the
parent_id TEXTcolumn to the prompts table. The migration is idempotent — if the column already exists (e.g. from a previous partial migration), the error is caught and silently skipped.Updates since last revision
maininto the branch and resolved merge conflicts intests/claude_code.rs(added new subagent test names to thereuse_tests_in_worktree!macro)apply_migration()idempotent by catching"duplicate column name"SQLite errors instead of crashingtest_initialize_schema_handles_preexisting_parent_id_columnto verify the idempotency pathparent_id: Noneintests/worktrees.rsPromptRecordinitializer (compilation error)tests/claude_code.rs)Test plan
cargo clippy— no warningscargo test --test claude_code— all 24 tests passcargo test --test agent_presets_comprehensive— all 58 tests passtest_parse_claude_code_jsonl_with_subagentsverifies subagents are returned separatelytest_parse_claude_code_jsonl_without_subagents_dirverifies empty vec for no subagentstest_initialize_schema_handles_preexisting_parent_id_columnverifies idempotent migrationclaude-code-with-subagents.jsonlandsubagents/agent-test-sub-1.jsonlReview & Testing Checklist for Human
There are 4 items to verify given the scope of DB schema + query changes:
"duplicate column name"via string matching is acceptable for SQLite migrations (this is a common pattern but relies on error message text)SELECT,INSERT, andUPDATEqueries ininternal_db.rsinclude the newparent_idcolumn in the correct position (column order matters for positional binding)PromptRecordinitializers in test files includeparent_id: None(compilation would fail if any were missed, but worth double-checking)test_parse_claude_code_jsonl_with_subagentsto ensure subagent messages are correctly separated from the parent transcript and not duplicatedNotes
parent_idfield isOption<String>with#[serde(default)], so existing serialized data without this field will deserialize correctly🤖 Generated with Devin
Requested by: @svarlamov