-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
Implement a GitHub Context Subagent that indexes GitHub data (issues, PRs, discussions) and makes it semantically searchable. This extends dev-agent's context provision capabilities from code files to GitHub metadata, enabling AI assistants to have complete project context.
Core Mission Alignment
Goal: Provide relevant context to AI tools, reducing hallucinations
Current: We index code files and documentation
Missing: GitHub issues, PRs, discussions, comments
Solution: Index GitHub data like we index code
Acceptance Criteria
GitHub Data Indexing:
- Fetch issues via GitHub CLI (
gh) - Fetch PRs with metadata (commits, reviews, comments)
- Index issue/PR descriptions as vector embeddings
- Store GitHub metadata (labels, status, timestamps, links)
- Support incremental updates (only fetch changed data)
Semantic Search:
- Search issues:
dev gh search "authentication bug" - Search PRs:
dev gh search --type pr "performance" - Find related issues:
dev gh related --issue 42 - Combined search (code + GitHub):
dev search "oauth" --include-github
Context Provision:
- Get issue context:
dev gh context --issue 42 - Returns: issue data, related PRs, linked code files, discussions
- Integrate with Planner (better issue understanding)
- Integrate with Explorer (find related issues to code patterns)
General:
- Integrates with subagent coordinator
- Uses existing vector storage (LanceDB)
- Respects .gitignore and privacy settings
- Handles rate limiting gracefully
Architecture
┌─────────────────────────────────────────────────────────┐
│ dev-agent │
├─────────────────────────────────────────────────────────┤
│ Code Scanner → Indexer → Vector Store (LanceDB) │
│ GitHub Fetcher → Indexer → Vector Store (LanceDB) │
│ ↓ │
│ Semantic Search │
│ (code + GitHub) │
└─────────────────────────────────────────────────────────┘
Use Cases
1. Issue Context for AI
dev gh context --issue 42
# Returns full context:
# - Issue description
# - Related issues (#35, #28)
# - Related PRs (#40)
# - Affected code files (via links/mentions)
# - Discussion threads2. Enhanced Planning
dev plan 42
# Planner now has access to:
# - Full issue description (not just gh CLI output)
# - Related issues for context
# - Previous similar work (from closed issues/PRs)3. Pattern Discovery
dev explore pattern "error handling"
# Explorer finds:
# - Code patterns
# - Related issues discussing error handling
# - PRs that improved error handling4. Knowledge Base
dev gh search "how do we handle rate limiting"
# Searches:
# - Issue discussions
# - PR descriptions
# - Code comments
# - DocumentationCLI Commands
# Indexing
dev gh index # Index all GitHub data
dev gh index --since 2024-01-01 # Incremental update
dev gh update # Refresh changed items
# Searching
dev gh search "query" # Search all GitHub data
dev gh search "query" --type issue
dev gh search "query" --type pr
# Context
dev gh context --issue 42 # Get full context for issue
dev gh related --issue 42 # Find related issues/PRs
# Stats
dev gh stats # Show indexed data statsTechnical Implementation
Phase 1: GitHub Fetcher (Day 1)
- Use
ghCLI to fetch issues and PRs - Parse JSON output into structured types
- Handle pagination and rate limits
- Store raw data with metadata
Phase 2: Indexing (Day 1-2)
- Extract text from issues/PRs for embedding
- Generate vectors using Transformers.js (same as code)
- Store in LanceDB with GitHub-specific metadata
- Implement incremental updates
Phase 3: Search & Context (Day 2)
- Semantic search over GitHub data
- Context assembly (issue + related items)
- Integration with existing search
Phase 4: Agent Integration (Day 3)
- Expose via Subagent Coordinator
- Integrate with Planner agent
- Integrate with Explorer agent
- Message-based communication
Data Model
interface GitHubDocument {
type: 'issue' | 'pr' | 'discussion';
number: number;
title: string;
body: string;
state: 'open' | 'closed';
labels: string[];
author: string;
createdAt: string;
updatedAt: string;
comments: number;
relatedIssues: number[]; // Extracted from links
relatedPRs: number[]; // Extracted from links
linkedFiles: string[]; // Mentioned in issue/PR
}Dependencies
- Requires:
ghCLI installed and authenticated - Uses: Existing indexer and vector storage
- Integrates with: Planner (Implement Planner Subagent #8), Explorer (Implement Explorer Subagent #9), Coordinator (Implement Subagent Coordinator #7)
Success Metrics
- AI assistants can find relevant issues/PRs for any query
- Planner generates better task breakdowns with GitHub context
- Explorer discovers cross-cutting concerns (code + issues)
- Developers spend less time searching GitHub manually
Future Enhancements
- Index PR review comments
- Index discussion threads
- Track issue relationships (blocks/blocked-by)
- Temporal analysis (issue trends over time)
- Integration with GitHub Projects
Branch: feat/github-context-subagent
Priority: High (enables better AI assistance)
Estimate: 3 days
Parent Epic: #1 (Core Context Provider)
Metadata
Metadata
Assignees
Labels
No labels