Skip to content

feat(core): Incremental indexing via rebuild strategy #122

@prosdev

Description

@prosdev

Summary

Fix the existing update() method to actually work. Most infrastructure already exists.

What Already Works

  • ✅ State management with file hashes (indexer-state.json)
  • detectChangedFiles() using SHA-256 comparison
  • update() method orchestration
  • ✅ Doc IDs contain file path (e.g., src/auth.ts:func:10)
  • ✅ CLI dev update command

What's Broken

  1. delete() throws error — We just never implemented it
  2. New files not detected — Only iterates state.files, misses new files
  3. Deleted files leave orphans — Not cleaned up

Implementation

1. Implement delete by file path prefix (~1 hour)

// In store.ts
async deleteByFilePrefix(filePrefix: string): Promise<void> {
  await this.table.delete(`id LIKE '${filePrefix}:%'`);
}

2. Add new file detection (~30 min)

// In detectChangedFiles()
// Also scan for files NOT in state.files
const allFiles = await scanForFiles(repoPath);
const newFiles = allFiles.filter(f => !state.files[f]);

3. Handle deleted files (~30 min)

// Files in state but not on disk
const deletedFiles = Object.keys(state.files)
  .filter(f => !existsSync(f));
// Delete their docs from vector store

Success Criteria

  • dev update detects changed files
  • dev update detects new files
  • dev update removes docs for deleted files
  • Orphaned symbols cleaned up when removed from file
  • --force still triggers full reindex

Out of Scope

  • Git commit SHA tracking (optimization, not required)
  • Watch mode

Parent Epic

Part of #104 - Performance & Reliability Critical Path

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions