Summary
BeginBulkWrite() switches the SQLite journal mode to MEMORY and synchronous to OFF for the duration of indexing. If the CMM process is killed (SIGKILL, OOM, user closing terminal) before EndBulkWrite() restores WAL mode, the partially-written B-tree pages are permanently flushed to the main DB file with no rollback journal. The database is unrecoverable without deleting and re-indexing.
Affected code
internal/store/store.go:
// line 223-225
func (s *Store) BeginBulkWrite(ctx context.Context) {
_, _ = s.db.ExecContext(ctx, "PRAGMA journal_mode = MEMORY")
_, _ = s.db.ExecContext(ctx, "PRAGMA synchronous = OFF")
// line 230-233
func (s *Store) EndBulkWrite(ctx context.Context) {
_, _ = s.db.ExecContext(ctx, "PRAGMA synchronous = NORMAL")
_, _ = s.db.ExecContext(ctx, "PRAGMA journal_mode = WAL")
Reproduction
- Start indexing a large repository (>50k files) with CMM
- Kill the process mid-index (
kill -9 <pid>, OOM, or just close the terminal)
- Next session: any
search_graph call returns search: database disk image is malformed
What we observed
- DB at
~/.cache/codebase-memory-mcp/<project>.db was 125MB, 93933 edges
PRAGMA integrity_check returned dozens of "Tree XXXX page YYYY cell N: 2nd reference to page ZZZZ" errors — classic interrupted B-tree write
- Specifically
idx_edges_url_path (a large B-tree index on edges.url_path) was corrupted
- No
-wal or -shm companion files existed — WAL mode had already been abandoned when the crash occurred in MEMORY journal mode
SELECT count(*) FROM nodes and FROM edges both fail — core tables unreadable
- Only fix:
delete_project + index_repository to rebuild from scratch
Why it matters
MEMORY journal mode means SQLite writes pages directly to the main DB file during the transaction with no way to roll back if the process exits abnormally. For large codebases that take minutes to index, the probability of hitting a kill signal (OOM, user interrupt, power loss) during that window is not negligible.
WAL mode (the default) is crash-safe: an interrupted write leaves a partial WAL file that SQLite simply ignores on next open. Switching to MEMORY mode removes this safety.
Suggested fix
Remove the journal_mode = MEMORY switch in BeginBulkWrite(). WAL mode with a larger cache_size and synchronous = NORMAL provides nearly the same bulk-write throughput while remaining crash-safe:
func (s *Store) BeginBulkWrite(ctx context.Context) {
// Keep WAL mode — switching to MEMORY disables crash recovery
_, _ = s.db.ExecContext(ctx, "PRAGMA synchronous = OFF")
_, _ = s.db.ExecContext(ctx, "PRAGMA cache_size = -65536") // 64MB page cache
}
If the MEMORY journal speedup is significant enough to keep, an alternative is the atomic swap pattern: index into a temp .db file, rename over the old file only on successful completion. This preserves the old DB if indexing is interrupted.
Environment
- macOS 15.x (Darwin 25.2.0)
- CMM version: from
../codebase-memory-mcp local build
- Project: large Perl monorepo (~93k edges)
Summary
BeginBulkWrite()switches the SQLite journal mode toMEMORYandsynchronoustoOFFfor the duration of indexing. If the CMM process is killed (SIGKILL, OOM, user closing terminal) beforeEndBulkWrite()restores WAL mode, the partially-written B-tree pages are permanently flushed to the main DB file with no rollback journal. The database is unrecoverable without deleting and re-indexing.Affected code
internal/store/store.go:Reproduction
kill -9 <pid>, OOM, or just close the terminal)search_graphcall returnssearch: database disk image is malformedWhat we observed
~/.cache/codebase-memory-mcp/<project>.dbwas 125MB, 93933 edgesPRAGMA integrity_checkreturned dozens of"Tree XXXX page YYYY cell N: 2nd reference to page ZZZZ"errors — classic interrupted B-tree writeidx_edges_url_path(a large B-tree index onedges.url_path) was corrupted-walor-shmcompanion files existed — WAL mode had already been abandoned when the crash occurred in MEMORY journal modeSELECT count(*) FROM nodesandFROM edgesboth fail — core tables unreadabledelete_project+index_repositoryto rebuild from scratchWhy it matters
MEMORY journal mode means SQLite writes pages directly to the main DB file during the transaction with no way to roll back if the process exits abnormally. For large codebases that take minutes to index, the probability of hitting a kill signal (OOM, user interrupt, power loss) during that window is not negligible.
WAL mode (the default) is crash-safe: an interrupted write leaves a partial WAL file that SQLite simply ignores on next open. Switching to MEMORY mode removes this safety.
Suggested fix
Remove the
journal_mode = MEMORYswitch inBeginBulkWrite(). WAL mode with a largercache_sizeandsynchronous = NORMALprovides nearly the same bulk-write throughput while remaining crash-safe:If the MEMORY journal speedup is significant enough to keep, an alternative is the atomic swap pattern: index into a temp
.dbfile, rename over the old file only on successful completion. This preserves the old DB if indexing is interrupted.Environment
../codebase-memory-mcplocal build