importPaths(): Seek over paths that are already in the store#236
Conversation
This speeds up re-importing a 15 GiB closure from 5.2s to 0.05s.
WalkthroughThe Changes
Sequence Diagram(s)sequenceDiagram
participant Importer
participant Store
participant Source
rect rgb(240, 248, 255)
Note over Importer,Source: Old Behavior (Always Import)
Importer->>Store: addToStore(info, source, NoRepair, checkSigs)
Store->>Source: Read NAR content
Source-->>Store: NAR data
Store-->>Importer: Path imported
end
rect rgb(240, 255, 240)
Note over Importer,Store: New Behavior (Conditional Import)
Importer->>Store: isValidPath(info.path)?
alt Path is Invalid
Store-->>Importer: false
Importer->>Importer: Log "importing path '...'"
Importer->>Store: addToStore(info, source, NoRepair, checkSigs)
Store->>Source: Read NAR content
Source-->>Store: NAR data
Store-->>Importer: Path imported
else Path is Valid
Store-->>Importer: true
Importer->>Source: skip(info.narSize)
Source-->>Importer: NAR skipped
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes The change introduces conditional import logic affecting a critical store operation, depends on new Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/libstore/export-import.cc (1)
147-147: Consider logging when skipping paths for better observability.Currently, the Activity log is only emitted when actually importing. When paths are skipped, the operation is silent. For debugging and monitoring, consider adding a debug-level log message when skipping.
Apply this diff to add optional logging:
store.addToStore(info, source, NoRepair, checkSigs); - } else + } else { + debug("skipping already-valid path '%s'", store.printStorePath(info.path)); source.skip(info.narSize); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/libstore/export-import.cc(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/libstore/export-import.cc (3)
src/nix/nario.cc (6)
store(62-69)store(62-62)store(96-100)store(96-96)info(268-288)info(269-269)src/libstore/include/nix/store/worker-protocol.hh (2)
store(123-123)store(124-124)src/libstore/include/nix/store/store-api.hh (3)
info(530-534)info(757-757)source(545-545)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (2)
src/libstore/export-import.cc (2)
141-145: No actionable issues found; code handles race condition safely through idempotent addToStore implementation.The TOCTOU window identified is real: another process could add the path between the
isValidPath()check and theaddToStore()call. However, verification confirms this is not a defect. LocalStore::addToStore uses a double-check-lock pattern—it re-checksisValidPath()after acquiring a lock, and with theNoRepairflag, if the path now exists, the operation gracefully skips import work rather than erroring. Remote stores delegate to their daemons with the same semantics. The operation is idempotent and handles duplicates safely.
141-147: Document the intentional verification bypass and its trust assumptions.The optimization skips NAR re-import when a path already exists, which intentionally bypasses signature verification (even when
checkSigsis true) and metadata re-validation (references, deriver, narHash). While Nix's content-addressed model ensures the path name matches the content hash, this means:
- Signature verification is skipped if the path already exists
- Metadata differences between import stream and stored path are silently ignored
- Silent behavior makes debugging harder (no activity log when skipping)
This is an acceptable tradeoff for performance if you trust the existing store state is correct. However, consider either:
- Adding a log entry when skipping to improve observability:
logger->log(lvlInfo, fmt("path '%s' already in store, skipping import", ...))- Documenting the trust assumption in a comment explaining why skipping verification is safe in your use case
- Optional: Conditionally verify narHash matches if
checkSigsis true, or add an optional strict-verify mode
Motivation
This speeds up re-importing a 15 GiB closure from 5.2s to 0.05s.
Context
Summary by CodeRabbit