Skip to content

importPaths(): Seek over paths that are already in the store#236

Merged
edolstra merged 1 commit into
mainfrom
nario-import-optimization
Oct 21, 2025
Merged

importPaths(): Seek over paths that are already in the store#236
edolstra merged 1 commit into
mainfrom
nario-import-optimization

Conversation

@edolstra
Copy link
Copy Markdown
Collaborator

@edolstra edolstra commented Oct 21, 2025

Motivation

This speeds up re-importing a 15 GiB closure from 5.2s to 0.05s.

Context

Summary by CodeRabbit

  • Bug Fixes
    • Optimized import-export operations to skip unnecessary re-imports of already-valid paths in the store, improving performance for repeated import operations.

This speeds up re-importing a 15 GiB closure from 5.2s to 0.05s.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Oct 21, 2025

Walkthrough

The importPaths function in export-import version 2 now conditionally imports paths. Instead of always importing each path's NAR, it first checks path validity: if the path is already valid in the store, the NAR content is skipped; otherwise, the path is imported and logged.

Changes

Cohort / File(s) Summary
Export-Import Conditional Logic
src/libstore/export-import.cc
Modified importPaths to check path validity before importing: skips NAR reading for already-valid paths via source.skip(), imports and logs only for invalid paths.

Sequence Diagram(s)

sequenceDiagram
    participant Importer
    participant Store
    participant Source

    rect rgb(240, 248, 255)
    Note over Importer,Source: Old Behavior (Always Import)
    Importer->>Store: addToStore(info, source, NoRepair, checkSigs)
    Store->>Source: Read NAR content
    Source-->>Store: NAR data
    Store-->>Importer: Path imported
    end

    rect rgb(240, 255, 240)
    Note over Importer,Store: New Behavior (Conditional Import)
    Importer->>Store: isValidPath(info.path)?
    alt Path is Invalid
        Store-->>Importer: false
        Importer->>Importer: Log "importing path '...'"
        Importer->>Store: addToStore(info, source, NoRepair, checkSigs)
        Store->>Source: Read NAR content
        Source-->>Store: NAR data
        Store-->>Importer: Path imported
    else Path is Valid
        Store-->>Importer: true
        Importer->>Source: skip(info.narSize)
        Source-->>Importer: NAR skipped
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The change introduces conditional import logic affecting a critical store operation, depends on new source.skip() functionality from a related PR, and requires verification of the path validity check and NAR skipping behavior.

Possibly related PRs

  • nix nario list improvements #232 — Introduces Source::skip() and FdSource::skip() functionality that this PR depends on for skipping NAR content of already-valid paths.

Suggested reviewers

  • cole-h

Poem

🐰 A rabbit hops through the import-export dance,
"Why read what's already here?" it chants,
Skip the NAR if the path is sound,
Import fresh ones, keep store unbound!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "importPaths(): Seek over paths that are already in the store" directly and accurately summarizes the main change in the changeset. The modification to the importPaths function conditionally skips (seeks over) paths that are already valid in the store rather than unconditionally importing them all, which is precisely what the title conveys. The title is concise, specific, and uses clear technical language that would allow a teammate reviewing the commit history to quickly understand the nature and benefit of the change without needing additional context.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch nario-import-optimization

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/libstore/export-import.cc (1)

147-147: Consider logging when skipping paths for better observability.

Currently, the Activity log is only emitted when actually importing. When paths are skipped, the operation is silent. For debugging and monitoring, consider adding a debug-level log message when skipping.

Apply this diff to add optional logging:

             store.addToStore(info, source, NoRepair, checkSigs);
-        } else
+        } else {
+            debug("skipping already-valid path '%s'", store.printStorePath(info.path));
             source.skip(info.narSize);
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f0b708 and 64f751f.

📒 Files selected for processing (1)
  • src/libstore/export-import.cc (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/libstore/export-import.cc (3)
src/nix/nario.cc (6)
  • store (62-69)
  • store (62-62)
  • store (96-100)
  • store (96-96)
  • info (268-288)
  • info (269-269)
src/libstore/include/nix/store/worker-protocol.hh (2)
  • store (123-123)
  • store (124-124)
src/libstore/include/nix/store/store-api.hh (3)
  • info (530-534)
  • info (757-757)
  • source (545-545)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (2)
src/libstore/export-import.cc (2)

141-145: No actionable issues found; code handles race condition safely through idempotent addToStore implementation.

The TOCTOU window identified is real: another process could add the path between the isValidPath() check and the addToStore() call. However, verification confirms this is not a defect. LocalStore::addToStore uses a double-check-lock pattern—it re-checks isValidPath() after acquiring a lock, and with the NoRepair flag, if the path now exists, the operation gracefully skips import work rather than erroring. Remote stores delegate to their daemons with the same semantics. The operation is idempotent and handles duplicates safely.


141-147: Document the intentional verification bypass and its trust assumptions.

The optimization skips NAR re-import when a path already exists, which intentionally bypasses signature verification (even when checkSigs is true) and metadata re-validation (references, deriver, narHash). While Nix's content-addressed model ensures the path name matches the content hash, this means:

  • Signature verification is skipped if the path already exists
  • Metadata differences between import stream and stored path are silently ignored
  • Silent behavior makes debugging harder (no activity log when skipping)

This is an acceptable tradeoff for performance if you trust the existing store state is correct. However, consider either:

  1. Adding a log entry when skipping to improve observability: logger->log(lvlInfo, fmt("path '%s' already in store, skipping import", ...))
  2. Documenting the trust assumption in a comment explaining why skipping verification is safe in your use case
  3. Optional: Conditionally verify narHash matches if checkSigs is true, or add an optional strict-verify mode

@edolstra edolstra enabled auto-merge October 21, 2025 15:43
@github-actions
Copy link
Copy Markdown

@github-actions github-actions Bot temporarily deployed to pull request October 21, 2025 15:47 Inactive
@edolstra edolstra added this pull request to the merge queue Oct 21, 2025
Merged via the queue into main with commit 2eea2eb Oct 21, 2025
35 checks passed
@edolstra edolstra deleted the nario-import-optimization branch October 21, 2025 16:20
This was referenced Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants