Skip to content

perf: reduce peak memory during cosine IVF-PQ index training#6016

Merged
wkalt merged 2 commits intolance-format:mainfrom
wkalt:wkalt/perf-ivf-training-memory
Feb 26, 2026
Merged

perf: reduce peak memory during cosine IVF-PQ index training#6016
wkalt merged 2 commits intolance-format:mainfrom
wkalt:wkalt/perf-ivf-training-memory

Conversation

@wkalt
Copy link
Copy Markdown
Contributor

@wkalt wkalt commented Feb 25, 2026

Two optimizations that together eliminate the transient 2x memory peak on the training sample during cosine-distance index builds:

  1. Add normalize_fsl_owned that L2-normalizes a FixedSizeListArray in-place via Buffer::into_mutable() when the buffer is uniquely owned, avoiding a full copy. Falls back to the existing copy path when the buffer is shared.

  2. Skip arrow::compute::filter when all vectors are already finite, avoiding another full copy of the training data.

Two optimizations that together eliminate the transient 2x memory peak
on the training sample during cosine-distance index builds:

1. Add `normalize_fsl_owned` that L2-normalizes a FixedSizeListArray
   in-place via `Buffer::into_mutable()` when the buffer is uniquely
   owned, avoiding a full copy. Falls back to the existing copy path
   when the buffer is shared.

2. Skip `arrow::compute::filter` when all vectors are already finite,
   avoiding another full copy of the training data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wkalt wkalt force-pushed the wkalt/perf-ivf-training-memory branch from bc57006 to 2a505cd Compare February 25, 2026 23:20
@wkalt
Copy link
Copy Markdown
Contributor Author

wkalt commented Feb 26, 2026

progress this is an index build on 100M 384d vectors before and after the change. Change targets the IVF training portion at the start.

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 94.20290% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-linalg/src/kernels.rs 94.54% 5 Missing and 1 partial ⚠️
rust/lance/src/index/vector/pq.rs 0.00% 0 Missing and 1 partial ⚠️
rust/lance/src/index/vector/utils.rs 96.00% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@wkalt wkalt merged commit 3cad9cb into lance-format:main Feb 26, 2026
29 checks passed
wjones127 pushed a commit to wjones127/lance that referenced this pull request Mar 4, 2026
…ormat#6016)

Two optimizations that together eliminate the transient 2x memory peak
on the training sample during cosine-distance index builds:

1. Add `normalize_fsl_owned` that L2-normalizes a FixedSizeListArray
in-place via `Buffer::into_mutable()` when the buffer is uniquely owned,
avoiding a full copy. Falls back to the existing copy path when the
buffer is shared.

2. Skip `arrow::compute::filter` when all vectors are already finite,
avoiding another full copy of the training data.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants