test: add regression test for ivf/pq search by westonpace · Pull Request #5476 · lance-format/lance

westonpace · 2025-12-15T14:26:14Z

The test covers 5 parameters (cache, k, nprobes, refine factor, dataset size) for a total of 64 tests. It takes ~3-4 minutes to run on my system.

There are other parameters (distance type, number of dimensions, PQ-v-SQ, etc.) but these should only really affect compute time and/or recall and are probably better tested in rust benchmarks. Recall benchmarks will be done separately as they require real data.

chatgpt-codex-connector · 2025-12-15T14:26:20Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

wjones127 · 2025-12-15T16:29:52Z

Two dimensions not tested here that seem worthwhile are:

num_bits in PQ (4-bit vs 8-bit)

data_type in the vector (f16 vs f32)

These setting will trigger different code paths, which I think are important to cover in regression testing.

There are some setting here like k which I don't think really hit different code paths as much as they just change a variable in the same code path.

I think rust-level benchmarks would sufficiently (and probably with more fidelity) capture those parameters since they primarily affect the compute driven PQ search time. These parameters would also require training completely new indexes which is what dominates the benchmark time.

We have pq_dist_table and 4bit_pq_dist_table which cover num_bits and distance_type. However, these only look at f32. I'll add an issue to add data_type to these benchmarks (I want to review them anyways since the lance-index benches don't pass reliably and in sufficient time for me)

westonpace · 2025-12-19T22:59:13Z

@bench-bot benchmark

westonpace · 2025-12-19T23:04:58Z

Benchmark Results for PR #5476

Commit: e026964
Baseline: Up to 20 most recent historical results per benchmark

Summary

Total benchmarks: 123
🚀 Improvements: 0
⚠️ Regressions: 0
✅ Stable: 123
❓ Insufficient data: 0

✅ All Benchmarks Within Normal Range

No benchmarks had |z-score| > 2.0

All Results

View all 123 benchmark results

Benchmark	PR Result	Baseline Mean	Baseline N	Z-Score	Status
Cosine(f32, auto-vectorized)	102.97 ms	153.39 ms	20	-1.19	✅
Cosine(f32, scalar)	681.69 ms	820.47 ms	20	-0.79	✅
Cosine(f64, auto-vectorized)	290.17 ms	404.36 ms	20	-1.02	✅
Cosine(f64, scalar)	691.42 ms	836.72 ms	20	-0.79	✅
Cosine(half::bfloat::bf16, auto-vectorized)	268.18 ms	362.38 ms	20	-0.86	✅
Cosine(half::bfloat::bf16, scalar)	7.25 s	9.15 s	20	-0.86	✅
Cosine(half::binary16::f16, auto-vectorized)	127.35 ms	154.34 ms	20	-0.80	✅
Cosine(half::binary16::f16, scalar)	6.28 s	7.79 s	20	-0.80	✅
Cosine(simd,f32x8) rng seed	2.81 ms	3.73 ms	20	-0.93	✅
Dot(bf16, auto-vectorization)	409.46 ms	546.63 ms	20	-0.83	✅
Dot(f16, SIMD)	70.48 ms	91.43 ms	20	-0.95	✅
Dot(f32, SIMD)	101.22 ms	144.37 ms	20	-1.14	✅
Dot(f32, arrow_artiy)	637.93 ms	792.48 ms	20	-0.91	✅
Dot(f32, auto-vectorization)	96.77 ms	150.78 ms	20	-1.45	✅
Dot(f64, arrow_artiy)	637.73 ms	797.02 ms	20	-0.90	✅
Dot(f64, auto-vectorization)	188.91 ms	287.17 ms	20	-1.22	✅
Dot(half::binary16::f16, arrow_artiy)	728.39 ms	870.45 ms	20	-0.90	✅
Dot(half::binary16::f16, auto-vectorization)	70.01 ms	92.24 ms	20	-1.09	✅
L2(f32, auto-vectorization)	104.57 ms	149.21 ms	20	-1.03	✅
L2(f32, scalar)	659.71 ms	804.28 ms	20	-0.84	✅
L2(f32, simd)	104.29 ms	146.38 ms	20	-1.00	✅
L2(f64, auto-vectorization)	198.41 ms	297.23 ms	20	-1.21	✅
L2(f64, scalar)	656.48 ms	815.34 ms	20	-0.89	✅
L2(half::binary16::f16, auto-vectorization)	75.95 ms	94.50 ms	20	-0.89	✅
L2(half::binary16::f16, scalar)	2.82 s	3.56 s	20	-0.86	✅
L2(simd,f32x8)	4.60 ms	5.62 ms	20	-0.84	✅
L2(uint8, auto-vectorization)	54.67 ms	66.65 ms	20	-0.95	✅
L2(uint8, scalar)	50.17 ms	61.07 ms	20	-1.10	✅
NormL2(f32, SIMD)	97.66 ms	142.88 ms	20	-1.10	✅
NormL2(f32, auto-vectorization)	136.33 ms	142.08 ms	20	-0.14	✅
NormL2(f32, scalar)	659.87 ms	787.33 ms	20	-0.73	✅
NormL2(f64, auto-vectorization)	264.31 ms	296.55 ms	20	-0.34	✅
NormL2(f64, scalar)	670.35 ms	807.86 ms	20	-0.75	✅
NormL2(half::bfloat::bf16, auto-vectorization)	261.19 ms	339.64 ms	20	-0.80	✅
NormL2(half::bfloat::bf16, scalar)	2.79 s	3.32 s	20	-0.77	✅
NormL2(half::binary16::f16, SIMD)	71.74 ms	84.97 ms	20	-0.57	✅
NormL2(half::binary16::f16, auto-vectorization)	72.01 ms	84.56 ms	20	-0.54	✅
NormL2(half::binary16::f16, scalar)	679.61 ms	810.05 ms	20	-0.73	✅
argmin(arrow)	4.83 ms	4.64 ms	20	+1.02	✅
decode_fsl/float32_128_v2.0_nullfalse	32.17 ms	48.35 ms	20	-0.83	✅
decode_fsl/float32_128_v2.1_nullfalse	10.73 ms	15.65 ms	20	-1.96	✅
decode_fsl/float32_128_v2.1_nulltrue	22.12 ms	32.22 ms	20	-1.61	✅
decode_fsl/float32_16_v2.0_nullfalse	30.40 ms	48.20 ms	20	-0.91	✅
decode_fsl/float32_16_v2.1_nullfalse	17.12 ms	24.29 ms	20	-1.22	✅
decode_fsl/float32_16_v2.1_nulltrue	28.48 ms	37.70 ms	20	-1.00	✅
decode_fsl/float32_32_v2.0_nullfalse	30.45 ms	48.07 ms	20	-0.90	✅
decode_fsl/float32_32_v2.1_nullfalse	18.69 ms	24.28 ms	20	-0.96	✅
decode_fsl/float32_32_v2.1_nulltrue	24.01 ms	34.03 ms	20	-1.24	✅
decode_fsl/float32_4_v2.0_nullfalse	30.61 ms	48.43 ms	20	-0.91	✅
decode_fsl/float32_4_v2.1_nullfalse	17.46 ms	25.27 ms	20	-1.43	✅
decode_fsl/float32_4_v2.1_nulltrue	47.58 ms	63.80 ms	20	-1.08	✅
decode_fsl/float32_64_v2.0_nullfalse	30.62 ms	48.08 ms	20	-0.90	✅
decode_fsl/float32_64_v2.1_nullfalse	11.69 ms	16.29 ms	20	-1.23	✅
decode_fsl/float32_64_v2.1_nulltrue	25.29 ms	33.65 ms	20	-1.32	✅
decode_fsl/int8_128_v2.0_nullfalse	30.41 ms	48.36 ms	20	-0.92	✅
decode_fsl/int8_128_v2.1_nullfalse	17.43 ms	25.29 ms	20	-1.49	✅
decode_fsl/int8_128_v2.1_nulltrue	24.19 ms	33.89 ms	20	-1.18	✅
decode_fsl/int8_16_v2.0_nullfalse	30.25 ms	48.20 ms	20	-0.92	✅
decode_fsl/int8_16_v2.1_nullfalse	17.42 ms	25.00 ms	20	-1.39	✅
decode_fsl/int8_16_v2.1_nulltrue	47.81 ms	63.93 ms	20	-1.13	✅
decode_fsl/int8_32_v2.0_nullfalse	30.14 ms	48.15 ms	20	-0.93	✅
decode_fsl/int8_32_v2.1_nullfalse	17.01 ms	24.92 ms	20	-1.46	✅
decode_fsl/int8_32_v2.1_nulltrue	34.96 ms	47.09 ms	20	-1.10	✅
decode_fsl/int8_4_v2.0_nullfalse	30.06 ms	48.17 ms	20	-0.94	✅
decode_fsl/int8_4_v2.1_nullfalse	19.46 ms	26.52 ms	20	-1.23	✅
decode_fsl/int8_4_v2.1_nulltrue	131.40 ms	174.89 ms	20	-1.09	✅
decode_fsl/int8_64_v2.0_nullfalse	30.03 ms	48.54 ms	20	-0.94	✅
decode_fsl/int8_64_v2.1_nullfalse	17.60 ms	24.32 ms	20	-1.17	✅
decode_fsl/int8_64_v2.1_nulltrue	27.24 ms	37.85 ms	20	-1.19	✅
decode_primitive/date32	11.51 ms	15.89 ms	20	-1.55	✅
decode_primitive/date64	11.98 ms	15.97 ms	20	-1.38	✅
decode_primitive/decimal128(10, 10)	14.50 ms	16.43 ms	20	-0.72	✅
decode_primitive/decimal256(10, 10)	29.05 ms	34.60 ms	20	-1.18	✅
decode_primitive/duration(second)	11.60 ms	16.97 ms	20	-1.29	✅
decode_primitive/fixed-utf8	65.79 µs	95.95 µs	20	-1.20	✅
decode_primitive/float16	13.44 ms	16.00 ms	20	-0.95	✅
decode_primitive/float32	13.69 ms	16.34 ms	20	-0.89	✅
decode_primitive/float64	14.20 ms	16.49 ms	20	-0.64	✅
decode_primitive/int16	11.91 ms	15.82 ms	20	-1.33	✅
decode_primitive/int32	13.20 ms	15.91 ms	20	-0.92	✅
decode_primitive/int64	12.02 ms	16.24 ms	20	-1.37	✅
decode_primitive/int8	12.06 ms	15.99 ms	20	-1.35	✅
decode_primitive/struct	136.30 µs	186.10 µs	20	-1.39	✅
decode_primitive/time32(second)	11.99 ms	16.39 ms	20	-1.68	✅
decode_primitive/time64(nanosecond)	12.19 ms	16.37 ms	20	-1.53	✅
decode_primitive/timestamp(nanosecond, none)	12.27 ms	16.25 ms	20	-1.33	✅
decode_primitive/uint16	13.33 ms	16.22 ms	20	-0.76	✅
decode_primitive/uint32	12.86 ms	16.31 ms	20	-1.13	✅
decode_primitive/uint64	13.78 ms	16.30 ms	20	-0.71	✅
decode_primitive/uint8	12.64 ms	15.65 ms	20	-1.00	✅
decode_primitive/utf8	752.31 µs	954.59 µs	20	-0.97	✅
from_elem/full_read,parallel=1,read_size=1048576	36.39 ms	165.77 ms	20	-0.88	✅
from_elem/full_read,parallel=1,read_size=16384	262.56 ms	326.30 ms	20	-0.66	✅
from_elem/full_read,parallel=1,read_size=4096	938.26 ms	1.15 s	20	-0.56	✅
from_elem/full_read,parallel=16,read_size=1048576	35.74 ms	165.92 ms	20	-0.88	✅
from_elem/full_read,parallel=16,read_size=16384	261.63 ms	318.14 ms	20	-0.64	✅
from_elem/full_read,parallel=16,read_size=4096	924.90 ms	1.14 s	20	-0.59	✅
from_elem/full_read,parallel=32,read_size=1048576	35.81 ms	165.97 ms	20	-0.88	✅
from_elem/full_read,parallel=32,read_size=16384	258.51 ms	318.46 ms	20	-0.71	✅
from_elem/full_read,parallel=32,read_size=4096	929.44 ms	1.14 s	20	-0.57	✅
from_elem/full_read,parallel=64,read_size=1048576	35.82 ms	166.02 ms	20	-0.88	✅
from_elem/full_read,parallel=64,read_size=16384	258.62 ms	317.58 ms	20	-0.69	✅
from_elem/full_read,parallel=64,read_size=4096	919.57 ms	1.15 s	20	-0.63	✅
from_elem/random_read,parallel=1,item_size=1024	4.13 ms	4.16 ms	20	-0.04	✅
from_elem/random_read,parallel=1,item_size=4096	4.66 ms	6.95 ms	20	-0.83	✅
from_elem/random_read,parallel=1,item_size=8	1.81 ms	1.85 ms	20	-0.12	✅
from_elem/random_read,parallel=16,item_size=1024	4.15 ms	4.11 ms	20	+0.05	✅
from_elem/random_read,parallel=16,item_size=4096	4.64 ms	6.94 ms	20	-0.82	✅
from_elem/random_read,parallel=16,item_size=8	1.82 ms	1.84 ms	20	-0.05	✅
from_elem/random_read,parallel=32,item_size=1024	4.11 ms	4.10 ms	20	+0.02	✅
from_elem/random_read,parallel=32,item_size=4096	4.91 ms	6.93 ms	20	-0.72	✅
from_elem/random_read,parallel=32,item_size=8	1.83 ms	1.85 ms	20	-0.05	✅
from_elem/random_read,parallel=64,item_size=1024	4.11 ms	4.10 ms	20	+0.02	✅
from_elem/random_read,parallel=64,item_size=4096	4.73 ms	6.91 ms	20	-0.77	✅
from_elem/random_read,parallel=64,item_size=8	1.83 ms	1.84 ms	20	-0.02	✅
hamming,auto_vec	74.65 ms	96.16 ms	20	-0.90	✅
hamming,scalar	118.43 ms	160.09 ms	20	-0.87	✅
zip_1024Ki/2_2_2_zip_into_6	6.20 ms	7.98 ms	20	-0.82	✅
zip_1024Ki/2_4_zip_into_6	4.61 ms	5.65 ms	20	-0.85	✅
zip_32Ki/2_2_2_zip_into_6	247.84 µs	254.99 µs	20	-0.10	✅
zip_32Ki/2_4_zip_into_6	154.40 µs	178.00 µs	20	-0.65	✅
zip_8Ki/2_2_2_zip_into_6	42.65 µs	67.75 µs	20	-1.42	✅
zip_8Ki/2_4_zip_into_6	34.56 µs	45.52 µs	20	-1.08	✅

Generated by bench-bot 🤖

The test covers 5 parameters (cache, k, nprobes, refine factor, dataset size) for a total of 64 tests. It takes ~3-4 minutes to run on my system. There are other parameters (distance type, number of dimensions, PQ-v-SQ, etc.) but these should only really affect compute time and/or recall and are probably better tested in rust benchmarks. Recall benchmarks will be done separately as they require real data.

add regression test for ivf/pq search

e026964

github-actions Bot added python chore labels Dec 15, 2025

wjones127 reviewed Dec 15, 2025

View reviewed changes

wjones127 approved these changes Dec 15, 2025

View reviewed changes

westonpace merged commit c641403 into lance-format:main Jan 5, 2026
14 checks passed

andrea-reale mentioned this pull request Mar 30, 2026

emilk/fix write starvation rerun-io/lance#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add regression test for ivf/pq search#5476

test: add regression test for ivf/pq search#5476
westonpace merged 1 commit intolance-format:mainfrom
westonpace:test/ivf-pq-regression

westonpace commented Dec 15, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Dec 15, 2025

Uh oh!

wjones127 Dec 15, 2025

Uh oh!

westonpace Dec 15, 2025

Uh oh!

westonpace Dec 15, 2025

Uh oh!

westonpace Dec 15, 2025

Uh oh!

westonpace commented Dec 19, 2025

Uh oh!

westonpace commented Dec 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

westonpace commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Dec 15, 2025

Uh oh!

wjones127 Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

westonpace Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

westonpace Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

westonpace Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

westonpace commented Dec 19, 2025

Uh oh!

westonpace commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results for PR #5476

Summary

✅ All Benchmarks Within Normal Range

All Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

westonpace commented Dec 15, 2025 •

edited

Loading

westonpace commented Dec 19, 2025 •

edited

Loading