Skip to content

test: add btree bitmap benchmark#5389

Merged
westonpace merged 5 commits intolance-format:mainfrom
westonpace:test/add-btree-bitmap-benchmark
Dec 4, 2025
Merged

test: add btree bitmap benchmark#5389
westonpace merged 5 commits intolance-format:mainfrom
westonpace:test/add-btree-bitmap-benchmark

Conversation

@westonpace
Copy link
Copy Markdown
Member

The btree and bitmap index are important and critical features but lacking a good micro-benchmark for the search (we do have some macro-benchmarks). This PR adds micro benchmarks for btree and bitmap search and covers a variety of scenarios.

  • Floats (fixed-width data) vs. Strings (variable width data)
  • Equality vs. Range
  • Small result set vs. Large result set
  • High cardinality data vs. Low cardinality data

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@github-actions github-actions Bot added the chore label Dec 2, 2025
@westonpace
Copy link
Copy Markdown
Member Author

Results from btree benchmarks:

btree_equality/float_unique
                        time:   [240.07 µs 242.02 µs 243.85 µs]
btree_equality/float_low_card
                        time:   [14.640 ms 14.678 ms 14.721 ms]
btree_equality/string_unique
                        time:   [602.19 µs 603.53 µs 605.52 µs]
btree_equality/string_low_card
                        time:   [21.163 ms 21.223 ms 21.265 ms]
btree_range_few/float_unique
                        time:   [1.5155 ms 1.5190 ms 1.5244 ms]
btree_range_few/float_low_card
                        time:   [15.081 ms 15.175 ms 15.269 ms]
btree_range_few/string_unique
                        time:   [2.2354 ms 2.2408 ms 2.2503 ms]
btree_range_few/string_low_card
                        time:   [24.453 ms 24.499 ms 24.538 ms]
btree_range_many/float_unique
                        time:   [123.50 ms 123.60 ms 123.72 ms]
btree_range_many/float_low_card
                        time:   [139.99 ms 140.60 ms 141.16 ms]
btree_range_many/string_unique
                        time:   [160.89 ms 161.17 ms 161.43 ms]
btree_range_many/string_low_card
                        time:   [195.28 ms 195.70 ms 195.99 ms]
btree_range_most/float_unique
                        time:   [1.1131 s 1.1158 s 1.1186 s]
btree_range_most/float_low_card
                        time:   [1.1321 s 1.1392 s 1.1473 s]
btree_range_most/string_unique
                        time:   [1.4461 s 1.4491 s 1.4521 s]
btree_range_most/string_low_card
                        time:   [1.5557 s 1.5590 s 1.5621 s]

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@westonpace
Copy link
Copy Markdown
Member Author

I've expanded on this quite a bit. Final times for rust btree benchmarks (all times are in microseconds):

btree_equality/int_unique/no_cache           90
btree_equality/int_low_card/no_cache        368
btree_equality/string_unique/no_cache       157
btree_equality/string_low_card/no_cache     510
btree_equality/int_unique/cached              2
btree_equality/int_low_card/cached          220
btree_equality/string_unique/cached          12
btree_equality/string_low_card/cached       261
btree_in_10/int_unique/no_cache             116
btree_in_10/int_low_card/no_cache          3132
btree_in_10/string_unique/no_cache          189
btree_in_10/string_low_card/no_cache       4071
btree_in_10/int_unique/cached                25
btree_in_10/int_low_card/cached            2682
btree_in_10/string_unique/cached             43
btree_in_10/string_low_card/cached         3302
btree_in_20/int_unique/no_cache             122
btree_in_20/int_low_card/no_cache          6284
btree_in_20/string_unique/no_cache          195
btree_in_20/string_low_card/no_cache       8124
btree_in_20/int_unique/cached                29
btree_in_20/int_low_card/cached            5377
btree_in_20/string_unique/cached             52
btree_in_20/string_low_card/cached         6902
btree_in_30/int_unique/no_cache             126
btree_in_30/int_low_card/no_cache          9582
btree_in_30/string_unique/no_cache          207
btree_in_30/string_low_card/no_cache      12232
btree_in_30/int_unique/cached                32
btree_in_30/int_low_card/cached            8119
btree_in_30/string_unique/cached             57
btree_in_30/string_low_card/cached        10360
btree_range_few/int_unique/no_cache         126
btree_range_few/int_low_card/no_cache       367
btree_range_few/string_unique/no_cache      229
btree_range_few/string_low_card/no_cache    582
btree_range_few/int_unique/cached            24
btree_range_few/int_low_card/cached         221
btree_range_few/string_unique/cached         73
btree_range_few/string_low_card/cached      332
btree_range_many/int_unique/no_cache       2588
btree_range_many/int_low_card/no_cache     2836
btree_range_many/string_unique/no_cache    3339
btree_range_many/string_low_card/no_cache  3801
btree_range_many/int_unique/cached         2083
btree_range_many/int_low_card/cached       2299
btree_range_many/string_unique/cached      2714
btree_range_many/string_low_card/cached    2952
btree_range_most/int_unique/no_cache      22170
btree_range_most/int_low_card/no_cache    22345
btree_range_most/string_unique/no_cache   28376
btree_range_most/string_low_card/no_cache 29201
btree_range_most/int_unique/cached        19087
btree_range_most/int_low_card/cached      19358
btree_range_most/string_unique/cached     24425
btree_range_most/string_low_card/cached   24766

Final times for rust bitmap benchmarks:

bitmap_equality/int_unique/no_cache       49
bitmap_equality/int_low_card/no_cache     51
bitmap_equality/string_unique/no_cache   418
bitmap_equality/string_low_card/no_cache 397
bitmap_equality/int_unique/cached          1
bitmap_equality/int_low_card/cached        1
bitmap_equality/string_unique/cached       1
bitmap_equality/string_low_card/cached     2
bitmap_in_1/int_unique/no_cache        41548
bitmap_in_1/int_low_card/no_cache      62055
bitmap_in_1/string_unique/no_cache    143860
bitmap_in_1/string_low_card/no_cache      58
bitmap_in_1/int_unique/cached          39337
bitmap_in_1/int_low_card/cached            3
bitmap_in_1/string_unique/cached      142120
bitmap_in_1/string_low_card/cached         7
bitmap_in_3/int_unique/no_cache       118530
bitmap_in_3/int_low_card/no_cache         69
bitmap_in_3/string_unique/no_cache    410210
bitmap_in_3/string_low_card/no_cache      82
bitmap_in_3/int_unique/cached         124100
bitmap_in_3/int_low_card/cached            9
bitmap_in_3/string_unique/cached      431030
bitmap_in_3/string_low_card/cached        22
bitmap_in_5/int_unique/no_cache	      204610
bitmap_in_5/int_low_card/no_cache         94
bitmap_in_5/string_unique/no_cache    700740
bitmap_in_5/string_low_card/no_cache     119
bitmap_in_5/int_unique/cached         203540
bitmap_in_5/int_low_card/cached           16
bitmap_in_5/string_unique/cached      708540
bitmap_in_5/string_low_card/cached        38

@github-actions github-actions Bot added the python label Dec 4, 2025
@westonpace westonpace merged commit e437d2c into lance-format:main Dec 4, 2025
22 of 27 checks passed
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Dec 5, 2025
The btree and bitmap index are important and critical features but
lacking a good micro-benchmark for the search (we do have some
macro-benchmarks). This PR adds micro benchmarks for btree and bitmap
search and covers a variety of scenarios.

* Floats (fixed-width data) vs. Strings (variable width data)
* Equality vs. Range
* Small result set vs. Large result set
* High cardinality data vs. Low cardinality data
jackye1995 pushed a commit that referenced this pull request Dec 5, 2025
The btree and bitmap index are important and critical features but
lacking a good micro-benchmark for the search (we do have some
macro-benchmarks). This PR adds micro benchmarks for btree and bitmap
search and covers a variety of scenarios.

* Floats (fixed-width data) vs. Strings (variable width data)
* Equality vs. Range
* Small result set vs. Large result set
* High cardinality data vs. Low cardinality data
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
The btree and bitmap index are important and critical features but
lacking a good micro-benchmark for the search (we do have some
macro-benchmarks). This PR adds micro benchmarks for btree and bitmap
search and covers a variety of scenarios.

* Floats (fixed-width data) vs. Strings (variable width data)
* Equality vs. Range
* Small result set vs. Large result set
* High cardinality data vs. Low cardinality data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants