Skip to content

[Minor] Use per-predicate projection masks in arrow_reader_clickbench benchmark#9413

Merged
Dandandan merged 2 commits intoapache:mainfrom
Dandandan:clickbench-optimizations
Feb 15, 2026
Merged

[Minor] Use per-predicate projection masks in arrow_reader_clickbench benchmark#9413
Dandandan merged 2 commits intoapache:mainfrom
Dandandan:clickbench-optimizations

Conversation

@Dandandan
Copy link
Copy Markdown
Contributor

@Dandandan Dandandan commented Feb 14, 2026

Which issue does this PR close?

  • Closes #NNN.

Rationale for this change

As suggested by Claude - currently it uses a projection mask for all columns, significantly slowing down queries that have multiple predicates.
This makes it more in line with consumer side (e.g. DataFusion) (so we can more accurately benchmark improvements).

It shows the perf difference in a number of (multi-filter) queries:

group                                             clickbench-optimizations               main
arrow_reader_clickbench/async_object_store/Q22    1.00    151.8±6.46ms        ? ?/sec    1.52    230.5±1.68ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q36    1.00     26.3±0.24ms        ? ?/sec    4.30    113.1±0.67ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q37    1.00      9.3±0.06ms        ? ?/sec    9.64     89.7±1.20ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q38    1.00     22.4±0.26ms        ? ?/sec    1.44     32.3±0.29ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q39    1.00     38.1±0.66ms        ? ?/sec    1.09     41.5±0.35ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q40    1.00     13.0±0.15ms        ? ?/sec    2.96     38.6±0.45ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q41    1.00     10.1±0.11ms        ? ?/sec    2.83     28.5±0.73ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q42    1.00      5.6±0.05ms        ? ?/sec    1.87     10.5±0.12ms        ? ?/sec

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions Bot added the parquet Changes to the parquet crate label Feb 14, 2026
@Dandandan
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_reader_clickbench

@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing clickbench-optimizations (27f9aba) to 39a2b71 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=clickbench-optimizations
Results will be posted here when complete

Previously, every predicate in the RowFilter received the same
ProjectionMask containing ALL filter columns. This caused unnecessary
decoding of expensive string columns when evaluating cheap integer
predicates. Now each predicate receives a mask with only the single
column it needs.

Key sync improvements (vs baseline):
- Q37: 63.7ms -> 7.3ms  (-88.6%, Title LIKE with CounterID=62 filter)
- Q36: 117ms -> 24ms    (-79.5%, URL <> '' with CounterID=62 filter)
- Q40: 17.9ms -> 5.1ms  (-71.5%, multi-pred with RefererHash eq)
- Q41: 17.3ms -> 5.5ms  (-68.1%, multi-pred with URLHash eq)
- Q22: 303ms -> 127ms   (-58.2%, 3 string predicates)
- Q42: 7.6ms -> 3.9ms   (-48.5%, int-only multi-predicate)
- Q38: 19.1ms -> 12.4ms (-34.9%, 5 int predicates)
- Q21: 159ms -> 98ms     (-38.5%, URL LIKE + SearchPhrase)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Dandandan Dandandan force-pushed the clickbench-optimizations branch from 27f9aba to 41e7bcb Compare February 14, 2026 16:05
@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                             clickbench-optimizations               main
-----                                             ------------------------               ----
arrow_reader_clickbench/async/Q1                  1.01      2.3±0.04ms        ? ?/sec    1.00      2.3±0.01ms        ? ?/sec
arrow_reader_clickbench/async/Q10                 1.00     11.3±0.43ms        ? ?/sec    1.01     11.4±0.38ms        ? ?/sec
arrow_reader_clickbench/async/Q11                 1.00     12.8±0.29ms        ? ?/sec    1.03     13.2±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q12                 1.00     22.5±0.22ms        ? ?/sec    1.02     23.0±0.25ms        ? ?/sec
arrow_reader_clickbench/async/Q13                 1.00     27.8±0.29ms        ? ?/sec    1.02     28.5±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q14                 1.00     25.2±0.24ms        ? ?/sec    1.02     25.6±0.58ms        ? ?/sec
arrow_reader_clickbench/async/Q19                 1.00      5.4±0.07ms        ? ?/sec    1.05      5.7±0.14ms        ? ?/sec
arrow_reader_clickbench/async/Q20                 1.00    119.3±0.72ms        ? ?/sec    1.00    119.0±9.14ms        ? ?/sec
arrow_reader_clickbench/async/Q21                 1.00    144.6±7.10ms        ? ?/sec    1.15    166.1±1.22ms        ? ?/sec
arrow_reader_clickbench/async/Q22                 1.00    206.8±4.80ms        ? ?/sec    1.15    238.7±2.08ms        ? ?/sec
arrow_reader_clickbench/async/Q23                 1.00    399.4±3.10ms        ? ?/sec    1.01    404.4±2.58ms        ? ?/sec
arrow_reader_clickbench/async/Q24                 1.00     30.8±0.34ms        ? ?/sec    1.03     31.8±0.30ms        ? ?/sec
arrow_reader_clickbench/async/Q27                 1.00     96.4±0.57ms        ? ?/sec    1.03     99.3±1.08ms        ? ?/sec
arrow_reader_clickbench/async/Q28                 1.00     94.3±0.70ms        ? ?/sec    1.03     96.7±0.87ms        ? ?/sec
arrow_reader_clickbench/async/Q30                 1.00     27.6±0.31ms        ? ?/sec    1.02     28.1±0.36ms        ? ?/sec
arrow_reader_clickbench/async/Q36                 1.00     29.3±0.32ms        ? ?/sec    4.00    117.1±0.60ms        ? ?/sec
arrow_reader_clickbench/async/Q37                 1.00      9.6±0.14ms        ? ?/sec    9.66     92.6±0.85ms        ? ?/sec
arrow_reader_clickbench/async/Q38                 1.00     25.5±0.38ms        ? ?/sec    1.37     35.0±0.20ms        ? ?/sec
arrow_reader_clickbench/async/Q39                 1.00     43.1±0.49ms        ? ?/sec    1.06     45.5±0.49ms        ? ?/sec
arrow_reader_clickbench/async/Q40                 1.00     13.7±0.41ms        ? ?/sec    2.95     40.6±0.73ms        ? ?/sec
arrow_reader_clickbench/async/Q41                 1.00     10.6±0.33ms        ? ?/sec    2.79     29.7±0.82ms        ? ?/sec
arrow_reader_clickbench/async/Q42                 1.00      5.9±0.08ms        ? ?/sec    1.87     11.1±0.31ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q1     1.01      2.3±0.04ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q10    1.00     10.6±0.28ms        ? ?/sec    1.03     10.9±0.35ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q11    1.00     12.1±0.25ms        ? ?/sec    1.02     12.3±0.29ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q12    1.00     22.1±0.31ms        ? ?/sec    1.02     22.6±0.36ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q13    1.00     27.0±0.52ms        ? ?/sec    1.02     27.5±0.32ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q14    1.00     24.6±0.25ms        ? ?/sec    1.01     24.9±0.29ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q19    1.00      5.1±0.11ms        ? ?/sec    1.02      5.2±0.19ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q20    1.00    107.0±0.94ms        ? ?/sec    1.03    110.5±1.00ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q21    1.00    119.1±0.53ms        ? ?/sec    1.05    125.5±0.61ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q22    1.00    151.8±6.46ms        ? ?/sec    1.52    230.5±1.68ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q23    1.09   366.5±14.02ms        ? ?/sec    1.00    337.7±1.97ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q24    1.00     29.6±0.43ms        ? ?/sec    1.03     30.6±0.72ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q27    1.00     90.8±0.44ms        ? ?/sec    1.05     95.1±0.46ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q28    1.00     89.2±0.39ms        ? ?/sec    1.04     93.0±0.60ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q30    1.00     26.2±0.31ms        ? ?/sec    1.03     26.9±0.65ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q36    1.00     26.3±0.24ms        ? ?/sec    4.30    113.1±0.67ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q37    1.00      9.3±0.06ms        ? ?/sec    9.64     89.7±1.20ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q38    1.00     22.4±0.26ms        ? ?/sec    1.44     32.3±0.29ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q39    1.00     38.1±0.66ms        ? ?/sec    1.09     41.5±0.35ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q40    1.00     13.0±0.15ms        ? ?/sec    2.96     38.6±0.45ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q41    1.00     10.1±0.11ms        ? ?/sec    2.83     28.5±0.73ms        ? ?/sec
arrow_reader_clickbench/async_object_store/Q42    1.00      5.6±0.05ms        ? ?/sec    1.87     10.5±0.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q1                   1.00   1996.0±5.46µs        ? ?/sec    1.00  1998.9±22.44µs        ? ?/sec
arrow_reader_clickbench/sync/Q10                  1.00      7.6±0.06ms        ? ?/sec    1.00      7.5±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q11                  1.00      9.0±0.08ms        ? ?/sec    1.00      9.0±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q12                  1.00     28.7±0.21ms        ? ?/sec    1.03     29.6±1.32ms        ? ?/sec
arrow_reader_clickbench/sync/Q13                  1.29     43.7±0.49ms        ? ?/sec    1.00     34.0±0.31ms        ? ?/sec
arrow_reader_clickbench/sync/Q14                  1.00     30.9±0.30ms        ? ?/sec    1.26     39.0±1.03ms        ? ?/sec
arrow_reader_clickbench/sync/Q19                  1.00      4.2±0.03ms        ? ?/sec    1.01      4.2±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20                  1.00    172.1±1.05ms        ? ?/sec    1.03    178.1±1.00ms        ? ?/sec
arrow_reader_clickbench/sync/Q21                  1.00    129.3±0.84ms        ? ?/sec    1.78    230.1±4.62ms        ? ?/sec
arrow_reader_clickbench/sync/Q22                  1.00    204.5±1.57ms        ? ?/sec    2.35    480.1±3.47ms        ? ?/sec
arrow_reader_clickbench/sync/Q23                  1.00   433.0±14.12ms        ? ?/sec    1.02   441.7±16.05ms        ? ?/sec
arrow_reader_clickbench/sync/Q24                  1.02     40.5±0.57ms        ? ?/sec    1.00     39.8±0.50ms        ? ?/sec
arrow_reader_clickbench/sync/Q27                  1.00    149.0±1.38ms        ? ?/sec    1.03    154.1±0.79ms        ? ?/sec
arrow_reader_clickbench/sync/Q28                  1.00    143.0±1.08ms        ? ?/sec    1.03    147.9±0.87ms        ? ?/sec
arrow_reader_clickbench/sync/Q30                  1.00     27.6±0.32ms        ? ?/sec    1.01     27.9±0.22ms        ? ?/sec
arrow_reader_clickbench/sync/Q36                  1.00     32.8±0.38ms        ? ?/sec    4.64    152.5±0.89ms        ? ?/sec
arrow_reader_clickbench/sync/Q37                  1.00     10.5±0.22ms        ? ?/sec    8.16     85.5±0.47ms        ? ?/sec
arrow_reader_clickbench/sync/Q38                  1.00     18.5±0.23ms        ? ?/sec    1.53     28.2±0.14ms        ? ?/sec
arrow_reader_clickbench/sync/Q39                  1.00     30.9±0.50ms        ? ?/sec    1.09     33.5±0.43ms        ? ?/sec
arrow_reader_clickbench/sync/Q40                  1.00      9.0±0.13ms        ? ?/sec    2.90     26.2±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q41                  1.00      9.2±0.07ms        ? ?/sec    3.01     27.6±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q42                  1.00      6.7±0.05ms        ? ?/sec    1.77     11.9±0.10ms        ? ?/sec

@Dandandan Dandandan marked this pull request as ready for review February 14, 2026 17:05
@Dandandan Dandandan changed the title per-predicate projection masks [Minor] Use per-predicate projection masks Feb 14, 2026
@Dandandan Dandandan requested a review from alamb February 14, 2026 17:14
@alamb alamb changed the title [Minor] Use per-predicate projection masks [Minor] Use per-predicate projection masks in arrow_reader_clickbench benchmark Feb 15, 2026
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thanks @Dandandan

///
/// Note that since `RowFilter` does not implement Clone, we need to create
/// the filter for each row
/// Each predicate gets a ProjectionMask containing only the single column
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Dandandan Dandandan merged commit df63590 into apache:main Feb 15, 2026
24 checks passed
@Dandandan
Copy link
Copy Markdown
Contributor Author

Thanks for the review @alamb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants