Skip to content

perf: Optimize starts_with and ends_with for scalar arguments#19516

Merged
Dandandan merged 5 commits intoapache:mainfrom
andygrove:faster-starts-ends-with
Dec 28, 2025
Merged

perf: Optimize starts_with and ends_with for scalar arguments#19516
Dandandan merged 5 commits intoapache:mainfrom
andygrove:faster-starts-ends-with

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Dec 27, 2025

Which issue does this PR close?

  • Closes #.

Rationale for this change

  • Scalar argument optimization delivers 3.6x-8x speedup for the common case of starts_with(column, 'literal') or ends_with(column, 'literal')
  • StringViewArray benefits even more (~6-8x) than StringArray (~3.6-3.8x)
  • The optimization uses Arrow's Scalar wrapper to avoid broadcasting scalar values to full arrays

starts_with

Benchmark Before After Speedup
StringArray + scalar 32.38 µs 8.49 µs 3.8x
StringViewArray + scalar 78.15 µs 9.82 µs 8.0x

ends_with

Benchmark Before After Speedup
StringArray + scalar 32.76 µs 9.06 µs 3.6x
StringViewArray + scalar 76.44 µs 12.04 µs 6.4x

What changes are included in this PR?

Handle all combinations of array and scalar arguments without converting scalars to arrays

Are these changes tested?

Yes, new unit tests added in this PR.

Are there any user-facing changes?

No, just faster performance.

@github-actions github-actions bot added the functions Changes to functions implementation label Dec 27, 2025
@andygrove andygrove marked this pull request as ready for review December 27, 2025 19:57
@andygrove andygrove requested a review from Dandandan December 27, 2025 20:24
@Dandandan Dandandan added this pull request to the merge queue Dec 28, 2025
Merged via the queue into apache:main with commit bb4e0ec Dec 28, 2025
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants