fix: percentile_cont interpolation causes NaN for f16 input by kumarUjjawal · Pull Request #20208 · apache/datafusion

kumarUjjawal · 2026-02-07T12:21:31Z

Which issue does this PR close?

Closes percentile_cont interpolation causes NaN for f16 input #18945

Rationale for this change

percentile_cont interpolation for Float16 could overflow f16 intermediates (e.g. when scaling the fractional component), producing inf/NaN and incorrect results. This PR makes interpolation numerically safe for f16.

What changes are included in this PR?

• Perform percentile interpolation in f64 and cast back to the input float type (f16/f32/f64) to avoid f16 overflow.
• Add a regression unit test covering Float16 interpolation near the maximum finite value.

Are these changes tested?

Yes

Are there any user-facing changes?

Yes. percentile_cont on Float16 inputs no longer returns NaN due to interpolation overflow and will produce correct
finite results for valid finite f16 data

Jefffrey

• Remove the unused interpolation precision constant and associated wrapping-math code paths/comments.

Could we please ensure we keep the PR body consistent with the latest changes

Jefffrey · 2026-02-12T02:20:41Z

-/// is computed as: `lower + ((upper - lower) * (fraction * PRECISION)) / PRECISION`
-/// to avoid floating-point operations on integer types while maintaining precision.
+/// Interpolation is performed in f64 and then cast back to the native type to
+/// avoid overflowing Float16 intermediates.


I'd still prefer we keep the old documentation as it was and just add on any amendments; I don't like how it reads about a "previous implementation" considering there is no information about such implementation (without looking at the git history)

Jefffrey · 2026-02-12T02:21:47Z

-                .div_wrapping(T::Native::usize_as(INTERPOLATION_PRECISION)),
-            );
-            Some(interpolated)
+            let scaled = (fraction * INTERPOLATION_PRECISION as f64) as usize;


If we always cast INTERPOLATION_PRECISION to f64 at all its usages we should just define it as f64

Jefffrey · 2026-02-12T02:23:53Z

+            // Linear interpolation.
+            //
+            // We quantize the fractional component (via `INTERPOLATION_PRECISION`) to
+            // minimize output changes for Float32/Float64 compared to the previous


What previous implementation are we referring to here?

alamb · 2026-02-13T17:06:32Z

Thanks @kumarUjjawal and @Jefffrey

…0208) ## Which issue does this PR close?  - Closes apache#18945 ## Rationale for this change percentile_cont interpolation for Float16 could overflow f16 intermediates (e.g. when scaling the fractional component), producing inf/NaN and incorrect results. This PR makes interpolation numerically safe for f16.  ## What changes are included in this PR? • Perform percentile interpolation in f64 and cast back to the input float type (f16/f32/f64) to avoid f16 overflow. • Add a regression unit test covering Float16 interpolation near the maximum finite value.  ## Are these changes tested? Yes  ## Are there any user-facing changes? Yes. percentile_cont on Float16 inputs no longer returns NaN due to interpolation overflow and will produce correct finite results for valid finite f16 data

fix: percentile_cont interpolation causes NaN for f16 input

f114360

github-actions Bot added the functions Changes to functions implementation label Feb 7, 2026

fix slt tests

4529e2f

github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Feb 7, 2026

Jefffrey reviewed Feb 8, 2026

View reviewed changes

Comment thread datafusion/functions-aggregate/src/percentile_cont.rs Outdated

Comment thread datafusion/sqllogictest/test_files/aggregate.slt Outdated

use num-trait for as primitive

242875f

Jefffrey reviewed Feb 10, 2026

View reviewed changes

Comment thread datafusion/functions-aggregate/src/percentile_cont.rs Outdated

Comment thread datafusion/functions-aggregate/src/percentile_cont.rs

Comment thread datafusion/functions-aggregate/src/percentile_cont.rs Outdated

revert original comments

157b79e

Jefffrey reviewed Feb 12, 2026

View reviewed changes

update comments and change type of interpolation precision

d8a4ebb

Jefffrey approved these changes Feb 12, 2026

View reviewed changes

alamb added this pull request to the merge queue Feb 13, 2026

Merged via the queue into apache:main with commit f5a2ac3 Feb 13, 2026
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: percentile_cont interpolation causes NaN for f16 input#20208

fix: percentile_cont interpolation causes NaN for f16 input#20208
alamb merged 5 commits intoapache:mainfrom
kumarUjjawal:fix/percentile_cont_f16

kumarUjjawal commented Feb 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Jefffrey left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jefffrey Feb 12, 2026

Uh oh!

Jefffrey Feb 12, 2026

Uh oh!

Jefffrey Feb 12, 2026

Uh oh!

alamb commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kumarUjjawal commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Uh oh!

Uh oh!

Jefffrey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jefffrey Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Jefffrey Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Jefffrey Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kumarUjjawal commented Feb 7, 2026 •

edited

Loading