Skip to content

perf: Optimize scalar performance for cot#19888

Merged
Jefffrey merged 3 commits intoapache:mainfrom
kumarUjjawal:perf/cot_scalar_path
Jan 21, 2026
Merged

perf: Optimize scalar performance for cot#19888
Jefffrey merged 3 commits intoapache:mainfrom
kumarUjjawal:perf/cot_scalar_path

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

The cot function currently converts scalar inputs to arrays before processing, even for single scalar values. This adds unnecessary overhead from array allocation and conversion. Adding a scalar fast path avoids this overhead.

What changes are included in this PR?

  • Added scalar fast path
  • Added benchmark
  • Update tests
Type Before After Speedup
cot_f64_scalar 229 ns 67 ns 3.4x
cot_f32_scalar 247 ns 59 ns 4.2x

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions Bot added the functions Changes to functions implementation label Jan 19, 2026
ColumnarValue::Scalar(_) => {
panic!("Expected an array value")
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests for the Scalar input/output (the fast path).
Also it would be good to add tests for inputs like NULL, 0.0 and f64::consts::Pi

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing sqllogictests should already cover the functionality. Aren't the changes just optimization.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no .slt test for the non-Spark cot function:

❯ rg cot datafusion/sqllogictest/
datafusion/sqllogictest/test_files/spark/math/cot.slt
24:## Original Query: SELECT cot(1);
27:#SELECT cot(1::int);

datafusion/sqllogictest/test_files/aggregates_topk.slt
203:('y', 'apricot'),

datafusion/sqllogictest/test_files/imdb.slt
850:    (24, 'Ridley Scott', NULL, NULL, 'm', NULL, NULL, NULL, NULL),

Or maybe datafusion/sqllogictest/test_files/spark/math/cot.slt is not really for Spark because I see no cot in https://github.com/apache/datafusion/blob/main/datafusion/spark/src/function/math/mod.rs

Anyway, https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/spark/math/cot.slt contains only commented out code, so there are no SLT tests for cot.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unit tests for these. Thanks for the feedback.

Comment thread datafusion/functions/src/math/cot.rs Outdated
.unary::<_, Float32Type>(|x: f32| compute_cot32(x)),
) as ArrayRef),
other => exec_err!("Unsupported data type {other:?} for function cot"),
let return_type = args.return_type().clone();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is used just once - it could be moved inside if scalar.is_null() { to avoid the cloning if not used.

.unary::<_, Float32Type>(compute_cot32),
))),
other => {
internal_err!("Unexpected data type {other:?} for function cot")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to use internal_err!() instead of exec_err!() (old line 116) ?!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we reach the other => branch, it means the type coercion/signature code has a bug, this should never happen in normal execution, hence internal_err.

Comment thread datafusion/functions/benches/cot.rs Outdated
.invoke_with_args(ScalarFunctionArgs {
args: scalar_f32_args.clone(),
arg_fields: scalar_f32_arg_fields.clone(),
number_rows: 1,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
number_rows: 1,
number_rows: size,

Currently the input is always the same for all values of size. Maybe the number_rows could be used to make it a bit different ?!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark loop already varies size for array benchmarks. For scalar, the point is to measure single-value performance regardless of batch size context.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case there is no need the Scalar bench to be inside for size in [1024, 4096, 8192] {. Currently it executes the very same logic with the very same config three times (once for each size).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah right!

Comment thread datafusion/functions/benches/cot.rs Outdated
.invoke_with_args(ScalarFunctionArgs {
args: scalar_f64_args.clone(),
arg_fields: scalar_f64_arg_fields.clone(),
number_rows: 1,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
number_rows: 1,
number_rows: size,

@Jefffrey Jefffrey added this pull request to the merge queue Jan 21, 2026
Merged via the queue into apache:main with commit 4d8d48c Jan 21, 2026
28 checks passed
@Jefffrey
Copy link
Copy Markdown
Contributor

Thanks @kumarUjjawal & @martin-g

de-bgunter pushed a commit to de-bgunter/datafusion that referenced this pull request Mar 24, 2026
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Part of  apache/datafusion-comet#2986.

## Rationale for this change

The cot function currently converts scalar inputs to arrays before
processing, even for single scalar values. This adds unnecessary
overhead from array allocation and conversion. Adding a scalar fast path
avoids this overhead.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

- Added scalar fast path
- Added benchmark
- Update tests


| Type | Before | After | Speedup |
|------|--------|-------|---------|
| **cot_f64_scalar** | 229 ns | 67 ns | **3.4x** |
| **cot_f32_scalar** | 247 ns | 59 ns | **4.2x** |

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?



<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants