Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Apr 2, 2024

Which issue does this PR close?

Related to #9764. Follow on to #9824

Rationale for this change

On #9824 @comphead saw that using Arc<String> made a significant difference in planning benchmarks compared to Arc<str> - #9824 (comment)

I believe this is due to the fact that the original data often comes as a String so making an Arc<str>from a &str results in a second copy.

It turns out that the Rust compiler is smart enough to avoid copying when using impl Into<Arc<str>> -- I thought it was, even though the docs seem to suggest that is not the case.

What changes are included in this PR?

Change from

    pub fn bare(table: &str) -> TableReference {

To

    pub fn bare(table: impl Into<Arc<str>>) -> TableReference {

Are these changes tested?

yes, by existing CI

Are there any user-facing changes?

Significantly faster performance (5x in at least one case it seems, on select *)

Benchmark Results

++ critcmp main perf_experiment
group                                         main                                    perf_experiment
-----                                         ----                                    ---------------
logical_aggregate_with_join                   1.07  1287.6±101.41µs        ? ?/sec    1.00  1201.6±70.47µs        ? ?/sec
logical_plan_tpch_all                         1.02     17.3±0.16ms        ? ?/sec     1.00     16.9±0.17ms        ? ?/sec
logical_select_all_from_1000                  4.93     95.3±0.41ms        ? ?/sec     1.00     19.3±0.10ms        ? ?/sec
logical_select_one_from_700                   1.00    741.9±9.77µs        ? ?/sec     1.07   795.3±13.53µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.07   791.2±11.13µs        ? ?/sec     1.00    739.9±9.52µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.04    754.9±8.61µs        ? ?/sec     1.00   729.0±21.65µs        ? ?/sec
physical_plan_tpch_all                        1.11    136.9±0.88ms        ? ?/sec     1.00    123.8±1.07ms        ? ?/sec
physical_plan_tpch_q1                         1.05      7.8±0.07ms        ? ?/sec     1.00      7.4±0.07ms        ? ?/sec
physical_plan_tpch_q10                        1.14      6.5±0.05ms        ? ?/sec     1.00      5.7±0.06ms        ? ?/sec
physical_plan_tpch_q11                        1.02      5.1±0.04ms        ? ?/sec     1.00      5.0±0.07ms        ? ?/sec
physical_plan_tpch_q12                        1.05      4.2±0.04ms        ? ?/sec     1.00      4.0±0.03ms        ? ?/sec
physical_plan_tpch_q13                        1.03      2.7±0.03ms        ? ?/sec     1.00      2.6±0.02ms        ? ?/sec
physical_plan_tpch_q14                        1.02      3.5±0.03ms        ? ?/sec     1.00      3.4±0.04ms        ? ?/sec
physical_plan_tpch_q16                        1.08      5.4±0.05ms        ? ?/sec     1.00      5.0±0.04ms        ? ?/sec
physical_plan_tpch_q17                        1.04      5.0±0.17ms        ? ?/sec     1.00      4.8±0.03ms        ? ?/sec
physical_plan_tpch_q18                        1.09      5.6±0.06ms        ? ?/sec     1.00      5.1±0.05ms        ? ?/sec
physical_plan_tpch_q19                        1.06     10.3±0.07ms        ? ?/sec     1.00      9.8±0.06ms        ? ?/sec
physical_plan_tpch_q2                         1.15     12.5±0.07ms        ? ?/sec     1.00     10.8±0.16ms        ? ?/sec
physical_plan_tpch_q20                        1.05      6.6±0.07ms        ? ?/sec     1.00      6.3±0.08ms        ? ?/sec
physical_plan_tpch_q21                        1.14      9.7±0.07ms        ? ?/sec     1.00      8.5±0.06ms        ? ?/sec
physical_plan_tpch_q22                        1.05      4.7±0.05ms        ? ?/sec     1.00      4.5±0.04ms        ? ?/sec
physical_plan_tpch_q3                         1.07      4.2±0.04ms        ? ?/sec     1.00      4.0±0.03ms        ? ?/sec
physical_plan_tpch_q4                         1.15      3.4±0.03ms        ? ?/sec     1.00      2.9±0.03ms        ? ?/sec
physical_plan_tpch_q5                         1.08      6.2±0.05ms        ? ?/sec     1.00      5.7±0.06ms        ? ?/sec
physical_plan_tpch_q6                         1.02      2.1±0.03ms        ? ?/sec     1.00      2.0±0.02ms        ? ?/sec
physical_plan_tpch_q7                         1.14      8.8±0.07ms        ? ?/sec     1.00      7.8±0.06ms        ? ?/sec
physical_plan_tpch_q8                         1.25     12.4±0.09ms        ? ?/sec     1.00      9.9±0.08ms        ? ?/sec
physical_plan_tpch_q9                         1.26      9.4±0.07ms        ? ?/sec     1.00      7.5±0.06ms        ? ?/sec
physical_select_all_from_1000                 5.46    693.9±1.52ms        ? ?/sec     1.00    127.1±1.17ms        ? ?/sec
physical_select_one_from_700                  1.04      4.2±0.02ms        ? ?/sec     1.00      4.1±0.03ms        ? ?/sec

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate substrait Changes to the substrait crate labels Apr 2, 2024
/// all, so "Foo.Bar" stays as a reference to the table named
/// "Foo.Bar" (rather than "foo"."bar")
pub fn bare(table: &str) -> TableReference {
pub fn bare(table: impl Into<Arc<str>>) -> TableReference {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR tests the theory that

    pub fn bare(table: impl Into<Arc<str>>) -> TableReference {

will be faster than

    pub fn bare(table: &str) -> TableReference {

@alamb

This comment was marked as outdated.

@alamb alamb force-pushed the alamb/perf_experiment branch from d21617c to c734791 Compare April 2, 2024 20:16
@github-actions github-actions bot removed logical-expr Logical plan and expressions core Core DataFusion crate substrait Changes to the substrait crate labels Apr 2, 2024
@alamb alamb changed the title [WIP] Test using impl Into<Arc<str>> Improve planning speed using impl Into<Arc<str>> to create Arc<str> rather than &str Apr 2, 2024
@alamb
Copy link
Contributor Author

alamb commented Apr 2, 2024

This PR actually seems quite a bit more effective at squeezing even more performance out of @comphead 's #9824

1 => {
let table = taker.take(enable_normalization);
Ok(OwnedTableReference::bare(&table))
Ok(OwnedTableReference::bare(table))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow it seems the compiler can avoid copying / allocating by getting this as an owned item

@alamb alamb marked this pull request as ready for review April 2, 2024 21:16
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @alamb 🚀

@comphead
Copy link
Contributor

comphead commented Apr 2, 2024

I'm planning to deprecate Owned* types, so will wait this PR to be merged to avoid conflicts

@alamb alamb merged commit d2ba901 into apache:main Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants