Skip to content

Improve InListExpr plan display#17884

Merged
alamb merged 4 commits intoapache:mainfrom
pepijnve:in_set_explain
Oct 4, 2025
Merged

Improve InListExpr plan display#17884
alamb merged 4 commits intoapache:mainfrom
pepijnve:in_set_explain

Conversation

@pepijnve
Copy link
Copy Markdown
Contributor

@pepijnve pepijnve commented Oct 2, 2025

Which issue does this PR close?

Rationale for this change

Aligns the explain output for IN (SET) and NOT IN (SET). The presence of Use is a bit annoying.

What changes are included in this PR?

  • Removes Use from the formatted output
  • Adapt tests where necessary

Are these changes tested?

Covered by existing tests

Are there any user-facing changes?

The explain output strings change slightly

@github-actions github-actions Bot added the physical-expr Changes to the physical-expr crates label Oct 2, 2025
@github-actions github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Oct 2, 2025
@pepijnve
Copy link
Copy Markdown
Contributor Author

pepijnve commented Oct 2, 2025

@Jefffrey I'm fixing the failing test. While I'm at it I'm considering adding another change in this PR that uses Display for the set elements rather than Debug. The output now is way too verbose imo.

Example from one of the test cases:

rather than

a@0 IN (SET) ([Literal { value: Utf8("a"), field: Field { name: "lit", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8("b"), field: Field { name: "lit", data_type: Utf8, nullable: false, d ...

we would get

a@0 IN (SET) ([a, b, NULL])

is that ok to include or should I make a separate PR?

@Jefffrey
Copy link
Copy Markdown
Contributor

Jefffrey commented Oct 2, 2025

@Jefffrey I'm fixing the failing test. While I'm at it I'm considering adding another change in this PR that uses Display for the set elements rather than Debug. The output now is way too verbose imo.

Example from one of the test cases:

rather than

a@0 IN (SET) ([Literal { value: Utf8("a"), field: Field { name: "lit", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8("b"), field: Field { name: "lit", data_type: Utf8, nullable: false, d ...

we would get

a@0 IN (SET) ([a, b, NULL])

is that ok to include or should I make a separate PR?

I think that's a great idea to include in this PR; I recall a recent PR where we saw this debug display too which was a bit ugly: #17732 (comment)

let display_string = expr.to_string();
assert_eq!(sql_string, "a NOT IN (a, b, NULL)");
assert_eq!(display_string, "a@0 NOT IN (SET) ([Literal { value: Utf8(\"a\"), field: Field { name: \"lit\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8(\"b\"), field: Field { name: \"lit\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8(NULL), field: Field { name: \"lit\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }])");
assert_eq!(display_string, "a@0 NOT IN (SET) ([a, b, NULL])");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

200_d

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering about the difference in formatting of literals between logical and physical plans. Logical will show Utf8("a") while physical shows just a. Out of scope for what I'm doing here, but would you want the type information in physical as well? For string literals in particular I think quotes would be a useful addition to avoid any possible confusion with column or table names.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a separate PR for that since it will probably affect quite a lot of the sql logic test results.

@pepijnve pepijnve changed the title Remove spurious Use in InListExpr display formatted output Improve InListExpr plan display Oct 2, 2025
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @pepijnve

let display_string = expr.to_string();
assert_eq!(sql_string, "a NOT IN (a, b, NULL)");
assert_eq!(display_string, "a@0 NOT IN (SET) ([Literal { value: Utf8(\"a\"), field: Field { name: \"lit\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8(\"b\"), field: Field { name: \"lit\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }, Literal { value: Utf8(NULL), field: Field { name: \"lit\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }])");
assert_eq!(display_string, "a@0 NOT IN (SET) ([a, b, NULL])");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@alamb alamb added this pull request to the merge queue Oct 4, 2025
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Oct 4, 2025

Thanks again @pepijnve and @Jefffrey

Merged via the queue into apache:main with commit d273ffb Oct 4, 2025
29 checks passed
@pepijnve pepijnve deleted the in_set_explain branch November 3, 2025 16:22
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
* Remove spurious `Use` in InListExpr display formatted output

* Adapt tpch.slt expected results

* Reduce verbosity of Display for InListExpr output

* Silence clippy warning

(cherry picked from commit d273ffb)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
* Remove spurious `Use` in InListExpr display formatted output

* Adapt tpch.slt expected results

* Reduce verbosity of Display for InListExpr output

* Silence clippy warning

(cherry picked from commit d273ffb)
(cherry picked from commit 050a110)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spurious 'Use' in InList display output

3 participants