-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add --substrait-round-trip option in sqllogictests
#16183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --substrait-round-trip option in sqllogictests
#16183
Conversation
dd41d5b to
2cc9244
Compare
2cc9244 to
e481a15
Compare
|
Code looks fine to me but I'm not familiar with the sqllogictests code so will leave it to Andrew to review properly. I really like this additional testing for Substrait, thanks! |
2010YOUY01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I suggest adding a CI test that exercises this option with a passing .slt file (or create a minimal one). Otherwise this LGTM.
Something like this 1a05ba2? |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @gabotechs -- thank you
I also tried it locally with tests I know fail and got an error as expected:
cargo test --test sqllogictests -- --substrait-round-trip agg
...
10. query failed: DataFusion error: This feature is not implemented: Producing a row from empty relation is unsupported
[SQL] with data as (
select 1 as f, 4 as b
union all
select null as f, 99 as b
union all
select 2 as f, 5 as b
union all
select 98 as f, null as b
union all
select 3 as f, 6 as b
union all
select null as f, null as b
)
select corr(f, b)
from data
at /Users/andrewlamb/Software/datafusion2/datafusion/sqllogictest/test_files/aggregate.slt:539| POSTGRES_PORT: ${{ job.services.postgres.ports[5432] }} | ||
|
|
||
| sqllogictest-substrait: | ||
| name: "Run sqllogictest in Substrait round-trip mode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
Thank you for the review @2010YOUY01 |
* Add substrait roundtrip option in sqllogictests * Fix doc link and missing license header * Add README.md entry for the Substrait round-trip mode * Link tracking issue in README.md * Use clap's `conflicts_with` instead of manually checking flag compatibility * Add sqllogictest-substrait job to the CI * Revert committed formatting changes to README.md
| (_, _, true) => { | ||
| run_test_file_substrait_round_trip( | ||
| test_file, | ||
| validator, | ||
| m_clone, | ||
| m_style_clone, | ||
| filters.as_ref(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't ignore postgres_runner and complete arguments.
Unsupported arg combinations should be an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think clap should be taking care of that
datafusion/datafusion/sqllogictest/bin/sqllogictests.rs
Lines 636 to 642 in 76ff87b
| #[clap( | |
| long, | |
| conflicts_with = "complete", | |
| conflicts_with = "postgres_runner", | |
| help = "Before executing each query, convert its logical plan to Substrait and from Substrait back to its logical plan" | |
| )] | |
| substrait_round_trip: bool, |
As suggested in #16183 (comment).
If you try to run incompatible commands you would get something like this:
error: the argument '--substrait-round-trip' cannot be used with '--complete'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point and i haven't checked that
in the past i've observed some param combinations being ignored...
it's best if the clap does that (prints best error messages), but IMO the code further down should not rely on clap validation (think: args struct could come from some other source) and thus should re-validate.
anyway, that's a nit, since we have the clap validation
Which issue does this PR close?
It probably does not fully closes it but it partially addresses:
Rationale for this change
Being able to execute the full sqllogictest suite adding a Substrait roundtrip. This means:
What changes are included in this PR?
Adds a new
DataFusionSubstraitRoundTripsqllogictest runner implementation that will do a Substrait roundtrip where it makes sense, and a new--substrait-round-tripoption to the sqllogictest CLI.Example:
Runs the full aggregate.slt file doing a Substrait roundtrip on each individual statement
cargo test --test sqllogictests -- --substrait-round-trip aggregateRuns an individual failing test reproducing one specific issue with the current Substrait conversions
cargo test --test sqllogictests -- --substrait-round-trip aggregate:1522This is still not enforced in the CI, and I would say the expectations are that it's just another tool to catch existing bugs in the current Substrait conversion code and better communicate them so that contributors have a clear path forward for contributing fixes.
Are these changes tested?
N/A
Are there any user-facing changes?
No