-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-29975][SQL] introduce --CONFIG_DIM directive #26612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -1,8 +1,3 @@ | |||
| -- List of configuration the test suite is run against: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to test the optimizer, don't need to run it with different join operators
| @@ -1,8 +1,3 @@ | |||
| -- List of configuration the test suite is run against: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to test the analyzer/optimizer. Natural join will be rewritten to other normal joins, no need to un it with different join operators
|
|
||
| -- Set the cross join enabled flag for the LEFT JOIN test since there's no join condition. | ||
| -- Ultimately the join should be optimized away. | ||
| set spark.sql.crossJoin.enabled = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is true by default now.
| @@ -1,8 +1,3 @@ | |||
| -- List of configuration the test suite is run against: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are testing UDFs, and the join operator doesn't matter.
|
Test build #114175 has finished for PR 26612 at commit
|
| @@ -104,9 +101,6 @@ class ThriftServerQueryTestSuite extends SQLQueryTestSuite { | |||
| "subquery/in-subquery/simple-in.sql", | |||
| "subquery/in-subquery/in-order-by.sql", | |||
| "subquery/in-subquery/in-set-operations.sql", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @cloud-fan .
This causes linter failure. Please remove the ending ,.
$ dev/lint-scala
Scalastyle checks failed at following occurrences:
[error] /Users/dongjoon/PRS/PR-26612/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerQueryTestSuite.scala: illegal start of simple expression: Token(RPAREN,),3495,))
[error] Total time: 20 s, completed Nov 20, 2019 11:24:55 AM
|
Test build #114198 has finished for PR 26612 at commit
|
|
retest this please |
|
Test build #114216 has finished for PR 26612 at commit
|
|
Test build #114218 has finished for PR 26612 at commit
|
|
After this PR:
Before this PR: (use the result of #26214)
|
|
Yes. Much faster than before. ThriftServerQueryTestSuite: |
|
Merged to master. |
| -- Test aggregate operator with codegen on and off. | ||
| --CONFIG_DIM1 spark.sql.codegen.wholeStage=true | ||
| --CONFIG_DIM1 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=CODEGEN_ONLY | ||
| --CONFIG_DIM1 spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry to be late) In the one dimension case, CONFIG_DIM1 is the same with SET?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's different. We still run this test 3 times as there are 3 config sets in this dimension. It's only the same with SET if there is only one dimension and one config set.
| // - config: (String, String)) | ||
| // We need to do cartesian product for all the config dimensions, to get a list of | ||
| // config sets, and run the query once for each config set. | ||
| val configDimLines = comments.filter(_.startsWith("--CONFIG_DIM")).map(_.substring(12)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to update how-to use CONFIG_DIM in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we should document this there ...
### What changes were proposed in this pull request? add document to address #26612 (comment) ### Why are the changes needed? help people understand how to use --CONFIG_DIM ### Does this PR introduce any user-facing change? no ### How was this patch tested? N/A Closes #26661 from cloud-fan/test. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
What changes were proposed in this pull request?
allow the sql test files to specify different dimensions of config sets during testing. For example,
This example defines 2 config dimensions, and each dimension defines 2 config sets. We will run the queries 4 times:
Why are the changes needed?
Currently
SQLQueryTestSuitetakes a long time. This is because we run each test at least 3 times, to check with different codegen modes. This is not necessary for most of the tests, e.g. DESC TABLE. We should only check these codegen modes for certain tests.With the --CONFIG_DIM directive, we can do things like: test different join operator(broadcast or shuffle join) X different codegen modes.
After reducing testing time, we should be able to run thrifter server SQL tests with config settings.
Does this PR introduce any user-facing change?
no
How was this patch tested?
test only