-
Notifications
You must be signed in to change notification settings - Fork 304
Feat: Support Spark 4.0.0 part1 #1830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ac42642
f2b76f4
ada6a24
9db8fda
e94f7b9
6011512
d7eff03
4365733
ba43e24
600d415
ef058e4
695b193
a2c1f3a
6fa18cd
ea77900
9ac9f1b
1285408
2c5c755
b537642
0d73622
ee9fe2c
a97680d
d3cf777
b1bbbc7
b03b1d9
072e439
5860482
d431fe6
ffa68a9
e8a2f40
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -57,9 +57,11 @@ jobs: | |
| java_version: "17" | ||
| maven_opts: "-Pspark-3.5 -Pscala-2.13" | ||
|
|
||
| - name: "Spark 4.0, JDK 17, Scala 2.13" | ||
| java_version: "17" | ||
| maven_opts: "-Pspark-4.0 -Pscala-2.13" | ||
| # TODO fails with OOM | ||
| # https://github.com/apache/datafusion-comet/issues/1949 | ||
| # - name: "Spark 4.0, JDK 17, Scala 2.13" | ||
| # java_version: "17" | ||
| # maven_opts: "-Pspark-4.0 -Pscala-2.13" | ||
|
Comment on lines
+60
to
+64
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The tests fail on macOS specifically, but still run on Linux |
||
|
|
||
| suite: | ||
| - name: "fuzz" | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -74,6 +74,8 @@ | |
| import static org.junit.Assert.*; | ||
| import static org.junit.Assert.assertEquals; | ||
|
|
||
| import static org.apache.comet.parquet.TypeUtil.isSpark40Plus; | ||
|
|
||
| @SuppressWarnings("deprecation") | ||
| public class TestFileReader { | ||
| private static final MessageType SCHEMA = | ||
|
|
@@ -609,7 +611,9 @@ public void testColumnIndexReadWrite() throws Exception { | |
| assertEquals(1, offsetIndex.getFirstRowIndex(1)); | ||
| assertEquals(3, offsetIndex.getFirstRowIndex(2)); | ||
|
|
||
| assertNull(indexReader.readColumnIndex(footer.getBlocks().get(2).getColumns().get(0))); | ||
| if (!isSpark40Plus()) { // TODO: https://github.com/apache/datafusion-comet/issues/1948 | ||
| assertNull(indexReader.readColumnIndex(footer.getBlocks().get(2).getColumns().get(0))); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this because of ANSI mode?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In L539, it has This line is trying to read the column index metadata for the first column of the third row group, and verify it's null. I am not sure why this failed for 4.0. My guess is that in the new parquet version, the column index implementation gets changed, but I didn't find the corresponding change for this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mind tracking this in a ticket please?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did we log an issue specifically for this? The ColumnIndex implementation is part of Comet code so if a test is failing we need to fix it in Comet.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The code here has a TODO linking to #1948. I added a comment in this issue referring to this file. |
||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This memory change did not help, but also did no harm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't help, do you mean the test fails on OOM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Comet test suites fail with OOM when running on macOS. I tried this change to specify more memory, but it did not make any difference. The macOS workflow is commented out in this PR, and I filed a follow up issue #1949