-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-8064, build against Hive 1.2.1 #7191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-8064, build against Hive 1.2.1 #7191
Conversation
|
Test build #36407 has finished for PR 7191 at commit
|
|
Jenkins, retest this please. |
|
Test build #36412 has finished for PR 7191 at commit
|
|
jenkins test this please |
|
Test build #37253 has finished for PR 7191 at commit
|
9c0336c to
b87eebf
Compare
|
Test build #37360 has finished for PR 7191 at commit
|
|
Test build #37386 has finished for PR 7191 at commit
|
8667845 to
4f1e210
Compare
|
Test build #37538 has finished for PR 7191 at commit
|
4f1e210 to
778e97e
Compare
|
Test build #37649 timed out for PR 7191 at commit |
778e97e to
7d0fcf1
Compare
|
Test build #38107 has finished for PR 7191 at commit
|
|
Test build #38105 timed out for PR 7191 at commit |
|
Test build #38117 timed out for PR 7191 at commit |
ced5c09 to
c7d8d82
Compare
|
Hi @steveloughran, I actually spent some time recently updating the Hive code to work against 1.1 (not 1.2, but a lot of it should apply). I can't post it as a PR because I did not touch the thrift server code, but since you're working on that part also, I thought you'd benefit from the code I worked on. Looking at your patch I see a lot of things that I had to change missing. I've posted my patch here: Feel free to use it any way you see fit. All the |
|
Test build #38253 timed out for PR 7191 at commit |
|
vanzin: thanks for that work. I'd held off going near sql/hive as you'd said you were working on it, but I was just starting to stare at the tokenization code and wondering what I was going to do there... I'm going to push up my next iteration, which adds more resilience to the tests (i.e. cleanup), but doesn't address any of the root causes. Then I'll see about how to merge in your code. |
I was working on the metastore support for newer Hive versions, which went in some time ago. Sorry if that wasn't clear and you've been holding back on your work for that reason - the patch I posted above is unrelated to the metastore work. |
|
Test build #38254 timed out for PR 7191 at commit |
|
Test build #38301 has finished for PR 7191 at commit
|
|
Test build #38396 timed out for PR 7191 at commit |
|
Test build 38301 timed out, presumably due to timeout on thrift server tests that took 6+ minutes to fail if the server didn't come up. Changes to the pom to get jersey server back on the classpath on hadoop < 2.6, (i.e. reinstate hive-exec's inclusion of the yarn-rm-server module) and improved fail-fast test setup should address this. as a result of the timeout, failures in the |
|
Test build #38508 has finished for PR 7191 at commit
|
181e494 to
d7553b0
Compare
|
Test build #38621 has finished for PR 7191 at commit
|
|
Test build #38647 has finished for PR 7191 at commit
|
|
Hey @steveloughran and @vanzin. If I understand correctly, I think we need to modify the dependency on hive-exec to depend on the http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C1.2.1%7Cjar |
|
Test build #39368 has finished for PR 7191 at commit
|
|
|
I've cleared the maven/ivy cache on all the jenkins machines, so going to restart the build again. Jenkins, test this please. |
|
Test build #39371 has finished for PR 7191 at commit
|
|
Test build #39373 has finished for PR 7191 at commit
|
|
I'm looking into the failures caused by Parquet dependency. Hive shades Parquet classes into its own private namespace. I think we miss a shading somewhere in the updated POM. |
…eature/SPARK-8064-hive-1.2-002
|
Latest patch includes the commits from lliancheng for HiveSubmit & missing spark-hive parquet dependencies on the SBT test runs |
|
Test build #39435 has finished for PR 7191 at commit
|
|
Test build #39562 has finished for PR 7191 at commit
|
|
@steveloughran any issues preventing us from merging? |
|
None that I know of: I've tested the thriftserver code on |
|
Awesome, I'm going to merge to master and 1.5 then. |
Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve Loughran <stevel@hortonworks.com> Author: Cheng Lian <lian@databricks.com> Author: Michael Armbrust <michael@databricks.com> Author: Patrick Wendell <patrick@databricks.com> Closes #7191 from steveloughran/stevel/feature/SPARK-8064-hive-1.2-002 and squashes the following commits: 7556d85 [Cheng Lian] Updates .q files and corresponding golden files ef4af62 [Steve Loughran] Merge commit '6a92bb09f46a04d6cd8c41bdba3ecb727ebb9030' into stevel/feature/SPARK-8064-hive-1.2-002 6a92bb0 [Cheng Lian] Overrides HiveConf time vars dcbb391 [Cheng Lian] Adds com.twitter:parquet-hadoop-bundle:1.6.0 for Hive Parquet SerDe 0bbe475 [Steve Loughran] SPARK-8064 scalastyle rejects the standard Hadoop ASF license header... fdf759b [Steve Loughran] SPARK-8064 classpath dependency suite to be in sync with shading in final (?) hive-exec spark 7a6c727 [Steve Loughran] SPARK-8064 switch to second staging repo of the spark-hive artifacts. This one has the protobuf-shaded hive-exec jar 376c003 [Steve Loughran] SPARK-8064 purge duplicate protobuf declaration 2c74697 [Steve Loughran] SPARK-8064 switch to the protobuf shaded hive-exec jar with tests to chase it down cc44020 [Steve Loughran] SPARK-8064 remove hadoop.version from runtest.py, as profile will fix that automatically. 6901fa9 [Steve Loughran] SPARK-8064 explicit protobuf import da310dc [Michael Armbrust] Fixes for Hive tests. a775a75 [Steve Loughran] SPARK-8064 cherry-pick-incomplete 7404f34 [Patrick Wendell] Add spark-hive staging repo 832c164 [Steve Loughran] SPARK-8064 try to supress compiler warnings on Complex.java pasted-thrift-code 312c0d4 [Steve Loughran] SPARK-8064 maven/ivy dependency purge; calcite declaration needed fa5ae7b [Steve Loughran] HIVE-8064 fix up hive-thriftserver dependencies and cut back on evicted references in the hive- packages; this keeps mvn and ivy resolution compatible, as the reconciliation policy is "by hand" c188048 [Steve Loughran] SPARK-8064 manage the Hive depencencies to that -things that aren't needed are excluded -sql/hive built with ivy is in sync with the maven reconciliation policy, rather than latest-first 4c8be8d [Cheng Lian] WIP: Partial fix for Thrift server and CLI tests 314eb3c [Steve Loughran] SPARK-8064 deprecation warning noise in one of the tests 17b0341 [Steve Loughran] SPARK-8064 IDE-hinted cleanups of Complex.java to reduce compiler warnings. It's all autogenerated code, so still ugly. d029b92 [Steve Loughran] SPARK-8064 rely on unescaping to have already taken place, so go straight to map of serde options 23eca7e [Steve Loughran] HIVE-8064 handle raw and escaped property tokens 54d9b06 [Steve Loughran] SPARK-8064 fix compilation regression surfacing from rebase 0b12d5f [Steve Loughran] HIVE-8064 use subset of hive complex type whose types deserialize fce73b6 [Steve Loughran] SPARK-8064 poms rely implicitly on the version of kryo chill provides fd3aa5d [Steve Loughran] SPARK-8064 version of hive to d/l from ivy is 1.2.1 dc73ece [Steve Loughran] SPARK-8064 revert to master's determinstic pushdown strategy d3c1e4a [Steve Loughran] SPARK-8064 purge UnionType 051cc21 [Steve Loughran] SPARK-8064 switch to an unshaded version of hive-exec-core, which must have been built with Kryo 2.21. This currently looks for a (locally built) version 1.2.1.spark 6684c60 [Steve Loughran] SPARK-8064 ignore RTE raised in blocking process.exitValue() call e6121e5 [Steve Loughran] SPARK-8064 address review comments aa43dc6 [Steve Loughran] SPARK-8064 more robust teardown on JavaMetastoreDatasourcesSuite f2bff01 [Steve Loughran] SPARK-8064 better takeup of asynchronously caught error text 8b1ef38 [Steve Loughran] SPARK-8064: on failures executing spark-submit in HiveSparkSubmitSuite, print command line and all logged output. 5a9ce6b [Steve Loughran] SPARK-8064 add explicit reason for kv split failure, rather than array OOB. *does not address the issue* 642b63a [Steve Loughran] SPARK-8064 reinstate something cut briefly during rebasing 97194dc [Steve Loughran] SPARK-8064 add extra logging to the YarnClusterSuite classpath test. There should be no reason why this is failing on jenkins, but as it is (and presumably its CP-related), improve the logging including any exception raised. 335357f [Steve Loughran] SPARK-8064 fail fast on thrive process spawning tests on exit codes and/or error string patterns seen in log. 3ed872f [Steve Loughran] SPARK-8064 rename field double to dbl bca55e5 [Steve Loughran] SPARK-8064 missed one of the `date` escapes 41d6479 [Steve Loughran] SPARK-8064 wrap tests with withTable() calls to avoid table-exists exceptions 2bc29a4 [Steve Loughran] SPARK-8064 ParquetSuites to escape `date` field name 1ab9bc4 [Steve Loughran] SPARK-8064 TestHive to use sered2.thrift.test.Complex bf3a249 [Steve Loughran] SPARK-8064: more resubmit than fix; tighten startup timeout to 60s. Still no obvious reason why jersey server code in spark-assembly isn't being picked up -it hasn't been shaded c829b8f [Steve Loughran] SPARK-8064: reinstate yarn-rm-server dependencies to hive-exec to ensure that jersey server is on classpath on hadoop versions < 2.6 0b0f738 [Steve Loughran] SPARK-8064: thrift server startup to fail fast on any exception in the main thread 13abaf1 [Steve Loughran] SPARK-8064 Hive compatibilty tests sin sync with explain/show output from Hive 1.2.1 d14d5ea [Steve Loughran] SPARK-8064: DATE is now a predicate; you can't use it as a field in select ops 26eef1c [Steve Loughran] SPARK-8064: HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT 3d64523 [Steve Loughran] SPARK-8064 improve diagns on uknown token; fix scalastyle failure d0360f6 [Steve Loughran] SPARK-8064: delicate merge in of the branch vanzin/hive-1.1 1126e5a [Steve Loughran] SPARK-8064: name of unrecognized file format wasn't appearing in error text 8cb09c4 [Steve Loughran] SPARK-8064: test resilience/assertion improvements. Independent of the rest of the work; can be backported to earlier versions dec12cb [Steve Loughran] SPARK-8064: when a CLI suite test fails include the full output text in the raised exception; this ensures that the stdout/stderr is included in jenkins reports, so it becomes possible to diagnose the cause. 463a670 [Steve Loughran] SPARK-8064 run-tests.py adds a hadoop-2.6 profile, and changes info messages to say "w/Hive 1.2.1" in console output 2531099 [Steve Loughran] SPARK-8064 successful attempt to get rid of pentaho as a transitive dependency of hive-exec 1d59100 [Steve Loughran] SPARK-8064 (unsuccessful) attempt to get rid of pentaho as a transitive dependency of hive-exec 75733fc [Steve Loughran] SPARK-8064 change thrift binary startup message to "Starting ThriftBinaryCLIService on port" 3ebc279 [Steve Loughran] SPARK-8064 move strings used to check for http/bin thrift services up into constants c80979d [Steve Loughran] SPARK-8064: SparkSQLCLIDriver drops remote mode support. CLISuite Tests pass instead of timing out: undetected regression? 27e8370 [Steve Loughran] SPARK-8064 fix some style & IDE warnings 00e50d6 [Steve Loughran] SPARK-8064 stop excluding hive shims from dependency (commented out , for now) cb4f142 [Steve Loughran] SPARK-8054 cut pentaho dependency from calcite f7aa9cb [Steve Loughran] SPARK-8064 everything compiles with some commenting and moving of classes into a hive package 6c310b4 [Steve Loughran] SPARK-8064 subclass Hive ServerOptionsProcessor to make it public again f61a675 [Steve Loughran] SPARK-8064 thrift server switched to Hive 1.2.1, though it doesn't compile everywhere 4890b9d [Steve Loughran] SPARK-8064, build against Hive 1.2.1 (cherry picked from commit a2409d1) Signed-off-by: Michael Armbrust <michael@databricks.com>
…accidentally reverted This PR removes the dependency reduced POM hack brought back by #7191 Author: tedyu <yuzhihong@gmail.com> Closes #7919 from tedyu/master and squashes the following commits: 1bfbd7b [tedyu] [BUILD] Remove dependency reduced POM hack (cherry picked from commit b211cbc) Signed-off-by: Sean Owen <sowen@cloudera.com>
Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork.
Tests not run yet: that's what the machines are for