Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented Sep 21, 2017

What changes were proposed in this pull request?

Enable Scala 2.12 REPL. Fix most remaining issues with 2.12 compilation and warnings, including:

  • Selecting Kafka 0.10.1+ for Scala 2.12 and patching over a minor API difference
  • Fixing lots of "eta expansion of zero arg method deprecated" warnings
  • Resolving the SparkContext.sequenceFile implicits compile problem
  • Fixing an odd but valid jetty-server missing dependency in hive-thriftserver

How was this patch tested?

Existing tests

@@ -2826,33 +2826,33 @@ object WritableConverter {
// them automatically. However, we still keep the old functions in SparkContext for backward
// compatibility and forward to the following functions directly.

implicit def intWritableConverter(): WritableConverter[Int] =
simpleWritableConverter[Int, IntWritable](_.get)
implicit val intWritableConverter: () => WritableConverter[Int] =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes were necessary to make the implicits work in 2.12 now that eta-expansion of zero-arg methods is deprecated and apparently doesn't work for implicit resolution. It passes MiMa, but we could be conservative and retain the existing methods, and add function vals instead.


implicit def writableWritableConverter[T <: Writable](): WritableConverter[T] =
new WritableConverter[T](_.runtimeClass.asInstanceOf[Class[T]], _.asInstanceOf[T])
implicit def writableWritableConverter[T <: Writable : ClassTag]: () => WritableConverter[T] =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClassTag was required here, for reasons I don't fully get

@@ -218,11 +218,13 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
if (!conf.contains("spark.testing")) {
// A task that periodically checks for event log updates on disk.
logDebug(s"Scheduling update thread every $UPDATE_INTERVAL_S seconds")
pool.scheduleWithFixedDelay(getRunner(checkForLogs), 0, UPDATE_INTERVAL_S, TimeUnit.SECONDS)
pool.scheduleWithFixedDelay(
getRunner(() => checkForLogs()), 0, UPDATE_INTERVAL_S, TimeUnit.SECONDS)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes avoid warnings about eta-expansion of zero-arg methods. It works fine in 2.11 as well; just not relying on syntactic sugar for the same.

case e: Throwable =>
logError("App dir cleanup failed: " + e.getMessage, e)
}(cleanupThreadExecutor)
cleanupFuture.failed.foreach(e =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onFailure, onSuccess are deprecated in 2.12. This should be equivalent in 2.11+2.12

@@ -396,12 +396,12 @@ class DAGScheduler(

/** Find ancestor shuffle dependencies that are not registered in shuffleToMapStage yet */
private def getMissingAncestorShuffleDependencies(
rdd: RDD[_]): Stack[ShuffleDependency[_, _, _]] = {
val ancestors = new Stack[ShuffleDependency[_, _, _]]
rdd: RDD[_]): ArrayStack[ShuffleDependency[_, _, _]] = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack is deprecated in 2.12 for poor performance; ArrayStack should work the same and be faster, in 2.11 too

<profile>
<id>scala-2.12</id>
<properties>
<kafka.version>0.10.1.1</kafka.version>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 0.10.1+ supports Scala 2.12. By the time a 2.12 build is actually supported we may be on to 0.10.2. Not sure. This at least makes it work

* sees any files, so that the Spark context is visible in those files. This is a bit of a
* hack, but there isn't another hook available to us at this point.
*/
override def createInterpreter(): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only meaningful difference from the 2.11 REPL, as it has to hook into a different place. All other REPL-related code isn't specific to 2.11/2.12 and was moved out into the common src directory in the repl module

@@ -230,19 +230,16 @@ class DataFrameStatSuite extends QueryTest with SharedSQLContext {

val resNaN1 = dfNaN.stat.approxQuantile("input1", Array(q1, q2), epsilon)
assert(resNaN1.count(_.isNaN) === 0)
assert(resNaN1.count(_ == null) === 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resNaN1 is an Array[Double] so can never contain null. This is always true and generated a warning

@@ -20,6 +20,7 @@ package org.apache.spark.sql.streaming
import java.util.UUID

import org.apache.spark.rdd.RDD
import org.apache.spark.sql.DataFrame
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file fix is already going into master separately as a hotfix

@@ -63,6 +63,16 @@
<groupId>${hive.group}</groupId>
<artifactId>hive-beeline</artifactId>
</dependency>
<dependency>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, hive-thriftserver wouldn't compile, unable to find jetty.server classes. It has a point, these should explicit, even if it somehow worked before.

@SparkQA
Copy link

SparkQA commented Sep 21, 2017

Test build #82033 has finished for PR 19307 at commit 99f082a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 21, 2017

Test build #82036 has finished for PR 19307 at commit 99f082a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Sep 21, 2017

Good to see this passes, as I expected, because it means we can also commit this chunk of progress towards 2.12 without any downside for 2.11.

Right now this means that the 2.12 build does compile, but I know it doesn't quite pass all tests (some weird paranamer problem, to start with)

The only part I was slightly uncertain about was the change to sequenceFile implicits, but MiMa hasn't complained.

Worth a look at the changes, but I think this is worth committing before too long.

@SparkQA
Copy link

SparkQA commented Sep 22, 2017

Test build #82076 has finished for PR 19307 at commit 05e4891.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to my little knowledge, it looks good to me in general. What I checked was, to double check the comments left here and the same instances.

@srowen
Copy link
Member Author

srowen commented Sep 24, 2017

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants