[SPARK-2097][SQL] UDF Support by marmbrus · Pull Request #1063 · apache/spark

marmbrus · 2014-06-12T06:57:15Z

This patch adds the ability to register lambda functions written in Python, Java or Scala as UDFs for use in SQL or HiveQL.

Scala:

registerFunction("strLenScala", (_: String).length)
sql("SELECT strLenScala('test')")

Python:

sqlCtx.registerFunction("strLenPython", lambda x: len(x), IntegerType())
sqlCtx.sql("SELECT strLenPython('test')")

Java:

sqlContext.registerFunction("stringLengthJava", new UDF1<String, Integer>() {
  @Override
  public Integer call(String str) throws Exception {
    return str.length();
  }
}, DataType.IntegerType);

sqlContext.sql("SELECT stringLengthJava('test')");

SparkQA · 2014-07-10T19:02:27Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16509/consoleFull

SparkQA · 2014-07-10T19:03:23Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class HiveContext(sc: SparkContext) extends SQLContext(sc) with UdfRegistration{
protected[sql] trait UdfRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16509/consoleFull

SparkQA · 2014-07-10T19:07:27Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16510/consoleFull

SparkQA · 2014-07-10T19:08:06Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class HiveContext(sc: SparkContext) extends SQLContext(sc) with UdfRegistration{
protected[sql] trait UdfRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16510/consoleFull

SparkQA · 2014-07-10T19:22:33Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16512/consoleFull

SparkQA · 2014-07-10T19:23:29Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class HiveContext(sc: SparkContext) extends SQLContext(sc) with UdfRegistration{
protected[sql] trait UdfRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16512/consoleFull

SparkQA · 2014-07-23T22:03:23Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17064/consoleFull

SparkQA · 2014-07-23T22:04:12Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class HiveContext(sc: SparkContext) extends SQLContext(sc) with UdfRegistration{
protected[sql] trait UdfRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17064/consoleFull

SparkQA · 2014-07-24T01:32:59Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17084/consoleFull

SparkQA · 2014-07-24T01:34:02Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class HiveContext(sc: SparkContext) extends SQLContext(sc) with UdfRegistration{
protected[sql] trait UdfRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17084/consoleFull

SparkQA · 2014-07-28T01:34:09Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17267/consoleFull

SparkQA · 2014-07-28T01:37:32Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UdfRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends FunctionRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17267/consoleFull

Conflicts: sql/core/src/main/scala/org/apache/spark/sql/api/java/JavaSQLContext.scala sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala

SparkQA · 2014-07-31T03:04:04Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17544/consoleFull

SparkQA · 2014-07-31T04:10:56Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UdfRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends FunctionRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17544/consoleFull

SparkQA · 2014-08-01T04:04:08Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17638/consoleFull

SparkQA · 2014-08-01T04:05:01Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UdfRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends FunctionRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17638/consoleFull

yhuai · 2014-08-01T17:21:28Z

Add a comment to say that a Hive UDF can be overridden?

SparkQA · 2014-08-02T07:49:20Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17763/consoleFull

SparkQA · 2014-08-02T09:19:34Z

QA results for PR 1063:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UdfRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends FunctionRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17763/consoleFull

SparkQA · 2014-08-02T20:39:19Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17777/consoleFull

Conflicts: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/TestHive.scala

SparkQA · 2014-08-02T21:34:17Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17779/consoleFull

SparkQA · 2014-08-02T21:54:28Z

QA tests have started for PR 1063. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17781/consoleFull

SparkQA · 2014-08-02T22:06:49Z

QA results for PR 1063:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UdfRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends FunctionRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17777/consoleFull

SparkQA · 2014-08-02T22:46:43Z

QA results for PR 1063:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UDFRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends UDFRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17779/consoleFull

SparkQA · 2014-08-02T23:14:25Z

QA results for PR 1063:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait OverrideFunctionRegistry extends FunctionRegistry {
class SimpleFunctionRegistry extends FunctionRegistry {
protected[sql] trait UDFRegistration {
class JavaSQLContext(val sqlContext: SQLContext) extends UDFRegistration {
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan) extends logical.UnaryNode {
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan)

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17781/consoleFull

marmbrus · 2014-08-02T23:34:26Z

Thanks for looking this over! I've merged to master and 1.1.

This patch adds the ability to register lambda functions written in Python, Java or Scala as UDFs for use in SQL or HiveQL. Scala: ```scala registerFunction("strLenScala", (_: String).length) sql("SELECT strLenScala('test')") ``` Python: ```python sqlCtx.registerFunction("strLenPython", lambda x: len(x), IntegerType()) sqlCtx.sql("SELECT strLenPython('test')") ``` Java: ```java sqlContext.registerFunction("stringLengthJava", new UDF1<String, Integer>() { Override public Integer call(String str) throws Exception { return str.length(); } }, DataType.IntegerType); sqlContext.sql("SELECT stringLengthJava('test')"); ``` Author: Michael Armbrust <michael@databricks.com> Closes #1063 from marmbrus/udfs and squashes the following commits: 9eda0fe [Michael Armbrust] newline 747c05e [Michael Armbrust] Add some scala UDF tests. d92727d [Michael Armbrust] Merge remote-tracking branch 'apache/master' into udfs 005d684 [Michael Armbrust] Fix naming and formatting. d14dac8 [Michael Armbrust] Fix last line of autogened java files. 8135c48 [Michael Armbrust] Move UDF unit tests to pyspark. 40b0ffd [Michael Armbrust] Merge remote-tracking branch 'apache/master' into udfs 6a36890 [Michael Armbrust] Switch logging so that SQLContext can be serializable. 7a83101 [Michael Armbrust] Drop toString 795fd15 [Michael Armbrust] Try to avoid capturing SQLContext. e54fb45 [Michael Armbrust] Docs and tests. 437cbe3 [Michael Armbrust] Update use of dataTypes, fix some python tests, address review comments. 01517d6 [Michael Armbrust] Merge remote-tracking branch 'origin/master' into udfs 8e6c932 [Michael Armbrust] WIP 3f96a52 [Michael Armbrust] Merge remote-tracking branch 'origin/master' into udfs 6237c8d [Michael Armbrust] WIP 2766f0b [Michael Armbrust] Move udfs support to SQL from hive. Add support for Java UDFs. 0f7d50c [Michael Armbrust] Draft of native Spark SQL UDFs for Scala and Python. (cherry picked from commit 158ad0b) Signed-off-by: Michael Armbrust <michael@databricks.com>

This patch adds the ability to register lambda functions written in Python, Java or Scala as UDFs for use in SQL or HiveQL. Scala: ```scala registerFunction("strLenScala", (_: String).length) sql("SELECT strLenScala('test')") ``` Python: ```python sqlCtx.registerFunction("strLenPython", lambda x: len(x), IntegerType()) sqlCtx.sql("SELECT strLenPython('test')") ``` Java: ```java sqlContext.registerFunction("stringLengthJava", new UDF1<String, Integer>() { Override public Integer call(String str) throws Exception { return str.length(); } }, DataType.IntegerType); sqlContext.sql("SELECT stringLengthJava('test')"); ``` Author: Michael Armbrust <michael@databricks.com> Closes apache#1063 from marmbrus/udfs and squashes the following commits: 9eda0fe [Michael Armbrust] newline 747c05e [Michael Armbrust] Add some scala UDF tests. d92727d [Michael Armbrust] Merge remote-tracking branch 'apache/master' into udfs 005d684 [Michael Armbrust] Fix naming and formatting. d14dac8 [Michael Armbrust] Fix last line of autogened java files. 8135c48 [Michael Armbrust] Move UDF unit tests to pyspark. 40b0ffd [Michael Armbrust] Merge remote-tracking branch 'apache/master' into udfs 6a36890 [Michael Armbrust] Switch logging so that SQLContext can be serializable. 7a83101 [Michael Armbrust] Drop toString 795fd15 [Michael Armbrust] Try to avoid capturing SQLContext. e54fb45 [Michael Armbrust] Docs and tests. 437cbe3 [Michael Armbrust] Update use of dataTypes, fix some python tests, address review comments. 01517d6 [Michael Armbrust] Merge remote-tracking branch 'origin/master' into udfs 8e6c932 [Michael Armbrust] WIP 3f96a52 [Michael Armbrust] Merge remote-tracking branch 'origin/master' into udfs 6237c8d [Michael Armbrust] WIP 2766f0b [Michael Armbrust] Move udfs support to SQL from hive. Add support for Java UDFs. 0f7d50c [Michael Armbrust] Draft of native Spark SQL UDFs for Scala and Python.

DanielMe · 2015-04-13T10:23:40Z

Excuse my naive question, however, it seems that this does not use the regular Hive UDF API, right? (Like when I would run hiveContext.sql("CREATE TEMPORARY FUNCTION [...]") ). Is there any particular reason for that? I noticed, that UDFs created using this mechanism won't show up in the SHOW FUNCTIONS list. Would it be difficult to achieve that?

Also, the Hive API allows to add description strings to a UDF (which obviously only makes sense if you can use DESCRIBE FUNCTION). It would be nice if something similar would exists for UDFs defined over the spark interface.

marmbrus · 2015-04-13T18:46:50Z

The biggest reason for the divergence is this API is much lighter weight (you can define functions in a single line, inline with the rest of your program). We can certainly consider adding more support for function listing metadata in the future, but you are the first to ask for this.

DanielMe · 2015-04-14T07:45:45Z

Okay, thanks for the clarification. Initially, I had naively assumed that the functionality you added was just a layer above the Hive API hence it was a bit confusing that SHOW FUNCTIONS did not list the UDFs. For my usecase I can easily work around that limitation so it's not that big of a deal.

* [SPARK-31168][BUILD] Upgrade Scala to 2.12.14 ### What changes were proposed in this pull request? This PR is the 4th try to upgrade Scala 2.12.x in order to see the feasibility. - #27929 (Upgrade Scala to 2.12.11, wangyum ) - #30940 (Upgrade Scala to 2.12.12, viirya ) - #31223 (Upgrade Scala to 2.12.13, dongjoon-hyun ) Note that Scala 2.12.14 has the following fix for Apache Spark community. - Fix cyclic error in runtime reflection (protobuf), a regression that prevented Spark upgrading to 2.12.13 REQUIREMENTS: - [x] `silencer` library is released via ghik/silencer#66 - [x] `genjavadoc` library is released via lightbend/genjavadoc#282 ### Why are the changes needed? Apache Spark was stuck to 2.12.10 due to the regression in Scala 2.12.11/2.12.12/2.12.13. This will bring all the bug fixes. - https://github.com/scala/scala/releases/tag/v2.12.14 - https://github.com/scala/scala/releases/tag/v2.12.13 - https://github.com/scala/scala/releases/tag/v2.12.12 - https://github.com/scala/scala/releases/tag/v2.12.11 ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug-fixed version. ### How was this patch tested? Pass the CIs. Closes #32697 from dongjoon-hyun/SPARK-31168. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 6c4b60f) * [SPARK-31168][BUILD] Upgrade Scala to 2.12.14 ### What changes were proposed in this pull request? This PR is the 4th try to upgrade Scala 2.12.x in order to see the feasibility. - #27929 (Upgrade Scala to 2.12.11, wangyum ) - #30940 (Upgrade Scala to 2.12.12, viirya ) - #31223 (Upgrade Scala to 2.12.13, dongjoon-hyun ) Note that Scala 2.12.14 has the following fix for Apache Spark community. - Fix cyclic error in runtime reflection (protobuf), a regression that prevented Spark upgrading to 2.12.13 REQUIREMENTS: - [x] `silencer` library is released via ghik/silencer#66 - [x] `genjavadoc` library is released via lightbend/genjavadoc#282 ### Why are the changes needed? Apache Spark was stuck to 2.12.10 due to the regression in Scala 2.12.11/2.12.12/2.12.13. This will bring all the bug fixes. - https://github.com/scala/scala/releases/tag/v2.12.14 - https://github.com/scala/scala/releases/tag/v2.12.13 - https://github.com/scala/scala/releases/tag/v2.12.12 - https://github.com/scala/scala/releases/tag/v2.12.11 ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug-fixed version. ### How was this patch tested? Pass the CIs. Closes #32697 from dongjoon-hyun/SPARK-31168. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 6c4b60f) * [SPARK-36759][BUILD] Upgrade Scala to 2.12.15 ### What changes were proposed in this pull request? This PR aims to upgrade Scala to 2.12.15 to support Java 17/18 better. ### Why are the changes needed? Scala 2.12.15 improves compatibility with JDK 17 and 18: https://github.com/scala/scala/releases/tag/v2.12.15 - Avoids IllegalArgumentException in JDK 17+ for lambda deserialization - Upgrades to ASM 9.2, for JDK 18 support in optimizer ### Does this PR introduce _any_ user-facing change? Yes, this is a Scala version change. ### How was this patch tested? Pass the CIs Closes #33999 from dongjoon-hyun/SPARK-36759. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 16f1f71) * [SPARK-36759][BUILD][FOLLOWUP] Update version in scala-2.12 profile and doc ### What changes were proposed in this pull request? This is a follow-up to fix the leftover during switching the Scala version. ### Why are the changes needed? This should be consistent. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is not tested by UT. We need to check manually. There is no more `2.12.14`. ``` $ git grep 2.12.14 R/pkg/tests/fulltests/test_sparkSQL.R: c(as.Date("2012-12-14"), as.Date("2013-12-15"), as.Date("2014-12-16"))) data/mllib/ridge-data/lpsa.data:3.5307626,0.987291634724086 -0.36279314978779 -0.922212414640967 0.232904453212813 -0.522940888712441 1.79270085261407 0.342627053981254 1.26288870310799 sql/hive/src/test/resources/data/files/over10k:-3|454|65705|4294967468|62.12|14.32|true|mike white|2013-03-01 09:11:58.703087|40.18|joggying ``` Closes #34020 from dongjoon-hyun/SPARK-36759-2. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit adbea25) * [SPARK-39414][BUILD] Upgrade Scala to 2.12.16 ### What changes were proposed in this pull request? This PR aims to upgrade Scala to 2.12.16 ### Why are the changes needed? This version bring some bug fix and start to try to support Java 19 scala/scala@v2.12.15...v2.12.16 - [Upgrade to asm 9.3, for JDK19 support](scala/scala#10000) - [Fix codegen for MH.invoke etc under JDK 17 -release](scala/scala#9930) - [Deprecation related SecurityManager on JDK 17 ](scala/scala#9775) ### Does this PR introduce _any_ user-facing change? Yes, this is a Scala version change. ### How was this patch tested? Pass Github Actions Closes #36807 from LuciferYang/SPARK-39414. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit ed875a8) * fix * fix * fix * fix �[0m[�[31merror�[0m] �[0m[warn] /home/jenkins/workspace/spark-sql-catalyst-3.0/core/src/main/scala/org/apache/spark/scheduler/SpillableTaskResultGetter.scala:36: non-variable type argument org.apache.spark.scheduler.DirectTaskResult[_] in type pattern scala.runtime.NonLocalReturnControl[org.apache.spark.scheduler.DirectTaskResult[_]] is unchecked since it is eliminated by erasure�[0m �[0m[�[31merror�[0m] �[0m[warn] private[spark] class SpillableTaskResultGetter(sparkEnv: SparkEnv, scheduler: TaskSchedulerImpl)�[0m �[0m[�[31merror�[0m] �[0m[warn] �[0m * fix �[0m[�[31merror�[0m] �[0m[warn] /home/jenkins/workspace/spark-sql-catalyst-3.0/mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala:287: match may not be exhaustive.�[0m �[0m[�[31merror�[0m] �[0mIt would fail on the following input: ~(~(_, (x: String forSome x not in "^")), _)�[0m �[0m[�[31merror�[0m] �[0m[warn] private val pow: Parser[Term] = term ~ "^" ~ "^[1-9]\\d*".r ^^ {�[0m �[0m[�[31merror�[0m] �[0m[warn] �[0m �[0m[�[31merror�[0m] �[0m[warn] /home/jenkins/workspace/spark-sql-catalyst-3.0/mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala:301: match may not be exhaustive.�[0m �[0m[�[31merror�[0m] �[0mIt would fail on the following input: ~(~(_, (x: String forSome x not in "~")), _)�[0m �[0m[�[31merror�[0m] �[0m[warn] (label ~ "~" ~ expr) ^^ { case r ~ "~" ~ t => ParsedRFormula(r, t.asTerms.terms) }�[0m �[0m[�[31merror�[0m] �[0m[warn] �[0m * fix Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Co-authored-by: yangjie01 <yangjie01@baidu.com>

marmbrus added 2 commits July 27, 2014 15:06

Draft of native Spark SQL UDFs for Scala and Python.

0f7d50c

Move udfs support to SQL from hive. Add support for Java UDFs.

2766f0b

marmbrus added 2 commits July 28, 2014 10:59

WIP

6237c8d

WIP

8e6c932

yhuai reviewed Aug 1, 2014
View reviewed changes

marmbrus added 2 commits August 1, 2014 18:55

Merge remote-tracking branch 'apache/master' into udfs

40b0ffd

Move UDF unit tests to pyspark.

8135c48

Fix last line of autogened java files.

d14dac8

marmbrus added 3 commits August 2, 2014 14:11

Fix naming and formatting.

005d684

Merge remote-tracking branch 'apache/master' into udfs

d92727d

Conflicts: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/TestHive.scala

Add some scala UDF tests.

747c05e

marmbrus changed the title ~~[WIP][SPARK-2097][SQL] UDF Support~~ [SPARK-2097][SQL] UDF Support Aug 2, 2014

newline

9eda0fe

asfgit closed this in 158ad0b Aug 3, 2014

marmbrus deleted the udfs branch August 27, 2014 20:57

Conversation

marmbrus commented Jun 12, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 10, 2014

Uh oh!

SparkQA commented Jul 23, 2014

Uh oh!

SparkQA commented Jul 23, 2014

Uh oh!

SparkQA commented Jul 24, 2014

Uh oh!

SparkQA commented Jul 24, 2014

Uh oh!

SparkQA commented Jul 28, 2014

Uh oh!

SparkQA commented Jul 28, 2014

Uh oh!

SparkQA commented Jul 31, 2014

Uh oh!

SparkQA commented Jul 31, 2014

Uh oh!

SparkQA commented Aug 1, 2014

Uh oh!

SparkQA commented Aug 1, 2014

Uh oh!

yhuai Aug 1, 2014

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

SparkQA commented Aug 2, 2014

Uh oh!

marmbrus commented Aug 2, 2014

Uh oh!

DanielMe commented Apr 13, 2015

Uh oh!

marmbrus commented Apr 13, 2015

Uh oh!

DanielMe commented Apr 14, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants