-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-11891] Model export/import for RFormula and RFormulaModel #9884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I will update |
|
Test build #46476 has finished for PR 9884 at commit
|
|
Test build #46479 has finished for PR 9884 at commit
|
|
Test build #46605 has finished for PR 9884 at commit
|
|
Just for clarification, will this approach allow for saving out a model object and then using it in a different SparkContext than the one in which it was created? I see that all the stages are written out and preserved, but I've seen SparkR throw a fit about trying to use things across contexts before. |
|
@cafreeman I believe the saved pipeline can be loaded in another spark context. |
|
Test build #48939 has finished for PR 9884 at commit
|
|
ok to test |
|
cc @jkbradley I updated the cross validator. Could you make a pass of it? Thx |
|
Test build #48998 has finished for PR 9884 at commit
|
Add save/load for feature.py. Meanwhile, add save/load for `ElementwiseProduct` in Scala side and fix a bug of missing `setDefault` in `VectorSlicer` and `StopWordsRemover`. In this PR I ignore the `RFormula` and `RFormulaModel` because its Scala implementation is pending in #9884. I'll add them in this PR if #9884 gets merged first. Or add a follow-up JIRA for `RFormula`. Author: Xusen Yin <yinxusen@gmail.com> Closes #11203 from yinxusen/SPARK-13036.
|
I'll take a look now |
|
Test build #2644 has finished for PR 9884 at commit
|
| resolvedFormula: ResolvedRFormula, | ||
| pipelineModel: PipelineModel) | ||
| extends Model[RFormulaModel] with RFormulaBase { | ||
| val resolvedFormula: ResolvedRFormula, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These vals should be private[feature]
|
I only saw a few issues. Thanks! |
| pipelineModel: PipelineModel) | ||
| extends Model[RFormulaModel] with RFormulaBase { | ||
| private[ml] val resolvedFormula: ResolvedRFormula, | ||
| private[ml] val pipelineModel: PipelineModel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to leave the private scope of ml, other than feature, because CrossValidator accesses the pipelineModel.
|
Test build #53390 has finished for PR 9884 at commit
|
|
LGTM |
Add save/load for feature.py. Meanwhile, add save/load for `ElementwiseProduct` in Scala side and fix a bug of missing `setDefault` in `VectorSlicer` and `StopWordsRemover`. In this PR I ignore the `RFormula` and `RFormulaModel` because its Scala implementation is pending in apache#9884. I'll add them in this PR if apache#9884 gets merged first. Or add a follow-up JIRA for `RFormula`. Author: Xusen Yin <yinxusen@gmail.com> Closes apache#11203 from yinxusen/SPARK-13036.
https://issues.apache.org/jira/browse/SPARK-11891 Author: Xusen Yin <yinxusen@gmail.com> Closes apache#9884 from yinxusen/SPARK-11891.
https://issues.apache.org/jira/browse/SPARK-11891