DOC california housing example#308
Conversation
To quote: The goal of this exercise is to go through a semi-realistic data science and machine learning task and develop a practical solution for it. We will learn about the following topics: - Perform *exploratory data analysis* - Do some non-trivial *feature engineering* - Explain how the feature engineering informs the *choice of machine learning model* and vice versa - Show how to make use of a couple of *advanced scikit-learn* features and explain why we use them - Create a *model card* that provides useful information about the model - Share the model by uploading it to the *Hugging Face Hub*
|
@skops-dev/maintainers ready for review It was unfortunately quite some work to convert this from ipynb to this percent-py format. Hopefully, I didn't miss anything. |
adrinjalali
left a comment
There was a problem hiding this comment.
Happy to ship this, it's pretty nice!
| importances, | ||
| X_test.columns, | ||
| plot_file=Path(local_repo) / "importance.png", | ||
| plot_file=str(Path(local_repo) / "importance.png"), |
There was a problem hiding this comment.
add_permutation_importances expects plot_file to be str.
There was a problem hiding this comment.
shouldn't we be fixing that instead?
BenjaminBossan
left a comment
There was a problem hiding this comment.
I replied to your comments, please review again.
| importances, | ||
| X_test.columns, | ||
| plot_file=Path(local_repo) / "importance.png", | ||
| plot_file=str(Path(local_repo) / "importance.png"), |
There was a problem hiding this comment.
add_permutation_importances expects plot_file to be str.
Previously, it was annotated as only taking str.
|
RTD failing? |
|
For posterity, a quick summary of what caused the bug: When sphinx-gallery runs the examples, it seems to use a single process for all. Because of that, sklearnex's
When running the new example in isolation, the bug does not occur because patching is not happening. |
adrinjalali
left a comment
There was a problem hiding this comment.
Thanks for the debugging @BenjaminBossan , this was nasty to deal with.
cc @ahuber21, @napetrov, your example is causing very hard to debug issues. We're removing relevant pieces from the example.
| # in faster inference times, since loading a persisted model will always load | ||
| # the objects exactly as they were saved. |
There was a problem hiding this comment.
this comment section should also be removed, I think we should remove all mentions of patch_sklearn() since it should never been used.
|
As an intermediate check, you could also call |
|
@BenjaminBossan - thanks for reporting - fixing this here - uxlfoundation/scikit-learn-intelex#1224 |
To quote:
The goal of this exercise is to go through a semi-realistic data science
and machine learning task and develop a practical solution for it. We
will learn about the following topics:
learning model and vice versa
and explain why we use them - Create a model card that provides
useful information about the model