ENH: Quantile Forest Support#384
ENH: Quantile Forest Support#384adrinjalali merged 12 commits intoskops-dev:mainfrom reidjohnson:quantile-forest-support
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks so much for providing this fix. I haven't done a proper review yet. Would you mind adding a test for quantile forest estimators? It could be similar to the tests here. If you need help with that, let us know.
-Adds base unit tests -Patches constructor visualization error
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
| "here: https://github.com/skops-dev/skops/issues" | ||
| ) | ||
|
|
||
| if node_name == "constructor": |
There was a problem hiding this comment.
I would include that in a separate PR.
|
@BenjaminBossan: Sure, I took a pass at adding a test for the quantile forest estimators. This also uncovered a bug in the visualization of sklearn DecisionTreeRegressors reported as #385, with a suggested resolution included in these changes. |
adrinjalali
left a comment
There was a problem hiding this comment.
Thanks for the nice PR. Other than the small nits, I think this looks good.
Might be time for us to start working on a better way to handle C extension types.
| self.trusted = self._get_trusted( | ||
| trusted, | ||
| [get_module(QuantileForest) + ".QuantileForest"], | ||
| ) |
There was a problem hiding this comment.
I'd be happier if we don't trust them by default.
| "here: https://github.com/skops-dev/skops/issues" | ||
| ) | ||
|
|
||
| if node_name == "constructor": |
There was a problem hiding this comment.
I would include that in a separate PR.
| tree_methods = [ | ||
| quantile_forest.RandomForestQuantileRegressor, | ||
| quantile_forest.ExtraTreesQuantileRegressor, | ||
| ] |
There was a problem hiding this comment.
move to @pytest.mark.parametrize?
In favor of separate PR.
Uses pytest.mark.parametrize
|
@reidjohnson tests failing |
|
Failing due to #385. |
| from __future__ import annotations | ||
|
|
||
| from typing import Any, Sequence | ||
|
|
||
| from ._protocol import PROTOCOL | ||
| from ._sklearn import ReduceNode, reduce_get_state | ||
| from ._utils import LoadContext, SaveContext | ||
|
|
||
|
|
||
| def _lazy_import(): | ||
| from quantile_forest._quantile_forest_fast import QuantileForest | ||
|
|
||
| def quantile_forest_get_state( | ||
| obj: Any, | ||
| save_context: SaveContext, | ||
| ) -> dict[str, Any]: | ||
| state = reduce_get_state(obj, save_context) | ||
| state["__loader__"] = "QuantileForestNode" | ||
| return state | ||
|
|
||
| class QuantileForestNode(ReduceNode): | ||
| def __init__( | ||
| self, | ||
| state: dict[str, Any], | ||
| load_context: LoadContext, | ||
| trusted: bool | Sequence[str] = False, | ||
| ) -> None: | ||
| super().__init__( | ||
| state, | ||
| load_context, | ||
| constructor=QuantileForest, | ||
| trusted=trusted, | ||
| ) | ||
| self.trusted = self._get_trusted(trusted, []) | ||
|
|
||
| return QuantileForest, QuantileForestNode, quantile_forest_get_state | ||
|
|
||
|
|
||
| try: | ||
| QuantileForest, QuantileForestNode, quantile_forest_get_state = _lazy_import() | ||
|
|
||
| # tuples of type and function that gets the state of that type | ||
| GET_STATE_DISPATCH_FUNCTIONS = [(QuantileForest, quantile_forest_get_state)] | ||
|
|
||
| # tuples of type and function that creates the instance of that type | ||
| NODE_TYPE_MAPPING = {("QuantileForestNode", PROTOCOL): QuantileForestNode} | ||
| except ModuleNotFoundError: | ||
| GET_STATE_DISPATCH_FUNCTIONS = [] | ||
| NODE_TYPE_MAPPING = {} |
There was a problem hiding this comment.
What do you think of this instead? (slightly different)
from __future__ import annotations
from typing import Any, Sequence
from ._protocol import PROTOCOL
from ._sklearn import ReduceNode, reduce_get_state
from ._utils import LoadContext, SaveContext
try:
from quantile_forest._quantile_forest_fast import QuantileForest
except ImportError:
QuantileForest = None
def quantile_forest_get_state(
obj: Any,
save_context: SaveContext,
) -> dict[str, Any]:
state = reduce_get_state(obj, save_context)
state["__loader__"] = "QuantileForestNode"
return state
class QuantileForestNode(ReduceNode):
def __init__(
self,
state: dict[str, Any],
load_context: LoadContext,
trusted: bool | Sequence[str] = False,
) -> None:
if QuantileForest is None:
raise ImportError(
"`quantile_forest` is missing and needs to be installed in order to"
" load this model."
)
super().__init__(
state,
load_context,
constructor=QuantileForest,
trusted=trusted,
)
self.trusted = self._get_trusted(trusted, [])
# tuples of type and function that gets the state of that type
if QuantileForest is not None:
GET_STATE_DISPATCH_FUNCTIONS = [(QuantileForest, quantile_forest_get_state)]
# tuples of type and function that creates the instance of that type
NODE_TYPE_MAPPING = {("QuantileForestNode", PROTOCOL): QuantileForestNode}|
|
||
|
|
||
| class TestQuantileForest: | ||
| """Tests for RandomForestQuantileRegressor and ExtraTreesQuantileRegressor""" |
There was a problem hiding this comment.
Is this not missing QuantileForest?
There was a problem hiding this comment.
These are the user-facing classes which are explicitly tested, QuantileForest is the underlying Cython object shared by the user-facing classes and isn't explicitly tested here (but is serialized by the loading/dumping).
BenjaminBossan
left a comment
There was a problem hiding this comment.
LGTM, thanks a lot for adding the support. I have an extremely minor comment, but this is good to be merged from my point of view.
|
I just noticed this is missing a changelog entry. |
|
Updated, thanks for flagging. |
|
awesome work everyone, y'all are amazing! I really appreciate it! |
|
I finally got back to this effort and it seems like, even with the changes, I'm still seeing the following error when attempting to load a quantile forest model: |
|
more specifically: |
|
@brentonmallen1 Just checking, but are you using the development build? I don't believe there's yet been a public release since these changes were merged. |
|
I was using the main release for 0.8.0 because the changelog had mentioned QuantileForest support, but I could have misread. I can try using the dev release and give it another go and report back. thank you |
|
well, I don't see the quantile changes in my site-packages so it does seem like I haven't installed what I was expecting. I also don't see a dev branch/build, how would I go about installing that directly? |
|
@brentonmallen1 I'll defer to the maintainers, but a viable route is to follow the dev setup described in the contributing doc. |
|
We just released 0.9.0, which includes this change. |
|
y'all are wonderful! thank you! |
Fixes #383.
The quantile-forest package extends sklearn random forest regressors to implement quantile regression forests. Without these changes, the Cython-based model object is unable to be loaded by skops due to its use of the
__reduce__method.I understand adding support for another package may be outside the current scope of this project. As the primary maintainer of the quantile-forest package, I'd also be interested in better understanding if there's a recommended alternative to the use of
__reduce__that would enable the model to be compatible with this project's existing secure persistence feature.