MNT Sklearn1.6 compatibility by TamaraAtanasoska · Pull Request #447 · skops-dev/skops

TamaraAtanasoska · 2024-11-04T14:46:02Z

Reference Issues/PRs

Fixes #443. Not fully working yet.

What does this implement/fix? Explain your changes.

A few of the compatibility errors, including the one explained in #443 are now fixed. There are some external failures(errors happening outside of skops) that I am not sure how to fix. @adrinjalali any tips? Below are the remaining failures:

FAILED skops/io/tests/test_persist.py::test_can_persist_fitted[GraphicalLassoCV(cv=3,max_iter=5)] - FloatingPointError: Non SPD result: the system is too ill-conditioned for this solver. The system is too ill-conditioned for this ...
FAILED skops/io/tests/test_persist.py::test_can_persist_fitted[PassiveAggressiveClassifier(max_iter=5)] - AssertionError
FAILED skops/io/tests/test_persist.py::test_can_persist_fitted[SGDClassifier(max_iter=5)] - AssertionError
FAILED skops/io/tests/test_persist.py::test_can_persist_fitted[SGDOneClassSVM(max_iter=5)] - AssertionError

Any other comments?

First PR to the project, the fixes might be off.
I understood #443 as I can remove all SGD stuff from the _sklearn.py file when working with sklearn 1.6+, tests pass like that, I am just not sure if the change is supposed to be so severe?

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

TamaraAtanasoska · 2024-11-11T12:48:48Z

Issue in quantile_forest made: zillow/quantile-forest#103

adrinjalali · 2024-11-13T16:14:24Z

Can you get the output with pytest -l to see all the local variables on the stack trace?

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

TamaraAtanasoska · 2024-11-18T11:08:41Z

Can you get the output with pytest -l to see all the local variables on the stack trace?

The issue is about versioning again, taking care of separating the SGD stuff by sklearn version would most likely solve these last errors. I will look into it today, seems a bit more complex in this case then the rest, at least at first glance. I will ping with questions.

adrinjalali

This would pass tests once scikit-learn/scikit-learn#30356 is merged in sklearn

adrinjalali · 2024-11-27T17:37:24Z

        X, y = get_input(estimator)
-        tags = _safe_tags(estimator)
-        if tags.get("requires_fit", True):
+        if get_tags(estimator).requires_fit:


scikit-learn now uses dataclasses for tags, which are emulated here for old sklearn as well

adrinjalali · 2024-11-27T17:38:20Z

-    Huber,
-    Log,
-    LossFunction,


scikit-learn has moved to a more central place for loss functions / classes, and therefore the ones which are used between estimators are removed from the sgd specific file.

adrinjalali · 2024-11-27T17:38:44Z

+    from sklearn._loss._loss import (
+        CyAbsoluteError,
+        CyExponentialLoss,
+        CyHalfBinomialLoss,
+        CyHalfGammaLoss,
+        CyHalfMultinomialLoss,
+        CyHalfPoissonLoss,
+        CyHalfSquaredError,
+        CyHalfTweedieLoss,
+        CyHalfTweedieLossIdentity,
+        CyHuberLoss,
+        CyPinballLoss,
+    )
+


these are the new loss classes

adrinjalali · 2024-11-27T17:39:38Z



-def _assert_generic_objects_equal(val1, val2):
+def _assert_generic_objects_equal(val1, val2, path=""):


I've added the path argument to give the devs a better verbose idea of where things fail, to debug things easier

adrinjalali · 2024-11-27T17:40:12Z


 # Default settings for X
-N_SAMPLES = 50
+N_SAMPLES = 100


an SGD estimator was ill defined with 50 samples, increasing to 100

TamaraAtanasoska · 2024-11-28T09:17:58Z

@adrinjalali any legwork left that you'd like to or can share to make this faster or it is good to go as it is when the changes are merged in sklearn?

adrinjalali · 2024-11-28T09:21:09Z

@TamaraAtanasoska what's left is to take all estimators generated by _construct_instances instead of only the first one. We'd need to restructure the test to allow for a for loop over the values yielded by the _construct_instances function.

TamaraAtanasoska · 2024-11-28T09:36:55Z

@TamaraAtanasoska what's left is to take all estimators generated by _construct_instances instead of only the first one. We'd need to restructure the test to allow for a for loop over the values yielded by the _construct_instances function.

do you mean in _tested_estimators and _unsupported_estimators? aren't both tests already looping through each estimator in all_estimators and then only creating an instance of it? or this exactly is the issue, it would be better to create instances of all estimators at once and then loop through them for sklearn 1.6?

Edit: I see this not about all_estimators, but all estimators produced by _construct_instances from that one estimator retrieved from all_estimators passed to it in the test 😄 it is confusing.

Should I take it on or you are already working on it?

adrinjalali · 2024-11-28T10:02:57Z

Should I take it on or you are already working on it?

No I'm not working on that, you can take it on :)

adrinjalali · 2024-11-28T10:09:08Z

BTW, we're working with @glemaitre on this: https://github.com/glemaitre/sklearn-compat

TamaraAtanasoska · 2024-11-28T10:15:43Z

BTW, we're working with @glemaitre on this: https://github.com/glemaitre/sklearn-compat

that is a cool effort! This can get stressful for a lot of folks otherwise

TamaraAtanasoska · 2024-11-28T16:01:02Z

@adrinjalali there is a TypeError related to the losses as a result of this new commit 712cb13. I suppose that will be fixed with everything together?

FAILED skops/io/tests/test_persist.py::test_can_persist_fitted[TweedieRegressor(max_iter=5)] - TypeError: __init__() takes exactly 1 positional argument (3 given)

>   ???
E   TypeError: __init__() takes exactly 1 positional argument (3 given)

sklearn/_loss/_loss.pyx:1615: TypeError
`

adrinjalali · 2024-11-29T12:08:50Z

Yes, that's exactly the error fixed with the now merged PR. The CI here should be green tomorrow once the nightly build is uploaded / updated.

TamaraAtanasoska · 2024-11-29T13:49:37Z

@adrinjalali refactor in c3da1b9 with the new changes regarding scikit-learn/scikit-learn#30372 although I guess that will be merged soon. If it gets merged before we merge I'd like to change the code slightly as I had it before as it was nicer :)

adrinjalali

There's a whole bunch of fixes here. I think I rather merge this and do other fixes in separate PRs.

Also, it turns out we were NOT testing against old sklearn versions, which is fixed now here, but it makes CI fail on those old ones, but at least they're tested. Also, the nightly build should be fixed (at least I tested locally with a latest version).

TamaraAtanasoska added 6 commits November 4, 2024 13:31

Fix _sgd imports

6901126

Fix _safe_tags import issue

19b71c5

Change _construct_instance import

e718fce

Change get_tags syntax

b3f6401

Ignore FutureWarning in sklearn

dadf4e3

Merge branch 'main' into sklearn1.6-compatibility

0146a74

TamaraAtanasoska changed the title ~~Sklearn1.6 compatibility~~ MNT Sklearn1.6 compatibility Nov 4, 2024

adrinjalali reviewed Nov 11, 2024

View reviewed changes

Comment thread pyproject.toml Outdated

Comment thread skops/io/_sklearn.py Outdated

Comment thread skops/io/_sklearn.py Outdated

Comment thread skops/io/_sklearn.py Outdated

Comment thread skops/io/tests/test_persist.py Outdated

TamaraAtanasoska and others added 5 commits November 11, 2024 12:20

Update skops/io/_sklearn.py

45af8a0

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

Update skops/io/_sklearn.py

7abf51e

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

fix typo

6332470

Fix variable name inconsitency

d8da963

Add clearer message about warning supression

d9a163b

This comment was marked as outdated.

Sign in to view

WIP

cb0b215

TamaraAtanasoska added 4 commits November 14, 2024 12:57

Add explicit typing

c367109

Merge branch 'main' into sklearn1.6-compatibility

c96be0b

Remove stray WIP with prints

051eead

Fix tags issues

a1f4344

adrinjalali reviewed Nov 15, 2024

View reviewed changes

Comment thread skops/io/_sklearn.py Outdated

Update skops/io/_sklearn.py

e6b4df3

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

Make the use of SGD models conditional on sklearn version

ed77ced

TamaraAtanasoska marked this pull request as draft November 18, 2024 14:26

TamaraAtanasoska added 4 commits November 18, 2024 16:09

Add relative paths to fix import errors

0983b80

Merge branch 'main' into sklearn1.6-compatibility

fdb4b20

Add construct_instances for both versions

0388d0b

Move imports for construct_instances

926f972

TamaraAtanasoska and others added 5 commits November 25, 2024 14:50

Add error for SGD class and incompatible sklearn version

d5696ff

Copy code for scikit-learn for est tags

cfeef0a

Fix loss issues

e1751fc

minor fix

b916346

reduce diff

e1d0132

adrinjalali reviewed Nov 27, 2024

View reviewed changes

annotations import

960dff9

TamaraAtanasoska mentioned this pull request Nov 28, 2024

MNT Make sure fairlearn is scikit-learn 1.6 compatible fairlearn/fairlearn#1446

Closed

work with all instances from _construct_instances

712cb13

TamaraAtanasoska commented Nov 28, 2024

View reviewed changes

Comment thread skops/io/tests/test_persist.py Outdated

Refactor get_input()

c3da1b9

adrinjalali added 8 commits December 2, 2024 11:06

trigger CI

a8dad87

debug CI

69325c0

...

d2ecc45

...

87065a3

...

d9eaaff

...

bcc78d4

...

24783a4

...

e2f0c82

adrinjalali approved these changes Dec 2, 2024

View reviewed changes

adrinjalali merged commit 00f5f07 into skops-dev:main Dec 2, 2024



		def _assert_generic_objects_equal(val1, val2):
		def _assert_generic_objects_equal(val1, val2, path=""):

Conversation

TamaraAtanasoska commented Nov 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TamaraAtanasoska commented Nov 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

adrinjalali commented Nov 13, 2024

Uh oh!

Uh oh!

TamaraAtanasoska commented Nov 18, 2024

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

TamaraAtanasoska commented Nov 28, 2024

Uh oh!

adrinjalali commented Nov 28, 2024

Uh oh!

TamaraAtanasoska commented Nov 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Nov 28, 2024

Uh oh!

adrinjalali commented Nov 28, 2024

Uh oh!

TamaraAtanasoska commented Nov 28, 2024

Uh oh!

Uh oh!

TamaraAtanasoska commented Nov 28, 2024

Uh oh!

adrinjalali commented Nov 29, 2024

Uh oh!

TamaraAtanasoska commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TamaraAtanasoska commented Nov 4, 2024 •

edited

Loading

TamaraAtanasoska commented Nov 11, 2024 •

edited

Loading

TamaraAtanasoska commented Nov 28, 2024 •

edited

Loading

TamaraAtanasoska commented Nov 29, 2024 •

edited

Loading