ENH add metric frame#298
Conversation
|
@BenjaminBossan I feel I should write a test for this, but I'm unsure where it should go. I was thinking of |
BenjaminBossan
left a comment
There was a problem hiding this comment.
I didn't do a thorough review yet, just high level.
I feel I should write a test for this, but I'm unsure where it should go. I was thinking of skops/card/tests/test_card.py inside the TestTableSection class. Thoughts? Is there somewhere else it would be better?
I think it would be best to add a completely new TestAddMetricFrame. You can take inspiration from other tests though.
| from typing import Any, Iterator, Literal, Protocol, Sequence, Union | ||
|
|
||
| import joblib | ||
| from fairlearn.metrics import MetricFrame |
There was a problem hiding this comment.
We should not import fairlearn at the top level, since it would mean that it's a required dependency. It should be imported inside the corresponding method, similar to how you import pandas there.
However, you should not import them directly, but use the helper function we introduced, like here:
skops/skops/card/_model_card.py
Line 1130 in b324b8d
| self, metrics: dict, y_true, y_pred, sensitive_features, pivot=True | ||
| ) -> Card: | ||
| """ | ||
| Add a metric frame table to the model card. |
There was a problem hiding this comment.
About the name add_metric_frame, I'm just wondering out loud if we can find a better name. For users who don't know fairlearn, the name could be very confusing, and the docstring "Add a metric frame table to the model card" doesn't add much.
|
@BenjaminBossan Thanks for the tip! Sorry, I missed that from the Contribution Guide. I'm currently working on making the tests. I'll ping you whenever it's ready again! |
|
@BenjaminBossan This is ready for you to take a look at! |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks a lot for this. I have a couple of code comments, but before considering those, I would like to discuss a bigger design decision.
Right now, the user more or less passes the arguments for MetricFrame to the add_fairlearn_metric_frame method, which takes care of creating the MetricFrame instance. I would propose to change this so that the user has to create the MetricFrame instance themselves, then pass it to add_fairlearn_metric_frame ("inversion of control"), which does not need to construct it but just takes care of creating and adding the table.
This approach is similar to add_permutation_importances, which takes the computed permutation importances as input, instead of computing them inside the method.
Why do I think this could be better? Here are a few reasons:
With the current implementation, the user loses control over the instantiation of MetricFrame. Therefore, if they want to use something like control_features or sample_params, it's not possible. Of course, we could add more parameters to the signature of add_fairlearn_metric_frame, but it only gets bigger and bigger that way, and we have to keep it up to date when fairlearn changes.
Another advantage of having the user pass the instance is that we don't need to import fairlearn inside the method. If a user creates that object, they have already imported fairlearn.
One more advantage is that if a user has a custom MetricFrame class, they can pass that to the method, whereas right now, it's impossible to use. For testing, we could even create a mock object instead of using a real MetricFrame, and then skops would have no dependency on fairlearn at all! But I think adding a test dependency is fine here.
A disadvantage of my proposal is that users have to do a little bit of extra work by instantiating the object themselves.
Overall, I think this price is worth paying. What do you think? If we decide to make this change and you refactor the code accordingly, then many of my comments are obsolete. I think you will see which ones.
| y_pred, | ||
| sensitive_features, | ||
| table_name: str, | ||
| pivot=True, |
There was a problem hiding this comment.
Do we really need the pivot option? Also, is it not "transpose" more than "pivot"?
There was a problem hiding this comment.
Hmmm, I guess we don't really need the pivot if we don't want to.
So if we don't transpose it the table looks like:
| difference | group_max | group_min | ratio | |
|---|---|---|---|---|
| selection_rate | 0.4 | 0.8 | 0.4 | 0.5 |
But when transposing it looks like:
| selection_rate | |
|---|---|
| difference | 0.4 |
| group_max | 0.8 |
| group_min | 0.4 |
| ratio | 0.5 |
Personally, I see the transposed version as more useful so we could just make that one the version of the table that is generated. What do you think?
I'm fine changing the name of the parameter to transpose if we decide to keep it.
| """ | ||
| Add a Fairlearn MetricFrame table to the model card. |
There was a problem hiding this comment.
| """ | |
| Add a Fairlearn MetricFrame table to the model card. | |
| """Add a Fairlearn MetricFrame table to the model card. |
Could you also add a few words of description here, + a link to to fairlearn? If you want to link to the class inside fairlearn docs, you would need to add an entry to the intersphinx mappings:
Line 145 in 3e1f138
|
|
||
| Parameters | ||
| ---------- | ||
| metrics: dict |
There was a problem hiding this comment.
According to the fairlearn docs, metrics can also be a callable. I think the outputs of metric_frame.difference() etc. will be scalars instead of lists, which is a bit annoying. So when the table is created, we might need to add np.atleast_1d(metric_frame.difference()) or something like this to support callables.
|
@BenjaminBossan Thanks for the review! I haven't looked at your code comments yet since the decision of whether to pass in the parameters or a
I agree! I was thinking of that while writing the method but was unsure if that was something we wanted to do. I do wonder if we require users to install and import Beyond that, I don't really see too many disadvantages that would outway the advantages. What're your thoughts? |
We would do that anyway, fairlearn would not be installed by default when installing skops. And even if it were, users don't just discover and try out random packages that were installed :)
This + documenting it well are definitely the way to go for users to discover and use the feature.
Great, I think we agree then. |
|
@BenjaminBossan Thanks for explaining this! I think I've addressed/responded to most of your comments. I still need to handle scalars from the |
|
Great, thanks, please ping Adrin and me once it's ready for review. |
|
@BenjaminBossan and @adrinjalali This is ready to review again!! |
adrinjalali
left a comment
There was a problem hiding this comment.
This looks pretty good to me, and we should probably incorporate it in our existing examples, but that could be a separate PR.
| def add_fairlearn_metric_frame( | ||
| self, | ||
| metric_frame, | ||
| table_name: str = "Fairlearn MetricFrame Table", | ||
| transpose=True, | ||
| ) -> Card: |
There was a problem hiding this comment.
should we be adding a description to our add_* methods? From the user's perspective, it seems odd that it's tricky to add a description here for sections. I think it'd make sense for all of them to allow a description as well as a title. WDYT? also cc @skops-dev/maintainers
There was a problem hiding this comment.
Yes, I agree it would be good to have, some existing methods also don't have it, e.g. add_table. It's probably easier to have a separate PR where this argument is added.
There was a problem hiding this comment.
@adrinjalali For clarification, should I add the description argument to this method in this PR? Or do it in another one?
There was a problem hiding this comment.
we can keep it for another PR as @BenjaminBossan suggested.
There was a problem hiding this comment.
One more thing: Could you please change the type annotation to return -> Self? This is a change we recently made on all the other methods as well.
| ------- | ||
| self: Card | ||
| The model card with the metric frame added. | ||
| """ |
There was a problem hiding this comment.
| """ | |
| Notes | |
| -------- | |
| You can check `fairlearn's documentation | |
| <https://fairlearn.org/v0.8/user_guide/assessment/index.html>`__ on how to | |
| work with `MetricFrame`s. | |
| """ |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Not much to add from my side. I think the transpose feature could be implemented without pandas, but it's not a big deal, most users will probably have pandas installed anyway.
|
I didn't even think about that, then forget what I said :) |
|
@adrinjalali and @BenjaminBossan Thanks for reviewing this! I've addressed all the comments except for one I needed clarification on. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Fantastic, just this minor comment, then we can merge.
| def add_fairlearn_metric_frame( | ||
| self, | ||
| metric_frame, | ||
| table_name: str = "Fairlearn MetricFrame Table", | ||
| transpose=True, | ||
| ) -> Card: |
There was a problem hiding this comment.
One more thing: Could you please change the type annotation to return -> Self? This is a change we recently made on all the other methods as well.
|
@BenjaminBossan I've addressed your comment! Sorry, it took me a couple of days to get to this. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Great. Thanks a lot for your contributions, really appreciated.
We merged skops-dev#298 and skops-dev#310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. On top, I added the description argument to add_fairlearn_metric_frame, to be consistent with all the other methods, and also as a test for it. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature
We merged #298 and #310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. To be clear, the only change needed to fix the tests is the following: ```python - actual_table = card.select("Metric Frame Table").content.format() + actual_table = card.select("Metric Frame Table").format() ``` On top, I added the `description` argument to `add_fairlearn_metric_frame`, to be consistent with all the other methods (also changed in #310), and I also added as a test for it. Since we now have 2 tests, I moved the `metric_frame` variable to a fixture. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature
Closes #278