Skip to content

ENH correctly restore default_factory of a defaultdict#433

Merged
adrinjalali merged 7 commits intoskops-dev:mainfrom
adrinjalali:default_dict
Aug 8, 2024
Merged

ENH correctly restore default_factory of a defaultdict#433
adrinjalali merged 7 commits intoskops-dev:mainfrom
adrinjalali:default_dict

Conversation

@adrinjalali
Copy link
Copy Markdown
Member

Fixes #432

During the inspection of #432 we also realized default_factory is not restored.

This also adds defaultdict, and a few other primitive builtin types as trusted.

cc @BenjaminBossan

Copy link
Copy Markdown
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this bug. I added a few comments, please take a look.

Comment thread skops/__init__.py
Comment thread skops/io/_trusted_types.py Outdated

PRIMITIVE_TYPE_NAMES = ["builtins." + t.__name__ for t in PRIMITIVES_TYPES]

BUILTIN_TYPES = [list, set, map, tuple]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about the name. Above we have "primitive types" but they are also "builtin". Should this be "complex types", "container types", "composite types" or something along these lines?

Then we could have BUILTIN_TYPE_NAMES = PRIMITIVE_TYPE_NAMES + COMPLEX_TYPE_NAMES (or whatever name is chosen) and save a bit of code duplication.

obj["foo"] = "bar"
obj_loaded = loads(dumps(obj))
assert obj_loaded == obj
assert obj_loaded.default_factory == obj.default_factory
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a test for OrderedDict be added too?

@adrinjalali
Copy link
Copy Markdown
Member Author

@BenjaminBossan thanks for the review, I think I addressed your comments

def test_dictionary(cls):
obj = cls({1: 5, 6: 3, 2: 4})
loaded_obj = loads(dumps(obj))
assert obj == loaded_obj
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the type also be checked? E.g. this passes:

assert {2: 20, 1: 10} == OrderedDict([(2, 20), (1, 10)])
assert {1: 10, 2: 20} == OrderedDict([(2, 20), (1, 10)])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, sorry. Added now @BenjaminBossan

Copy link
Copy Markdown
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, LGTM.

Failing CI seems to be unrelated, some tests fail because of a change in sklearn and some because MacOS is no longer supported by codecov (??). So feel free to merge despite the red CI.

@adrinjalali adrinjalali merged commit 4467309 into skops-dev:main Aug 8, 2024
@adrinjalali adrinjalali deleted the default_dict branch August 8, 2024 14:03
@juhoinkinen
Copy link
Copy Markdown

Do you have an ETA for a release which would include this fix?

@adrinjalali
Copy link
Copy Markdown
Member Author

Working on some sklearn 1.6 compatibility issues (due to come out soon) and then we'd do a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UntrustedTypesFoundException raised for standard library usage in lightgbm's Booster

3 participants