-
-
Notifications
You must be signed in to change notification settings - Fork 782
Orjson serialize set to list #5267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for the contribution. Per #5265 (comment), this is a performance critical change and will need more work. Since none of the existing Orquesta and other tests caught this issue it looks like this functionality is not that widely used so we should understand how much overhead it adds (if any) and then decide how to proceed. If it turns out it adds non-trivial amount of overhead, we should consider some other approach instead of using Another alternative would be to modify this field DB field type class to only support sets where they are valid (for Orquesta context. But again, we need to first understand the impact. Maybe that won't be needed, or similar. And to understand how much / if any overhead it adds, we need to add some micro-benchmarks at the very least. There are already some examples you can use as a starting point. New micro benchmarks need to cover multiple scenarios so we can understand how it affects performance (small dict size, medium dict size, large dict with and having a set item in various places in the dict - e.g. deeply nested, top level attribute value, etc.). |
| class JSONDictFieldTestCase(unittest2.TestCase): | ||
| def test_set_to_mongo(self): | ||
| field = JSONDictField(use_header=False) | ||
| result = field.to_mongo({"test": {1, 2}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Round trip test would also be good - aka to ensure that when we unserialize the value, we get a list back.
|
Pretty sure it only runs this code if it bumps into a set. Can you point me
to the existing benchmark code?
The major issue is fixed with your zstd compression.
But alas I agree benchmarks will prove it.
…On Sat, May 15, 2021, 3:34 PM Tomaz Muraus ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In st2common/tests/unit/test_db_fields.py
<#5267 (comment)>:
> @@ -73,6 +73,16 @@ class ModelWithJSONDictFieldDB(stormbase.StormFoundationDB):
class JSONDictFieldTestCase(unittest2.TestCase):
+ def test_set_to_mongo(self):
+ field = JSONDictField(use_header=False)
+ result = field.to_mongo({"test": {1, 2}})
Round trip test would also be good - aka to ensure that when we
unserialize the value, we get a list back.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5267 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACZ5TIJC2NLD36UESNDXOXTTN3EEHANCNFSM443HAL5Q>
.
|
|
Can add it here - https://github.com/StackStorm/st2/blob/master/st2common/benchmarks/micro/test_json_serialization_and_deserialization.py#L32 Also, not sure what you mean with zstandard - that code is not actually used in prod. It was one of the proposed approaches, but in the end we decided to go with a simpler approach which doesn't include field level compression (since MongoDB server already handles compression on the server aka storage size). |
|
The blob storage for running workflows. That fixed most of my problems
…On Sat, May 15, 2021, 3:43 PM Tomaz Muraus ***@***.***> wrote:
Can add it here -
https://github.com/StackStorm/st2/blob/master/st2common/benchmarks/micro/test_json_serialization_and_deserialization.py#L32
Also, not sure what you mean with zstandard - that code is not actually
used in prod.
It was one of the proposed approaches, but in the end we decided to go
with a simpler approach which doesn't include field level compression
(since MongoDB server already handles compression on the server aka storage
size).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5267 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACZ5TIN2NSW5ATUJHUJLURTTN3FGHANCNFSM443HAL5Q>
.
|
function for orjson.dumps.
|
I've added a micro benchmark (fb2dce4) and results show that the overhead is indeed very small / negligible. Having said that, since set is only valid in Orquesta workflow it still makes sense to only support it for Orquesta DB models. |
|
I added a round-trip test case and will ago ahead and merge it into master as-is. I added a comment to the code and if it turns out if indeed adds more overhead in some other scenarios micro benchmarks don't cover, we can change the code then to only use |
|
Merged, thanks again for catching this. |
orjson not serializing set from Yaql
.toSet()function when publishing a variable.Fixes issue #5625