[RLlib] introduce utils to serialize gym spaces and view requirements#25007
[RLlib] introduce utils to serialize gym spaces and view requirements#25007sven1977 merged 3 commits intoray-project:masterfrom
Conversation
rllib/utils/serialization.py
Outdated
There was a problem hiding this comment.
type is a reserved keyword, better not to use it
There was a problem hiding this comment.
good catch!
I am changing the serialization key to "space". and the variable name to "space_type".
rllib/utils/serialization.py
Outdated
There was a problem hiding this comment.
maybe cleaner to have a dict
type_mapper = {"box": _box, ...}
and then simply
func = type_mapper.get(d["type"])
return func(d)
checking for invalid type can happen before. wdyt?
There was a problem hiding this comment.
This works with margin=None, which you wisely put as default, so I think you can simplify this to just
for a in zip(a1, a2):
eq(a[0], a[1], margin)
rllib/utils/serialization.py
Outdated
There was a problem hiding this comment.
can we call this array instead of d? ok in tests, but let's not be too cryptic here!
There was a problem hiding this comment.
Agreed, we should never use single-char variable names.
There was a problem hiding this comment.
habit from working with other languages.
I agree function signature shouldn't use these super short-hands.
what are your opinions of only using these short names in a function for local variables?
There was a problem hiding this comment.
think this is fine. especially for complex and math-heavy functionality, imo it can even be an advantage to have concise notation. depends on context, of course.
rllib/utils/serialization.py
Outdated
There was a problem hiding this comment.
maybe a little docstring here as well?
There was a problem hiding this comment.
done. settled on b64_string
rllib/utils/serialization.py
Outdated
maxpumperla
left a comment
There was a problem hiding this comment.
left a couple of non-blocking comments
|
Looks great! Thanks for trying to solve this problem. I'm guessing this is mostly for the serializability of our Policies, which all have-a view-requirements dict (with spaces inside) as well as observation- and action gym spaces. Can you write one sentence about the main purpose of this PR in the comment? |
sure |
gjoliver
left a comment
There was a problem hiding this comment.
done. all comments addressed. thanks for the awesome review.
maxpumperla
left a comment
There was a problem hiding this comment.
thanks for addressing everything! 👍
… into JSON-serializable dicts.
into JSON-serializable dicts.
This is for safe checkpointing of RLlib policy's obs and action spaces, and view requirements (which currently contains a space attribute).
Why are these changes needed?
This is so we can checkpoint everything a policy needs to run independently.
Related issue number
Checks
scripts/format.shto lint the changes in this PR.