[Glm4.5V] fix vLLM support#40696
Conversation
|
[For maintainers] Suggested jobs to run (before merge) run-slow: glm4v |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
qubvel
left a comment
There was a problem hiding this comment.
Thanks for fixing! I'm trusting you, I'm a bit out of context 😄
|
|
||
| @dataclass | ||
| class VideoMetadata: | ||
| class VideoMetadata(Mapping): |
There was a problem hiding this comment.
What is the motivation for making it iterable?
There was a problem hiding this comment.
so that we can use VideoMetadata in same way as dict. We allow users to pass both formats currently
| patch_size = images_kwargs.get("patch_size", self.patch_size) | ||
| merge_size = images_kwargs.get("merge_size", self.merge_size) | ||
| size = images_kwargs.get("size", self.size) | ||
| size = images_kwargs.get("size", {"shortest_edge": 112 * 112, "longest_edge": 28 * 28 * 15000}) |
There was a problem hiding this comment.
should we align self.size instead?
There was a problem hiding this comment.
the defaults are already using these values. It is the model checkpoint that has another set of values saved, and for some reason we never use it. Didn't want to break BC but using config values is actually correct
There was a problem hiding this comment.
hmmm, can we update config values instead? or Hub's PR will not be merged?
There was a problem hiding this comment.
we can try, lemme ask one of the authors who is on slack
There was a problem hiding this comment.
asked the authors and the config values are the correct ones. So will update processor in the next PR to use config defaults and mark it as slightly breaking
What does this PR do?
GLM-4.5V inference does not work in the latest release and this PR fixes it. Specifically, users can now pass in a
VideoMetadataobject to processor's call which is needed to backwards compatibility (adding a test for this, one sec)And a tiny fix in image processing to make helper utility aligned with
self._preprocess