SAM2 Video support fp16#43268
Conversation
|
@yonigozlan bump. there are a few cicd pipelines which are failing; smth to do with code quality and consistency between sam 2 and sam 3 (but not sure entirely) |
|
Thanks! The checks are failing because you are modifying a |
|
Thank you for raising this issue @Guppy16 ! Indeed I could reproduce the problem. I simplified the fix a bit to convert to the correct dtype as late as possible, and added tests. |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: edgetam_video, sam2_video, sam3_tracker_video, sam3_video |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* fix: cast memory attention inputs to inference session dtype * chore: fix formatting * add fix and tests --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
What does this PR do?
Fix SAM2 Video inference processor so that it can support float16 (currently just works for fp32 and bfloat16).
How to reproduce
Demo source from here
dtype = torch.bfloat16anddtype = torch.float32,dtype = torch.float16Who can review?
@yonigozlan @molbap