Add VidEoMT by NielsRogge · Pull Request #44285 · huggingface/transformers

NielsRogge · 2026-02-25T19:24:39Z

What does this PR do?

This PR adds the VidEoMT model, as described in VidEoMT: Your ViT is Secretly Also a Video Segmentation Model.

Gradio demo (running on ZeroGPU): https://huggingface.co/spaces/nielsr/videomt-transformers-demo

Original Github thread: tue-mps/videomt#1

…hecks

…-to-transformers [Videomt] Extend query-stage 5D/4D parity validation to 3-frame videos

…onversion [videomt] Improve verify adapters and DINOv3 failure diagnostics

…mall

…n-videomt [VidEoMT] Add temporal query-updater support, fix DINOv2 conversion mappings and re-verify yt_2019_vit_small

vasqu · 2026-03-17T16:27:32Z

What gpu @NielsRogge? Might be a diff between CI (A10) and your local one

vasqu · 2026-03-17T19:12:36Z

run-slow: videomt

github-actions · 2026-03-17T19:14:24Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/videomt"]
quantizations: []

vasqu · 2026-03-17T19:25:38Z

I've added expectation for our CI devices so it works now, multi gpu still seems to fail. Would be nice if you could check @NielsRogge

(general ci still shaky 😢)

github-actions · 2026-03-17T19:29:27Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	5bba765b	workflow commit (merge commit)
PR	20278faf	branch commit (from PR)
main	acc89e74	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

NielsRogge · 2026-03-23T09:37:27Z

@vasqu friendly ping

vasqu · 2026-03-23T09:56:53Z

Please check #44285 (comment)

It hasn't been addressed afaik

vasqu · 2026-03-25T15:53:10Z

run-slow: videomt

github-actions · 2026-03-25T15:53:44Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, videomt

github-actions · 2026-03-25T15:54:34Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/videomt"]
quantizations: []

github-actions · 2026-03-25T16:06:48Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	1a440406	workflow commit (merge commit)
PR	b192fc74	branch commit (from PR)
main	2f624917	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

vasqu

Fixed the last things, good to merge now 🤗

github-actions · 2026-03-25T16:33:45Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44285&sha=09b99a

vasqu · 2026-03-25T17:05:47Z

Force merged, because the failure is known and I don't want to further delay this model

* First draft * [Videomt] Extend query-stage parity checks to 3-frame inputs * [Videomt] Add full-model parity check against EoMT reference * [Videomt] Compare conversion against official GitHub reference * [Videomt] Simplify conversion to checkpoint-based HF mapping * [Videomt] Add --verify mode against upstream GitHub implementation * [Videomt] Improve --verify diagnostics with key remapping and layer checks * [Videomt] Improve verify backbone candidate fallback and remapping * [Videomt] Add DINOv3 verify compatibility patch and progress logging * [Videomt] Extend verify diagnostics with MLP/head parity checks * [Videomt] Make --verify succeed for converted weight mapping scope * [videomt] Improve verify adapters and candidate traceback diagnostics * [videomt] Adapt verify _pos_embed output for DINOv3 candidates * [videomt] Enable DINOv3 verify candidate by adapting EVA head_dim * [videomt] Add pre-query layer diagnostics to verify flow * [videomt] Add deterministic verify probes and deeper pre-query diffs * [videomt] Penalize skipped keys in verify candidate scoring * [videomt] Add no-rope A/B diagnostics to verify pre-query layers * [videomt] Add branch-level pre-query diagnostics to verify * [videomt] Add fine-grained MLP diagnostics to verify * [videomt] Verify layer-scale mapping parity in --verify * [videomt] Validate MLP diagnostic decomposition in verify * [videomt] Add token-group diagnostics for layer-4 MLP divergence * [VidEoMT] Add temporal query updater path and re-verify yt_2019_vit_small * [VidEoMT] Refine 5D execution order and re-check small checkpoint parity * Simplify conversion script and convert all dinov2 checkpoints * Add id2label mappings * Fix all tests * Add to auto mapping * Simplify verify_conversion_against_github_reference * Update absolute tolerance * Update date * Revert AGENTS.md * Address comments * Add circleci skill, fix circleci * Fix CI * Remove skills from git * Address comments * Address more comments * Address comment * Add docstrigns * Restore AGENTS.md * Address comment * fix this one * Address comments * [fix] mistral 4 docs (huggingface#44776) fix * Address comment * add expectations * Update date * Make fix-repo * fix multi gpu * fix with changes on main * fix date --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

NielsRogge and others added 30 commits February 23, 2026 15:37

First draft

f96e9f9

[Videomt] Extend query-stage parity checks to 3-frame inputs

e170bad

[Videomt] Add full-model parity check against EoMT reference

23afffa

[Videomt] Compare conversion against official GitHub reference

327db52

[Videomt] Simplify conversion to checkpoint-based HF mapping

3d234fb

[Videomt] Add --verify mode against upstream GitHub implementation

6063212

[Videomt] Improve --verify diagnostics with key remapping and layer c…

36106f1

…hecks

[Videomt] Improve verify backbone candidate fallback and remapping

22e9cac

[Videomt] Add DINOv3 verify compatibility patch and progress logging

9a93db8

[Videomt] Extend verify diagnostics with MLP/head parity checks

7b09227

[Videomt] Make --verify succeed for converted weight mapping scope

ea31bbd

Merge pull request #66 from NielsRogge/codex/contribute-videomt-model…

9762a95

…-to-transformers [Videomt] Extend query-stage 5D/4D parity validation to 3-frame videos

[videomt] Improve verify adapters and candidate traceback diagnostics

c6d4bf5

[videomt] Adapt verify _pos_embed output for DINOv3 candidates

1e2846b

[videomt] Enable DINOv3 verify candidate by adapting EVA head_dim

521c8a0

[videomt] Add pre-query layer diagnostics to verify flow

c0a2937

[videomt] Add deterministic verify probes and deeper pre-query diffs

f691a52

[videomt] Penalize skipped keys in verify candidate scoring

47e1a8b

[videomt] Add no-rope A/B diagnostics to verify pre-query layers

548402a

[videomt] Add branch-level pre-query diagnostics to verify

55bbfa2

[videomt] Add fine-grained MLP diagnostics to verify

55110f9

[videomt] Verify layer-scale mapping parity in --verify

a56c4d3

[videomt] Validate MLP diagnostic decomposition in verify

7061d6f

[videomt] Add token-group diagnostics for layer-4 MLP divergence

d8c4459

Merge pull request #67 from NielsRogge/codex/complete-videomt-model-c…

bbd75fb

…onversion [videomt] Improve verify adapters and DINOv3 failure diagnostics

[VidEoMT] Add temporal query updater path and re-verify yt_2019_vit_s…

0337c7d

…mall

[VidEoMT] Refine 5D execution order and re-check small checkpoint parity

484896e

Merge pull request #68 from NielsRogge/codex/fix-backbone-to-dinov2-i…

f3c5381

…n-videomt [VidEoMT] Add temporal query-updater support, fix DINOv2 conversion mappings and re-verify yt_2019_vit_small

Simplify conversion script and convert all dinov2 checkpoints

534f4d9

Add id2label mappings

e173147

vasqu and others added 2 commits March 17, 2026 20:12

add expectations

9d05337

Merge branch 'main' into add_videomt

20278fa

NielsRogge and others added 4 commits March 18, 2026 09:10

Merge branch 'main' into add_videomt

a2135cc

Update date

025b9dc

Fix merge

bd19f9a

Make fix-repo

36166bf

vasqu and others added 2 commits March 25, 2026 16:52

fix multi gpu

9b9ec82

Merge branch 'main' into add_videomt

b192fc7

fix with changes on main

b13c2be

vasqu enabled auto-merge March 25, 2026 16:08

vasqu approved these changes Mar 25, 2026

View reviewed changes

fix date

09b99ac

vasqu disabled auto-merge March 25, 2026 17:05

vasqu merged commit 2a54236 into huggingface:main Mar 25, 2026
27 of 29 checks passed

NielsRogge mentioned this pull request Mar 30, 2026

Add SAM 3.1 #45110

Open

2 tasks

Conversation

NielsRogge commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

vasqu commented Mar 17, 2026

Uh oh!

vasqu commented Mar 17, 2026

Uh oh!

github-actions Bot commented Mar 17, 2026

Uh oh!

vasqu commented Mar 17, 2026

Uh oh!

github-actions Bot commented Mar 17, 2026

CI Results

Commit Info

Uh oh!

NielsRogge commented Mar 23, 2026

Uh oh!

vasqu commented Mar 23, 2026

Uh oh!

vasqu commented Mar 25, 2026

Uh oh!

github-actions Bot commented Mar 25, 2026

Uh oh!

github-actions Bot commented Mar 25, 2026

Uh oh!

github-actions Bot commented Mar 25, 2026

CI Results

Commit Info

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 25, 2026

Uh oh!

Uh oh!

vasqu commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

NielsRogge commented Feb 25, 2026 •

edited

Loading