[SAM3-LiteText] Fix modular converter KeyError and torchvision soft dependency#72
[SAM3-LiteText] Fix modular converter KeyError and torchvision soft dependency#72JavierYepez wants to merge 49 commits intoNielsRogge:add_sam_3_lite_textfrom
Conversation
* Fix tie_word_embedding issues with `Qwen2VL` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * remove colqwen hack Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* unify calls * give me make! * added tomli to quality extra * keep teh exact same behavior * remove circleCI from agent file * unified UI * better UX * fix typing target * dev-ci * dev-ci * hardcoded :dev * removed dev blocks and python version marker * make fix-repo
…ce#44970) * fix 16 bytes alignement issue * add issue for reference * test fix for non-aligned inputs as well * avoid dims non divisible by 8 for grouped_mm testing * test * style * final fix that works for cpu builds as well * move coment
* squash commit * several forks mixed up, revert * oops * glms * commit lost when rebasing, revert * typing hints * more failures * fix repo * comments and revert unrelated * fix style * fix repo
…the source size (huggingface#44899) * fix: Correct interpolation target size * test: Add fast test coverage
* BC * update * revert * update * style
…r Python 3.13 compat (huggingface#44986) fix: remove `# Copied from` comments between @torch.jit.script and def for Python 3.13 compat On Python 3.13, placing a comment between @torch.jit.script and the function definition causes an IndentationError when torch.jit.script calls inspect.getsource() followed by ast.parse(). The stricter parser in Python 3.13 fails to associate the function body with the def when a comment intervenes. Remove the `# Copied from` comments from the three affected functions (c2p_dynamic_expand, p2c_dynamic_expand, pos_dynamic_expand) in both modeling_deberta_v2.py and modeling_sew_d.py, as suggested by the maintainer in issue huggingface#44855. Fixes huggingface#44855
* First draft * [Videomt] Extend query-stage parity checks to 3-frame inputs * [Videomt] Add full-model parity check against EoMT reference * [Videomt] Compare conversion against official GitHub reference * [Videomt] Simplify conversion to checkpoint-based HF mapping * [Videomt] Add --verify mode against upstream GitHub implementation * [Videomt] Improve --verify diagnostics with key remapping and layer checks * [Videomt] Improve verify backbone candidate fallback and remapping * [Videomt] Add DINOv3 verify compatibility patch and progress logging * [Videomt] Extend verify diagnostics with MLP/head parity checks * [Videomt] Make --verify succeed for converted weight mapping scope * [videomt] Improve verify adapters and candidate traceback diagnostics * [videomt] Adapt verify _pos_embed output for DINOv3 candidates * [videomt] Enable DINOv3 verify candidate by adapting EVA head_dim * [videomt] Add pre-query layer diagnostics to verify flow * [videomt] Add deterministic verify probes and deeper pre-query diffs * [videomt] Penalize skipped keys in verify candidate scoring * [videomt] Add no-rope A/B diagnostics to verify pre-query layers * [videomt] Add branch-level pre-query diagnostics to verify * [videomt] Add fine-grained MLP diagnostics to verify * [videomt] Verify layer-scale mapping parity in --verify * [videomt] Validate MLP diagnostic decomposition in verify * [videomt] Add token-group diagnostics for layer-4 MLP divergence * [VidEoMT] Add temporal query updater path and re-verify yt_2019_vit_small * [VidEoMT] Refine 5D execution order and re-check small checkpoint parity * Simplify conversion script and convert all dinov2 checkpoints * Add id2label mappings * Fix all tests * Add to auto mapping * Simplify verify_conversion_against_github_reference * Update absolute tolerance * Update date * Revert AGENTS.md * Address comments * Add circleci skill, fix circleci * Fix CI * Remove skills from git * Address comments * Address more comments * Address comment * Add docstrigns * Restore AGENTS.md * Address comment * fix this one * Address comments * [fix] mistral 4 docs (huggingface#44776) fix * Address comment * add expectations * Update date * Make fix-repo * fix multi gpu * fix with changes on main * fix date --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).
Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ngface#44710) * Fix AutoProcessor.from_pretrained silently dropping hub kwargs The previous code used inspect.signature(cached_file).parameters to filter kwargs before passing them to cached_file(). However, since cached_file() is defined with **kwargs in its signature, only 'path_or_repo_id', 'filename', and 'kwargs' were visible as parameter names. This meant user-supplied hub kwargs like force_download, cache_dir, token, revision, etc. were silently dropped and never forwarded. Replace the inspect.signature approach with an explicit tuple of known hub parameter names that cached_file actually accepts (via cached_files). This matches how other auto classes like AutoTokenizer handle the same situation. Fixes huggingface#44704 Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> * narrow it a bit --------- Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
…ngface#45002) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* split out from timm PR * all other VLMs * timm backbone is not here * oops, extra key is breaking eveerything * . * this test * maybe * fix missing keys when loading from hub * now fix fast tests * merge gone wrong * fix repo * refine the regex again! * close the bracket * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * revert unrelated * ! * revert more * add submodule prefix when recursing * i'll need to fix maskformer later * dont duplicate the same pattern twice * fix modular * detr * colpali isn't working still! * oke, so this can be fine for now * ! * revert * dot lost in regex and comments * timm wrapper is weird * skip these, timm wrapper * bye bye timm * make repo check happy * Revert "bye bye timm" This reverts commit ca68663. * love timm! * Apply repo consistency fixes * oke, the bot can't fix it so here we go --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* thanks claude * typo * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * cross link and add docs about patching * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_output_tracing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_output_tracing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add anti-slop action * remove maintainers names * pin the action to 0.2.1 * tweaks @Rocketknigth1 reviews
* Rebase: Add base_model_tp_plan to OlmoeConfig (dataclass style) Rebased onto main after configs were migrated to dataclasses. Adds base_model_tp_plan as a class attribute and TensorParallelTesterMixin to the OLMoE test suite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Apply repo consistency fixes * review: update src/transformers/models/olmoe/configuration_olmoe.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * review: use correct pattern for OlmoeModelTester class --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
…gingface#44146) * rebase * merge conflict * merge conflict1 * merge conflict trainer * blank space qulity run * lint error * modify test to address our change * rebase * rebase * rebase * rebase * test updated with delay check * checkpoint tests updated * test updated in utils * correct test condition * style format
…uggingface#45031) Use the correct _tied_weights_keys for CamembertForCausalLM
…#45032) * multi runners * multi runners --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Stacked commits cb-persistent * Rebase fixes * style * ty compliance * Fix * nit
…ingface#45035) The conversion operations table was missing PermuteForRope. Added it with its reverse (itself), consistent with how other operations are documented. PermuteForRope is self-inverse applying it twice returns the original tensor layout.
* cohere-asr model * repo udpates * tmp weight mapping * add fast tests * fix compile * add integration tests * update integration tests * fixes * clearer API * test update * fix * cosmetics * fix on parakeet encoder * modular update * Update src/transformers/models/cohere_asr/configuration_cohere_asr.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make check-repo * doc _reassemble_chunk_texts * nit * fix * updates * test update * make style * doc updates * ensure bc with the hub checkpoints * quick fixes * remove rope - not used * skip this one * fix * last fixes - needed revision + wrong main input name (less modular but we have to) * style * output_mask should be int! --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: vasqu <antonprogamer@gmail.com>
* first part of the fix
* fix torch imports
* revert
* fix: make from transformers import * work without torch
- `is_torchvision_available`, `is_timm_available`, `is_torchaudio_available`,
`is_torchao_available`, `is_accelerate_available` now return False when
torch is not installed, since all these packages require torch
- Add `@requires(backends=("torch",))` to `PI0Processor` (was missing,
causing the lazy module to crash on import without torch)
- Fix wrong availability guards: `is_vision_available` → `is_torchvision_available`
in pixtral processor, `is_torch_available` in smolvlm processor
- Wrap bare `import torch` / torchvision imports in `processing_sam3_video.py`
- Quote `torch.Tensor` in return type annotation of `tokenization_mistral_common.py`
- Wrap 66 `image_processing_pil_*.py` imports from torch-dependent counterparts
in try/except with ImagesKwargs fallbacks; quote `torch.Tensor` annotations
- Restore explicit `from transformers import *` check in CircleCI
`check_repository_consistency` job
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* up
* style of this
* revert: remove src/models changes, keep only core import fixes
The PIL image processor changes are too fragile (break on make fix-repo).
Keep only the core fixes:
- is_torchvision/timm/torchaudio/torchao/accelerate_available() check torch
- CircleCI explicit import check
- tokenization_mistral_common.py torch.Tensor annotation
- processing_sam3_video.py conditional torch imports
- processing_pixtral.py/processing_smolvlm.py availability guard fixes
- PI0Processor @requires decorator
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* nit
* the mega quidproquo
* use rquires(backend
* more pil fixes
* fixes
* temp update
* up?
* is this it?
* style?
* revert a bunch of ai shit
* pi0 requires this
* revert some stuffs
* upd
* the fix
* yups
* ah
* up
* up
* fix
* yes?
* update
* up
* nits
* up
* up
* order
---------
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Guard import torch in processing_cohere_asr.py with is_torch_available() - Add @requires(backends=("torch",)) to CohereAsrProcessor - Fix is_vision_available() to use actual import test instead of find_spec - Fix is_torchvision_available() and is_timm_available() to require vision - Fix is_pytesseract_available() to require vision - Fix is_mistral_common_available() to require vision - Add setdefault for image_processing_backends in __init__.py - Guard import torchvision in sam3/modeling_sam3.py and processing_sam3_video.py - Add @requires(backends=("torch", "torchvision")) to Sam3PreTrainedModel - Add @requires(backends=("torch", "torchvision")) to Sam3VideoProcessor - Add CI checks for torch-only and PIL-only import scenarios Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* speed up docstring checker * add doctsrings and improve test readability * fmt * refactor cache so it's owned by a single function and the flow is clearer * adopt test_fetcher style for cache
…uggingface#44985) * fix: preserve rotary_pct across save/load cycle in GPTNeoX configs Use setdefault instead of unconditional assignment for partial_rotary_factor in GPTNeoXConfig and GPTNeoXJapaneseConfig, so the value saved in rope_parameters is not overwritten with the default on reload. * refactor: simplify partial_rotary_factor to use setdefault per review Replace the 4-line if/else block with a single setdefault call, matching the pattern already used for rope_theta on the line above. As suggested by @zucchini-nlp in PR review.
…gingface#45019) * Fix GraniteConfig type hints to accept int for multiplier fields Fixes huggingface#44877 * Also update granitemoe and granitemoeshared multiplier type hints
* squash * fix copies * skip, we dont need to load base model for it * oops, one more regex since now we have no prefix
chor: Fix mlinter cache location
* fix Image.open failure in case "tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py::PromptDepthAnythingModelIntegrationTest::test_inference" Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * updated Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Update update_metdata.yml `tf` and `flax` are long gone
huggingface#44644) * fix tests/quantization/fp_quant_integration/test_fp_quant.py::FPQuantMXFP4PseudoquantTest::test_quantized_model fail in xpu Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * updated Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Fix failing SmolLM3IntegrationTest
* check float before using normal op Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llama4 weight Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add bnb quant skip module for llama4 Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert bnb integration Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert initialization.py Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * total revert init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix _keep_in_fp32_modules Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add _modules_to_not_quantize Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix modules_to_not_convert Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update bnb quantize condition Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
…trained` (huggingface#45058) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…ain CI (huggingface#45004) * fix: Guard sdpa flash test and fix phi3/pi0 tests * fix: Narrow scope by adding it to the skip list * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
…gingface#45061) * fix * style * move to init as well
fix(security): remediate workflow vulnerability in .github/workflows/update_metdata.yml Co-authored-by: hf-security-analysis[bot] <265538906+hf-security-analysis[bot]@users.noreply.github.com>
* fix * convert only if non-empty key
…uggingface#44983) fix: add identity reverse_op to dequantize operations for save_pretrained Dequantize operations (Mxfp4Dequantize, Fp8Dequantize, MetalDequantize) raise NotImplementedError on reverse_op, causing save_pretrained to fail for models loaded with dequantize=True. Add _IdentityOp as the reverse_op so dequantized weights are saved as-is.
Remove extra Tensorflow env flag
* Cache model modules in check_repo * add test coverage
…age processors (huggingface#45045) * [Bugfix] Remove incorrect torchvision requirement from PIL backend image processors PR huggingface#45029 added @requires(backends=("vision", "torch", "torchvision")) to 67 PIL backend image_processing_pil_*.py files. This causes PIL backend classes to become dummy objects when torchvision is not installed, making AutoImageProcessor unable to find any working processor. Fix: set @requires to ("vision",) for files that only need PIL, and ("vision", "torch") for files that also use torch directly. Also fix 5 modular source files so make fix-repo preserves the correct backends. Fixes huggingface#45042 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [Bugfix] Remove redundant @requires(backends=("vision",)) from PIL backends Per reviewer feedback: the vision-only @requires decorator is redundant for PIL backend classes since PilBackend base class already handles this. - Remove @requires(backends=("vision",)) from 43 PIL backend files - Remove unused `requires` import from 38 files (Category A) - Keep @requires(backends=("vision", "torch")) on method-level decorators (Category B: 5 files) * update * remove torch when its not necessary * remove if typechecking * fix import shinanigans * marvellous that's how we protect torch :) * beit is torchvisionbackend * more import cleanup * fiixup * fix-repo * update * style * fixes * up * more * fix repo * up * update * fix imports * style * fix check copies * arf * converter up * fix? * fix copies * fix for func * style * ignore * type --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>
|
Hello @NielsRogge I can't add you as a reviewer. Please, let me know if there is something else to do/fix |
|
Hi, any reason this PR was opened? I've already opened a PR for SAM3-LiteText here: huggingface#44320, and it looks like the git diff got messed up |
I opened this PR because running python utils/modular_model_converter.py sam3_lite_text didn't work and the quality tests failed. I don't know why the git diff got messed up. I forked the transformers repo, created a branch, merged yours and changed a few lines of code... Any Idea on how to fix the git diff? |
e5a5063 to
8f35675
Compare
|
I see this is no longer needed. Thank you @NielsRogge and @yonigozlan for adding Efficient SAM3 |
What does this PR fix?
This PR resolves two bugs in the SAM3-LiteText integration introduced in huggingface#44320.
KeyError: '__init__'in modular model converterSam3LiteTextVisionConfigoverrode init directly, which caused utils/modular_model_converter.py to crash with aKeyError: '__init__'when it tried to look up the method inoriginal_modeling_methods. Since configurations now use@strictdataclasses from huggingface_hub, initialization logic must go in__post_init__instead. The fix replaces the__init__override with a proper post_init method that initializes backbone_config and then delegates to `super().post_init().Hard torchvision import in modeling file
torchvision was unconditionally imported at the top of
modeling_sam3_lite_text.py, causing anImportErrorfor users without it installed. The fix guards the import behindis_torchvision_available()and adds a@requires(backends=("torch", "torchvision"))decorator to `Sam3LiteTextPreTrainedModel to surface a clear error message.Configuration class cleanup
Sam3LiteTextViTConfigwas rewritten to use@auto_docstringand@strictdecorators, removing the large hand-written docstring block in favor of the auto-generated one. Redundant__init__arguments that were just pass-through wrappers around the parent SAM3 config were removed._checkpoint_conversion_mapping → base_model_prefix
The _checkpoint_conversion_mapping regex pattern in
Sam3LiteTextModel(used to strip/add the detector_model. prefix) was replaced with the simpler base_model_prefix = "detector_model", which is the idiomatic approach in Transformers.Test commands
python -c "from transformers import Sam3LiteTextConfig; c = Sam3LiteTextConfig(); print(c)"python utils/modular_model_converter.py --files src/transformers/models/sam3_lite_text/modular_sam3_lite_text.pymake stylemake check-repoNotes
AI assistance was used to draft parts of the description; all changes have been reviewed line by line.
This is a companion fix to huggingface#44320, not a duplicate — it addresses bugs found during review/integration.
Related CLI fix for add_new_model_like.py is tracked separately in huggingface#44334.