Refacto GGUF weight conversion#44794
Draft
ArthurZucker wants to merge 281 commits intomainfrom
Draft
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: ggml |
* Fix unexpected `position_ids` keys when loading OwlViT models Older OwlViT checkpoints saved `position_ids` as buffers in the text and vision embedding modules. These tensors are simple integer ranges and are now recomputed dynamically during initialization. This results in `UNEXPECTED` key warnings when loading models such as `google/owlvit-base-patch32`. Add the corresponding patterns to `_keys_to_ignore_on_load_unexpected` to suppress these warnings. * Fix OwlViT copy consistency for owlv2
* Add GreedyLR adaptive learning rate scheduler Add GreedyLR, a metric-based scheduler that increases LR on improvement and decreases on plateau, based on arxiv.org/abs/2512.14527. - Add GreedyLR class and get_greedy_schedule() to optimization.py - Add StreamingAverage helper for metric smoothing - Integrate with Trainer via ReduceLROnPlateau-style metric stepping - Add GREEDY to SchedulerType enum and TrainingArguments validation - Add comprehensive tests in tests/optimization/test_greedy_lr.py - Add example script and documentation * Address review comments: rename examples/greedy-lr to examples/scheduler, delete .gitignore, add trainer integration tests
* feat(ci): added a network debug report * xdist-aware for parallel runs * fix fmt * moved the hooks to tests/utils/test_network_logging.py * forgot to add the new file * use plugin approach * rename env variables * narrow public API * fix the env name in circleci
* Added Model Documentation. * Added conversion_mapping weight renamings * Added Auto Mappings. * init * Modular jina_embeddings_v3 * modular -> modeling + config * __init__.py * Created folder for tests * Added documentation for the jina-embeddings-v3 Model * Tests * Update Tests * Update Tests * Update modular * Fix failing test * scope * Update modular, Add docstring for adapter_mask * Testing * Fix failing test * Added IntegrationTests * Updated model doc date * post_init() * make style. * adapter_mask gone * Better Modular * Add conversion_mapping * Modular -> Modeling + Config * Update model doc * Update tests * Small fix * make fix-repo * fix _tied_weights_keys * self.is_causal=False * Add tie_word_embeddings in configuration class * small fix in configuration doc-string * config update * fix check_docstrings.py * ruff: Reformat * Remove extra args from config * update tests + model doc * Better, modern modular * make fix-repo * Update conversion mapping * fix dropout * Better modular * Update conversion mapping * Update tests * Update docs * Better modular * Fix license * Fix date * Better modular, Configuration * make fix-repo * Fix config * Use autodocstring * lets use auto * hmm is it this * make hf version * my bad... * retry whats up with ci * ci pls --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
#44808) * init * fix * add image processor test * add mobile_rec * fix * fix * fix code style * add mobile_rec * fix * fix toctree * update * cleanup inits and docs etc * dang * make separate auto model for text recognition --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
…enizer class on the hub (#44801) * deepseek and modernbert * deepseek v2
Fix formatting of code block in weightconverter.md
* Stacked commits * New config * good logged and silent deletion * nits * better config * review * Update example script * Test for new flag * Better benchmarking of CB * Rebase fixes * Avoid deleting an non-existing arg * No refering non existing var * Remove useless wrapper * Fix a bug where padding was not handled * Update tests * Trying to solve tests * Fix tests * style * Claude review * Exaplantions in cache paged * grammar * style2 * Review compliance * New doc * Fix docs * Fix repo consistency
Fix annotations reader for python 3.14
#44782) fix: pass device to torch.arange in XLNet relative_positional_encoding
When loading tokenizers like vesteinn/ScandiBERT whose tokenizer_config specifies XLMRobertaTokenizer (model=Unigram) but whose tokenizer.json contains a dict-type vocab, the expression vocab[0] raises KeyError because dict keys are strings, not integers. Add an isinstance(vocab, list) guard so the list-to-tuple conversion is only attempted on list vocabs.
* v5-style AFMoE impl * don't unnecessarily return router logits * inherit MoE code and refactor for stylistic consistency * remove pointless type alias * remove legacy cache reference * type and lint --------- Co-authored-by: Wing Lian <wing@axolotl.ai>
* remove from generation * update tests * more tests * fix * doc * last changes * aqlm slipped through * add bc for remote code models * anton's review * add warning
* init refactor * Fix llava * changes after review * update first batch of image processors * refactor part 2 * improve base image processor class, move backends to separate file * refactor to have backends in separate files, with backends now inheriting from BaseImageProcessor * fix docstrings * update some image processors to new refactored standards * refactor more image processors * refactor more image processors * refactor more fast image processors * refactor more image processors * refactor more image processor * improve compatibility with video processors * refactor more image processors * add more image processors, improve compatibility with video processors * support for modular * refactor modular ima proc * refactor more modular image processors * adjustments before merge * fimish image processors refactor * update docs * add fallback to Pil backend for backward compat * fix repo * Fix all processors and image processors tests * fix modular and style * fix docs * fix remote code backward compatibility + super in lists * Update docs and add new model like cli * fix processor tests * relax test tvp (used to be skipped) * fix 4 channels oneformer * Changes after review * Fixes after review * Fix tests * Change imports in modeling tests to minimize integration tests changes * fix wrong import * fix import and missing doc * fix typo PI0 * Fix all integration tests * Fix after review, enforce protected torch/torchvision imports in pil image processors (directly in modular model converter) * Fix style * Fix test modeling depth pro * Fix processing_idefics * Fixes after merge * _rescale_and_normalize -> rescale_and_normalize * fix-repo
* enable tp for benchmark Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * refine code Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com>
* update eos and q-lora-ranl * oops, wrong name for class
* Propagate the model loading from transformers serve to chat * Docs and tests * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * logigng update * Adjust docs re Marc's comment * Remove model name if too long for current console size * Refactor dual model loading w/ locks --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* init * fix doc * update * update * update * update * update * update * update * update * refactor image_processor_fast * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * upddate * update * update * update * update * update * update * update * update * update * update * update * update * small fixes * more explicit skip msg * some quick fixes * fix * quick cleanups * update * update * update * update * update * update * update * fixup after new refactor * fix * update * update * last fixups * update * remove my todos I left there --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
* added cache * added make typing * use explicit call
nemotron-config
* align to other mambas * oupsi * fix
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…44873) * Fix Qwen3.5 rope_deltas persistence causing crash in online RL training * Extend * Extend
Init with zeros instead of empty in _move_missing_keys_from_meta_to_device
* Fix MoE routers returning probabilities instead of logits * Propagate modular fix to modeling files via make fix-repo --------- Co-authored-by: Arthur <arthur.zucker@gmail.com>
* Fix unintended Hub metadata calls from _patch_mistral_regex * ruff fixes * pass local files only * Cache and fail-closed model_info call, add regression tests - Wrap is_base_mistral with lru_cache so repeated loads of the same repo id (notebooks, rollout loops, DDP workers) don't each hit the Hub. - Swallow any Hub error in model_info — a 5xx/ratelimit/network hiccup must not block tokenizer init for non-Mistral models. - Add regression tests: (a) local_files_only=True never calls model_info, (b) a Hub failure does not break _patch_mistral_regex. --------- Co-authored-by: vasqu <antonprogamer@gmail.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>
* suppress warning if int * remove override, not needed anymore * also youtu --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
…rmatting) (#45370) docs: fix docstring errors in Gemma3nTextConfig Fix five documentation errors in Gemma3nTextConfig docstring: - Typo: "emebeddings" → "embeddings" - Incomplete sentence for altup_active_idx (truncated at "or correct") - Grammar: "should be make" → "should make" in altup_num_inputs - Grammar: "number of layer" → "number of layers" in num_kv_shared_layers - Formatting: add missing backticks around type annotations for laurel_rank and activation_sparsity_pattern to match HF docstring conventions Both modular_gemma3n.py (source of truth) and the generated configuration_gemma3n.py are updated in sync. Built by Rudrendu Paul, developed with Claude Code Co-authored-by: Rudrendu <RudrenduPaul@users.noreply.github.com>
…44949) * Fix NotebookProgressCallback to allow evaluate() before and after train * Add unit test for NotebookProgressCallback evaluating before and after training * Skip NotebookProgressCallback tests when IPython is not installed * Display eval metrics when training tracker is None on NotebookProgressCallback * Add is_ipython_available and require_ipython test decorator * Filter model_preparation_time metric and add code comments in on_eval
…ock.forward (#45352) * fix(qwen3_moe): correct return type annotation on Qwen3MoeSparseMoeBlock.forward * fix: propagate Qwen3MoeSparseMoeBlock forward return type fix to generated vl_moe and omni_moe files Built by Rudrendu Paul, developed with Claude Code --------- Co-authored-by: Rudrendu <RudrenduPaul@users.noreply.github.com>
…training (#45329) * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * Apply repo consistency fixes * changes * changes * changes * changes * changes * changes * changes * changes * chore: empty commit --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix #45305 + add regression test GAS * Refine test model_accepts_loss_kwargs * fix style * Fix properly setup model_accepts_loss_kwargs+True * Update tests/trainer/test_trainer.py remove unnecessary parameters Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix: simplify error messages, back to a simpler test * feat: add new test with actual training --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
… generation (#45368) ProcessorMixin subclasses (e.g. Qwen3VLProcessor) expose the fast tokenizer at .tokenizer, not ._tokenizer. Use getattr() to handle both ProcessorMixin and PreTrainedTokenizerFast when extracting the rust tokenizer backend for DirectStreamer and CBStreamer. Fixes #45362 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…Error (#45359) Fixes #45356 Remove `kimi_k25` from `MODELS_WITH_INCORRECT_HUB_TOKENIZER_CLASS` — its remote `TikTokenTokenizer` is the only correct backend (no `tokenizer.json`, non-sequential added-token IDs that `TokenizersBackend` cannot reproduce). Also fix `_patch_mistral_regex`: the method receives the raw `tokenizers.Tokenizer` object, which has `.pre_tokenizer` directly, not `.backend_tokenizer.pre_tokenizer`. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rebased onto current main to resolve 274 commits behind - Added pytest-benchmark to _deps list and dependency_versions_table - Fixed ruff linting issues (quoted type annotations, formatting) - Resolved rebase conflicts in core_model_loading.py and modeling_utils.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds support for a more generic path, aligned with the rest of the loading!