Skip to content

Cumulative defect fixes from recent Transformers PRs#41

Open
evalstate wants to merge 607 commits intomainfrom
all-defects
Open

Cumulative defect fixes from recent Transformers PRs#41
evalstate wants to merge 607 commits intomainfrom
all-defects

Conversation

@evalstate
Copy link
Copy Markdown
Owner

@evalstate evalstate commented Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs

This PR is generated by the all-defects mergeability flow. It accumulates defect-fix PRs from huggingface/transformers that could be applied cleanly to the current base.

  • Source branch: all-defects
  • Base: evalstate/transformers:main
  • Head: ae9e74bc2a
  • PRs classified: 500
  • PRs with terminal state: 500
  • Applied/merged/already-present defect fixes: 204
  • Aborted defect fixes: 38
  • Validation failures reverted: 31
  • Non-defect skipped: 227

Status counts

  • aborted: 38
  • already_present: 53
  • applied: 5
  • merged: 146
  • skipped: 227
  • validation_failed: 31

Category counts

  • defect: 273
  • documentation: 39
  • feature: 121
  • other: 67

Validation

Each applied defect fix was followed by the configured lightweight validation profile:

  1. compileall -q src/transformers
  2. utils/checkers.py ruff_check,ruff_format,init_isort,sort_auto_mappings
  3. utils/tests_fetcher.py ... && pytest ... when impacted pytest targets are selected

Note: this is intentionally not an end-to-end or slow-test validation pass.

Details

A detailed status table is posted as a PR comment and is also available locally in:

  • .mergeability/defect-merge-state.jsonl
  • .mergeability/pr-classifications.jsonl
  • all-defects-report.md

SunMarc and others added 30 commits April 22, 2026 22:00
`flash_attention_forward` unconditionally called `s_aux.to(query.dtype)`,
which crashed with `AttributeError: 'NoneType' object has no attribute 'to'`
for models that don't use attention sinks (e.g. Gemma). Mirrors the parallel
guard added in huggingface#40434 for `flash_paged.py`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_init_weights() on `NemotronHPreTrainedModel` unconditionally overwrites
`dt_bias` (random `inv_softplus(dt)`) and `out_proj.weight` (kaiming_uniform
scaled by 1/sqrt(n_layer)) every time it is invoked on a mamba block.
It sets `module.dt_bias._no_reinit = True` after the copy, but the flag is
never checked by either code path (only the Linear-bias branch reads it).

On transformers>=5.0, `_init_weights` is triggered a second time after
`from_pretrained()` has loaded the checkpoint (the post-load safety pass
that initializes tensors staying on `meta`). For `NemotronHForCausalLM`
that silently overwrites the checkpoint values for `dt_bias` and
`out_proj.weight` with fresh random draws. The model then outputs
repetitive stop-word streams like ` and and and and ,` for any input.

Minimal repro with any Nemotron-H checkpoint:

    from transformers import AutoConfig, AutoModelForCausalLM
    from safetensors.torch import load_file
    import json, pathlib

    path = ".../NVIDIA-Nemotron-Cascade-2-30B-A3B-BF16"  # or Nano
    cfg = AutoConfig.from_pretrained(path); cfg._attn_implementation='eager'
    m = AutoModelForCausalLM.from_pretrained(path, config=cfg, torch_dtype='bfloat16')
    idx = json.loads((pathlib.Path(path) / 'model.safetensors.index.json').read_text())['weight_map']
    k = 'backbone.layers.0.mixer.dt_bias'
    on_disk = load_file(f'{path}/{idx[k]}')[k]
    in_mem  = m.backbone.layers[0].mixer.dt_bias
    print((on_disk.float() - in_mem.float().cpu()).abs().max())   # ~26.8

This patch makes `_init_weights` honour `_no_reinit` on both `dt_bias` and
`out_proj.weight` (the only two params that re-init unconditionally), and
sets `_no_reinit = True` on `out_proj.weight` after the initial kaiming
scale so a second pass is a no-op. Ordinary fresh-init training is
unaffected; only the second invocation becomes idempotent.

Signed-off-by: Min Zhou <minzhou@virtueai.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
# Conflicts:
#	src/transformers/loss/loss_utils.py
Direct merge conflicted after Trainer refactors; applied the minimal config-saving change from 57cb2b9.
@evalstate
Copy link
Copy Markdown
Owner Author

All-defects flow status

Processed terminal records: 500

Merged / applied defect records (204)

PR Status Method Validation Original PR summary / goal Merge note
huggingface#45682 merged merge passed FIX Restore LoRA hotswapping functionality restored PEFT LoRA hotswap adapter loading; validation selected no pytest targets
huggingface#45681 merged merge passed Restore TokenizersBackend override for DeepSeek V3/R1 tokenizer dispatch honored TokenizersBackend override for incorrect hub tokenizer class; validation selected no pytest targets
huggingface#45680 already_present none change got reverted same DeepSeek/TokenizersBackend dispatch fix already merged via PR 45681; current code preserves spaces and existing/merged tests cover behavior
huggingface#45678 merged merge passed Fix shared config mutation issue in flash_attn_from_config deep-copied config in flash attention common test helper to avoid mutating shared config; validation selected no pytest targets
huggingface#45675 merged merge passed Fix UnboundLocalError in shard_and_distribute_module for replicated parameters guarded tensor-parallel attribute update when no shard plan exists; validation selected no pytest targets
huggingface#45671 merged merge passed Update latest revision for Phi-4-multimodal test updated Phi-4 multimodal test revisions to checkpoint state compatible with current code; validation selected no pytest targets
huggingface#45670 merged merge passed [nit] glmasr should be in AutoModelForMultimodalLM added missing glmasr/dia auto-model mappings; validation selected no pytest targets
huggingface#45665 already_present none Fix pageable H2D copies in Gated DeltaNet PyTorch fallback fix already present in current cumulative branch as upstream squashed commit ca72aa0; cherry-pick of PR fix commit was empty
huggingface#45662 applied cherry-pick passed Fix EP + FSDP2: experts silently overwritten by rank-0 broadcast applied EP/FSDP2 DTensor wrapping and FSDP ignored-module fixes; validation selected no pytest targets
huggingface#45658 merged merge passed Fix triggered by fixed PeftConfigLike import/annotation so subclass initialization no longer raises NameError; validation selected no pytest targets
huggingface#45655 merged merge passed Fix the order of cls.config resolution fixed cls.config_class lookup order for loading VLM checkpoints into LLM classes; validation selected no pytest targets
huggingface#45651 merged merge passed [Trainer] Optimize LengthGroupedSampler computation with select_columns and tqdm optimized LengthGroupedSampler length extraction to avoid large-dataset iteration bottleneck; validation selected no pytest targets
huggingface#45650 merged merge passed Fix KeyError for flash_attn in import_utils.py on Python 3.13Fix fixed flash_attn availability lookup to avoid KeyError on Python 3.13; validation selected no pytest targets
huggingface#45649 merged merge passed Fix OOM regression for FSDP2 + cpu_ram_efficient_loading on large models avoided full zero tensor allocation for FSDP2 cpu_ram_efficient_loading missing keys to prevent OOM; validation selected no pytest targets
huggingface#45642 merged merge passed Fix trust_remote_code local cache collisions for local models (huggingface#45632) fixed trust_remote_code local cache key collisions for local models; validation selected no pytest targets
huggingface#45641 merged merge passed Fix NameError in serving CLI due to conditional import asymmetry fixed serving CLI conditional import NameError paths; validation selected no pytest targets
huggingface#45639 merged merge passed Make patched testing debug logs xdist-safe made patched testing debug logs xdist-safe; validation selected no pytest targets
huggingface#45628 merged merge passed [MistralCommonBackend] Soften validation mode and apply_chat_template arguments check merged MistralCommonBackend validation softening; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45627 merged merge passed Processing Utils: honor pre-built sub-processor kwargs in from_pretrained merged AutoProcessor sub-processor kwarg fix; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45625 merged merge passed Add supports_gradient_checkpointing to NemotronHPreTrainedModel merged NemotronH gradient-checkpointing support flag; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45624 merged merge passed Skip failing offloading tests merged Gemma4 offloading test skips; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45622 merged merge passed Fix peft constructors merged PEFT constructor fix; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45620 already_present none not_run Fix TypeError in video_processor_class_from_name when torchvision is not installed current branch already has qwen3_omni mapping and video_processor_class_from_name compares directly against possibly-None values, so the TypeError fixed by the PR is no longer pre…
huggingface#45619 merged merge passed Remove unnecessary generate warnings merged generation config warning cleanup; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45615 merged merge passed fix(qianfan_ocr): add XPU expectations merged XPU expectations for Qianfan OCR integration tests; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45611 merged merge passed Raise clear error for problem_type="single_label_classification" with num_labels=1 merged config validation for degenerate single-label classification; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45610 already_present none not_run Fix configuration reading and error handling for kernels current branch already contains the FP8 kernel error-handling and eager config-read fixes; direct PR-head merge conflicted only because the older branch lacks the newer laguna con…
huggingface#45606 merged merge passed [gemma4] infer from config instead of hardcoding merged Gemma4 audio relative-position config inference; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45605 merged merge passed Processing Utils: continue when content is a string merged processor string-content guard; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45603 merged merge passed [Auto] Pass kwargs to fixed_cross_entropy (cluster-43240-3): merged 2 of 2 PRs merged fixed_cross_entropy kwargs forwarding fix; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45602 merged merge passed [AMD CI] Fix expectations for Gemma3n merged Gemma3n AMD CI expectation fix; compile/lint passed; tests_fetcher selected no pytest targets
huggingface#45601 merged merge passed fix: compute auxiliary losses when denoising is disabled in D-FINE D-FINE auxiliary losses now compute when denoising is disabled; validation selected no pytest targets
huggingface#45598 merged merge passed Align latest model attention function dispatch aligned model attention dispatch methods; validation selected no pytest targets
huggingface#45596 merged merge passed fix 2 failed test cases for blt model on XPU updated BLT XPU test expectations; validation selected no pytest targets
huggingface#45594 already_present none passed fix(utils): Resolve backbone utils test regressions backbone utility test regression fix already present in cumulative branch
huggingface#45592 merged merge passed fix padding side issue for fast_vlm tests merged FastVLM padding-side test expectation fix; validation selected no pytest targets
huggingface#45591 merged merge passed [nemotron_h] respect _no_reinit flag on dt_bias and out_proj.weight merged Nemotron-H idempotent initialization guard; validation selected no pytest targets
huggingface#45590 already_present none passed fix huggingface#45588: guard s_aux against None in flash_attention_forward s_aux None guard is already present in current flash_attention.py; PR-head merge only conflicted on equivalent formatting
huggingface#45589 merged merge passed Fix AttributeError on s_aux=None in flash_attention_forward merged traceable flash_attention s_aux None guard; validation selected no pytest targets
huggingface#45582 merged merge passed generate: drop stale num_return_sequences warning on continuous batching path merged continuous batching num_return_sequences warning fix; validation selected no pytest targets
huggingface#45578 merged merge passed Remove attribute_map from GptOssConfig removed GptOssConfig attribute_map so checkpoint num_local_experts is not clobbered; validation selected no pytest targets
huggingface#45575 already_present none fix(generation): remove stale warning for num_return_sequences in paged generate same stale num_return_sequences warning removal is already present via PR 45582; direct merge conflicted only on equivalent num_beams warning formatting in src/transformers/genera…
huggingface#45573 merged merge passed fix transformers + torchao nvfp4 serialization merged torchao NVFP4 serialization fix; validation selected no pytest targets
huggingface#45570 merged merge passed Fix whisper long-form generation when eos_token_id is a list merged Whisper eos_token_id list handling fix; validation selected no pytest targets
huggingface#45568 merged merge passed Gemma4: fix failed test cases merged Gemma4 failing-test fixes; validation selected no pytest targets
huggingface#45566 merged merge passed fix: raise clear error when tokenizer config uses v5 list format on older versions merged clearer tokenizer config version-mismatch error; validation selected no pytest targets
huggingface#45565 already_present none fix: remove stale num_return_sequences warning in paged generate same stale num_return_sequences warning removal was already integrated via earlier PR 45582/45575 handling; no code change needed
huggingface#45564 merged merge passed Gemma3n and Gemma4 cannot use rotary kernel merged Gemma3n/Gemma4 rotary-kernel disablement; validation selected no pytest targets
huggingface#45562 merged merge passed Updated the image cache for Paddle models according to the latest API merged Paddle model cache/test update; validation selected no pytest targets
huggingface#45559 already_present none Drop noisy generate warnings when do_sample=False (or num_beams=1) intended generation warning provenance fix is already present in src/transformers/generation/configuration_utils.py; direct PR head also contains broad unrelated churn, so no merg…
huggingface#45552 merged merge passed Remove warnings for modernbert merged ModernBERT auto-docstring warning suppression; validation selected no pytest targets
huggingface#45549 merged merge passed fix: apply channel averaging correctly in audio feature extractors merged channel-averaging correction in audio feature extractors; validation selected no pytest targets
huggingface#45548 merged merge passed Fix EP + DeepSpeed ZeRO-3 loading via accelerate launch merged Expert Parallelism plus DeepSpeed ZeRO-3 loading fix; validation selected no pytest targets
huggingface#45547 already_present none passed Add disable_mmap kwarg to from_pretrained with hf-mount auto-detection disable_mmap safetensors FUSE-loading fix already present in cumulative branch; baseline validation selected no pytest targets
huggingface#45544 already_present none passed fix table update versions Python 3.10 dependency-table update fix already present in cumulative branch; baseline validation selected no pytest targets
huggingface#45541 merged merge passed Fix local_files_only tokenizer fallback when tokenizer files are missing (Issue 45538) merged local_files_only tokenizer fallback fix; validation selected no pytest targets
huggingface#45540 merged merge passed Fix cross-attention cache layer type for T5Gemma2 long inputs merged T5Gemma2 long-input cross-attention cache fix; validation selected no pytest targets
huggingface#45539 merged merge passed [modular] Fix modular logic broken in huggingface#45045 merged modular converter inheritance regression fix; validation selected no pytest targets
huggingface#45533 already_present none passed Fix AMD CI: rebuild torchvision with libjpeg + refresh expectations AMD CI torchvision/libjpeg fix and refreshed expectations already present in cumulative branch; baseline validation passed lint-only
huggingface#45530 merged merge passed [CB] Changes for long generation merged continuous-batching long-generation memory/cache fixes; validation selected no pytest targets
huggingface#45528 already_present none passed qa: re-run modular converter when the script itself is modified modular-converter checker rerun fix already present in cumulative branch; baseline validation passed lint-only
huggingface#45526 already_present none passed xpu output align with cuda in test case InternVL XPU expectation fix already present in cumulative branch; baseline validation passed lint-only
huggingface#45525 already_present none passed Fix CSM TextToAudioPipeline missing <bos> token CSM text-to-audio add_special_tokens pipeline fix already present in cumulative branch; baseline validation passed lint-only
huggingface#45523 merged merge passed Fix Seq2SeqLM ExecuTorch export: add encoder_attention_mask to decoder and use static encoder shapes merged seq2seq ExecuTorch encoder_attention_mask/static-shape export fix; validation selected no pytest targets
huggingface#45514 already_present none passed Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models GraniteMoeHybrid attention-only mamba-mask fix was already present in the cumulative branch; attempted merge produced no tree changes, and compileall/checkers passed before tests_…
huggingface#45513 already_present none passed [Qwen3.5] Fix GDN linear attention multi-token cached forward GDN linear-attention multi-token cached-forward fix was already present in cumulative branch; attempted merge produced no tree changes
huggingface#45510 merged merge passed cache_utils: fix QuantizedLayer to correctly propagate reorder_cache, crop, and batch ops to quantized buffers merged QuantizedLayer cache mutation propagation fix; validation selected no pytest targets
huggingface#45501 applied patch passed .4021378118068288:e3506c3c5a98ec3a50332c6102362804_69e2e58842c82e665fe03f65.69e2e58c42c82e665fe03f69.69e2e58b8b8bd167e2faa23e:Trae CN.T(2026/4/18 09:59:40) applied text-classification prediction label output fix; validation selected no runnable pytest targets
huggingface#45499 applied patch passed .4021378118068288:94a295563cf6b5aa7d67bd0f2c0cd7a7_69e2df7342c82e665fe03edd.69e2df7542c82e665fe03ee1.69e2df758b8bd167e2faa23c:Trae CN.T(2026/4/18 09:33:41) applied GLUE id2label mapping fix; validation selected no runnable pytest targets
huggingface#45498 already_present none passed .4021378118068288:98296403e0cd6dedb7b420b80d0fe80b_69e2ce7c42c82e665fe03e0c.69e2ce9842c82e665fe03e10.69e2ce988b8bd167e2faa239:Trae CN.T(2026/4/18 08:21:44) text-classification id2label prediction-output fix was already present from PR 45501 patch application
huggingface#45486 already_present none passed fix: return empty tuple from import_protobuf_decode_error when protobuf is unavailable protobuf decode error masking fix was already present in the cumulative branch; attempted merge produced no tree changes
huggingface#45483 already_present none passed [Conversion Mapping] Small fixups conversion mapping fixups were already present in the cumulative branch; direct merge conflicted only with newer adjacent mapping entries
huggingface#45466 already_present none passed fix: return empty tuple when protobuf not available protobuf decode error handling fix was already present; merge conflicted only on adjacent explanatory comments
huggingface#45444 merged merge unavailable [fix] Always early return for non-Mistral models in _patch_mistral_regex merge clean; compileall and style passed; light validation unavailable (baseline light validation failed and post-merge run exceeded tool timeout)
huggingface#45443 merged merge unavailable Raise 400 on model mismatch when transformers serve is pinned merge clean; compileall and style passed; light validation unavailable (baseline light validation failed and post-merge light run timed out)
huggingface#45441 merged merge unavailable fix(DSV3): parity between native DeepseekV3MoE and remote official implementation merge clean; compileall and style passed; light validation unavailable (baseline light validation failed and retry timed out)
huggingface#45437 merged merge unavailable Fix spurious position_ids warnings for at least 40 architectures merge clean; compileall and style passed; light validation unavailable (baseline light validation failed and post-merge light run timed out)
huggingface#45435 merged merge unavailable do not index past decoded chars with special tokens merge clean; compileall and style passed; light validation unavailable due same generated stdin SyntaxError seen in baseline after selecting tests
huggingface#45428 merged merge unavailable [fix] PEFT integration fixes preventing save/load & integration merge clean; compileall and checkers passed; light validation unavailable due baseline generated stdin SyntaxError and post-merge light run exceeded tool timeout
huggingface#45427 merged merge unavailable fix(testing_utils): guard get_device_capability() with torch.cuda.is_available() merge clean; compileall and checkers passed; light validation unavailable due baseline generated stdin SyntaxError and post-merge light run timed out
huggingface#45423 merged merge unavailable Fix void segmentation map label reduction merge clean; compileall and checkers passed; light validation unavailable (baseline generated stdin SyntaxError; post-merge tests_fetcher checkout blocked by transient untracked g…
huggingface#45422 merged merge unavailable Drop content=None from messages in apply_chat_template merge clean; compileall and checkers passed; light validation unavailable (baseline light validation timed out and post-merge light validation timed out)
huggingface#45421 merged merge unavailable Improve nested base_model_prefix handling in weight conversion and loading merge clean; compileall and checkers passed; light validation unavailable (baseline and post-merge light validation timed out)
huggingface#45420 merged merge unavailable 🚨 [Kernels] Fix kernel function registration merge clean; compileall and checkers passed; light validation unavailable (baseline and post-merge light validation timed out)
huggingface#45418 already_present none unavailable [serve] Forward tool_calls/tool_call_id in processor inputs fix already present in cumulative branch: BaseHandler forwards tool_calls and tool_call_id and has matching LLM/VLM tests; direct PR merge only conflicted with newer raw-content/a…
huggingface#45414 merged merge unavailable Fix IndexError with DeepSpeed ZeRO-3 when kernels rotary is active merge completed with straightforward conflict resolution preserving newer hidden-kernel registration plus PR DeepSpeed ZeRO-3 guard; compileall and checkers passed; light validati…
huggingface#45413 merged merge unavailable Fix EtaLogitsWarper on fully masked logits merge clean; compileall and checkers passed; light validation unavailable due environment (baseline timed out; post-merge selected targets but failed importing tests/cli/conftest.…
huggingface#45411 merged merge unavailable Fix the response schema for the gemma4 converter merge clean; compileall and checkers passed; light validation unavailable (baseline and post-merge light validation timed out)
huggingface#45410 merged merge unavailable fix(altclip): fix failing tests merge clean; compileall and checkers passed; light validation unavailable (baseline and post-merge light validation timed out)
huggingface#45407 merged merge unavailable avoid wrap 4bit-quantized model into DP merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45404 merged merge unavailable [Tokenizers] Move gpt sw3 tokenizer out merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45403 merged merge unavailable fix(clipseg): fix 2 failing tests merge clean; compileall and checkers passed; light validation unavailable after baseline timeout (initial post-merge run also hit transient untracked files from tests_fetcher chec…
huggingface#45402 merged merge unavailable Fix ZeRO-3 from_pretrained: load registered buffers in _load_state_dict_into_zero3_model merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45400 merged merge unavailable Fix Qwen2.5VL temporal grid positions merge completed with simple test expectation conflict resolution; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45395 already_present none not_run Fix IndexError with DeepSpeed ZeRO-3 when kernels rotary is active fix already present from later PR huggingface#45414: use_kernelized_func skips registration under DeepSpeed ZeRO-3; current branch also preserves newer hidden-kernel registration
huggingface#45394 merged merge unavailable fix(x_clip): fix 8 failed test cases merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45389 merged merge unavailable Require input_ids for repetition penalty merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45388 merged merge unavailable Make Gemma4ClippableLinear inherit from nn.Linear for PEFT/LoRA compatibility merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45386 already_present none not_run [GGUF] Reduce peak RAM usage by casting dequantized tensors early during load PR merged as an empty tree change because the GGUF early dtype-casting changes are already present in the cumulative branch
huggingface#45385 merged merge unavailable Ignore CLIP position_ids in unexpected key loading report merge clean; compileall and checkers passed; light validation unavailable after baseline timeout
huggingface#45383 already_present none not_run fix(processing): guard message content access in apply_chat_template current processing_utils already uses message.get('content') or [] in tokenize=True and additionally handles string content, so the missing-content KeyError fix is present despite…
huggingface#45380 merged merge unavailable fix Qwen3_5MoeVisionConfig deepstack_visual_indexes silently dropped by @strict (Issue: huggingface#45375) merged Qwen3.5 deepstack_visual_indexes config fix; compile and ruff passed, light validation timed out as in baseline
huggingface#45379 already_present none not_run fix(config): add deepstack_visual_indexes to Qwen3_5MoeVisionConfig same issue huggingface#45375 fixed by preceding PR 45380; current Qwen3.5/Qwen3.5-MoE configs already declare deepstack_visual_indexes, and direct merge only conflicted on duplicate generate…
huggingface#45371 already_present none not_run fix: check CUDA availability before calling get_device_capability current get_device_properties already guards CUDA/ROCm capability lookup with torch.cuda.is_available(), so the no-GPU CUDA crash fix is present despite textual merge conflict
huggingface#45369 merged merge unavailable fix(generation): handle CUDA multinomial limit in beam search sampling merged CUDA multinomial category-limit guard; compile and ruff passed, light validation timed out as in baseline
huggingface#45368 already_present none not_run fix(serving): resolve rust tokenizer from ProcessorMixin in streaming generation current serving utils already resolves the Rust tokenizer through getattr(processor, "tokenizer", processor)._tokenizer while preserving newer tool-call streaming and has_talker h…
huggingface#45359 merged merge unavailable Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex AttributeError merge clean; compileall and checkers passed; light validation timed out as in baseline
huggingface#45358 already_present none not_run Fix vlm weight mappings current conversion_mapping already contains the VLM mapping fixes for llava/llava_next/fuyu/mllama/emu3/qwen2_vl-style models; direct merge conflicted only with newer clip model a…
huggingface#45354 merged merge unavailable fix gemma4 gradient accumulation loss and last token incorrect labels merge clean after trivial conflict resolution preserving newer Gemma3n embedding accessors; compileall and checkers passed; light validation timed out as in baseline
huggingface#45352 merged merge unavailable fix(qwen3_moe): correct return type annotation on Qwen3MoeSparseMoeBlock.forward merge clean; compileall and checkers passed; light validation timed out as in baseline
huggingface#45351 merged merge unavailable fix(testing_utils): guard get_device_capability with torch.cuda.is_available() merge clean; compileall and checkers passed; light validation timed out as in baseline
huggingface#45348 already_present none not_run Fix apply_chat_template crash on tool_call messages without content current branch already handles missing or null message content in processing_utils and serving conversion and includes broader tool-call content coverage; direct merge conflicted …
huggingface#45347 merged merge unavailable [gemma4] Fix device map auto merge completed with practical conflict resolution for newer Gemma4 class/test changes; compileall and checkers passed after removing duplicate generated class fields; light valid…
huggingface#45346 merged merge unavailable Fix Double Application of Softmax for Router Logits in MoE models merge clean; PR head only added regression tests because implementation fix was already present; compileall and checkers passed; light validation timed out as in baseline
huggingface#45345 merged merge unavailable Fix ByteLevel-BPE tokenizers silently breaking in merge clean; compileall and checkers passed; light validation timed out as in baseline
huggingface#45340 already_present none not_run Fix conversion mappings for vlms current branch already contains upstream merge commit for huggingface#45340; direct PR-head merge conflicted with newer conversion mapping and WeightTransform refactors
huggingface#45336 already_present none not_run [gemma4] Remove all shared weights, and silently skip them during loading current branch already contains upstream merge commit for huggingface#45336
huggingface#45330 already_present none not_run Fix Qwen2.5-VL temporal RoPE scaling applied to still images current branch already contains upstream merge commit for huggingface#45330
huggingface#45328 already_present none not_run Update Gemma4 weight conversion script current branch already contains upstream merge commit for huggingface#45328
huggingface#45324 already_present none not_run Gemma4 resizing per layer inputs current branch already contains upstream merge commit for huggingface#45324
huggingface#45323 already_present none not_run [CB] Fix capture of max_seqlen current branch already contains upstream merge commit for huggingface#45323
huggingface#45320 merged merge unavailable Fix AttributeError in AssistantToTargetTranslator.unmap_input_ids with cross-vocab models merge clean; validation unavailable because light validation tests_fetcher cannot checkout commits with pre-existing untracked files in worktree
huggingface#45318 merged merge unavailable fix: leak in tokenizer registry for test_processors merge clean; validation unavailable because light validation tests_fetcher cannot checkout commits with pre-existing untracked files in worktree
huggingface#45317 merged merge unavailable Fix AttributeError in _patch_mistral_regex when fix_mistral_regex=True merge clean; validation unavailable because light validation tests_fetcher cannot checkout commits with pre-existing untracked files in worktree
huggingface#45316 merged merge unavailable Logger has [transformers] prefix in non-verbose mode merge clean; validation unavailable because light validation tests_fetcher cannot checkout commits with pre-existing untracked files in worktree
huggingface#45311 merged merge unavailable resize_token_embeddings does not effect to output_embeddings merge clean; validation unavailable because light validation tests_fetcher cannot checkout commits with pre-existing untracked files in worktree
huggingface#45302 merged merge unavailable fix(security): prevent untrusted users from triggering TRL CI dispatch merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45300 merged merge unavailable Fix Nemotron-H: add mlp layer type support merge completed after resolving mechanical conflicts from current layer_types API/NemotronHMLP base; compile and repository checkers passed, but light validation unavailable due t…
huggingface#45297 merged merge unavailable Fix mutable default arguments in quantization config classes merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45293 merged merge unavailable Fix "AttributeError: NewTokenizer has no attribute special_attribute_present" (Remove REGISTERED_FAST_ALIASES) merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45289 merged merge unavailable Less unnecessary RoPE warnings merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45286 merged merge unavailable fix(nomic_bert): auto-fix failing tests merge completed with practical conflict resolution; compile and repository checkers passed, but light validation was unavailable because tests_fetcher checkout was blocked by tran…
huggingface#45284 merged merge unavailable [AMD CI] Fix Qwen2 expectations merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45282 merged merge unavailable [AMD CI] Fix torch.compile/export failures on AMD CI due to untraceable set.contains merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45281 merged merge unavailable Fix resize failure caused by zero-sized masks in PP-DocLayoutV3 merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45277 merged merge unavailable Fix AttributeError in Gemma3ForConditionalGeneration and Gemma3ForSequenceClassification when config.return_dict=False merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45275 applied patch unavailable fix(ernie4_5_vl_moe): resolve three config loading failures for ERNIE-4.5-VL MoE models ported ERNIE-4.5-VL MoE legacy model_type mapping and moe_num_experts list typing; compile and checkers passed, light validation unavailable due missing requests dependency
huggingface#45273 merged merge unavailable fix: liger unnecessarily materializes logits in VRAM during eval, causing OOM merge clean; compile and repository checkers passed, but light validation timed out as in baseline under tool limit
huggingface#45272 merged merge unavailable Fix redundant logic in video processing SmolVLM merge clean; compile and repository checkers passed, but light validation unavailable due missing requests dependency
huggingface#45691 merged merge unavailable [serve] cb error merge clean; compile and repository checkers passed, but light validation unavailable because baseline pytest collection fails without requests dependency
huggingface#45263 merged merge unavailable Add hasattr(torch.backends.cudnn, "conv") to conftest.py merge clean; compile and repository checkers passed, but light validation unavailable because baseline pytest collection fails without requests dependency
huggingface#45257 merged merge unavailable [Gemma4] Fix chat template and stop tokens for OpenAI tool calling compatibility merge clean; compile and repository checkers passed, but light validation unavailable because baseline pytest collection fails without requests dependency
huggingface#45253 merged merge unavailable Fix Gemma4 producing bad logits merge clean; compile and repository checkers passed, but light validation timed out as it did in baseline validation
huggingface#45252 merged merge unavailable Fix unexpected TF32 being enabled in testing merge clean; compile and repository checkers passed, but light validation timed out as it did in baseline validation
huggingface#45247 merged merge unavailable Fix UnboundLocalError in invert_attention_mask by adding proper shape… merge clean; compile and repository checkers passed, but light validation timed out as it did in baseline validation
huggingface#45240 merged merge unavailable fix: restore mypy type checking for PreTrainedConfig subclasses (huggingface#45071) merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45238 merged merge unavailable Update get_test_info.py (related to tiny model creation) merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45236 merged merge unavailable resize_token_embeddings does not resize lm_head merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45226 already_present none not_run fix: handle trailing replacement character in Whisper word timestamp decoding Whisper trailing replacement-character bounds check is already present on the cumulative branch
huggingface#45225 merged merge unavailable fix: hf-doc-builder insallation was failing merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45224 merged merge unavailable remove unnecessary entries in some auto model mappings merge clean after trivial mapping conflict resolution preserving current dia entry; compile and repository checks passed, light validation timed out as in baseline
huggingface#45223 merged merge unavailable Fix: ObjectDetectionPipeline batch inference only returns first image results merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45222 already_present none not_run fix(gemma3, gemma4): default token_type_ids to zeros for text-only training current Gemma3/Gemma4 mask code no longer raises when token_type_ids/mm_token_type_ids are absent during training, so the text-only training fix is already subsumed
huggingface#45221 merged merge unavailable user friendly error when loading audio from video merge clean after preserving current timeout plumbing while adding video-file error; compile and repository checks passed, light validation timed out as in baseline
huggingface#45214 merged merge unavailable cohere_asr: fix bug for model_parallel_beam_search test case merge clean; compile and repository checks passed, light validation timed out as in baseline
huggingface#45211 merged merge unavailable [Qwen3MoE] Fix wrong return type annotation in Qwen3MoeSparseMoeBlock.forward merge clean; compileall and repository checkers passed; light validation timed out as in baseline
huggingface#45210 already_present none not_run Fix pypi release release workflow twine check/verbose publish and hf-doc-builder dependency-table changes are already present; current Python-version guard supersedes the older conflicting guard
huggingface#45204 merged merge unavailable fix bug for videomt model device mismatch merge clean; compileall and repository checkers passed; removed transient untracked docs that blocked tests_fetcher checkout; light validation then timed out as in baseline
huggingface#45202 merged merge unavailable Fix gemma4 has flash-attention incompatbile head-dim=512 merge clean; compileall and repository checkers passed; light validation selected 458 targets but failed before tests because validation environment lacks requests
huggingface#45199 merged merge unavailable fix(models): Resolve regressions in Wav2Vec2PhonemeCTCTokenizer (wav2vec2-lv-60-espeak-cv-ft) merge clean; compileall and repository checkers passed; light validation timed out as in baseline before reporting selected targets
huggingface#45193 merged merge unavailable Config can apply pyndatic validation without torch-dependence merge clean; compileall and checker commands passed; light validation unavailable because selected CLI tests require missing dependency requests in validation environment
huggingface#45188 merged merge unavailable fix test_register_result_handler merge clean; compileall and checker commands passed; light validation unavailable because tests_fetcher could not checkout parent with pre-existing untracked files in worktree
huggingface#45185 merged merge unavailable Generalize gemma vision mask to videos merge clean; compileall and checker commands passed; light validation unavailable because tests_fetcher checkout was blocked by pre-existing untracked files
huggingface#45173 merged merge unavailable [misc] fix qwen35 tests: correct the text model type and skip reverse_mapping merge clean; compileall and checker commands passed; light validation unavailable because tests_fetcher checkout was blocked by pre-existing untracked files
huggingface#45171 already_present none not_run Fix Sam3Processor missing input_boxes_labels for padded None entries Sam3Processor default input_boxes_labels generation and regression tests for padded None entries are already present on the cumulative branch
huggingface#45169 merged merge unavailable Fix explicit local code resolution for tokenizers and image processors merge clean; compileall and checker commands passed, but light validation unavailable because tests_fetcher cannot checkout commits while pre-existing untracked source files would…
huggingface#45166 already_present none not_run Re-add regex substitutions to the response parsing spec response parsing support for x-regex-substitutions and corresponding tests are already present on the cumulative branch; direct PR-head merge only conflicted with later added Gemm…
huggingface#45165 already_present none not_run Fix missing image processors backends image processor backend fixes from PR huggingface#45165 are already present on the cumulative branch
huggingface#45164 already_present none not_run Fix TypeError: 'NoneType' object is not iterable in GenerationMixin.generate GenerationMixin now guards None layer_types with an empty iterable; PR huggingface#45164 is already present on the cumulative branch
huggingface#45163 already_present none not_run tweak checkers output on errors checker traceback/error-output changes and tests from PR huggingface#45163 are already present on the cumulative branch
huggingface#45156 merged merge unavailable Fix save_pretrained writing incorrect tie_word_embeddings=True config after PEFT merge merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45147 merged merge unavailable Fix broken HQQ support merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45140 merged merge unavailable Fix stupid test fetcher merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45138 merged merge unavailable CI] Small T5 expectations updated merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45136 already_present none unavailable Fix huggingface#45127: Auto-fix diverged tie_word_embeddings config on save to prevent silent weight corruption the cumulative branch already contains the save_pretrained diverged tied-embedding config fix and regression test; direct PR merge conflicts only with the already-present equivale…
huggingface#45131 merged merge unavailable Fix MoE routers returning probabilities instead of logits merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45129 already_present none unavailable fix(config): annotate PreTrainedConfig.dtype as Any to fix pydantic schema generation (huggingface#45070) current cumulative branch already annotates PreTrainedConfig.dtype as Any via dtype_validator, covering the pydantic forward-reference fix; PR conflicts only against this evolved …
huggingface#45124 merged merge unavailable [Qwen3.5 MoE] Add _tp_plan to ForConditionalGeneration merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45123 merged merge unavailable Fix PP test_ocr_queries merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45119 already_present none unavailable Fix: Preserve PreTrainedConfig init signatures for type checkers (fixes huggingface#45071) fix code already present in cumulative branch via preserved signature in wrap_init_to_accept_kwargs; PR merge only conflicted on tests adjacent to existing config-subclass cov…
huggingface#45117 merged merge unavailable Copy the template resolution logic from the base apply_chat_template to Voxtral merge clean; compileall/checkers passed, but light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files also present at baseline
huggingface#45111 already_present none unavailable Fix double softmax in MoE router load-balancing loss PR head is already an ancestor of the cumulative branch; no code change needed
huggingface#45108 merged merge unavailable Fix type to allow merge clean; validation unavailable: run-light-validation cannot checkout comparison commits because existing untracked files would be overwritten
huggingface#45107 merged merge unavailable Fix text-to-speech pipeline crash when generation config contains None values merge clean; validation unavailable: run-light-validation cannot checkout comparison commits because existing untracked files would be overwritten
huggingface#45098 merged merge unavailable fix: incomplete string literal causes syntax error in config docstring checker merge clean; validation unavailable: run-light-validation cannot checkout comparison commits because existing untracked files would be overwritten
huggingface#45091 merged merge unavailable Fix _get_feat_extract_output_lengths in qwen3_omni_moe merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45090 merged merge unavailable Fix TypeError when chat_template is None in VoxtralProcessor merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45089 already_present none unavailable fix: use sys.modules.get() to avoid KeyError in modeling_utils fix already present: current modeling_utils already uses sys.modules.get(cls.module) and checks class_module is None in both attention and experts implementation probes; attem…
huggingface#45088 merged merge unavailable fix audio encoder output length formula in qwen3_omni_moe merge clean; changes overlapped with already-merged Qwen3 Omni MoE length fix; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout…
huggingface#45086 merged merge unavailable fix AttributeError in _patch_mistral_regex merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45085 merged merge unavailable Fix dtype mismatches in SwitchTransformers and TimmWrapperModel for bfloat16 merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45080 merged merge unavailable Fix PreTrainedConfig as Pydantic field type after dataclass conversion merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45079 already_present none unavailable Fix resized LM head weights being overwritten by post_init intended LM head initialization marker is already present in modeling_utils.py; PR head conflicts only on direct assignment versus setattr spelling
huggingface#45078 already_present none unavailable throw error when conversion required current tokenizer auto dispatch already contains PR behavior: fallback to TokenizersBackend then raises when conversion backend is unavailable; direct PR head also includes unrela…
huggingface#45077 already_present none unavailable fix: pin 50 unpinned actions to commit SHA, extract 1 secret to env var workflow hardening is already present on the cumulative branch: actions are pinned to SHAs; direct merge conflicts are formatting/comment spacing across updated workflow files
huggingface#45074 merged merge unavailable fix(models): Fix dtype mismatch in SwitchTransformers and TimmWrapperModel merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45069 merged merge unavailable Fix TypeError in rope validation when ignore_keys is a list merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45062 merged merge unavailable Add regression test for ByteLevel added-token Unicode decode corruption merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45061 merged merge unavailable [FA] Fix BC support for a few versions + add deprecation cycle merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45060 merged merge unavailable Fix PIL backend fallback when torchvision is unavailable merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45057 already_present none unavailable [serving] Fix continuous batching JSON response serialization fix already present after serving code was split into src/transformers/cli/serving: non-streaming JSONResponse uses model_dump and regression test exists; direct merge conflicted …
huggingface#45056 merged merge unavailable [auto_docstring] needs to be only run on doc merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files
huggingface#45055 applied patch unavailable Save model config in Trainer checkpoints for non-PreTrainedModel models direct merge conflicted in src/transformers/trainer.py after Trainer refactors; applied minimal config save in current _save path; compileall and repository checks passed; light v…
huggingface#45053 merged merge unavailable Fix failing XCLIPModelIntegrationTest merge clean; compileall and repository checks passed; light validation unavailable because tests_fetcher checkout is blocked by pre-existing untracked files from earlier cumulativ…

Rejected / not included records (296)

PR Category Status Original PR summary / goal Rejection reason
huggingface#45679 other skipped TST Run fast PEFT tests in normal CI not a defect fix
huggingface#45677 other skipped No serving in quality docker image not a defect fix
huggingface#45673 feature skipped Laguna XS.2 implementation not a defect fix
huggingface#45669 documentation skipped zero_shot_object_detection ValueError fix for python 3.13 not a defect fix
huggingface#45668 feature skipped [GGUF] Add support for Qwen3.5 MoE (qwen35moe arch) not a defect fix
huggingface#45667 other skipped chore(typing): add ty type checking for 3 pipeline files not a defect fix
huggingface#45666 feature skipped Extended n-to-1 kernel fusion via KernelConfig not a defect fix
huggingface#45664 documentation skipped Doc translate to Persian(farsi) not a defect fix
huggingface#45661 feature skipped [Weight Converter] More fine-grained mappings on classes, scoping for every transforms (including weight converter) not a defect fix
huggingface#45660 documentation skipped [docs] cpu offloading not a defect fix
huggingface#45659 documentation skipped [docs] dtype not a defect fix
huggingface#45654 feature skipped [CB] Refactor any model-related code in a separate class not a defect fix
huggingface#45653 feature skipped [CB] Better overall script and decode bucketting not a defect fix
huggingface#45652 other skipped Fix colmodernvbert tests not a defect fix
huggingface#45648 other skipped Fix SDPA inference tolerances for MPS backend not a defect fix
huggingface#45645 other skipped Fix xdist collisions for captured_info artifacts and preserve CI debug logs not a defect fix
huggingface#45643 feature skipped Add DeepSeek V4 not a defect fix
huggingface#45640 feature skipped 🚨🚨🚨 [Trainer] Default to FSDP2, simplify API around fsdp + fsdp_config not a defect fix
huggingface#45638 feature skipped Add Multi-Token Prediction (MTP) support for Qwen3.5 not a defect fix
huggingface#45637 feature skipped Add Multi-Token Prediction (MTP) support for Qwen3.5 not a defect fix
huggingface#45635 other skipped qa: speed up dtype regex weight load + reduce dtype tests to 3 random not a defect fix
huggingface#45634 other skipped DeepGEMM BF16, isolation, refactor not a defect fix
huggingface#45633 other skipped CircleCI with torch 2.11 not a defect fix
huggingface#45631 other skipped chore: bump doc-builder SHA for main doc build workflow not a defect fix
huggingface#45630 feature skipped Add new model: Kimi2-6 not a defect fix
huggingface#45629 other skipped Allow more artifacts to be download in CI not a defect fix
huggingface#45626 feature skipped [Model] Add PP-FormulaNet Model Support not a defect fix
huggingface#45623 other skipped Glm5 change not a defect fix
huggingface#45621 feature skipped Better Grouped GEMM + EP not a defect fix
huggingface#45618 feature skipped Add MTP speculative decoding via MTPCandidateGenerator not a defect fix
huggingface#45617 feature skipped Add Multi-Token Prediction (MTP) inference support not a defect fix
huggingface#45616 feature skipped Add DeepSeek V4 not a defect fix
huggingface#45614 defect validation_failed Add missing requests dependency to transformers[serving] cherry-pick was clean but validation failed ruff_format: setup.py would be reformatted after adding requests to serving extras; reverted the top cherry-pick
huggingface#45613 feature skipped [New Model] Add MiniCPM3 support not a defect fix
huggingface#45612 documentation skipped [docs] update model cards not a defect fix
huggingface#45609 feature skipped make it possible to ser/deser HF MoE models with torchao not a defect fix
huggingface#45608 documentation skipped Python code in model docs not a defect fix
huggingface#45607 other skipped Add regression test for Gemma4 audio relative positional range not a defect fix
huggingface#45604 feature skipped Agent first cli with skill not a defect fix
huggingface#45683 defect validation_failed Exclude audio modules from conversion process ruff_check failed: blank line contains whitespace in src/transformers/quantizers/base.py
huggingface#45599 other skipped qa: more lazy loading not a defect fix
huggingface#45597 feature skipped Add Granite 4.1 Vision (granite4_vision) not a defect fix
huggingface#45595 feature skipped Add unified Cache-layer management for GLM-5 DSA Indexer keys not a defect fix
huggingface#45587 documentation skipped [docs] cb memory management not a defect fix
huggingface#45586 feature skipped Add Audio-Visual Flamingo model not a defect fix
huggingface#45585 other skipped qa: bumped mlinter and allow local override not a defect fix
huggingface#45583 other skipped Update dev version not a defect fix
huggingface#45581 other skipped Add automated reviewer assignment script not a defect fix
huggingface#45580 feature skipped [Privacy Filter] Add model not a defect fix
huggingface#45579 other skipped Update assign_reviewers.py not a defect fix
huggingface#45577 feature skipped Allow for registered experts from kernels hub not a defect fix
huggingface#45576 documentation skipped docs(pipeline): fix num_workers docstring default from 8 to 0 not a defect fix
huggingface#45574 documentation skipped Fix typos not a defect fix
huggingface#45572 other skipped refactor(Dots1): drop Dots1MoE override to pass (inherits from DSV3 MoE) not a defect fix
huggingface#45569 feature skipped Proper nemotron H and 3 and 2 not a defect fix
huggingface#45567 other skipped Move some conversion mappings to PrefixChange not a defect fix
huggingface#45560 documentation skipped Update torchao usage for XPU and CPU not a defect fix
huggingface#45558 feature skipped feat(trainer): log individual losses from loss_dict not a defect fix
huggingface#45556 documentation skipped Add image processors refactor to v5 migration guide not a defect fix
huggingface#45555 feature skipped perf: avoid recomputing rotary_emb for each layer in some Google and ModernBERT models not a defect fix
huggingface#45554 documentation skipped [docs] multi-turn tool calling not a defect fix
huggingface#45553 documentation skipped [docs] per-request sampling params not a defect fix
huggingface#45551 feature skipped Add ForSequenceClassification heads for the OLMo family not a defect fix
huggingface#45550 other skipped Add runner selection for mi325 GPU type not a defect fix
huggingface#45546 feature skipped feat: Add GGUF loading support for Llama 4 (text) not a defect fix
huggingface#45543 other skipped ci: OTEL support not a defect fix
huggingface#45537 feature skipped NVFP4 quantization: streaming loader, fused MoE experts (Qwen + Llama… not a defect fix
huggingface#45535 other skipped [Sam3LiteText] Remove unnecessary modules/configs not a defect fix
huggingface#45534 feature skipped 🚨 [ALM] Add base model without head not a defect fix
huggingface#45532 feature skipped [Model] Add SLANet Model Support not a defect fix
huggingface#45531 defect validation_failed Revert "Fix: modular image processors (huggingface#45492)" merge was clean and compileall/checkers passed, but configured tests_fetcher/pytest validation did not complete within the command timeout; merge reset
huggingface#45527 other skipped Reapply modular to examples not a defect fix
huggingface#45524 defect aborted utils: handle flash_attn missing from importlib packages_distributions without crashing codebase moved on: src/transformers/utils/import_utils.py already uses PACKAGE_DISTRIBUTION_MAPPING.get(..., []) for the flash_attn availability checks, while current code differs from the PR on the flash_attn_3 package…
huggingface#45686 defect aborted Fix custom-module copies inheriting read-only permissions merge conflicted in tests/utils/test_dynamic_module_utils.py where current branch already added local-cache-key tests and PR adds a custom_object_save read-only-permission regression test; runtime code change in src/tra…
huggingface#45519 feature skipped [Trainer] Add ddp_static_graph option not a defect fix
huggingface#45516 defect validation_failed T5Gemma2: fix prepare_decoder_input_ids_from_labels compileall and repo checkers passed, but tests_fetcher validation did not complete within the command timeout after the merge
huggingface#45515 defect validation_failed Fix CUDA availability check in get_device_properties() ruff_check failed with F823 because PR removed the local torch import while get_device_properties still has later local torch imports
huggingface#45512 feature skipped [OutputRecorder] re.search on layer_name not a defect fix
huggingface#45511 defect validation_failed Fix NaN in Gemma3/EmbeddingGemma when batching mixed-length sequences… compileall and repo checkers passed, but tests_fetcher validation did not complete within the command timeout after the merge
huggingface#45509 defect validation_failed Fix get_device_properties crash when CUDA is installed but no GPU compileall and repo checkers passed, but tests_fetcher validation did not complete within the command timeout after the merge
huggingface#45508 documentation skipped [Doc] Fix 'tokenized' -> 'tokenizer' typo in streamer docstrings not a defect fix
huggingface#45506 feature skipped Add full GGUF loading support for GPT‑OSS (fixes huggingface#43366, supersedes huggingface#43757) latest not a defect fix
huggingface#45504 feature skipped .4021378118068288:1e40fd96a800b4038c914120b0aa85c2_69e2faf342c82e665fe04170.69e2faf642c82e665fe04174.69e2faf6d18af355a624eac3:Trae CN.T(2026/4/18 11:31:02) not a defect fix
huggingface#45503 feature skipped .4021378118068288:aafb9167aaa6b321205f754209b0cbcb_69e2f25642c82e665fe0407f.69e2f43542c82e665fe04083.69e2f435d18af355a624eac1:Trae CN.T(2026/4/18 11:02:13) not a defect fix
huggingface#45502 feature skipped .4021378118068288:4da7ed27ccaa5f974fe4a552e2b67bb6_69e2eea842c82e665fe04002.69e2eec142c82e665fe04006.69e2eec18b8bd167e2faa241:Trae CN.T(2026/4/18 10:38:57) not a defect fix
huggingface#45500 feature skipped Add full GGUF loading support for GPT‑OSS (fixes huggingface#43366, supersedes huggingface#43757) latest not a defect fix
huggingface#45497 feature skipped Add V-JEPA 2.1 inference support not a defect fix
huggingface#45495 other skipped revert sha commit pointing to main for transformers_amd_ci_ workflows not a defect fix
huggingface#45494 defect validation_failed Fix: propagate quantization_config to text sub-config for composite models in AutoModelForCausalLM compileall and repo checkers passed, but tests_fetcher validation timed out after the merge
huggingface#45493 feature skipped Modularize ProcessorMixin into smaller components not a defect fix
huggingface#45492 defect validation_failed Fix: modular image processors compileall and repo checkers passed, but tests_fetcher validation timed out after the merge
huggingface#45490 feature skipped Add ctsm model not a defect fix
huggingface#45489 other skipped Align gemma3n cache sharing to gemma4 not a defect fix
huggingface#45487 defect validation_failed Fix model parallel issue for altclip model and ChineseClip model compileall and repo checkers passed, but tests_fetcher validation timed out after the merge
huggingface#45485 feature skipped [serve] Update tool call to switch to parse_response not a defect fix
huggingface#45484 other skipped Minor update not a defect fix
huggingface#45481 other skipped Add check-auto in repo-consistency and fix sorting not a defect fix
huggingface#45480 defect validation_failed Update quants tests compileall and repo checkers passed, but tests_fetcher/pytest validation timed out after the merge
huggingface#45477 feature skipped Blockwise mask fn as opt arg in all masking functions not a defect fix
huggingface#45476 other skipped [Don't merge] Call CI workflow not a defect fix
huggingface#45475 other skipped chore(qa): split out mlinter not a defect fix
huggingface#45474 other skipped chore: bump doc-builder SHA for main doc build workflow not a defect fix
huggingface#45473 defect validation_failed Fix EP: RouterParallel shape, tp_plan property, grouped_mm sentinels merge conflict in tensor_parallel.py was resolved by retaining newer post_shard_wrap, compileall and checkers passed, but tests_fetcher/pytest validation timed out
huggingface#45472 defect validation_failed fix(testing_utils): guard get_device_capability with torch.cuda.is_available() compileall and repo checkers passed, but tests_fetcher/pytest validation timed out after the merge
huggingface#45471 feature skipped Add EXAONE 4.5 implementations not a defect fix
huggingface#45470 defect validation_failed sam3_lite_text: skip flash_attn_2_can_dispatch_composite_models tests repo checkers failed: PR added a duplicate test_flash_attn_2_can_dispatch_composite_models definition in sam3_lite_text tests
huggingface#45469 defect validation_failed Fix: propagate interpolate_pos_encoding through Pixio model hierarchy repo checkers failed: ruff format would reformat pixio modeling and modular files
huggingface#45467 defect validation_failed Fix MPS SDPA output shape when value head dim differs from query head dim compileall and repo checkers passed, but tests_fetcher/pytest validation timed out after the merge
huggingface#45465 documentation skipped [docs] contributing not a defect fix
huggingface#45463 defect aborted Fix response api support codebase moved on: response API support overlaps heavily with newer serve/tool-call parsing changes; direct merge conflicts across docs, response/chat handlers, serving utils, and serve tests, making a speculative manua…
huggingface#45462 other skipped chore(sec): added a handful of security checks not a defect fix
huggingface#45461 other skipped Remove redundant condition checks in get_image_size method not a defect fix
huggingface#45460 defect validation_failed fix(tokenization): re-raise ImportError to allow RuntimeError/OSError fallback (huggingface#45459) compileall and repo checkers passed, but tests_fetcher/pytest validation timed out after the merge
huggingface#45457 feature skipped Allow loading Qwen Thinker 'base' models without generative head not a defect fix
huggingface#45456 other skipped refactor(qa): extend extras so ty can run on server modules not a defect fix
huggingface#45455 defect validation_failed [fix] Make Qwen2_5OmniProcessor warning a lot less noisy via warning_once compileall and repo checkers passed, but impacted-test validation timed out after the merge
huggingface#45454 defect aborted Gemma4 training with text-only samples codebase moved on: PR head includes unrelated main merge history, and cherry-picking the relevant Gemma token-type commits conflicts with newer Gemma3/Gemma4 causal mask call signatures and test skips; resolving would r…
huggingface#45453 other skipped Draft commit not a defect fix
huggingface#45452 other skipped refactor: replace wildcard imports with explicit imports in model init.py files not a defect fix
huggingface#45451 documentation skipped Fix 'seperate' typo in qwen3/glm video-model docstrings not a defect fix
huggingface#45450 other skipped chore: bump doc-builder SHA for PR upload workflow not a defect fix
huggingface#45449 defect validation_failed Add step3_vl to MODELS_WITH_INCORRECT_HUB_TOKENIZER_CLASS compileall and repo checkers passed, but impacted-test validation timed out after the merge
huggingface#45448 defect aborted [loading] Clean way to add/remove full parts in checkpoint names codebase moved on: loading/conversion mapping changes conflict with newer cumulative conversion mappings for qwen3_5_moe_text, hy_v3/laguna, and cohere_asr; resolving the combined checkpoint-conversion table safely woul…
huggingface#45445 defect validation_failed Update Torch version check for flex attention compileall and repo checkers passed, but impacted-test validation timed out after the merge
huggingface#45442 other skipped Update workflow references to new commit hash not a defect fix
huggingface#45439 feature skipped feat: bump min safetensors version to 0.8.0-rc.0 not a defect fix
huggingface#45438 feature skipped Add Gemma4ForSequenceClassification not a defect fix
huggingface#45436 feature skipped Add expert parallelism (EP) config support for Qwen3 MoE not a defect fix
huggingface#45434 other skipped better grad acc tests not a defect fix
huggingface#45433 feature skipped SonicMoe not a defect fix
huggingface#45432 other skipped chore(qa): split pipeline and add type checking not a defect fix
huggingface#45430 documentation skipped [Doc] Correct checkpoint path in Dinov2 model_docs not a defect fix
huggingface#45429 other skipped Improve workflow file not a defect fix
huggingface#45426 feature skipped Feature/add axk1 not a defect fix
huggingface#45425 other skipped chore(typing): added modeling_utils to ty not a defect fix
huggingface#45424 feature skipped Add IndexCache support for GLM5 DSA not a defect fix
huggingface#45415 other skipped Adds type checking to src/transformers/*py not a defect fix
huggingface#45409 feature skipped from_pretrained orchestration + distributed save/load not a defect fix
huggingface#45408 feature skipped MoE expert parallelism + sequence parallelism not a defect fix
huggingface#45401 feature skipped Add support for Voxtral-4B-TTS-2603 to transformers not a defect fix
huggingface#45398 documentation skipped Add example for iterative chatting with MLLMs not a defect fix
huggingface#45396 feature skipped Extract dynamic vision/audio tensors into standalone pure functions not a defect fix
huggingface#45687 defect validation_failed fix: Made histc_input robust for broader hardware merge clean but validation failed: ruff format would reformat src/transformers/integrations/moe.py
huggingface#45392 other skipped remove cache file from tree not a defect fix
huggingface#45391 feature skipped audio tester class not a defect fix
huggingface#45387 defect aborted Fix flash_attention_3 detection and import for hopper wheel installs codebase moved on: direct merge conflicts in is_flash_attn_3_available; current branch already has newer flash-attn package-distribution guard changes while the PR replaces detection with import probing, so choosing eit…
huggingface#45384 feature skipped generation/stopping_criteria: short-circuit StoppingCriteriaList when all sequences are done not a defect fix
huggingface#45688 documentation skipped docs(README_zh-hans): clarify conditions for not using Transformers not a defect fix
huggingface#45382 feature skipped Add AudioGen (AudioCraft) to MusicGen conversion scripts not a defect fix
huggingface#45378 defect validation_failed fix(mistral): guard ReasoningEffort import for older mistral_common versions merge clean but validation failed: ruff format would reformat src/transformers/tokenization_mistral_common.py
huggingface#45374 feature skipped Adding hierarchical classification example not a defect fix
huggingface#45370 documentation skipped docs: fix 5 docstring errors in Gemma3nTextConfig (typos, grammar, formatting) not a defect fix
huggingface#45367 feature skipped Add dtype config options for Four Over Six not a defect fix
huggingface#45366 defect validation_failed Fix OLMoE routing and Mistral4 RoPE dimensions merge clean but validation failed: import ordering errors in tests/models/mistral4/test_modeling_mistral4.py and tests/models/olmoe/test_modeling_olmoe.py
huggingface#45365 other skipped Refactor GPT-J output tracing to use standardized decorators not a defect fix
huggingface#45364 feature skipped Add PolarQuant backend to QuantizedCache (Hadamard-rotated Lloyd-Max) not a defect fix
huggingface#45363 feature skipped n-to-1 kernel fusion via KernelConfig not a defect fix
huggingface#45361 feature skipped Add CLIP-like models in conversion to VLMs not a defect fix
huggingface#45360 other skipped Replace deprecated huggingface-cli references with hf not a defect fix
huggingface#45355 feature skipped Add universal phone recognition model - PhoneticXeus not a defect fix
huggingface#45353 feature skipped add kwargs to all methods in the CallbackHandler class not a defect fix
huggingface#45350 feature skipped WIP: Add support for Granite4VisionForConditionalGeneration not a defect fix
huggingface#45349 defect validation_failed Fix huggingface#45305 + add regression test GAS ruff failed after merge: duplicate test_gradient_accumulation_steps_not_leaked_to_accelerator and undefined set_seed in tests/trainer/test_trainer.py; merge reset
huggingface#45344 other skipped refactor: display test duration not a defect fix
huggingface#45342 defect validation_failed Use recursively from children ruff format failed after merge: src/transformers/modeling_utils.py would be reformatted; merge reset
huggingface#45339 other skipped chore: added circleci python script to ruff and ty checkers not a defect fix
huggingface#45338 documentation skipped docs: document known limitations of _can_set_attn/experts_implementation source inspection not a defect fix
huggingface#45337 other skipped chore: remove test_hub for now not a defect fix
huggingface#45334 feature skipped Feature/add axk1 not a defect fix
huggingface#45333 feature skipped Add heterogeneous config support (per-layer configuration) not a defect fix
huggingface#45689 feature skipped fixing more typos not a defect fix
huggingface#45332 feature skipped Add heterogeneous model support (per-layer config and modeling) not a defect fix
huggingface#45329 feature skipped Update trackio integration to use Buckets and "freeze" Space after training not a defect fix
huggingface#45327 documentation skipped [docs] modular transformers not a defect fix
huggingface#45326 feature skipped feat[vLLM × v5]: Add vLLM compatibility for audio models not a defect fix
huggingface#45321 defect aborted Remove references to torchao's AffineQuantizedTensor codebase moved on: direct merge conflicted in tests/quantization/torchao_integration/test_torchao.py where current branch imports newer NVFP4DynamicActivationNVFP4WeightConfig while the PR removes deprecated AffineQuant…
huggingface#45319 defect aborted fix: dont download artifacts from the test hub codebase moved on: direct merge conflicted in utils/fetch_hub_objects_for_ci.py at URLS_FOR_TESTING_DATA where cumulative branch added PP-OCR/Paddle demo URLs while the PR rewrites hub artifact fetching to avoid staging…
huggingface#45315 defect aborted Fix softmaxing router logits codebase moved on: direct merge conflicted across many MoE router implementations where current branch already has router_probs/router_scores naming while PR changes routing_weights/router_logits return semantics to avo…
huggingface#45314 defect aborted Conversion for LLM class loading with VLM ckpt codebase moved on: direct merge conflicted in conversion mapping and Gemma3n modular code where current branch has newer PrefixChange/model-prefix conversion logic, qwen3_5_text mappings, and Gemma3n unexpected-key hand…
huggingface#45312 defect validation_failed [gemma4] Dissociate kv states sharing from the Cache compileall failed after clean merge: Gemma4 modeling and modular files contain duplicate shared_kv_states arguments
huggingface#45309 defect aborted Fix KeyError in apply_chat_template when message has no content (huggingface#45290) codebase moved on: direct merge conflicted in chat-template/processing/serving paths where current branch already uses message.get/content handling and PR introduces a get_message_content helper for missing content; res…
huggingface#45303 defect validation_failed Fix FA2 inference equivalence failures for Whisper (closes huggingface#29942) patch applied from gh diff because PR ref was unavailable, but ruff_format would reformat tests/test_modeling_common.py
huggingface#45301 documentation skipped docs maintenance for transformers repository 979e8 not a defect fix
huggingface#45299 other skipped [Please ignore] CI Test PR not a defect fix
huggingface#45298 feature skipped Add new qwen2 5 vl not a defect fix
huggingface#45296 feature skipped Add GGUF support to Gemma4 (31B & 26B-A4B) text not a defect fix
huggingface#45294 feature skipped feat: add Gemma4ForSequenceClassification not a defect fix
huggingface#45291 feature skipped First pull request not a defect fix
huggingface#45288 defect aborted fix(cohere_asr): auto-fix failing tests codebase moved on: Cohere ASR expectations now use the shared Expectations helper with xpu/cuda entries, overlapping older torch_device conditional expectations; accepting the PR head would regress current expectation s…
huggingface#45287 defect aborted fix(videomt): auto-fix failing tests codebase moved on: VideoMT expected-mask tensors already differ from the closed draft auto-fix in multiple positions; choosing either side would be speculative without rerunning the model-specific slow tests
huggingface#45285 defect aborted Fix export for gemma4 and add Integration tests codebase moved on: Gemma4 tests and ExecuTorch export paths have diverged substantially; current branch already has newer export/cache handling and expanded Gemma4 expectations, while the PR head would remove later Quan…
huggingface#45283 feature skipped Add Qwen3.5 GGUF loading support not a defect fix
huggingface#45280 feature skipped add Qianfan-OCR model definition not a defect fix
huggingface#45279 feature skipped add expert parallelism for gemma-4-26B-A4B-it not a defect fix
huggingface#45274 defect aborted Fix CB Accuracy Regression under FA2 codebase moved on: continuous batching CUDA graph handling now has split varlen/decode controls and a different graph-key API; the PR edits older use_cuda_graph and graph signature paths and conflicts across runtime plu…
huggingface#45271 documentation skipped [docs] vlm addition not a defect fix
huggingface#45270 feature skipped [Trainer] Support multi-loss component logging not a defect fix
huggingface#45269 other skipped Fix typos in src/transformers/utils/output_capturing.py not a defect fix
huggingface#45690 feature skipped [serve] Support for reasoning not a defect fix
huggingface#45268 defect aborted Fix Qwen2IntegrationTest codebase moved on: Qwen2 integration expectations now have per-backend cuda/rocm/xpu entries while PR changes the same expectations to a generic default; resolving would require choosing expected values across backends …
huggingface#45267 documentation skipped Add docstring to FFN.forward in DistilBERT not a defect fix
huggingface#45266 documentation skipped Add docstrings to AlbertMLMHead and AlbertSOPHead forward methods not a defect fix
huggingface#45262 documentation skipped doc: fix TokenizersBackend.convert_to_native_format docstring not a defect fix
huggingface#45261 other skipped empty not a defect fix
huggingface#45258 defect aborted Fix SmolVLM video processor resize using wrong interpolation after backend refactor codebase moved on: current SmolVLMVideoProcessor already has the resample signature and calls super().resize with resample, while PR adds manual PIL-to-torch interpolation mapping in the same block; resolving would requ…
huggingface#45256 defect aborted fix: skip qwen3_5_text checkpoint remap for nested VL language_model codebase moved on: current conversion_mapping now passes named module prefixes into extract_weight_conversions_for_model, while the PR changes older model.modules traversal and nested language_model detection; resolving…
huggingface#45254 other skipped Fix more integration tests for important models not a defect fix
huggingface#45251 defect aborted fix(generation): beam sample when num_beams * vocab_size exceeds multinomial limit codebase moved on: current generation utils already contains a multinomial-dimension workaround for beam sampling using top-k prefiltering, while the PR applies a different Gumbel-top-k fallback in the same _get_top_k_c…
huggingface#45248 defect aborted Fix tf32 issue: set explicitly. codebase moved on: conftest.py already contains the TF32 disablement block from this PR with an additional guard for torch.backends.cudnn.conv, causing a conflict on the same hasattr line; resolving would amount to choo…
huggingface#45244 other skipped Let's CI go great not a defect fix
huggingface#45243 other skipped Nvidia CI with torch 2.11 not a defect fix
huggingface#45241 other skipped Update tiny model creation script not a defect fix
huggingface#45235 feature skipped feat/rfc/poc: Agnostic GPU not a defect fix
huggingface#45233 feature skipped feat: make timesfm2_5 onnx export compatible not a defect fix
huggingface#45232 documentation skipped [docs] static model rules not a defect fix
huggingface#45228 defect aborted More fix for tiny model creation codebase moved on: conflict in generated auto image processor mapping; PR adds vivit mapping that is already present on cumulative branch while surrounding mapping entries differ, and remaining tiny-model/config changes…
huggingface#45227 defect validation_failed fix: remove nonexistent PILImageResampling import from video_processing_utils patch removing PILImageResampling runtime import failed ruff with F821 undefined name in quoted annotation; light validation also missing requests after targeted selection
huggingface#45220 feature skipped Multimodal serve support not a defect fix
huggingface#45219 feature skipped Add MoE to Gemma4 TP plan not a defect fix
huggingface#45218 feature skipped Proposal: Agent-first CLI not a defect fix
huggingface#45215 other skipped [Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_rule attention computation: "attn = (q_i @ k_i.transpose(-1, -2) * decay_mask[:, :, i]).masked_fill_(mask, 0)" not a defect fix
huggingface#45213 other skipped DO NOT MERGE - model creation skill not a defect fix
huggingface#45212 feature skipped musicflamingo: add test support for Intel XPU device not a defect fix
huggingface#45209 feature skipped nomic_bert: make the test suitable for general device. not a defect fix
huggingface#45207 documentation skipped [Gemma4] Add docstrings for Per-Layer Embeddings (PLE) pipeline not a defect fix
huggingface#45197 documentation skipped fix(docs): correct gemma4 docs and examples not a defect fix
huggingface#45196 documentation skipped [docs] formatting not a defect fix
huggingface#45195 feature skipped Use torchvision decode_image to load images in the torchvision backend not a defect fix
huggingface#45194 defect aborted Configuration insoncistencies codebase moved on: PR adds NougatConfig and config mapping cleanups, but current branch already has NougatConfig with newer auto_docstring checkpoint metadata; add/add conflict in generated/model config area, so not res…
huggingface#45192 feature skipped casually dropping the most capable open weights on the planet not a defect fix
huggingface#45191 other skipped Add edge case tests for out-of-range token id decoding in Qwen2 tokenizer not a defect fix
huggingface#45190 defect aborted Fix ty for transformers cli codebase moved on: PR updates CLI typing and ty include paths, but current cumulative branch has substantially changed serving CLI request/response/transcription utilities and check_types wiring; multiple content confli…
huggingface#45189 other skipped Add doc test CI workflow reusing existing model job infrastructure not a defect fix
huggingface#45187 defect aborted Close file handler codebase moved on: PR fixes base64 image temp-file handle handling in old BaseHandler parsing, but current branch rewrote multimodal/message parsing and tool-call support in the same block; applying the close-file chang…
huggingface#45186 feature skipped Add new model: Isaac not a defect fix
huggingface#45184 feature skipped [CB] [Major] Add CPU request offloading not a defect fix
huggingface#45181 feature skipped Make the cli a top-level package not a defect fix
huggingface#45180 other skipped 🔒 Pin GitHub Actions to commit SHAs not a defect fix
huggingface#45179 defect aborted [CB] Tweaks to update and minor fixes codebase moved on: PR changes continuous batching cache/memory handler indexing and paged tests, but current branch already has newer activation-peak memory accounting, optional empty read-index handling, CPU offload te…
huggingface#45176 feature skipped added efficietvitsam model to HF not a defect fix
huggingface#45174 documentation skipped [docs] transformers serve not a defect fix
huggingface#45172 feature skipped Add TopNSigmaLogitsWarper and top_n_sigma generation config support not a defect fix
huggingface#45170 defect aborted layrnorm -> layernorm codebase moved on: CLIP-like vision modeling files have been refactored since the PR; the typo-only rename conflicts with moved/duplicated CLIPVisionModel/forward blocks and would require regenerating copied/modular mod…
huggingface#45168 feature skipped Update min_lr and max_lr default values to better defaults not a defect fix
huggingface#45167 feature skipped Add anthropic style of function schema not a defect fix
huggingface#45159 documentation skipped Add Turkish documentation: Get Started section not a defect fix
huggingface#45158 documentation skipped Add Turkish (tr) translation for Get Started section not a defect fix
huggingface#45157 feature skipped [WIP] PrismML Bonsai model support not a defect fix
huggingface#45155 feature skipped Load adapter with TP not a defect fix
huggingface#45154 defect aborted Pretrained-config bug(45072/huggingfacebug) codebase moved on: PR rewrites PreTrainedConfig/auto_docstring type-validation code against an older layout, while current branch already uses dataclass/strict config fields, dtype_validator, dataclass_transform, and up…
huggingface#45153 feature skipped [FA] Native torch integration not a defect fix
huggingface#45152 documentation skipped [docs] model testing not a defect fix
huggingface#45150 documentation skipped Fix incorrect TrainingArguments example in training.md not a defect fix
huggingface#45149 feature skipped DO NOT MERGE adding SAML3-LiteText with a skill, first pass not a defect fix
huggingface#45148 feature skipped Allow for all layers in Qwen3.5 architecture to be Gated Deltanet. not a defect fix
huggingface#45144 feature skipped Add Xiaomi MiMo-V2 not a defect fix
huggingface#45143 feature skipped Add parse_response to Processor, make it a bit more official not a defect fix
huggingface#45142 other skipped refactor(gpt-oss): rename eager_attention_forward to eager_attention_forward_with_sink not a defect fix
huggingface#45139 defect aborted Fix vllm cis codebase moved on: PR applies a broad generated RoPE/cis update across modeling files and conflicts in CLIPSeg attention shape handling; resolving safely would require regenerating/checking the copied model updates rath…
huggingface#45135 defect aborted Fix model saving corruption for dynamically untied embeddings codebase moved on: PR overlaps the already-present tied-embedding save_pretrained fix in modeling_utils with a different implementation/restoration behavior; merging conflicts in the same block and safe resolution would…
huggingface#45134 feature skipped Optimize Parakeet feature extraction on CUDA not a defect fix
huggingface#45133 feature skipped Add sarvam model not a defect fix
huggingface#45132 defect validation_failed Fix: Remove double softmax in MoE router load-balancing loss (Mixtral, Qwen2MoE, Qwen3VLMoE) merge clean but validation failed: ruff format would reformat three added/modified MoE test files; light validation also unavailable as in baseline
huggingface#45130 documentation skipped [docs] @auto_docstring decorator not a defect fix
huggingface#45128 defect validation_failed Fix: handle future annotations in _process_kwargs_parameters merge clean but validation failed: ruff import ordering error in src/transformers/utils/auto_docstring.py; light validation unavailable as in baseline
huggingface#45126 defect aborted http retries on audio file downloads codebase moved on: PR adds generic retry/backoff and tests but conflicts with current audio video-container handling in audio_utils and newer generic utility tests/imports; resolving is small but would require manually …
huggingface#45122 defect aborted 🚨 [LightGlue] Remove remote code execution merge blocked by pre-existing untracked utils/mlinter/trf014.py in the cumulative worktree, which the PR also adds; not moving/removing existing workflow artifacts during this turn, and safely reconciling the linter sta…
huggingface#45121 defect aborted fix: remove unsafe exec() in serve.py codebase moved on: PR patches the old monolithic FastAPI /load_model implementation in src/transformers/cli/serve.py, but current branch has refactored serving into build_server/ModelManager handlers; safe security hand…
#45118 feature skipped Add full GGUF loading support for GPT‑OSS (fixes huggingface#43366, supersedes huggingface#43757) not a defect fix
#45116 feature skipped Add full GGUF loading support for GPT‑OSS (fixes huggingface#43366) not a defect fix
#45115 other skipped Refactor/nemotron h inherit granitemoehybrid not a defect fix
#45114 documentation skipped fix: lets fix all doctests not a defect fix
#45113 feature skipped Add GDS support for safetensors loading not a defect fix
#45112 feature skipped [CB] Add warmup feature not a defect fix
#45110 feature skipped Add SAM 3.1 not a defect fix
#45109 defect aborted Fix T5Attention shape mismatch under Tensor Parallelism codebase moved on: T5-family attention shape code now uses input_shape/hidden_shape generalization that overlaps the PR batch_size/seq_length view fix across copied/model files; resolving six generated/copy-related conf…
#45105 defect aborted Fix @auto_docstring crash with from future import annotations in _process_kwargs_parameters codebase moved on: auto_docstring kwargs annotation handling has overlapping guard changes in _process_kwargs_parameters; PR adds get_type_hints/string-resolution but conflicts with current args/name guard, so a…
#45104 defect aborted Fix auto_docstring crash with from future import annotations codebase moved on: same auto_docstring _process_kwargs_parameters area already has overlapping annotation guard changes; this closed PR conflicts with the newer guard and duplicate string-annotation fix area
#45101 feature skipped Adding support for Nandi Models not a defect fix
#45100 documentation skipped Update accelerator_selection.md not a defect fix
#45097 feature skipped Add old InternVL2-1B/2B support to the InternVL conversion script #45092 not a defect fix
#45096 defect aborted Fix: Skip meta device initialization for remote code models codebase moved on: modeling_utils initialization context now uses init.meta_device_safe_creation_ops to make meta-device construction safer, directly overlapping the PR change that conditionally skips torch.device("meta…
#45094 defect aborted fix: prefer registered config over remote code in AutoConfig.from_pretrained codebase moved on: AutoImageProcessor now resolves mapped processors through _load_class_with_fallback for backend-specific mappings and AutoTokenizer has newer explicit-local-code guards for TOKENIZER_MAPPING and missi…
#45087 defect validation_failed Fix PretrainedConfig type checking with mypy ruff validation failed after merge: src/transformers/configuration_utils.py references TYPE_CHECKING without importing it
#45082 feature skipped [VidEoMT] Update conversion script not a defect fix
#45076 feature skipped Osman-Level Innovations: Hardware-Aware Advisor & Selective Weight Surgery CLI not a defect fix
#45075 feature skipped Add Deepseek-OCR-2 model not a defect fix
#45073 other skipped Refactor OwlViT to modular Transformers not a defect fix
#45067 feature skipped feat: trainer resume_from_checkpoint support hub downloads (#43375) not a defect fix
#45066 feature skipped [PR] Unique Enhancement: Transformers Model Advisor & Legacy Cleanup not a defect fix
#45065 other skipped Remove unused TensorFlow env var not a defect fix
#45064 other skipped refactor: shard checkers not a defect fix
#45063 feature skipped CB improvements for serving not a defect fix
#45058 feature skipped Allow advanced users to override model_type in AutoConfig.from_pretrained not a defect fix
#45054 other skipped chore: update update_metdata.yml not a defect fix
#45052 defect aborted chore: Fix mlinter cache location codebase moved on: PR replaces old utils/.check_modeling_structure_cache.json ignore with utils/mlinter/.mlinter_cache.json, but current branch already passed through later checker/mlinter refactors and .gitignore now i…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.