Skip to content

[Auto] Fix xdist captured_info collisions (cluster-45561-3): merged 1 of 2 PRs#38

Open
evalstate wants to merge 23 commits intomainfrom
merge-cluster-cluster-45561-3-20260427115403
Open

[Auto] Fix xdist captured_info collisions (cluster-45561-3): merged 1 of 2 PRs#38
evalstate wants to merge 23 commits intomainfrom
merge-cluster-cluster-45561-3-20260427115403

Conversation

@evalstate
Copy link
Copy Markdown
Owner

Cluster: cluster-45561-3
Base: origin/main
Branch: merge-cluster-cluster-45561-3-20260427115403

Merged PRs:

Skipped PRs:

Failed PRs:

  • None.

Notes:

Next steps:

remi-or and others added 23 commits April 23, 2026 11:34
* Fix KV dedup for decode batches

* Fix memory estimation

* Change default

* Added write-only fast path

* Take both peaks into account

* Revert unused config field

* Review 1

* Fix p1s

* Fix p2s and p3s that needed it

* Added a TODO

* Fix test, lower max cached graph, add TODO

* Fix fragmentation with big warmup

* Add more space for logits processors

* Fix
* Allow for registered experts from kernels hub

* remove deepgemm as that is also dynamic

* Apply repo consistency fixes

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* Apply repo consistency fixes

* Apply suggestion from @IlyasMoutawwakil

* Apply repo consistency fixes

* get rid of triton dependency

* keep eager first

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: IlyasMoutawwakil <moutawwakil.ilyas.tsi@gmail.com>
Summary:

1. fix torchao NVFP4 serialization with transformers
2. add a test to cover the fix

While i'm here, also did the following bundled into this PR:
3. make the torchao serialization test have human readable names (easier
   to debug)
4. fix the float8 test (update the expected output)

after this PR the test command for all torchao configs passes on an
NVIDIA B200

Test Plan:

```
RUN_SLOW=1 pytest tests/quantization/torchao_integration/test_torchao.py -k "Serialization" -s
```

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* added sonic moe

* use lazy_load_kernel

* style

* use concatenated revision

* final touches

* fix

* merge conflict

* simpler naming

* style

* add sonicmoe test

* skip fp32 on sonic

* add transposed support

* fix

---------

Co-authored-by: vasqu <antonprogamer@gmail.com>
* qa: bumped mlinter and allow local override

* bump version

* Update utils/check_modeling_rules_doc.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* license header

* license header

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
…#45610)

* Fix missing conversion of experts

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix eager config attribute reading

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add proper error when kernels isn't installed

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* remove unnecessary mapping

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* review comments

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* remove double newline

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
…uggingface#45601)

* fix: compute auxiliary losses when denoising is disabled in D-FINE

* style: fix formatting

* test: add regression test for auxiliary losses when denoising is disabled

* test: fix num_labels config in auxiliary loss regression test

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* remove warnings

* fix

* revert

* revert useless

* move function outside
…ing path (huggingface#45582)

* generate: drop stale num_return_sequences warning on continuous batching path

The continuous-batching branch warned that num_return_sequences was
unsupported alongside num_beams, but generate_batch() already honors
generation_config.num_return_sequences when expanding requests.  The
warning fires for any run that explicitly sets num_return_sequences
even though the feature works, cluttering logs and misleading users.

Drop the num_return_sequences half of the warning; keep the num_beams
guard since beam search is still unsupported on the CB path.

Fixes huggingface#45563

* Apply repo consistency fixes

---------

Co-authored-by: Joaquin Hui Gomez <joaquinhuigomez@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com>
* chore(qa): split pipeline and add type checking

* added serving to quality

* fmt
allow

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* circleci with torch 2.11

* circleci with torch 2.11

* circleci with torch 2.11

* circleci with torch 2.11

* circleci with torch 2.11

* circleci with torch 2.11

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
…th `num_labels=1` (huggingface#45611)

* Raise clear error for problem_type="single_label_classification" with num_labels=1

This combination is mathematically degenerate: applying cross-entropy loss to a
single logit always yields zero loss, so training silently accomplishes nothing.
Validate the combination in PreTrainedConfig.__post_init__ so users get a clear
error at config construction with a pointer to the correct setup (num_labels=2
for binary classification, or problem_type="regression" for a single-output
regression head).

Closes huggingface#45479

* Update src/transformers/configuration_utils.py

* Update tests/utils/test_configuration_utils.py

* Update src/transformers/configuration_utils.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
…uggingface#45625)

Add supports_gradient_checkpointing to NemotronHPreTrainedModel
* Add output language to chunks

* Add output language to chunks

* Fix formating

* Return full language instead of iso code

* revert changes (excep test)

* correct fix

* fix

* values for runner

---------

Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
@evalstate
Copy link
Copy Markdown
Owner Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.