chore(typing): extend typing to `src/transformers/cli` by tarekziade · Pull Request #44566 · huggingface/transformers

tarekziade · 2026-03-10T10:40:13Z

This patch extends ty check to src/transformers/cli

Based on #44412

HuggingFaceDocBuilderDev · 2026-03-10T10:50:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2026-03-11T08:15:47Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: fbgemm_fp8, finegrained_fp8, gptq, higgs, hqq, metal, mxfp4, sinq

vasqu

Some initial comments from my side: I think we need a better workaround or else we will encounter these issues in the whole codebase again. Just my intuition/impression, it may as well not be that bad

vasqu · 2026-03-17T16:47:29Z

+        elif pt_hpu_available and hasattr(torch, "hpu"):
            info["Using HPU in script?"] = "<fill in>"
            info["HPU type"] = torch.hpu.get_device_name()
-        elif pt_npu_available:
+        elif pt_npu_available and hasattr(torch, "npu"):


Would we not need this for all devices? Cuda, xpu does not need it?

like for safetensor it depends on how torch has declared its types and also if the API exists in alll supported version. On our side the safest bet is to assume it's not there, and always check for it.

Here, this was an automated change on failures, but we should do this for all torch.something

vasqu · 2026-03-17T16:56:56Z

+        cb_manager = self.running_continuous_batching_manager
+        if cb_manager is None:
+            raise RuntimeError("Continuous batching manager failed to initialize")


Suggested change

cb_manager = self.running_continuous_batching_manager

if cb_manager is None:

raise RuntimeError("Continuous batching manager failed to initialize")

if self.running_continuous_batching_manager is None:

raise RuntimeError("Continuous batching manager failed to initialize")

I guess this is needed, but just double-checking: Do we the local var conversion?

The root issue is that this function contains 3 sub functions - that variable is narrowed as ContinuousBatchingManager | None, and nested closures can't benefit from the narrowing guard.

The real problem is how large and complex that function is, the real fix is to refactor it but I was trying to minimize the diff. Happy to do it though if you think that's in scope for this patch

Gotcha, no worries I think we can keep it as is for now. But ccing @remi-or for CB viz

vasqu

Looks already better to me 🤗 just a few smaller comments. Imo, the biggest point remains on how we handle the batch encoding --> this is very core and I think we should take our time here

vasqu · 2026-03-19T15:23:24Z

+        cb_manager = self.running_continuous_batching_manager
+        if cb_manager is None:
+            raise RuntimeError("Continuous batching manager failed to initialize")


Gotcha, no worries I think we can keep it as is for now. But ccing @remi-or for CB viz

vasqu · 2026-03-19T15:25:14Z

-        inputs = processor.apply_chat_template(
-            req["messages"], return_tensors="pt", add_generation_prompt=True, return_dict=True
-        ).to(model.device)["input_ids"][0]
+        chat_inputs = require_batch_encoding(


Opening re #44566 (comment) because I'm lazy and want everything collected in the same review :D

How much lifecycle for 3.10 is left 🤔 I think it might be worth it. The usage in other places of the code base should be similar so we hit this sooner or later I feel like

vasqu · 2026-03-19T15:26:21Z

                "Using `fp_quant` with real quantization requires a **Blackwell GPU** and qutlass: `git clone https://github.com/IST-DASLab/qutlass.git && cd qutlass && pip install --no-build-isolation .`. You can use `FPQuantConfig(pseudoquantization=True, ...)` to use Triton-based pseudo-quantization. It doesn't provide any speedups but emulates the quantization behavior of the real quantization."
            )

-        if (


Looks unrelated 👀

ewww, thanks, looks like a bad rebase

…e CB context manager Refactors `src/transformers/cli/serve.py` to reduce nesting depth, eliminate code duplication, and improve maintainability. No behavioral changes and the public API is unchanged. This change is motivated by discussion in #44566 where type checking was made a bit complex due to the current code architecture.

vasqu · 2026-03-27T13:42:26Z

Argh, it changed a lot on main 😭 do we want to close/draft this one for now @tarekziade

tarekziade · 2026-03-27T15:53:04Z

@vasqu I can rebase no worries, I am waiting on @SunMarc refactoring of the server cli here #44796 before resuming this one

Add type declarations for mixin host-class attributes on GenerationMixin, class-level annotations for dynamically-set attributes on GenerationConfig, and fix minor typing issues in candidate_generator, watermarking, and stopping_criteria. Create _typing.py Protocol for documentation/reuse.

github-actions · 2026-04-01T16:05:50Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44566&sha=7606ab

tarekziade · 2026-04-01T16:05:56Z

refactoring too complex, will cherry pick in a new PR

tarekziade self-assigned this Mar 10, 2026

tarekziade force-pushed the tarekziade-typing-cli branch from 524e015 to 5f7993f Compare March 16, 2026 09:13

tarekziade requested a review from vasqu March 17, 2026 08:42

vasqu reviewed Mar 17, 2026

View reviewed changes

tarekziade force-pushed the tarekziade-typing-cli branch from 7bbc1ae to b1104ee Compare March 18, 2026 06:44

vasqu reviewed Mar 19, 2026

View reviewed changes

tarekziade mentioned this pull request Mar 20, 2026

refactor: improved the cli server module code organization #44875

Draft

tarekziade force-pushed the tarekziade-typing-cli branch from 0ea408b to 0adcb07 Compare April 1, 2026 15:40

tarekziade added 2 commits April 1, 2026 17:50

revert bad merge

92ce723

inherit from croncrete class?

7606ab7

tarekziade closed this Apr 1, 2026

SunMarc mentioned this pull request Apr 2, 2026

Fix ty for transformers cli #45190

Merged

Conversation

tarekziade commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2026

Uh oh!

github-actions Bot commented Mar 11, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu commented Mar 27, 2026

Uh oh!

tarekziade commented Mar 27, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

tarekziade commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tarekziade commented Mar 10, 2026 •

edited

Loading