Skip to content

fix(testing): Fix Kyutai Speech-To-Text and LongCatFlash test failures on main CI#44695

Merged
Rocketknight1 merged 3 commits intohuggingface:mainfrom
harshaljanjani:fix/kyutai-llava-longcat-test-failures
Apr 9, 2026
Merged

fix(testing): Fix Kyutai Speech-To-Text and LongCatFlash test failures on main CI#44695
Rocketknight1 merged 3 commits intohuggingface:mainfrom
harshaljanjani:fix/kyutai-llava-longcat-test-failures

Conversation

@harshaljanjani
Copy link
Copy Markdown
Contributor

@harshaljanjani harshaljanjani commented Mar 14, 2026

What does this PR do?

The following failing tests were identified and fixed in this PR:

Kyutai Speech-To-Text: The PR [processors] Unbloating simple processors, refactored ProcessorMixin.call to use explicit keyword-only params instead of accepting positional arguments; but the KyutaiSTT integration tests were still calling processor(samples) positionally; the audio samples in the current state mapped to the images param.
LLaVA-OneVision (Removed from the scope of this PR; requires more information from maintainers 🟡): The PR Load a tiny video to make CI faster introduced local video file path mappings. LlavaOnevision's setUpClass was still building paths to Big_Buck_Bunny_720_10s_10MB.mp4 and sample_demo_1.mp4 in the repo root.
LongCatFlash: The PR [V5] Return a BatchEncoding dict from apply_chat_template by default again changed apply_chat_template to return BatchEncoding dict instead of a tensor. The test was passing this dict directly to model.generate and tried to access .shape on the dict; this fixes that :)

Note: The test still fails with an AssertionError, I'm not too sure and it could be flaky, but the crash should be resolved :)

cc: @Rocketknight1 @zucchini-nlp

CI Failures

Before the fix (feel free to cross-check; these errors are reproducible):

image

After the fix (feel free to cross-check):

image

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you fix any necessary existing tests?

@harshaljanjani harshaljanjani marked this pull request as ready for review March 14, 2026 09:12
@github-actions github-actions Bot requested a review from ydshieh March 14, 2026 09:12
@harshaljanjani
Copy link
Copy Markdown
Contributor Author

cc: @Rocketknight1 @zucchini-nlp Just a gentle ping :)

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left one comment, otherwise lgmt

Comment on lines -62 to -65
local_videos = [
os.path.join(repo_root, "Big_Buck_Bunny_720_10s_10MB.mp4"),
os.path.join(repo_root, "sample_demo_1.mp4"),
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also load local images above from the same root path. IIRC we made sure these artifacts are cached when loading from hub so I dont know why they are being created here

cc @ydshieh for this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, I didn't notice this previously but now that I've read into it a bit more, I guess even this image creation block isn't needed either for the same reason (ref: 05c0e1d); just the local_tiny_video logic staying intact should suffice but I'd love to know if I'm missing something.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to hear from YihDar, since I am not aware if we are supposed to do it like this instead of allowing hub to get the correct cache from the hub

@harshaljanjani
Copy link
Copy Markdown
Contributor Author

Good day @ydshieh,
I'm checking in to see if I might ask for some clarification regarding this thread at your convenience; thank you!

@harshaljanjani harshaljanjani changed the title fix(testing): Fix Kyutai Speech-To-Text, LLaVA-OneVision, and LongCatFlash test failures on main CI fix(testing): Fix Kyutai Speech-To-Text and LongCatFlash test failures on main CI Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: kyutai_speech_to_text, longcat_flash

@harshaljanjani
Copy link
Copy Markdown
Contributor Author

Good day @zucchini-nlp!
Was wondering if we could remove the LLaVA-OneVision fix from the scope of this PR to unblock it. Once we receive updates on that PR, I'll raise another one for LLaVA-OneVision with the new information in mind. Please let me know if that works :)

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oke, lets merge wihtout llava

@zucchini-nlp zucchini-nlp enabled auto-merge April 2, 2026 14:42
@zucchini-nlp zucchini-nlp added this pull request to the merge queue Apr 2, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Apr 2, 2026
@zucchini-nlp zucchini-nlp added this pull request to the merge queue Apr 2, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Apr 2, 2026
@harshaljanjani
Copy link
Copy Markdown
Contributor Author

@zucchini-nlp Just bumping this up since it was dequeued a couple of times :)

@Rocketknight1 Rocketknight1 added this pull request to the merge queue Apr 9, 2026
Merged via the queue into huggingface:main with commit 655707b Apr 9, 2026
22 checks passed
@harshaljanjani harshaljanjani deleted the fix/kyutai-llava-longcat-test-failures branch April 9, 2026 15:41
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
…s on main CI (huggingface#44695)

* fix: Fix KyutaiSTT, LlavaOnevision, and LongcatFlash test failures on main

* revert: Remove LLaVA-OneVision change out of scope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants