Skip to content

do not index past decoded chars with special tokens#45435

Merged
itazap merged 3 commits intomainfrom
whisper_token_fix
Apr 22, 2026
Merged

do not index past decoded chars with special tokens#45435
itazap merged 3 commits intomainfrom
whisper_token_fix

Conversation

@itazap
Copy link
Copy Markdown
Collaborator

@itazap itazap commented Apr 14, 2026

fixed #44869

add check to not index past decoded chars with special tokens

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@itazap itazap requested review from ArthurZucker and eustlb April 15, 2026 13:09
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

Copy link
Copy Markdown
Contributor

@eustlb eustlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @itazap

@eustlb eustlb force-pushed the whisper_token_fix branch 3 times, most recently from 6eeaadb to e493168 Compare April 22, 2026 09:38
itazap and others added 2 commits April 22, 2026 11:42
Co-authored-by: Krishnachaitanyakc <22275437+Krishnachaitanyakc@users.noreply.github.com>
@eustlb
Copy link
Copy Markdown
Contributor

eustlb commented Apr 22, 2026

@itazap I took the liberty to add @Krishnachaitanyakc, because he had already addressed this in #45006 before 🤗
hope you don't mind

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: whisper

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45435&sha=798322

@itazap itazap added this pull request to the merge queue Apr 22, 2026
@itazap
Copy link
Copy Markdown
Collaborator Author

itazap commented Apr 22, 2026

run-slow: whisper

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/whisper"]
quantizations: []

Merged via the queue into main with commit 74a2a4d Apr 22, 2026
29 of 30 checks passed
@itazap itazap deleted the whisper_token_fix branch April 22, 2026 17:33
@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 85be53bb workflow commit (merge commit)
PR 7983227a branch commit (from PR)
main 7187177f base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Whisper word timestamp decode crashes on trailing replacement character at end of decoded token stream

4 participants