Skip to content

Whisper: Support ORT 1.17.1#1016

Merged
jambayk merged 4 commits intomainfrom
jambayk/whisper-1.17.1
Mar 14, 2024
Merged

Whisper: Support ORT 1.17.1#1016
jambayk merged 4 commits intomainfrom
jambayk/whisper-1.17.1

Conversation

@jambayk
Copy link
Contributor

@jambayk jambayk commented Mar 13, 2024

Describe your changes

ORT 1.17.1 made some changes to the beam search ops. For version 1.17.1, the beam search op used will be WhisperBeamSearch.

Fixes:
#1014

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

@xiaoyu-work
Copy link
Collaborator

I'm not sure if this is the root cause, cause I'm able to run test_transcription.py with ORT 1.17.1 successfully without --predict_timestamps. Do you know the reason?

@jambayk
Copy link
Contributor Author

jambayk commented Mar 13, 2024

I'm not sure if this is the root cause, cause I'm able to run test_transcription.py with ORT 1.17.1 successfully without --predict_timestamps. Do you know the reason?

@xiaoyu-work you can see in this PR microsoft/onnxruntime#19509 that some token ids are now part of the model attributes and not inferred using offsets. This affects the logits processor which does the time stamp prediction. I think the lack of these values in the beam search node's attributes causes the issue.

@xiaoyu-work
Copy link
Collaborator

I'm not sure if this is the root cause, cause I'm able to run test_transcription.py with ORT 1.17.1 successfully without --predict_timestamps. Do you know the reason?

@xiaoyu-work you can see in this PR microsoft/onnxruntime#19509 that some token ids are now part of the model attributes and not inferred using offsets. This affects the logits processor which does the time stamp prediction. I think the lack of these values in the beam search node's attributes causes the issue.

Can you test if this change works?

@jambayk
Copy link
Contributor Author

jambayk commented Mar 13, 2024

I'm not sure if this is the root cause, cause I'm able to run test_transcription.py with ORT 1.17.1 successfully without --predict_timestamps. Do you know the reason?

@xiaoyu-work you can see in this PR microsoft/onnxruntime#19509 that some token ids are now part of the model attributes and not inferred using offsets. This affects the logits processor which does the time stamp prediction. I think the lack of these values in the beam search node's attributes causes the issue.

Can you test if this change works?

It's already tested.

xiaoyu-work
xiaoyu-work previously approved these changes Mar 13, 2024
xiaoyu-work
xiaoyu-work previously approved these changes Mar 14, 2024
@jambayk jambayk merged commit 7aa9d1b into main Mar 14, 2024
@jambayk jambayk deleted the jambayk/whisper-1.17.1 branch March 14, 2024 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants