System Info
transformers version: 4.44.2
- Platform: macOS-15.0-arm64-arm-64bit
- Python version: 3.12.6
- Huggingface_hub version: 0.24.7
- Safetensors version: 0.4.5
- Accelerate version: 0.34.2
- Accelerate config: not found
- PyTorch version (GPU?): 2.6.0.dev20240916 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: No
Who can help?
@kamilakesbi @ArthurZucker @itazap
Information
Tasks
Reproduction
Hi, I am attempting to transcribe several audio files; however, the process intermittently encounters an exception with some of the files. The transcription works successfully in approximately 90% of the cases, but certain files trigger this exception unexpectedly. I am attaching one of the audio files that generates this exception for your review. Thank you.
- I was able replicate it on a MacOS on CPU and Linux on CUDA.
1 Install Stable TS
pip install stable-ts
2 Run the code:
import stable_whisper
model = stable_whisper.load_hf_whisper('medium')
result = model.transcribe(
audio = 'radio_18596_1726554951_1726554981.mp3',
)
print(result.text)
Audio sample: https://filebin.net/hivqswoer298m65m
Than I receive the follow exception:
Traceback (most recent call last):
File "/tests/test.py", line 4, in <module>
result = model.transcribe(
^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/stable_whisper/whisper_word_level/hf_whisper.py", line 236, in transcribe
return transcribe_any(
^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/stable_whisper/non_whisper.py", line 342, in transcribe_any
result = inference_func(**inference_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/stable_whisper/whisper_word_level/hf_whisper.py", line 116, in _inner_transcribe
output = self._pipe(audio, **pipe_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 284, in __call__
return super().__call__(inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/pipelines/base.py", line 1255, in __call__
return next(
^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
processed = self.infer(item, **self.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 587, in postprocess
text, optional = self.tokenizer._decode_asr(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 835, in _decode_asr
return _decode_asr(
^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1086, in _decode_asr
resolved_tokens, resolved_token_timestamps = _find_longest_common_sequence(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1193, in _find_longest_common_sequence
matches = sum(
^^^^
File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1198, in <genexpr>
and left_token_timestamp_sequence[left_start + idx]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<=' not supported between instances of 'NoneType' and 'float'
Expected behavior
To be able to transcibe the audio files without this exception.
System Info
transformersversion: 4.44.2Who can help?
@kamilakesbi @ArthurZucker @itazap
Information
Tasks
examplesfolder (such as GLUE/SQuAD, ...)Reproduction
Hi, I am attempting to transcribe several audio files; however, the process intermittently encounters an exception with some of the files. The transcription works successfully in approximately 90% of the cases, but certain files trigger this exception unexpectedly. I am attaching one of the audio files that generates this exception for your review. Thank you.
1 Install Stable TS
pip install stable-ts2 Run the code:
Audio sample: https://filebin.net/hivqswoer298m65m
Than I receive the follow exception:
Expected behavior
To be able to transcibe the audio files without this exception.