Whisper models in Foundry Local SDK: v1 models return empty text, v2 models only transcribe first 30 seconds

Hi,

I’m trying to use the Whisper-CPU models through the Foundry Local SDK, but I’m encountering some unexpected behavior.
There appear to be two model versions: v1 and v2.
For v1 models (e.g., base, small, medium), the API consistently returns empty text, even for audio files that produce correct transcriptions when using whisper-tiny.
For v2 models (e.g., tiny, large), the transcription is returned, but only for the first ~30 seconds of the audio. The rest of the audio is not transcribed.
Because of this, I’m currently unable to get a full transcription using the available models.

Could you please help clarify:
Whether this is a known issue with these models in the Foundry Local SDK?
If there is a recommended configuration or workaround to obtain full transcriptions?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper models in Foundry Local SDK: v1 models return empty text, v2 models only transcribe first 30 seconds #517

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Whisper models in Foundry Local SDK: v1 models return empty text, v2 models only transcribe first 30 seconds #517

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions