Fix TTS output text: audio not image#28133
Merged
elithrar merged 2 commits intoproductionfrom Feb 13, 2026
Merged
Conversation
Co-authored-by: elithrar <elithrar@users.noreply.github.com>
Contributor
|
This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:
|
Contributor
|
Preview URL: https://261836d6.preview.developers.cloudflare.com |
Collaborator
|
/bonk fix the syntax issues |
Contributor
Author
|
The two type errors are fixed by adding |
mchenco
approved these changes
Feb 6, 2026
kodster28
approved these changes
Feb 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For
melotts, the output is aoneOfschema, somodel.schema.output.formatisundefined(not"binary"), meaning it falls through to theSchemaViewercomponent. This is fine — the issue only affects models where the top-level output hasformat: "binary".The image models with binary output have
contentTypevalues like"image/png"or"image/jpeg", which don't start with"audio/", so they'll still show the original "image in JPEG or PNG format" text.Here's a summary of the change:
Problem: The model page template at
src/pages/workers-ai/models/[name].astro:304had a hardcoded description for all binary output models saying "The binding returns aReadableStreamwith the image in JPEG or PNG format". This is incorrect for text-to-speech models like@cf/deepgram/aura-1,@cf/deepgram/aura-2-en, and@cf/deepgram/aura-2-es, which output audio, not images.Fix: The template now checks the model's output
contentType. If it starts with"audio/", it displays "audio in MPEG format" (derived from the actual content type). Otherwise, it falls back to the existing "image in JPEG or PNG format" text. This correctly handles both image generation models and TTS models.Closes #25780
github run