Skip to content

Reject unsupported js_repl image MIME types#19292

Merged
etraut-openai merged 1 commit intomainfrom
etraut/bad-image-type
Apr 24, 2026
Merged

Reject unsupported js_repl image MIME types#19292
etraut-openai merged 1 commit intomainfrom
etraut/bad-image-type

Conversation

@etraut-openai
Copy link
Copy Markdown
Collaborator

@etraut-openai etraut-openai commented Apr 24, 2026

Summary

codex.emitImage accepted arbitrary image MIME types for byte payloads and data URLs. That allowed a value like image/rgba to be wrapped as an input_image, even though it is not a supported encoded image format, so the invalid image could reach the model-input path and trigger output sanitization.

This results in a panic in debug builds because the output sanitization is meant as a final safety net, not a primary means of rejecting invalid image types. I've hit this case multiple times when executing certain long-running tasks.

This PR rejects unsupported image MIME types before they are emitted from js_repl.

Changes

  • Validate codex.emitImage({ bytes, mimeType }) in the JS kernel so only encoded PNG, JPEG, WebP, or GIF payloads are accepted.
  • Apply the same MIME allowlist to direct image data URLs, including the Rust host-side validation path.
  • Clarify the JS REPL instructions so agents know byte payloads must already be encoded as PNG/JPEG/WebP/GIF.

@etraut-openai etraut-openai requested a review from a team as a code owner April 24, 2026 05:50
Copy link
Copy Markdown
Collaborator

@bolinfest bolinfest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about any of these comments, FYI!

return mediaType;
}

function validateEmitImageMimeType(mimeType) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider naming assertEmitImageMimeType() to underscore the fact that this will throw.

I'm so used to Rust: should we return something Result-like instead of throwing? I'm not familiar with how we handle error-handling in this code.

if (commaIndex < 0) {
throw new Error("codex.emitImage expected a valid image data URL");
}
const mediaType = dataUrl.slice(5, commaIndex).split(";")[0];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where the 5 comes from?

Suggested change
const mediaType = dataUrl.slice(5, commaIndex).split(";")[0];
const mediaType = dataUrl.slice('data:'.length, commaIndex).split(";")[0];

Comment on lines +1263 to +1272
if (
normalized !== "image/png" &&
normalized !== "image/jpeg" &&
normalized !== "image/webp" &&
normalized !== "image/gif"
) {
throw new Error(
"codex.emitImage only supports image/png, image/jpeg, image/webp, or image/gif",
);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative to the repeated !==?

Suggested change
if (
normalized !== "image/png" &&
normalized !== "image/jpeg" &&
normalized !== "image/webp" &&
normalized !== "image/gif"
) {
throw new Error(
"codex.emitImage only supports image/png, image/jpeg, image/webp, or image/gif",
);
}
const SUPPORTED_IMAGE_TYPES = [
"image/png",
"image/jpeg",
"image/webp",
"image/gif",
];
if (!SUPPORTED_IMAGE_TYPES.includes(normalized)) {
throw new Error(
`codex.emitImage only supports ${SUPPORTED_IMAGE_TYPES.join(", ")}`,
);
}

Comment on lines +1801 to +1806
if !image_url
.get(..5)
.is_some_and(|scheme| scheme.eq_ignore_ascii_case("data:"))
{
return Err("codex.emitImage only accepts data URLs".to_string());
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip_prefix() to the rescue!

Suggested change
if !image_url
.get(..5)
.is_some_and(|scheme| scheme.eq_ignore_ascii_case("data:"))
{
return Err("codex.emitImage only accepts data URLs".to_string());
}
let data_payload = image_url.strip_prefix("data:").ok_or_else(|| anyhow!("expected data: URI but got `{image_url}`"))?;

Oh, but you need to support eq_ignore_ascii_case()? Is that a thing for URIs?

@etraut-openai etraut-openai force-pushed the etraut/bad-image-type branch from d75e908 to 871e02c Compare April 24, 2026 06:50
@etraut-openai etraut-openai merged commit ac8c9fc into main Apr 24, 2026
25 checks passed
@etraut-openai etraut-openai deleted the etraut/bad-image-type branch April 24, 2026 07:14
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants