Skip to content

fix: silently drop unknown prediction inputs instead of rejecting them#2943

Merged
michaeldwan merged 4 commits intomainfrom
md/allow-unknown-inputs
Apr 15, 2026
Merged

fix: silently drop unknown prediction inputs instead of rejecting them#2943
michaeldwan merged 4 commits intomainfrom
md/allow-unknown-inputs

Conversation

@michaeldwan
Copy link
Copy Markdown
Member

Coglet was rejecting predictions that contain input fields the model doesn't recognize -- e.g. a client sends guidance_scale to a model that never had that input. This broke backwards compatibility with Replicate's API, which has historically ignored unknown inputs. Models upgraded to new Cog started failing with:

{"detail":[{"loc":["body","input","guidance_scale"],"msg":"Unexpected field 'guidance_scale'","type":"value_error.extra"}]}

The root cause was additionalProperties: false being injected into the JSON Schema validator at startup. I removed that and added a strip_unknown step that silently drops unrecognized fields before validation. Stripped fields are logged at warn level with the prediction ID.

Required field validation, type checking, and training input fallback all still work as before.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 14, 2026

@michaeldwan Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

Coglet was injecting additionalProperties: false into the JSON Schema
validator, causing 422 errors for any input field not in the model's
schema. This broke backwards compatibility -- Replicate's API has
historically ignored unknown inputs, and deployed models rely on that.

Strip unknown fields before validation instead. Stripped field names are
logged at warn level for observability.
@michaeldwan michaeldwan force-pushed the md/allow-unknown-inputs branch from e3a902f to c5fc127 Compare April 15, 2026 15:38
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes a backward compatibility issue where Coglet was rejecting predictions with unknown input fields. The fix strips unknown fields before validation (matching Replicate's historical API behavior) and logs them at warn level.

Medium: Race condition in training input validation

In service.rs, strip_and_validate_train_input drops the read lock on train_validator before calling strip_and_validate_input, creating a TOCTOU race. If the schema is updated between these two calls, the second validation could use a different validator.

Comment on lines +362 to +365
/// Strip unknown fields from training input and validate in one pass.
///
/// Falls back to the predict validator if no training schema is present.
pub async fn validate_train_input(
pub async fn strip_and_validate_train_input(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: This method drops the read lock on train_validator before calling strip_and_validate_input. Between dropping the guard and acquiring a new one in strip_and_validate_input, the schema could change, causing the second validation to potentially use a different validator.

Consider restructuring to hold both locks or perform both operations under a single lock acquisition.

Suggested change
/// Strip unknown fields from training input and validate in one pass.
///
/// Falls back to the predict validator if no training schema is present.
pub async fn validate_train_input(
pub async fn strip_and_validate_train_input(
/// Strip unknown fields from training input and validate in one pass.
///
/// Falls back to the predict validator if no training schema is present.
pub async fn strip_and_validate_train_input(
&self,
input: &mut serde_json::Value,
) -> (
Vec<String>,
Result<(), Vec<crate::input_validation::ValidationError>>,
) {
let train_guard = self.train_validator.read().await;
if let Some(ref validator) = *train_guard {
let stripped = validator.strip_unknown(input);
let result = validator.validate(input);
return (stripped, result);
}
drop(train_guard);
// Try the predict validator as fallback
let predict_guard = self.input_validator.read().await;
if let Some(ref validator) = *predict_guard {
let stripped = validator.strip_unknown(input);
let result = validator.validate(input);
return (stripped, result);
}
(Vec::new(), Ok(()))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion seems legit

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 15, 2026

Review posted to PR #2943. I identified one actionable issue: a race condition in the training input validation fallback path where the lock is dropped before the second validation attempt.

github run

Inline the predict-validator fallback instead of re-entering through
strip_and_validate_input. The old code dropped the train_validator lock
before acquiring the input_validator lock, allowing a schema swap between
the two acquisitions.
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes a backward compatibility issue where Coglet was rejecting predictions with unknown input fields. The fix strips unknown fields before validation (matching Replicate's historical API behavior) and logs them at warn level.

Medium: Race condition in training input validation

In service.rs, strip_and_validate_train_input drops the read lock on train_validator before calling strip_and_validate_input, creating a TOCTOU race. If the schema is updated between these two calls, the second validation could use a different validator.

Comment thread crates/coglet/src/service.rs
@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 15, 2026

Review posted to PR #2943. I identified one actionable issue: a race condition in the training input validation fallback path where the lock is dropped before the second validation attempt.

github run

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 15, 2026

LGTM

github run

@michaeldwan michaeldwan marked this pull request as ready for review April 15, 2026 16:06
@michaeldwan michaeldwan requested a review from a team as a code owner April 15, 2026 16:06
@michaeldwan michaeldwan added this pull request to the merge queue Apr 15, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 15, 2026
@michaeldwan michaeldwan added this pull request to the merge queue Apr 15, 2026
Merged via the queue into main with commit 7e8cb4a Apr 15, 2026
40 checks passed
@michaeldwan michaeldwan deleted the md/allow-unknown-inputs branch April 15, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants