Centralize model catalog aliases#183
Conversation
WalkthroughModel matching changed from prefix-based heuristics to an explicit per-model registry with metadata and aliases. Alias resolution and canonicalization are used to validate/translate requested completion models (only when api_listed && enabled). New model-list endpoints/builders and request rewrite/billing updates were added. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server
participant Resolver as Model Resolver
participant Registry as Model Registry
participant Upstream as Upstream Provider
Client->>Server: POST /v1/chat/completions { model: "auto:quick", ... }
Server->>Resolver: resolve_completion_model_id("auto:quick")
Resolver->>Registry: lookup alias/legacy -> canonical provider id
Registry-->>Resolver: "llama-3-70b" (api_listed && enabled)
Resolver-->>Server: "llama-3-70b"
alt Resolved & allowed
Server->>Server: set outgoing_body["model"]="llama-3-70b"
Server->>Server: billing_context.model_name="llama-3-70b"
Server->>Upstream: POST ... { model:"llama-3-70b", ... }
Upstream-->>Server: response
Server-->>Client: encrypted proxied response
else Not listed/enabled or unsupported
Server-->>Client: ApiError::BadRequest
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/web/responses/handlers.rs (1)
358-368: Test may not reflect the actual runtime flow.This test verifies that
build_model_turn_requestpreserves the model value frombody.model. However, in the actual flow throughcreate_response_stream,body.modelis already overwritten with the resolved provider ID at line 2504 beforebuild_model_turn_requestis called viastream_one_assistant_turn.The test name implies the alias is preserved for provider resolution, but the integration actually sends the already-resolved provider ID. Consider either:
- Renaming the test to clarify it only tests the copy behavior
- Adding an integration-level test that verifies the full resolution flow
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/web/responses/handlers.rs` around lines 358 - 368, The test test_build_model_turn_request_preserves_auto_alias_for_provider_resolution is misleading because at runtime create_response_stream overwrites body.model with the resolved provider ID (see create_response_stream and stream_one_assistant_turn) before build_model_turn_request is invoked; either rename the unit test to something like test_build_model_turn_request_copies_model_value to reflect it only checks copy behavior of build_model_turn_request, or add an integration test that exercises create_response_stream/stream_one_assistant_turn end-to-end: simulate a request with AUTO_QUICK_MODEL_ID, let the resolver run so body.model is replaced by the provider ID, call the streaming flow, and assert the outgoing chat request/model matches the resolved provider ID rather than the alias.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/web/responses/handlers.rs`:
- Around line 358-368: The test
test_build_model_turn_request_preserves_auto_alias_for_provider_resolution is
misleading because at runtime create_response_stream overwrites body.model with
the resolved provider ID (see create_response_stream and
stream_one_assistant_turn) before build_model_turn_request is invoked; either
rename the unit test to something like
test_build_model_turn_request_copies_model_value to reflect it only checks copy
behavior of build_model_turn_request, or add an integration test that exercises
create_response_stream/stream_one_assistant_turn end-to-end: simulate a request
with AUTO_QUICK_MODEL_ID, let the resolver run so body.model is replaced by the
provider ID, call the streaming flow, and assert the outgoing chat request/model
matches the resolved provider ID rather than the alias.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e382398d-160a-4612-bda5-b91f60aca8f9
📒 Files selected for processing (2)
src/web/openai.rssrc/web/responses/handlers.rs
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Keep model alias resolution provider-aware Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Update supported model catalog Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
56655d3 to
8328d94
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/model_config.rs (1)
510-522: Guard alias exposure with target state.The catalog always publishes both aliases and defaults, even if a future catalog update disables or hides the target model. In that state
model_catalog_response()would still advertiseauto:quick/auto:powerful, butresolve_completion_model_id()would returnNonefor them. Filtering aliases/defaults against a target that is still usable would keep the catalog contract consistent.Suggested direction
let aliases = MODEL_ALIAS_ENTRIES .iter() + .filter(|alias| { + model_entry(alias.target_model) + .is_some_and(|entry| entry.listed && entry.api_listed && entry.enabled) + }) .map(|entry| entry.catalog_json()) .collect::<Vec<_>>();
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/model_config.rs`:
- Around line 488-495: The current lookup in MODEL_CONFIGS uses a fallback that
matches entries whose id is a prefix of canonical (the .or_else with
canonical.starts_with), which reintroduces the removed suffix-variant heuristic;
change the lookup to only use an exact match by removing the .or_else block so
MODEL_CONFIGS.iter().find(|entry| entry.id == canonical) is the sole lookup,
ensuring suffix variants like "gpt-oss-120b-anything" are not accepted via
prefix matching.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1c0deab2-da94-4af9-b989-6d4dc0fdeea2
⛔ Files ignored due to path filters (2)
pcrDev.jsonis excluded by!pcrDev.jsonpcrProd.jsonis excluded by!pcrProd.json
📒 Files selected for processing (6)
pcrDevHistory.jsonpcrProdHistory.jsonsrc/model_config.rssrc/tokens.rssrc/web/openai.rssrc/web/responses/handlers.rs
✅ Files skipped from review due to trivial changes (2)
- pcrProdHistory.json
- pcrDevHistory.json
🚧 Files skipped from review as they are similar to previous changes (3)
- src/tokens.rs
- src/web/responses/handlers.rs
- src/web/openai.rs
| MODEL_CONFIGS | ||
| .iter() | ||
| .find(|entry| model.starts_with(entry.prefix)) | ||
| .find(|entry| entry.id == canonical) | ||
| .or_else(|| { | ||
| MODEL_CONFIGS | ||
| .iter() | ||
| .find(|entry| canonical.starts_with(entry.id)) | ||
| }) |
There was a problem hiding this comment.
Remove the residual prefix fallback from model_config.
Lines 491-495 still let any unregistered suffix variant inherit a registered config. For example, gpt-oss-120b-anything will get GPT-OSS limits/sampling here, while resolve_completion_model_id correctly rejects it. That reintroduces the heuristic this PR is trying to remove and can make validation/config/billing disagree.
Suggested fix
MODEL_CONFIGS
.iter()
.find(|entry| entry.id == canonical)
- .or_else(|| {
- MODEL_CONFIGS
- .iter()
- .find(|entry| canonical.starts_with(entry.id))
- })
.map(|entry| entry.config)
.unwrap_or(DEFAULT_MODEL_CONFIG)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| MODEL_CONFIGS | |
| .iter() | |
| .find(|entry| model.starts_with(entry.prefix)) | |
| .find(|entry| entry.id == canonical) | |
| .or_else(|| { | |
| MODEL_CONFIGS | |
| .iter() | |
| .find(|entry| canonical.starts_with(entry.id)) | |
| }) | |
| MODEL_CONFIGS | |
| .iter() | |
| .find(|entry| entry.id == canonical) | |
| .map(|entry| entry.config) | |
| .unwrap_or(DEFAULT_MODEL_CONFIG) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/model_config.rs` around lines 488 - 495, The current lookup in
MODEL_CONFIGS uses a fallback that matches entries whose id is a prefix of
canonical (the .or_else with canonical.starts_with), which reintroduces the
removed suffix-variant heuristic; change the lookup to only use an exact match
by removing the .or_else block so MODEL_CONFIGS.iter().find(|entry| entry.id ==
canonical) is the sole lookup, ensuring suffix variants like
"gpt-oss-120b-anything" are not accepted via prefix matching.
Summary
Validation
Summary by CodeRabbit
New Features
Bug Fixes
Changes
Tests
Chores