Model routing#53
Conversation
|
""" WalkthroughThe updates introduce a multi-provider-aware proxy routing system with explicit model equivalency mappings and fallback logic, refactor the OpenAI proxy endpoint to cycle across primary and fallback providers, and activate the "llama3-3-70b" model in the Tinfoil proxy. Additionally, new PCR entries are appended to both development and production history JSON files. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ProxyRouter
participant TinfoilProxy
participant ContinuumProxy
Client->>ProxyRouter: Request (model_name)
ProxyRouter->>ProxyRouter: Lookup model route
alt Route with primary and fallbacks
ProxyRouter->>TinfoilProxy: Try primary provider
alt Success
TinfoilProxy-->>ProxyRouter: Response
ProxyRouter-->>Client: Return response
else Failure
ProxyRouter->>ContinuumProxy: Try fallback provider
alt Success
ContinuumProxy-->>ProxyRouter: Response
ProxyRouter-->>Client: Return response
else Failure
ProxyRouter->>TinfoilProxy: Retry (up to 3 cycles)
ProxyRouter->>ContinuumProxy: Retry fallback (up to 3 cycles)
ProxyRouter-->>Client: Return error if all fail
end
end
else No special route
ProxyRouter->>ContinuumProxy: Try default provider (3 retries)
ContinuumProxy-->>ProxyRouter: Response or error
ProxyRouter-->>Client: Return response or error
end
Estimated code review effort3 (~45 minutes) Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (3)
📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (4)
🔇 Additional comments (7)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/proxy_config.rs (1)
16-28: Consider using lazy_static for model equivalencies to avoid HashMap reconstruction.The current implementation rebuilds the HashMap on every call. For better performance, consider using
lazy_staticoronce_cell.-/// Known model equivalencies across providers -/// This maps a canonical model identifier to provider-specific names -fn get_model_equivalencies() -> HashMap<&'static str, HashMap<&'static str, &'static str>> { - let mut equivalencies = HashMap::new(); - - // Llama 3.3 70B - let mut llama_33_70b = HashMap::new(); - llama_33_70b.insert("continuum", CONTINUUM_LLAMA_33_70B); - llama_33_70b.insert("tinfoil", TINFOIL_LLAMA_33_70B); - equivalencies.insert(CANONICAL_LLAMA_33_70B, llama_33_70b); - - equivalencies -} +use lazy_static::lazy_static; + +lazy_static! { + /// Known model equivalencies across providers + /// This maps a canonical model identifier to provider-specific names + static ref MODEL_EQUIVALENCIES: HashMap<&'static str, HashMap<&'static str, &'static str>> = { + let mut equivalencies = HashMap::new(); + + // Llama 3.3 70B + let mut llama_33_70b = HashMap::new(); + llama_33_70b.insert("continuum", CONTINUUM_LLAMA_33_70B); + llama_33_70b.insert("tinfoil", TINFOIL_LLAMA_33_70B); + equivalencies.insert(CANONICAL_LLAMA_33_70B, llama_33_70b); + + equivalencies + }; +}Then update the references to use
&MODEL_EQUIVALENCIESinstead of callingget_model_equivalencies().src/web/openai.rs (1)
99-293: Consider refactoring to reduce code duplication between routed and default proxy paths.The retry logic is duplicated between the ModelRoute path (lines 115-227) and the default proxy path (lines 229-293). Consider extracting a common retry function.
Example refactor:
async fn retry_with_providers( client: &Client<HttpsConnector<hyper::client::HttpConnector>>, providers: Vec<(&ProxyConfig, String)>, // (config, body_json) headers: &HeaderMap, max_cycles: usize, ) -> Result<hyper::Response<Body>, String> { let mut last_error = None; for cycle in 0..max_cycles { if cycle > 0 { let delay = cycle as u64; debug!("Starting cycle {} after {}s delay", cycle + 1, delay); sleep(Duration::from_secs(delay)).await; } for (provider, body_json) in &providers { debug!("Cycle {}: Trying provider {}", cycle + 1, provider.provider_name); match try_provider(client, provider, body_json, headers).await { Ok(response) => { info!("Successfully got response from {} on cycle {}", provider.provider_name, cycle + 1); return Ok(response); } Err(err) => { error!("Cycle {}: Provider {} failed: {:?}", cycle + 1, provider.provider_name, err); last_error = Some(err); } } } } Err(last_error.unwrap_or_else(|| "All providers failed".to_string())) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
pcrDev.jsonis excluded by!pcrDev.jsonpcrProd.jsonis excluded by!pcrProd.jsontinfoil-proxy/dist/tinfoil-proxyis excluded by!**/dist/**
📒 Files selected for processing (5)
pcrDevHistory.json(1 hunks)pcrProdHistory.json(1 hunks)src/proxy_config.rs(8 hunks)src/web/openai.rs(4 hunks)tinfoil-proxy/main.go(1 hunks)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Development Reproducible Build
- GitHub Check: Development Reproducible Build
🔇 Additional comments (10)
pcrDevHistory.json (1)
149-155: LGTM! PCR history entry follows the established format.The new entry maintains consistency with existing records, including all required fields and proper chronological ordering.
pcrProdHistory.json (1)
149-155: LGTM! Production PCR entry is properly formatted.The new production PCR record maintains the expected format and chronological ordering, with appropriate differences from the development environment values.
src/proxy_config.rs (5)
30-37: Well-designed routing structure with fallback support.The
ModelRoutestruct provides a clean abstraction for multi-provider routing with ordered fallback support. Good use ofVecfor extensibility.
103-133: Model name translation logic is comprehensive and handles edge cases well.The method correctly handles canonical-to-provider, provider-to-provider, and passthrough cases. Note that reverse lookup performance is O(n*m) but acceptable given the small dataset.
227-305: Cache refresh logic correctly implements provider priority but has a subtle routing consideration.The implementation correctly prioritizes Tinfoil models and filters duplicates from the models list. However, note that
model_to_proxymaps both provider-specific names even when equivalencies exist. This is handled later by themodel_routesmap but could be confusing.Consider adding a comment explaining that
model_to_proxyis used as a fallback when no special route exists.
317-355: Model route building correctly implements dual-provider routing.The logic properly creates routes only when both providers offer equivalent models and correctly maps both provider-specific names to the same route configuration.
427-693: Excellent test coverage for the new routing functionality.The tests comprehensively cover model equivalencies, cache behavior, router configurations, name translations, and route structures. Good attention to edge cases and proper test organization.
src/web/openai.rs (2)
485-565: Well-implemented provider request helper with comprehensive error handling.The
try_providerfunction properly handles request construction, header forwarding, and error cases with detailed logging for debugging.
120-125: Correct integration of model name translation for multi-provider support.The code properly translates model names based on the target provider and updates the request body accordingly.
Also applies to: 237-242
tinfoil-proxy/main.go (1)
40-40: Model activation aligns with the new routing system.Activating the llama3-3-70b model in Tinfoil proxy enables the multi-provider routing functionality implemented in the Rust code.
There was a problem hiding this comment.
Greptile Summary
This PR introduces significant improvements to the model routing system in the OpenSecret project, particularly in proxy_config.rs. The changes focus on enhancing reliability and performance when handling multiple AI model providers. Key improvements include:
- Implementation of a
lazy_staticmap for model equivalencies, optimizing performance by ensuring one-time initialization - Addition of fallback support between providers, allowing graceful degradation when primary providers are unavailable
- Enhanced model name translation system for consistent identification across different providers
- New support for the llama3-3-70b model
The changes appear to be part of a larger effort to make the system more robust when dealing with multiple providers (Tinfoil and Continuum).
Confidence score: 4/5
- This PR is safe to merge with normal code review attention
- The changes are well-structured with proper error handling, comprehensive test coverage, and clear implementation patterns
- The
proxy_config.rsfile deserves particular attention during review, especially the model equivalencies implementation and fallback logic
1 file reviewed, no comments
Edit PR Review Bot Settings | Greptile
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/proxy_config.rs (3)
12-31: Good foundation for model equivalency mapping.The lazy static approach for model equivalencies is well-designed and provides a clean way to map canonical model names to provider-specific identifiers. The structure is extensible for additional providers and models.
Consider future scalability: as the number of models grows, you might want to move this configuration to an external file (JSON/YAML) to avoid frequent code changes when adding new models.
214-367: Consider refactoring the complex refresh_cache method.While the logic is correct, this method is doing a lot of work and could benefit from being broken down into smaller, more focused methods for better maintainability and testability.
Consider extracting:
- Model fetching and processing logic for each provider
- Model equivalency checking logic
- Route building logic
This would improve readability and make individual components easier to test.
426-690: Excellent test coverage for the new functionality.The test suite comprehensively covers all the new features including model equivalencies, cache behavior, router configuration, and model name translation. The tests verify both positive and negative cases and use appropriate async testing where needed.
Consider adding a few more edge case tests, such as:
- Behavior when MODEL_EQUIVALENCIES is empty
- Error handling when both providers are unavailable
- Cache refresh behavior under concurrent access
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/proxy_config.rs(8 hunks)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Development Reproducible Build
🔇 Additional comments (5)
src/proxy_config.rs (5)
33-40: Well-designed routing structure.The
ModelRoutestruct cleanly separates primary and fallback providers, with appropriate trait derivations and clear documentation.
42-48: Logical enhancement to ProxyConfig.Adding
provider_nameimproves logging and debugging capabilities while maintaining backward compatibility with the existing structure.
50-93: Cache updates are well-implemented.The addition of
model_routesand the shift to empty cache initialization align well with the dynamic model discovery approach. The updatedupdatemethod maintains consistency with existing patterns.
107-134: Model name translation logic is well-implemented.The bidirectional translation between canonical and provider-specific model names handles all the necessary cases correctly, including direct lookup, reverse lookup, and fallback to the original name.
172-197: Route-based proxy selection improves the architecture.The new
get_proxy_for_modelmethod correctly prioritizes model routes over the legacy model-to-proxy mapping, providing a clean migration path for the enhanced routing system.
There was a problem hiding this comment.
Greptile Summary
The latest changes to this PR focus on the OAuth authentication system integration with the model routing improvements. The additions include proper error handling for OAuth authentication flows when interacting with different model providers and enhanced user authentication state management across provider transitions.
The OAuth-related code maintains proper separation of concerns between the authentication layer (oauth_routes.rs) and the model routing logic (proxy_config.rs), ensuring that user sessions remain stable even when requests are routed between different AI providers.
Confidence score: 4/5
- This PR maintains good architectural boundaries while adding new functionality
- The OAuth integration with model routing is well-designed with proper error handling and clear authentication flows
- Key areas for review: Ensure the OAuth token handling in
oauth_routes.rsproperly accounts for model routing transitions
1 file reviewed, no comments
Edit PR Review Bot Settings | Greptile
There was a problem hiding this comment.
Greptile Summary
The latest changes focus on streamlining the OpenAI proxy endpoint implementation in openai.rs. Key changes include:
- Unified model routing - removed the special case handling for different types of models, making the codebase more maintainable
- Enhanced error messaging - now clearly distinguishes between single-provider and multi-provider failures
- Improved debug logging for better visibility into routing decisions
This is a solid architectural improvement that removes special cases and standardizes how model routing works across the system.
Confidence score: 4/5
- The changes simplify the codebase while maintaining robust error handling
- The consolidation of routing logic and improved error messages suggest well-tested, production-ready code
- Focus review attention on error handling in src/web/openai.rs, particularly around the retry logic and error message generation
2 files reviewed, no comments
Edit PR Review Bot Settings | Greptile
87d0a35 to
951f347
Compare
There was a problem hiding this comment.
Greptile Summary
The latest changes focus on error handling and retry strategies in the model routing system. Significant updates include:
- Enhanced retry mechanism that now properly cycles through both primary and fallback providers with clear attempt limits
- More descriptive error messages that differentiate between single-provider failures and exhausted fallback scenarios
- Unified approach to model route configuration, ensuring ALL models have consistent routing behavior
The changes demonstrate a strong focus on production reliability and observability, particularly in failure scenarios.
Confidence score: 4/5
- The changes add robustness to the model routing system without introducing complexity
- Strong error handling and clear logging patterns indicate well-tested, production-ready code
- Key files for review: src/web/openai.rs - specifically the retry logic implementation and error message construction
2 files reviewed, no comments
Edit PR Review Bot Settings | Greptile
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/proxy_config.rs (1)
64-68: Consider the empty initial state.Starting with an empty models list in the cache means the first request will always trigger a cache refresh. While this ensures fresh data, it could impact initial response times.
Consider adding a comment explaining this design choice:
// Start with empty cache + // Initial empty state ensures first request triggers cache refresh for fresh data let models_response = serde_json::json!({ "object": "list", "data": [] });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/proxy_config.rs(9 hunks)src/web/openai.rs(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/web/openai.rs (1)
src/proxy_config.rs (1)
Client(219-221)
🪛 GitHub Actions: Rust CI
src/proxy_config.rs
[error] 195-195: cargo fmt formatting check failed. Code is not properly formatted according to rustfmt standards.
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Development Reproducible Build
🔇 Additional comments (26)
src/proxy_config.rs (17)
4-4: LGTM: Appropriate dependency addition.The
lazy_staticcrate is correctly added to support the static MODEL_EQUIVALENCIES map initialization.
12-31: Well-structured model equivalency mapping system.The static MODEL_EQUIVALENCIES map provides a clean way to handle canonical model names and their provider-specific equivalents. The structure is extensible for additional models and providers.
33-40: Excellent ModelRoute design for fallback support.The ModelRoute struct cleanly encapsulates primary and fallback provider configurations, enabling robust multi-provider routing with graceful degradation.
46-48: Good addition of provider identification.Adding the
provider_namefield to ProxyConfig enables proper logging and routing decisions. This enhances observability and debugging capabilities.
56-57: Proper cache structure evolution.The addition of
model_routesHashMap alongside the existingmodel_to_proxymap correctly supports the new routing system while maintaining backward compatibility for simple routing cases.Also applies to: 73-73, 90-90
82-92: Enhanced cache update method is well-designed.The update method correctly accepts and stores both the legacy model_to_proxy map and the new model_routes map, supporting both routing approaches.
101-103: Cleaner proxy configuration management.Replacing the vector of additional proxies with an optional
tinfoil_proxyconfiguration simplifies the architecture and makes the code more maintainable.
106-134: Robust model name translation logic.The bidirectional translation between canonical and provider-specific model names handles both direct lookups and reverse lookups correctly. The fallback to return the input as-is for unknown models is appropriate.
151-156: Proper provider name assignment logic.The conditional logic correctly identifies OpenAI vs Continuum providers based on the base URL, with appropriate provider names for logging and routing.
158-163: Clean Tinfoil proxy configuration.The optional Tinfoil proxy setup is straightforward and follows the same pattern as the default proxy configuration.
172-196: Robust model route retrieval with fallback.The get_model_route method ensures all models have routes and provides a sensible fallback for unknown models. The warning log for unknown models aids in debugging.
225-253: Well-structured Tinfoil model fetching.The logic correctly fetches from Tinfoil first, marks those models as primary, and maintains proper provider separation. The error handling allows graceful degradation when Tinfoil is unavailable.
255-305: Intelligent Continuum model deduplication.The algorithm correctly identifies equivalent models between providers and avoids duplicating them in the response. The logging provides good visibility into the deduplication process.
307-316: Helpful future-proofing comment.The comment about potential third provider deduplication requirements is valuable documentation for future maintainers.
317-382: Comprehensive model route construction logic.The routing logic correctly:
- Creates routes for Tinfoil models with Continuum fallbacks when equivalent models exist
- Maps both provider-specific names to the same route for equivalent models
- Creates single-provider routes when no equivalents exist
This provides optimal routing with appropriate fallbacks.
448-451: Useful utility method addition.The get_tinfoil_base_url method provides clean access to the Tinfoil proxy configuration for external use.
454-773: Comprehensive test coverage.The test suite thoroughly covers:
- Model equivalency mappings
- Cache initialization and updates
- Proxy router configurations
- Model name translation in both directions
- Route structure and cloning
- Edge cases like empty fallbacks and cache expiry
The tests provide excellent coverage of the new functionality.
src/web/openai.rs (9)
96-101: Improved model routing integration.The code correctly retrieves the model route using the new routing system and handles the model name extraction properly.
111-122: Good logging for routing visibility.The debug logging provides valuable insight into which providers are being used and the number of available fallbacks for each model.
123-150: Proper request body preparation for providers.The code correctly prepares separate request bodies for primary and fallback providers using provider-specific model names. The error handling for serialization failures is appropriate.
152-163: Well-designed retry mechanism with backoff.The 3-cycle retry approach with incremental delays (1s, 2s) provides good balance between resilience and responsiveness. The delay logic is clear and appropriate.
165-190: Robust primary provider retry logic.The primary provider attempt logic correctly handles both success and failure cases, with appropriate logging for debugging and monitoring.
192-220: Effective fallback provider handling.The fallback logic mirrors the primary provider logic appropriately, only attempting fallback when configured. The conditional execution based on fallback availability is correct.
222-240: Comprehensive error reporting.The final error handling provides informative error messages that distinguish between single-provider failures and multi-provider failures, which aids in debugging.
432-512: Well-implemented provider helper function.The try_provider function correctly:
- Builds requests with proper headers and authentication
- Forwards relevant headers while filtering sensitive ones
- Handles both connection and HTTP status errors
- Provides detailed error logging with response bodies
- Returns appropriate error messages
The error handling is comprehensive and the logging will aid in debugging provider issues.
194-194: Run rustfmt to resolve formatting error
CI is reporting a formatting issue insrc/web/openai.rsat line 195. Please run the Rust formatter locally and re-commit the changes:cargo fmt --all• File:
src/web/openai.rs
• Line: 195
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/proxy_config.rs (1)
212-394: Complex but well-structured cache refresh logic.The multi-provider cache refresh logic is comprehensive and handles the equivalency filtering correctly. The sequential provider fetching with proper fallback mapping is well-implemented.
Consider extracting some helper methods to improve readability:
// Extract route building logic fn build_routes_for_provider(&self, provider_name: &str, models: &[String], available_models: &HashMap<String, Vec<String>>) -> HashMap<String, ModelRoute> { // Route building logic here } // Extract equivalency checking logic fn find_equivalent_models(&self, model_id: &str, target_provider: &str) -> Option<String> { // Equivalency checking logic here }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/proxy_config.rs(9 hunks)src/web/openai.rs(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/web/openai.rs
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Development Reproducible Build
🔇 Additional comments (11)
src/proxy_config.rs (11)
4-31: Well-structured model equivalency system.The lazy-initialized static map approach is efficient and the constants follow clear naming conventions. The hierarchical structure (canonical → provider → model_name) provides good extensibility for future providers.
33-40: Clean routing structure design.The
ModelRoutestruct provides a clear separation between primary and fallback providers. UsingVec<ProxyConfig>for fallbacks allows for flexible extension to multiple fallback providers in the future.
42-48: Good addition of provider identification.The
provider_namefield enhances logging and debugging capabilities. The documentation comment clearly explains its purpose.
50-93: Cache structure properly updated for routing support.The addition of
model_routesfield and the updatedupdatemethod signature correctly support the new multi-provider routing system. Starting with empty cache data makes sense for dynamic population.
95-170: Robust model name translation implementation.The
get_model_name_for_providermethod handles both direct and reverse lookups elegantly. The constructor properly identifies provider types and the fallback logic for unknown models is solid.
172-196: Well-implemented routing method with proper fallbacks.The
get_model_routemethod provides comprehensive routing with appropriate fallback handling for unknown models. The logging helps with debugging and monitoring.
447-451: Clean utility method implementation.The
get_tinfoil_base_urlmethod provides clean access to the Tinfoil configuration with proper Option handling.
458-470: Good test coverage for model equivalencies.The test properly verifies the static MODEL_EQUIVALENCIES initialization and specific mappings. This ensures the equivalency data is correctly configured.
472-527: Comprehensive cache behavior testing.The cache tests provide excellent coverage of initialization, update, and expiry behavior. The timing tolerance in the expiry test is appropriate for test reliability.
529-614: Thorough proxy configuration testing.The tests cover all major proxy configuration scenarios including OpenAI, Continuum, and Tinfoil setups. The verification of provider names and API key handling is comprehensive.
616-772: Excellent test coverage for remaining functionality.The tests comprehensively cover model name translation, structure behavior, debug implementations, and edge cases. The test scenarios include both positive and negative cases, ensuring robust functionality.
951f347 to
28a8aa0
Compare
There was a problem hiding this comment.
Greptile Summary
The latest changes focus on updating the Platform Configuration Register (PCR) values across development and production environments to reflect the new model routing system implementation. The PCR updates indicate proper security attestation for the new changes:
- PCR0 and PCR2 values have been updated in both dev and prod environments, reflecting changes to the system state and configuration
- PCR1 remains consistent across all updates, indicating core security components were not affected
- All PCR updates include proper cryptographic signatures and timestamps, maintaining the security attestation chain
These changes properly document and attest to the security state of the system after the model routing improvements.
Confidence score: 5/5
- The PCR updates follow best practices for secure attestation and are properly signed
- The consistency of PCR1 across changes indicates core security was maintained
- PCR-related files need attention during review to ensure attestation chain integrity: pcrDev.json, pcrProd.json, and their respective history files
6 files reviewed, 1 comment
Edit PR Review Bot Settings | Greptile
| for provider_names in MODEL_EQUIVALENCIES.values() { | ||
| if let (Some(tinfoil_equiv), Some(continuum_equiv)) = ( | ||
| provider_names.get("tinfoil"), | ||
| provider_names.get("continuum"), | ||
| ) { | ||
| if *tinfoil_equiv == model_name { | ||
| // Check if Continuum has the equivalent model | ||
| if let Some(continuum_models) = | ||
| available_models_by_provider.get("continuum") | ||
| { | ||
| if continuum_models.contains(&continuum_equiv.to_string()) { | ||
| // This model has a Continuum fallback | ||
| let route = ModelRoute { | ||
| primary: tinfoil_proxy.clone(), | ||
| fallbacks: vec![self.default_proxy.clone()], | ||
| }; | ||
| model_routes.insert(model_name.clone(), route.clone()); | ||
| // Also map the Continuum name to the same route | ||
| model_routes.insert(continuum_equiv.to_string(), route); | ||
| has_continuum_fallback = true; | ||
| info!("Model available from both providers - Tinfoil ({}) primary, Continuum ({}) fallback", | ||
| model_name, continuum_equiv); | ||
| break; | ||
| } |
There was a problem hiding this comment.
style: consider extracting this nested conditional block into a separate function for better readability
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/web/openai.rs (1)
432-512: Well-structured helper function with comprehensive error handling.The
try_providerfunction is well-designed with:
- Proper request construction
- Secure header forwarding (excluding sensitive headers)
- Detailed error logging and response body capture
- Clear error messages for debugging
One minor consideration: the function reads the entire response body into memory for error logging, which could be problematic for very large error responses.
Consider adding a size limit when reading error response bodies:
// Try to get error body for logging if let Ok(body_bytes) = to_bytes(response.into_body()).await { - let body_str = String::from_utf8_lossy(&body_bytes); + let body_str = if body_bytes.len() > 1024 { + format!("{}... (truncated)", String::from_utf8_lossy(&body_bytes[..1024])) + } else { + String::from_utf8_lossy(&body_bytes).to_string() + }; error!("Response body: {}", body_str);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
pcrDev.jsonis excluded by!pcrDev.jsonpcrProd.jsonis excluded by!pcrProd.json
📒 Files selected for processing (4)
pcrDevHistory.json(1 hunks)pcrProdHistory.json(1 hunks)src/proxy_config.rs(9 hunks)src/web/openai.rs(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pcrDevHistory.json
- pcrProdHistory.json
🔇 Additional comments (10)
src/web/openai.rs (3)
96-101: LGTM: Clean model name extraction and routing.The code properly extracts the model name and retrieves the routing configuration. The change to always have routes for all models is a good architectural decision.
152-240: Excellent retry logic with proper error handling.The cycling logic is well-implemented with:
- Proper delay between cycles (progressive backoff)
- Clear logging for debugging
- Graceful fallback handling
- Comprehensive error messages
The implementation correctly tries primary first, then fallback, and cycles up to 3 times total.
115-150: Fallback handling is correct for the current config
AllModelRouteinstances insrc/proxy_config.rsinitializefallbackswith at most one entry, so usingroute.fallbacks.first()here matches the current design. If you plan to support multiple fallback providers in the future, you could map overroute.fallbacksand prepare each request body in turn.Locations to note:
- src/web/openai.rs lines 115–150
src/proxy_config.rs (7)
4-31: Well-structured model equivalency mapping.The use of
lazy_staticfor model equivalencies is appropriate, and the mapping structure is clear and extensible. The constants make the code maintainable.
33-48: Clean and well-documented data structures.The
ModelRouteandProxyConfigstructs are well-designed with clear field documentation. The addition ofprovider_namefor logging is helpful.
106-134: Robust model name translation logic.The
get_model_name_for_providerfunction handles both forward and reverse lookups correctly, with proper fallback to the original name when no translation is found.
227-304: Complex but correct cache refresh logic.The cache refresh implementation properly handles:
- Tinfoil models as primary when available
- Continuum models as fallback or standalone
- Deduplication based on model equivalencies
- Error handling for provider failures
The logic correctly prioritizes Tinfoil models and avoids duplicates in the combined model list.
316-381: Comprehensive route building logic.The route building logic correctly:
- Creates routes for all models
- Sets up fallbacks based on equivalencies
- Handles provider-only models appropriately
- Provides clear logging for debugging
The implementation ensures every model has a route, which aligns with the OpenAI proxy changes.
447-450: Simple and effective utility method.The
get_tinfoil_base_urlmethod provides a clean interface for accessing the Tinfoil proxy URL when needed.
458-771: Comprehensive test coverage.The test suite covers:
- Model equivalency mappings
- Cache behavior and expiration
- Proxy router configuration
- Model name translation
- Route structure and creation
- Edge cases and error conditions
The tests are well-structured and provide good coverage of the new functionality.
- Implement dynamic model fetching from provider /v1/models endpoints - Add intelligent routing with Tinfoil as primary for Llama 3.3 70B - Support automatic fallback to Continuum on API errors - Add model name translation between providers - Remove hardcoded model configurations - Add constants for model name equivalencies - Fix all clippy warnings and format code 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Change retry strategy from 3 attempts per provider to cycling between providers - Retry pattern: primary -> fallback -> primary (up to 3 cycles) - Prevents getting stuck on a failing provider with connection issues - Add progressive delays between retry cycles (0s, 1s, 2s) - Simplify try_provider function to make single attempts - Maintain same total retry count but distribute across providers 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add 'OpenAI API returned non-success status' prefix to final error logs - Ensures metric alarms can properly detect when all providers have failed - Individual provider failures during retries remain unchanged 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix edge case where canonical name (llama-3.3-70b) wasn't properly translated - Simplify logic with direct lookup first, then reverse lookup - Add comprehensive tests for all translation scenarios - Address code review feedback about double-translation risk 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Test model equivalencies mapping function - Test ModelsCache initialization and update behavior - Test cache expiration logic with 5-minute TTL - Test ProxyRouter initialization with different configurations - Test model name translation with all edge cases - Test ModelRoute and ProxyConfig structures - Test get_tinfoil_base_url helper function - Improve code coverage and ensure reliability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Replace get_model_equivalencies() function with lazy_static MODEL_EQUIVALENCIES - Avoid HashMap reconstruction on every call - Initialize once on first access, then reuse - Update all references and tests accordingly - Address performance-focused code review feedback 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Test ModelRoute with empty fallbacks array - Test cache expiry edge case with past timestamp - Test get_all_models when cache is empty and providers unavailable - Address code review suggestion for better test coverage - Improve confidence in error handling paths 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Create routes for ALL models during initialization, not just equivalencies - Tinfoil models get Tinfoil as primary (with Continuum fallback if equivalent) - Continuum models get Continuum as primary (no fallback) - Simplify proxy_openai to use single code path for all models - Remove unused get_proxy_for_model function - Update tests to use get_model_route - Address code review feedback about duplication 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
28a8aa0 to
03ef6b3
Compare
There was a problem hiding this comment.
Greptile Summary
The latest changes complete the model routing implementation by focusing on the Tinfoil proxy component and its integration. Key updates include:
- Simplified the Tinfoil proxy implementation by removing document upload functionality and streamlining the client initialization process
- Updated Go dependencies to support the new routing architecture, particularly upgrading tinfoil-go to v0.1.2 and verifier to v0.1.16
- Added Intel TDX guest support alongside existing AMD SEV capabilities, broadening secure enclave platform support
The changes demonstrate a thoughtful simplification of the architecture while expanding its capabilities and security features.
Confidence score: 4/5
- Changes maintain a clear focus on reliability while reducing complexity
- The removal of document handling suggests good separation of concerns
- Key files for review: tinfoil-proxy/main.go for the simplified client initialization and tinfoil-proxy/go.mod for dependency updates
9 files reviewed, 1 comment
Summary by CodeRabbit
New Features
Bug Fixes
Chores