Revert llama33 to tinfoil primary#96
Conversation
WalkthroughAppends new PCR entries to pcrDevHistory.json and pcrProdHistory.json. Standardizes proxy routing in Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Router as ProxyRouter
participant Tinfoil as Tinfoil (primary)
participant Continuum as Continuum (fallback)
participant Default as Default Proxy
Client->>Router: Request(model_name)
Note right of Router: Resolve canonical or Continuum-equivalent name\nMap both names to same ModelRoute (primary=Tinfoil, fallback=Continuum)
Router->>Tinfoil: Forward request
alt Tinfoil success
Tinfoil-->>Router: Response
Router-->>Client: Response
else Tinfoil error/timeout
Router->>Continuum: Forward request (fallback)
alt Continuum success
Continuum-->>Router: Response
Router-->>Client: Response
else Continuum unavailable
Router->>Default: Forward request (last resort)
Default-->>Router: Response/Error
Router-->>Client: Response/Error
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Greptile Summary
This PR reverts the system's AI model routing configuration from using Llama 3.3 70B as the primary provider back to the standard Tinfoil-primary architecture. The changes span both application logic and AWS Nitro Enclave security configurations across development and production environments.
The core change in src/proxy_config.rs removes special-case handling that was routing Llama 3.3 70B requests through Continuum as the primary provider instead of following the normal Tinfoil-primary pattern. This temporary workaround included conditional logic that would swap primary and fallback providers specifically for the 'llama3-3-70b' model, along with associated warning logs indicating this was a temporary fix for Tinfoil provider issues.
The PCR (Platform Configuration Register) files are updated to reflect the new enclave state after this configuration change. PCR values are cryptographic hashes that verify the integrity of code running in AWS Nitro Enclaves - any change to the routing logic requires corresponding PCR updates. The changes update both development (pcrDev.json) and production (pcrProd.json) PCR configurations, reverting PCR0 and PCR2 values while keeping PCR1 unchanged. These new PCR values match entries being added to the history files (pcrDevHistory.json and pcrProdHistory.json), which maintain an audit trail of all enclave state changes with timestamps and cryptographic signatures.
This revert suggests that the underlying Tinfoil provider issues that necessitated the Llama 3.3 workaround have been resolved, allowing the system to return to its standard dual-provider architecture where Tinfoil serves as primary with Continuum as fallback for all models.
Confidence score: 4/5
- This PR is generally safe to merge as it's reverting to a known working configuration and follows established patterns for PCR management
- Score reflects that while the changes are well-structured, PCR modifications in secure enclave environments require careful validation to ensure attestation continues to work properly
- Pay close attention to the PCR configuration files (
pcrDev.json,pcrProd.json) to ensure the hash values are correct for the enclave deployment
5 files reviewed, no comments
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/proxy_config.rs (1)
268-318: Ensure all “continuum” literals are replaced with a single, dynamic provider keyThe change to use
self.default_proxy.provider_namein your snippet is spot-on—but it isn’t enough on its own. The codebase still has dozens of hard-coded"continuum"buckets sprinkled throughout, and tests/assertions that expect"continuum". To avoid breaking equivalency lookups, routing logic, and CI tests, you must centralize the provider identifier into one constant or method, then swap out every literal occurrence.Please refactor the following areas:
• MODEL_EQUIVALENCIES initialization (lines 27–37):
– Replace inner-map keys"continuum"with aCONTINUUM_PROVIDERconstant (or call a helper).
• Default proxy selection (lines 166–170):
– UseCONTINUUM_PROVIDERrather than the literal.
• Equivalency checks (lines 283–287 & 341–348):
– Changeprovider_names.get("continuum")and any subsequent lookups to use the constant.
• available_models_by_provider buckets (lines 311–316):
– Already suggested: useprovider_keyhere.
• Route-building (lines 383–385):
– Swapavailable_models_by_provider.get("continuum")forget(&CONTINUUM_PROVIDER).
• Tests (lines 478–482, 559–563, 642–646, 654–658, 724–727):
– Update allassert_eq!(… "continuum")or method calls using"continuum"to reference the new constant or dynamic name.A consolidated diff might look like:
--- a/src/proxy_config.rs +++ b/src/proxy_config.rs @@ –15,6 +15,7 @@ use std::collections::{HashMap, HashSet}; pub const CONTINUUM_PROVIDER: &str = "continuum"; // MODEL_EQUIVALENCIES: map canonical name → { provider_name → actual_model } @@ -27,13 +28,13 @@ lazy_static! { llama_33_70b.insert("tinfoil", TINFOIL_LLAMA_33_70B); - llama_33_70b.insert("continuum", CONTINUUM_LLAMA_33_70B); + llama_33_70b.insert(CONTINUUM_PROVIDER, CONTINUUM_LLAMA_33_70B); equivalencies.insert(CANONICAL_LLAMA_33_70B, llama_33_70b); let mut gpt_oss_120b = HashMap::new(); gpt_oss_120b.insert("tinfoil", TINFOIL_GPT_OSS_120B); - gpt_oss_120b.insert("continuum", CONTINUUM_GPT_OSS_120B); + gpt_oss_120b.insert(CONTINUUM_PROVIDER, CONTINUUM_GPT_OSS_120B); equivalencies.insert(CANONICAL_GPT_OSS_120B, gpt_oss_120b);…and then, throughout the file replace every literal
"continuum"(inprovider_names.get,.insert,.get,assert_eq!, log messages, etc.) with eitherCONTINUUM_PROVIDERorprovider_key.clone().This mandatory refactor will ensure you don’t accidentally break any existing logic or tests that assume the bucket name.
🧹 Nitpick comments (4)
src/proxy_config.rs (4)
351-355: Map canonical model name to the same route and make fallback logging provider-agnostic.
- You correctly map both provider-specific names to a single route with Tinfoil primary and default proxy fallback. However, the canonical name (e.g., "llama-3.3-70b") isn’t currently mapped, contrary to the PR summary. Adding it prevents “unknown model” routes when callers use canonical identifiers. Also, the log message hard-codes “Continuum” even if the default proxy is OpenAI.
Apply this focused diff inside the current block to:
- Insert canonical name mapping in addition to Tinfoil and Continuum IDs
- Log the actual default provider name instead of a hard-coded “Continuum”
@@ - // This model has a Continuum fallback - let route = ModelRoute { - primary: tinfoil_proxy.clone(), - fallbacks: vec![self.default_proxy.clone()], - }; + // This model has a fallback via the default proxy provider + let route = ModelRoute { + primary: tinfoil_proxy.clone(), + fallbacks: vec![self.default_proxy.clone()], + }; model_routes.insert(model_name.clone(), route.clone()); // Also map the Continuum name to the same route - model_routes.insert(continuum_equiv.to_string(), route); + model_routes.insert(continuum_equiv.to_string(), route.clone()); + // Also map the canonical name to the same route (when known) + if let Some((canonical, _)) = MODEL_EQUIVALENCIES + .iter() + .find(|(_, m)| m.get("tinfoil") == Some(&model_name.as_str())) + { + model_routes.insert((*canonical).to_string(), route.clone()); + } has_continuum_fallback = true; - info!("Model available from both providers - Tinfoil ({}) primary, Continuum ({}) fallback", - model_name, continuum_equiv); + info!( + "Model available from both providers - Tinfoil ({}) primary, {} ({}) fallback", + model_name, self.default_proxy.provider_name, continuum_equiv + );Also applies to: 358-361
340-345: Iterate with canonical key available to avoid a second lookup.Using values() loses the canonical name you need for route mapping. Iterate over (canonical, mapping) directly to simplify logic and avoid the extra find().
- for provider_names in MODEL_EQUIVALENCIES.values() { + for (canonical, provider_names) in MODEL_EQUIVALENCIES.iter() { if let (Some(tinfoil_equiv), Some(continuum_equiv)) = ( provider_names.get("tinfoil"), provider_names.get("continuum"), ) { - if *tinfoil_equiv == model_name { + if *tinfoil_equiv == model_name { // Check if Continuum has the equivalent model if let Some(continuum_models) = available_models_by_provider.get("continuum") { if continuum_models.contains(&continuum_equiv.to_string()) { // … see canonical insertion in the other diff hunk … } } } } }If you adopt this change, you can insert the canonical mapping with:
model_routes.insert((*canonical).to_string(), route.clone());inside the same branch.
52-58: Avoid leaking API keys via Debug; redact secrets in ProxyConfig.ProxyConfig derives Debug and tests assert api_key appears in Debug output. This creates a risk of secret leakage if configs are ever logged. Prefer a manual Debug impl that redacts api_key.
Example redacted Debug:
use std::fmt; impl fmt::Debug for ProxyConfig { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { f.debug_struct("ProxyConfig") .field("base_url", &self.base_url) .field("api_key", &self.api_key.as_ref().map(|_| "REDACTED")) .field("provider_name", &self.provider_name) .finish() } }If you adopt this, update test_model_route_structure/test_proxy_config_debug_trait expectations to not require the literal key contents.
Also applies to: 700-714
595-606: Add a test to assert canonical-name routing resolves to Tinfoil primary with default fallback.Given the routing change, a focused unit test will prevent regressions:
Sketch:
#[tokio::test] async fn test_canonical_name_maps_to_same_route() { let router = ProxyRouter::new( "http://continuum.example.com".to_string(), None, Some("http://tinfoil.example.com".to_string()), ); // Pre-populate cache or mock HTTP to include a pair { tinfoil: llama3-3-70b, default: ibnzterrell/... }. // Then assert all three identifiers resolve to identical routes: // - "llama-3.3-70b" (canonical) // - "llama3-3-70b" (tinfoil) // - "ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4" (default) }I can provide a full version with a small mock for fetch_models_from_proxy if you’d like.
Also applies to: 608-628, 630-672
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
pcrDev.jsonis excluded by!pcrDev.jsonpcrProd.jsonis excluded by!pcrProd.json
📒 Files selected for processing (3)
pcrDevHistory.json(1 hunks)pcrProdHistory.json(1 hunks)src/proxy_config.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Development Reproducible Build
🔇 Additional comments (3)
pcrProdHistory.json (1)
289-295: PCR history verification passed — approving changesAll sanity checks for the appended PCR entry in pcrProdHistory.json and the corresponding dev entry have passed:
- Required fields are present (PCR0, PCR1, PCR2, timestamp, signature)
- PCR values are valid hex strings of sufficient length
- Signatures conform to base64-like format
- Timestamps strictly increase in both prod and dev histories
No further action required.
pcrDevHistory.json (1)
290-294: Dev PCR append is consistent; matches Prod PCR1 and maintains ordering.
- New entry appended, timestamp 1756226379 fits ascending order.
- PCR1 matches the new Prod entry’s PCR1, which is expected.
If helpful, reuse the same shell script from the Prod comment to validate both files in one go.
src/proxy_config.rs (1)
237-256: Review Update: model_to_proxy Is Intended for Overrides OnlyThe
model_to_proxymap is explicitly used to override routing for secondary providers (e.g., Tinfoil) and is not consulted for default‐provider (Continuum) models. Continuum models fall back to the router’sdefault_proxyby design, so they never need entries inmodel_to_proxy. No code path readsmodel_to_proxyto route Continuum models.– No changes required for default‐provider routing.
– The existing logic correctly:
- Populates
model_to_proxyonly when a non-default proxy (Tinfoil) is configured- Falls back to
default_proxyfor all other models✅ This concern can be dismissed.
Likely an incorrect or invalid review comment.
Removed the temporary swap that was using Continuum as primary for Llama 3.3 70B. Tinfoil is now restored as the primary provider with Continuum as fallback, as originally intended. This reverts the temporary workaround introduced in commit 772eb9f. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Updated PCR values for both dev and prod environments after reverting the Llama 3.3 70B provider priority change. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
11ad28b to
8438451
Compare
There was a problem hiding this comment.
Greptile Summary
This PR reverts the Llama 3.3 70B model routing configuration from using Continuum as the primary provider back to the standard configuration where Tinfoil is primary with Continuum as fallback. The change removes special case handling in src/proxy_config.rs that was temporarily implemented when Tinfoil had issues serving this specific model.
The core change standardizes model routing behavior across all models, ensuring consistent failover patterns. When both Tinfoil and Continuum support a model, the system now uniformly uses Tinfoil as primary and Continuum as fallback, eliminating model-specific exceptions that created unpredictable routing behavior.
As part of this infrastructure change, the PR updates Platform Configuration Register (PCR) values in both development and production environments. PCRs are cryptographic measurements used for AWS Nitro Enclave attestation - they must be updated whenever the enclave code or configuration changes. The updated PCR values in pcrDev.json, pcrProd.json, pcrDevHistory.json, and pcrProdHistory.json reflect the new binary state after the proxy configuration changes. The history files maintain an append-only audit trail with cryptographically signed entries for security verification.
Confidence score: 5/5
- This PR is safe to merge with minimal risk as it reverts to a previously stable configuration
- Score reflects straightforward configuration changes with proper PCR updates following established procedures
- No files require special attention as all changes follow standard deployment patterns
5 files reviewed, no comments
Summary by CodeRabbit
Refactor
Chores