Skip to content

Revert llama33 to tinfoil primary#96

Merged
AnthonyRonning merged 2 commits intomasterfrom
revert-llama33-to-tinfoil-primary
Aug 26, 2025
Merged

Revert llama33 to tinfoil primary#96
AnthonyRonning merged 2 commits intomasterfrom
revert-llama33-to-tinfoil-primary

Conversation

@AnthonyRonning
Copy link
Copy Markdown
Contributor

@AnthonyRonning AnthonyRonning commented Aug 26, 2025

Summary by CodeRabbit

  • Refactor

    • Standardized model routing so models use Tinfoil as primary with Continuum as fallback, improving predictability and failover behavior across models (including Llama 3.3 70B).
  • Chores

    • Appended new PCR history entries to development and production records with updated timestamps and signatures; no existing records were modified.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 26, 2025

Walkthrough

Appends new PCR entries to pcrDevHistory.json and pcrProdHistory.json. Standardizes proxy routing in src/proxy_config.rs so models with Continuum equivalents map to a single ModelRoute with Tinfoil as primary and Continuum (default proxy) as fallback; removes the llama3-3-70b special-case and unifies logging/control flow.

Changes

Cohort / File(s) Summary
PCR histories
pcrDevHistory.json, pcrProdHistory.json
Appended one new PCR record to each file containing PCR0/1/2 values, a timestamp, and a signature. No existing entries were modified or removed.
Proxy routing logic
src/proxy_config.rs
Removed the llama3-3-70b special-case. Map both canonical model_name and its Continuum-equivalent to the same ModelRoute. Set primary=Tinfoil and fallback=Continuum/default proxy consistently, unified logging message, and added an early break after mapping to stop further processing. No public API changes.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant Router as ProxyRouter
    participant Tinfoil as Tinfoil (primary)
    participant Continuum as Continuum (fallback)
    participant Default as Default Proxy

    Client->>Router: Request(model_name)
    Note right of Router: Resolve canonical or Continuum-equivalent name\nMap both names to same ModelRoute (primary=Tinfoil, fallback=Continuum)

    Router->>Tinfoil: Forward request
    alt Tinfoil success
        Tinfoil-->>Router: Response
        Router-->>Client: Response
    else Tinfoil error/timeout
        Router->>Continuum: Forward request (fallback)
        alt Continuum success
            Continuum-->>Router: Response
            Router-->>Client: Response
        else Continuum unavailable
            Router->>Default: Forward request (last resort)
            Default-->>Router: Response/Error
            Router-->>Client: Response/Error
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Temp continuum priority #88 — Modifies src/proxy_config.rs special-case routing for llama3-3-70b; directly related to the removed special-case here.
  • Model routing #53 — Adjusts model routing and provider mapping (Tinfoil/Continuum) in src/proxy_config.rs; overlaps in logic changes.
  • Append services to apple external oauth #32 — Previously appended PCR entries to the same PCR history files (pcrDevHistory.json, pcrProdHistory.json); directly related to the data updates.

Poem

I thump my paws on cables bright,
Routes aligned by moonlit light,
Tinfoil leads, Continuum stays near,
PCR lines stamped crisp and clear.
Hop, commit — the path is right. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 11ad28b and 8438451.

⛔ Files ignored due to path filters (2)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
📒 Files selected for processing (3)
  • pcrDevHistory.json (1 hunks)
  • pcrProdHistory.json (1 hunks)
  • src/proxy_config.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pcrDevHistory.json
  • src/proxy_config.rs
  • pcrProdHistory.json
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch revert-llama33-to-tinfoil-primary

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR reverts the system's AI model routing configuration from using Llama 3.3 70B as the primary provider back to the standard Tinfoil-primary architecture. The changes span both application logic and AWS Nitro Enclave security configurations across development and production environments.

The core change in src/proxy_config.rs removes special-case handling that was routing Llama 3.3 70B requests through Continuum as the primary provider instead of following the normal Tinfoil-primary pattern. This temporary workaround included conditional logic that would swap primary and fallback providers specifically for the 'llama3-3-70b' model, along with associated warning logs indicating this was a temporary fix for Tinfoil provider issues.

The PCR (Platform Configuration Register) files are updated to reflect the new enclave state after this configuration change. PCR values are cryptographic hashes that verify the integrity of code running in AWS Nitro Enclaves - any change to the routing logic requires corresponding PCR updates. The changes update both development (pcrDev.json) and production (pcrProd.json) PCR configurations, reverting PCR0 and PCR2 values while keeping PCR1 unchanged. These new PCR values match entries being added to the history files (pcrDevHistory.json and pcrProdHistory.json), which maintain an audit trail of all enclave state changes with timestamps and cryptographic signatures.

This revert suggests that the underlying Tinfoil provider issues that necessitated the Llama 3.3 workaround have been resolved, allowing the system to return to its standard dual-provider architecture where Tinfoil serves as primary with Continuum as fallback for all models.

Confidence score: 4/5

  • This PR is generally safe to merge as it's reverting to a known working configuration and follows established patterns for PCR management
  • Score reflects that while the changes are well-structured, PCR modifications in secure enclave environments require careful validation to ensure attestation continues to work properly
  • Pay close attention to the PCR configuration files (pcrDev.json, pcrProd.json) to ensure the hash values are correct for the enclave deployment

5 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/proxy_config.rs (1)

268-318: Ensure all “continuum” literals are replaced with a single, dynamic provider key

The change to use self.default_proxy.provider_name in your snippet is spot-on—but it isn’t enough on its own. The codebase still has dozens of hard-coded "continuum" buckets sprinkled throughout, and tests/assertions that expect "continuum". To avoid breaking equivalency lookups, routing logic, and CI tests, you must centralize the provider identifier into one constant or method, then swap out every literal occurrence.

Please refactor the following areas:

• MODEL_EQUIVALENCIES initialization (lines 27–37):
– Replace inner-map keys "continuum" with a CONTINUUM_PROVIDER constant (or call a helper).
• Default proxy selection (lines 166–170):
– Use CONTINUUM_PROVIDER rather than the literal.
• Equivalency checks (lines 283–287 & 341–348):
– Change provider_names.get("continuum") and any subsequent lookups to use the constant.
• available_models_by_provider buckets (lines 311–316):
– Already suggested: use provider_key here.
• Route-building (lines 383–385):
– Swap available_models_by_provider.get("continuum") for get(&CONTINUUM_PROVIDER).
• Tests (lines 478–482, 559–563, 642–646, 654–658, 724–727):
– Update all assert_eq!(… "continuum") or method calls using "continuum" to reference the new constant or dynamic name.

A consolidated diff might look like:

--- a/src/proxy_config.rs
+++ b/src/proxy_config.rs
@@ –15,6 +15,7 @@ use std::collections::{HashMap, HashSet};
 pub const CONTINUUM_PROVIDER: &str = "continuum";

 // MODEL_EQUIVALENCIES: map canonical name → { provider_name → actual_model }
@@ -27,13 +28,13 @@ lazy_static! {
         llama_33_70b.insert("tinfoil", TINFOIL_LLAMA_33_70B);
-        llama_33_70b.insert("continuum", CONTINUUM_LLAMA_33_70B);
+        llama_33_70b.insert(CONTINUUM_PROVIDER, CONTINUUM_LLAMA_33_70B);
         equivalencies.insert(CANONICAL_LLAMA_33_70B, llama_33_70b);

         let mut gpt_oss_120b = HashMap::new();
         gpt_oss_120b.insert("tinfoil", TINFOIL_GPT_OSS_120B);
-        gpt_oss_120b.insert("continuum", CONTINUUM_GPT_OSS_120B);
+        gpt_oss_120b.insert(CONTINUUM_PROVIDER, CONTINUUM_GPT_OSS_120B);
         equivalencies.insert(CANONICAL_GPT_OSS_120B, gpt_oss_120b);

…and then, throughout the file replace every literal "continuum" (in provider_names.get, .insert, .get, assert_eq!, log messages, etc.) with either CONTINUUM_PROVIDER or provider_key.clone().

This mandatory refactor will ensure you don’t accidentally break any existing logic or tests that assume the bucket name.

🧹 Nitpick comments (4)
src/proxy_config.rs (4)

351-355: Map canonical model name to the same route and make fallback logging provider-agnostic.

  • You correctly map both provider-specific names to a single route with Tinfoil primary and default proxy fallback. However, the canonical name (e.g., "llama-3.3-70b") isn’t currently mapped, contrary to the PR summary. Adding it prevents “unknown model” routes when callers use canonical identifiers. Also, the log message hard-codes “Continuum” even if the default proxy is OpenAI.

Apply this focused diff inside the current block to:

  • Insert canonical name mapping in addition to Tinfoil and Continuum IDs
  • Log the actual default provider name instead of a hard-coded “Continuum”
@@
-                                        // This model has a Continuum fallback
-                                        let route = ModelRoute {
-                                            primary: tinfoil_proxy.clone(),
-                                            fallbacks: vec![self.default_proxy.clone()],
-                                        };
+                                        // This model has a fallback via the default proxy provider
+                                        let route = ModelRoute {
+                                            primary: tinfoil_proxy.clone(),
+                                            fallbacks: vec![self.default_proxy.clone()],
+                                        };
                                         model_routes.insert(model_name.clone(), route.clone());
                                         // Also map the Continuum name to the same route
-                                        model_routes.insert(continuum_equiv.to_string(), route);
+                                        model_routes.insert(continuum_equiv.to_string(), route.clone());
+                                        // Also map the canonical name to the same route (when known)
+                                        if let Some((canonical, _)) = MODEL_EQUIVALENCIES
+                                            .iter()
+                                            .find(|(_, m)| m.get("tinfoil") == Some(&model_name.as_str()))
+                                        {
+                                            model_routes.insert((*canonical).to_string(), route.clone());
+                                        }
                                         has_continuum_fallback = true;
-                                        info!("Model available from both providers - Tinfoil ({}) primary, Continuum ({}) fallback", 
-                                              model_name, continuum_equiv);
+                                        info!(
+                                            "Model available from both providers - Tinfoil ({}) primary, {} ({}) fallback",
+                                            model_name, self.default_proxy.provider_name, continuum_equiv
+                                        );

Also applies to: 358-361


340-345: Iterate with canonical key available to avoid a second lookup.

Using values() loses the canonical name you need for route mapping. Iterate over (canonical, mapping) directly to simplify logic and avoid the extra find().

-                    for provider_names in MODEL_EQUIVALENCIES.values() {
+                    for (canonical, provider_names) in MODEL_EQUIVALENCIES.iter() {
                         if let (Some(tinfoil_equiv), Some(continuum_equiv)) = (
                             provider_names.get("tinfoil"),
                             provider_names.get("continuum"),
                         ) {
-                            if *tinfoil_equiv == model_name {
+                            if *tinfoil_equiv == model_name {
                                 // Check if Continuum has the equivalent model
                                 if let Some(continuum_models) =
                                     available_models_by_provider.get("continuum")
                                 {
                                     if continuum_models.contains(&continuum_equiv.to_string()) {
                                         // … see canonical insertion in the other diff hunk …
                                     }
                                 }
                             }
                         }
                     }

If you adopt this change, you can insert the canonical mapping with:
model_routes.insert((*canonical).to_string(), route.clone()); inside the same branch.


52-58: Avoid leaking API keys via Debug; redact secrets in ProxyConfig.

ProxyConfig derives Debug and tests assert api_key appears in Debug output. This creates a risk of secret leakage if configs are ever logged. Prefer a manual Debug impl that redacts api_key.

Example redacted Debug:

use std::fmt;

impl fmt::Debug for ProxyConfig {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("ProxyConfig")
            .field("base_url", &self.base_url)
            .field("api_key", &self.api_key.as_ref().map(|_| "REDACTED"))
            .field("provider_name", &self.provider_name)
            .finish()
    }
}

If you adopt this, update test_model_route_structure/test_proxy_config_debug_trait expectations to not require the literal key contents.

Also applies to: 700-714


595-606: Add a test to assert canonical-name routing resolves to Tinfoil primary with default fallback.

Given the routing change, a focused unit test will prevent regressions:

Sketch:

#[tokio::test]
async fn test_canonical_name_maps_to_same_route() {
    let router = ProxyRouter::new(
        "http://continuum.example.com".to_string(),
        None,
        Some("http://tinfoil.example.com".to_string()),
    );

    // Pre-populate cache or mock HTTP to include a pair { tinfoil: llama3-3-70b, default: ibnzterrell/... }.
    // Then assert all three identifiers resolve to identical routes:
    // - "llama-3.3-70b" (canonical)
    // - "llama3-3-70b"   (tinfoil)
    // - "ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4" (default)
}

I can provide a full version with a small mock for fetch_models_from_proxy if you’d like.

Also applies to: 608-628, 630-672

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0bc5032 and 11ad28b.

⛔ Files ignored due to path filters (2)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
📒 Files selected for processing (3)
  • pcrDevHistory.json (1 hunks)
  • pcrProdHistory.json (1 hunks)
  • src/proxy_config.rs (1 hunks)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Development Reproducible Build
🔇 Additional comments (3)
pcrProdHistory.json (1)

289-295: PCR history verification passed — approving changes

All sanity checks for the appended PCR entry in pcrProdHistory.json and the corresponding dev entry have passed:

  • Required fields are present (PCR0, PCR1, PCR2, timestamp, signature)
  • PCR values are valid hex strings of sufficient length
  • Signatures conform to base64-like format
  • Timestamps strictly increase in both prod and dev histories

No further action required.

pcrDevHistory.json (1)

290-294: Dev PCR append is consistent; matches Prod PCR1 and maintains ordering.

  • New entry appended, timestamp 1756226379 fits ascending order.
  • PCR1 matches the new Prod entry’s PCR1, which is expected.

If helpful, reuse the same shell script from the Prod comment to validate both files in one go.

src/proxy_config.rs (1)

237-256: Review Update: model_to_proxy Is Intended for Overrides Only

The model_to_proxy map is explicitly used to override routing for secondary providers (e.g., Tinfoil) and is not consulted for default‐provider (Continuum) models. Continuum models fall back to the router’s default_proxy by design, so they never need entries in model_to_proxy. No code path reads model_to_proxy to route Continuum models.

– No changes required for default‐provider routing.
– The existing logic correctly:

  • Populates model_to_proxy only when a non-default proxy (Tinfoil) is configured
  • Falls back to default_proxy for all other models

✅ This concern can be dismissed.

Likely an incorrect or invalid review comment.

AnthonyRonning and others added 2 commits August 26, 2025 12:55
Removed the temporary swap that was using Continuum as primary for
Llama 3.3 70B. Tinfoil is now restored as the primary provider with
Continuum as fallback, as originally intended.

This reverts the temporary workaround introduced in commit 772eb9f.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Updated PCR values for both dev and prod environments after
reverting the Llama 3.3 70B provider priority change.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@AnthonyRonning AnthonyRonning force-pushed the revert-llama33-to-tinfoil-primary branch from 11ad28b to 8438451 Compare August 26, 2025 17:57
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR reverts the Llama 3.3 70B model routing configuration from using Continuum as the primary provider back to the standard configuration where Tinfoil is primary with Continuum as fallback. The change removes special case handling in src/proxy_config.rs that was temporarily implemented when Tinfoil had issues serving this specific model.

The core change standardizes model routing behavior across all models, ensuring consistent failover patterns. When both Tinfoil and Continuum support a model, the system now uniformly uses Tinfoil as primary and Continuum as fallback, eliminating model-specific exceptions that created unpredictable routing behavior.

As part of this infrastructure change, the PR updates Platform Configuration Register (PCR) values in both development and production environments. PCRs are cryptographic measurements used for AWS Nitro Enclave attestation - they must be updated whenever the enclave code or configuration changes. The updated PCR values in pcrDev.json, pcrProd.json, pcrDevHistory.json, and pcrProdHistory.json reflect the new binary state after the proxy configuration changes. The history files maintain an append-only audit trail with cryptographically signed entries for security verification.

Confidence score: 5/5

  • This PR is safe to merge with minimal risk as it reverts to a previously stable configuration
  • Score reflects straightforward configuration changes with proper PCR updates following established procedures
  • No files require special attention as all changes follow standard deployment patterns

5 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

@AnthonyRonning AnthonyRonning merged commit 8abd0b7 into master Aug 26, 2025
8 checks passed
@AnthonyRonning AnthonyRonning deleted the revert-llama33-to-tinfoil-primary branch August 26, 2025 18:05
@coderabbitai coderabbitai Bot mentioned this pull request Sep 13, 2025
Closed
@coderabbitai coderabbitai Bot mentioned this pull request Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant