fix: update Gemma models to 20k input tokens#193
Conversation
Updated both Gemma 3 27B model configurations from 70,000 to 20,000 token limit to optimize performance and resource usage. Resolves #192 Co-authored-by: Anthony <AnthonyRonning@users.noreply.github.com>
WalkthroughRemoved the Changes
Sequence Diagram(s)(Skipped — changes are configuration-only and do not alter control flow.) Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Assessment against linked issues
Out-of-scope changesNo out-of-scope functional changes detected. Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
@claude oh also remove the |
There was a problem hiding this comment.
Greptile Summary
This PR updates the token limits for both Gemma 3 27B model configurations in the ModelSelector.tsx file, reducing them from 70,000 to 20,000 tokens. The change affects two model entries in the MODEL_CONFIG object: google/gemma-3-27b-it and leon-se/gemma-3-27b-it-fp8-dynamic.
The MODEL_CONFIG object serves as the central configuration registry for all available AI models in the application, defining properties like display names, badges, access requirements, and crucially, token limits. These token limits are used throughout the codebase via the getModelTokenLimit function to enforce input constraints during chat operations, ensuring that conversations don't exceed the model's processing capabilities.
This configuration change integrates seamlessly with the existing model management system - the ModelSelector component uses this configuration to display available models to users, while the chat functionality respects these limits to prevent API errors or performance degradation. The change maintains consistency with the existing model configuration pattern while optimizing resource usage for these specific high-parameter models.
Confidence score: 5/5
- This PR is extremely safe to merge with virtually no risk of causing issues
- Score reflects a simple, well-isolated configuration change that reduces resource usage rather than introducing new functionality or complexity
- No files require special attention as this is a straightforward parameter adjustment
1 file reviewed, no comments
|
Claude finished @AnthonyRonning's task —— View job Todo List
SummarySuccessfully removed the unused Changes made:
The PR now only contains the |
Removed the unused google/gemma-3-27b-it model entry from ModelSelector.tsx as requested since it's not being used in the application. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Anthony <AnthonyRonning@users.noreply.github.com>
There was a problem hiding this comment.
Greptile Summary
This review covers only the changes made since the last review (commit af7eaa8), not the entire PR.
The most recent changes modify the Gemma 3 27B model configurations in the ModelSelector.tsx component. The update removes one model variant (google/gemma-3-27b-it) entirely and reduces the token limit for the remaining variant (leon-se/gemma-3-27b-it-fp8-dynamic) from 70,000 to 20,000 tokens. This change consolidates the Gemma model offerings while optimizing resource usage.
The modification directly impacts the model configuration object that defines available AI models in the application's UI. The remaining Gemma model retains its "Starter" badge designation, vision support capability, and starter-tier requirement, but with the reduced context window. This change fits into the broader model management system where different AI models are configured with specific parameters like token limits, access requirements, and feature capabilities.
The consolidation from two Gemma variants to one suggests a strategic decision to focus on the FP8 dynamic variant, which likely offers better performance characteristics while maintaining the same user-facing functionality.
Confidence score: 5/5
- This PR is safe to merge with minimal risk
- Score reflects a simple configuration change with clear intent and minimal complexity
- No files require special attention
1 file reviewed, no comments
Deploying maple with
|
| Latest commit: |
6b12161
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://437b352f.maple-ca8.pages.dev |
| Branch Preview URL: | https://claude-issue-192-20250813-04.maple-ca8.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
frontend/src/components/ModelSelector.tsx (1)
72-75: Ensure 20k cap also applies to unknown Gemma variants (persisted IDs).If a user has a persisted model ID like
google/gemma-3-27b-it,getModelTokenLimitwill return the default 64000 (not 20000), which violates the 20k cap goal for Gemma. Add a Gemma-specific fallback to enforce 20k for any Gemma variant not explicitly listed in MODEL_CONFIG.Apply this diff:
export function getModelTokenLimit(modelId: string): number { - return MODEL_CONFIG[modelId]?.tokenLimit || DEFAULT_TOKEN_LIMIT; + const cfg = MODEL_CONFIG[modelId]; + if (cfg) return cfg.tokenLimit; + // Safety: enforce 20k cap for Gemma variants not in MODEL_CONFIG + if (modelId.toLowerCase().includes("gemma")) return 20000; + return DEFAULT_TOKEN_LIMIT; }
🧹 Nitpick comments (1)
frontend/src/components/ModelSelector.tsx (1)
237-249: Gracefully migrate deprecated Gemma model ID if it’s currently selected.Unknown models render as “Coming Soon,” but if the current selection is the removed
google/gemma-3-27b-it, the session could still operate with an incorrect default token cap. A small one-time migration ensures users are moved to the supported Gemma variant.Add this effect in the component:
useEffect(() => { // One-time migration: remap deprecated Gemma model to current one if (model === "google/gemma-3-27b-it") { setModel("leon-se/gemma-3-27b-it-fp8-dynamic"); } }, [model, setModel]);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
frontend/src/components/ModelSelector.tsx(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit Inference Engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Use 2-space indentation, double quotes, and a 100-character line limit for formatting
Use camelCase for variable and function names
Use try/catch with specific error types for error handling
Files:
frontend/src/components/ModelSelector.tsx
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (CLAUDE.md)
Use strict TypeScript typing and avoid
anywhen possible
Files:
frontend/src/components/ModelSelector.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-ios
- GitHub Check: build-linux
- GitHub Check: browseruse-tests
- GitHub Check: build-macos (universal-apple-darwin)
🔇 Additional comments (2)
frontend/src/components/ModelSelector.tsx (2)
35-41: Gemma token limit reduced to 20k — aligns with PR objective.The
leon-se/gemma-3-27b-it-fp8-dynamictokenLimit is correctly updated to 20000 and the unusedgoogle/gemma-3-27b-itentry is no longer present. This fulfills #192 and the PR description.
26-67: No stale Gemma references found
A repo-wide search confirms that the only70_000token limits are in ModelSelector.tsx for the two Llama entries, and there are no lingeringgoogle/gemma-3-27b-itIDs or Gemma tokenLimit declarations.• frontend/src/components/ModelSelector.tsx:
- Lines 29, 33 –
tokenLimit: 70000(Llama models)
Updated both Gemma 3 27B model configurations from 70,000 to 20,000 token limit to optimize performance and resource usage.
Resolves #192
Generated with Claude Code
Summary by CodeRabbit