Skip to content

fix: update Gemma models to 20k input tokens#193

Merged
AnthonyRonning merged 2 commits intomasterfrom
claude/issue-192-20250813-0420
Aug 13, 2025
Merged

fix: update Gemma models to 20k input tokens#193
AnthonyRonning merged 2 commits intomasterfrom
claude/issue-192-20250813-0420

Conversation

@AnthonyRonning
Copy link
Copy Markdown
Contributor

@AnthonyRonning AnthonyRonning commented Aug 13, 2025

Updated both Gemma 3 27B model configurations from 70,000 to 20,000 token limit to optimize performance and resource usage.

Resolves #192

Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Removed one Gemma 3 27B model from the publicly listed models (it will appear as unavailable/Coming Soon).
    • Reduced maximum context size for the remaining Gemma 3 27B Instruct (FP8 dynamic) to 20k tokens.
    • Impact: Longer prompts may be truncated or rejected for these models — please adjust prompt lengths accordingly.

Updated both Gemma 3 27B model configurations from 70,000 to 20,000
token limit to optimize performance and resource usage.

Resolves #192

Co-authored-by: Anthony <AnthonyRonning@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 13, 2025

Walkthrough

Removed the google/gemma-3-27b-it entry from MODEL_CONFIG and reduced the tokenLimit for leon-se/gemma-3-27b-it-fp8-dynamic from 70000 to 20000 in frontend/src/components/ModelSelector.tsx. No other logic or control-flow changes.

Changes

Cohort / File(s) Summary
Model config updates
frontend/src/components/ModelSelector.tsx
Removed google/gemma-3-27b-it from exported MODEL_CONFIG; changed tokenLimit for leon-se/gemma-3-27b-it-fp8-dynamic from 70000 to 20000.

Sequence Diagram(s)

(Skipped — changes are configuration-only and do not alter control flow.)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Assessment against linked issues

Objective Addressed Explanation
Gemma at 20k input tokens (#192)

Out-of-scope changes

No out-of-scope functional changes detected.

Possibly related PRs

Poem

I nibble code like clover sprigs—
One Gemma gone, one trimmed to twenty,
My paws tap keys in quiet glee,
Limits set, the fields feel free.
Thump-thump—small changes, big hop.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/issue-192-20250813-0420

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@AnthonyRonning
Copy link
Copy Markdown
Contributor Author

@claude oh also remove the google/gemma-3-27b-it one, it's not used

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR updates the token limits for both Gemma 3 27B model configurations in the ModelSelector.tsx file, reducing them from 70,000 to 20,000 tokens. The change affects two model entries in the MODEL_CONFIG object: google/gemma-3-27b-it and leon-se/gemma-3-27b-it-fp8-dynamic.

The MODEL_CONFIG object serves as the central configuration registry for all available AI models in the application, defining properties like display names, badges, access requirements, and crucially, token limits. These token limits are used throughout the codebase via the getModelTokenLimit function to enforce input constraints during chat operations, ensuring that conversations don't exceed the model's processing capabilities.

This configuration change integrates seamlessly with the existing model management system - the ModelSelector component uses this configuration to display available models to users, while the chat functionality respects these limits to prevent API errors or performance degradation. The change maintains consistency with the existing model configuration pattern while optimizing resource usage for these specific high-parameter models.

Confidence score: 5/5

  • This PR is extremely safe to merge with virtually no risk of causing issues
  • Score reflects a simple, well-isolated configuration change that reduces resource usage rather than introducing new functionality or complexity
  • No files require special attention as this is a straightforward parameter adjustment

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Aug 13, 2025

Claude finished @AnthonyRonning's task —— View job


Todo List

  • Read ModelSelector.tsx to understand current model configurations
  • Remove the unused google/gemma-3-27b-it model entry
  • Run formatting, linting, and build commands
  • Commit and push changes to the PR branch

Summary

Successfully removed the unused google/gemma-3-27b-it model configuration from the ModelSelector.tsx file as requested. The model entry at lines 35-40 has been completely removed.

Changes made:

  • Removed the google/gemma-3-27b-it model configuration object from the MODEL_CONFIG
  • All formatting, linting, and build commands completed successfully
  • Changes committed and pushed to the PR branch claude/issue-192-20250813-0420

The PR now only contains the leon-se/gemma-3-27b-it-fp8-dynamic model with the updated 20k token limit, and the unused model has been cleaned up.


| Branch: claude/issue-192-20250813-0420

Removed the unused google/gemma-3-27b-it model entry from ModelSelector.tsx
as requested since it's not being used in the application.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Anthony <AnthonyRonning@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit af7eaa8), not the entire PR.

The most recent changes modify the Gemma 3 27B model configurations in the ModelSelector.tsx component. The update removes one model variant (google/gemma-3-27b-it) entirely and reduces the token limit for the remaining variant (leon-se/gemma-3-27b-it-fp8-dynamic) from 70,000 to 20,000 tokens. This change consolidates the Gemma model offerings while optimizing resource usage.

The modification directly impacts the model configuration object that defines available AI models in the application's UI. The remaining Gemma model retains its "Starter" badge designation, vision support capability, and starter-tier requirement, but with the reduced context window. This change fits into the broader model management system where different AI models are configured with specific parameters like token limits, access requirements, and feature capabilities.

The consolidation from two Gemma variants to one suggests a strategic decision to focus on the FP8 dynamic variant, which likely offers better performance characteristics while maintaining the same user-facing functionality.

Confidence score: 5/5

  • This PR is safe to merge with minimal risk
  • Score reflects a simple configuration change with clear intent and minimal complexity
  • No files require special attention

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying maple with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6b12161
Status: ✅  Deploy successful!
Preview URL: https://437b352f.maple-ca8.pages.dev
Branch Preview URL: https://claude-issue-192-20250813-04.maple-ca8.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
frontend/src/components/ModelSelector.tsx (1)

72-75: Ensure 20k cap also applies to unknown Gemma variants (persisted IDs).

If a user has a persisted model ID like google/gemma-3-27b-it, getModelTokenLimit will return the default 64000 (not 20000), which violates the 20k cap goal for Gemma. Add a Gemma-specific fallback to enforce 20k for any Gemma variant not explicitly listed in MODEL_CONFIG.

Apply this diff:

 export function getModelTokenLimit(modelId: string): number {
-  return MODEL_CONFIG[modelId]?.tokenLimit || DEFAULT_TOKEN_LIMIT;
+  const cfg = MODEL_CONFIG[modelId];
+  if (cfg) return cfg.tokenLimit;
+  // Safety: enforce 20k cap for Gemma variants not in MODEL_CONFIG
+  if (modelId.toLowerCase().includes("gemma")) return 20000;
+  return DEFAULT_TOKEN_LIMIT;
 }
🧹 Nitpick comments (1)
frontend/src/components/ModelSelector.tsx (1)

237-249: Gracefully migrate deprecated Gemma model ID if it’s currently selected.

Unknown models render as “Coming Soon,” but if the current selection is the removed google/gemma-3-27b-it, the session could still operate with an incorrect default token cap. A small one-time migration ensures users are moved to the supported Gemma variant.

Add this effect in the component:

useEffect(() => {
  // One-time migration: remap deprecated Gemma model to current one
  if (model === "google/gemma-3-27b-it") {
    setModel("leon-se/gemma-3-27b-it-fp8-dynamic");
  }
}, [model, setModel]);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af7eaa8 and 6b12161.

📒 Files selected for processing (1)
  • frontend/src/components/ModelSelector.tsx (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit Inference Engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Use 2-space indentation, double quotes, and a 100-character line limit for formatting
Use camelCase for variable and function names
Use try/catch with specific error types for error handling

Files:

  • frontend/src/components/ModelSelector.tsx
**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (CLAUDE.md)

Use strict TypeScript typing and avoid any when possible

Files:

  • frontend/src/components/ModelSelector.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-ios
  • GitHub Check: build-linux
  • GitHub Check: browseruse-tests
  • GitHub Check: build-macos (universal-apple-darwin)
🔇 Additional comments (2)
frontend/src/components/ModelSelector.tsx (2)

35-41: Gemma token limit reduced to 20k — aligns with PR objective.

The leon-se/gemma-3-27b-it-fp8-dynamic tokenLimit is correctly updated to 20000 and the unused google/gemma-3-27b-it entry is no longer present. This fulfills #192 and the PR description.


26-67: No stale Gemma references found
A repo-wide search confirms that the only 70_000 token limits are in ModelSelector.tsx for the two Llama entries, and there are no lingering google/gemma-3-27b-it IDs or Gemma tokenLimit declarations.

• frontend/src/components/ModelSelector.tsx:

  • Lines 29, 33 – tokenLimit: 70000 (Llama models)

@AnthonyRonning AnthonyRonning merged commit b2b84d1 into master Aug 13, 2025
8 checks passed
@AnthonyRonning AnthonyRonning deleted the claude/issue-192-20250813-0420 branch August 13, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemma at 20k input tokens

1 participant