Skip to content

Add trimmed gemma270m vocab 3#800

Open
klei22 wants to merge 3 commits intoReaLLMASIC:masterfrom
klei22:add-trimmed-gemma270m-vocab-3
Open

Add trimmed gemma270m vocab 3#800
klei22 wants to merge 3 commits intoReaLLMASIC:masterfrom
klei22:add-trimmed-gemma270m-vocab-3

Conversation

@klei22
Copy link
Copy Markdown
Collaborator

@klei22 klei22 commented Apr 16, 2026

This pull request adds a new interactive chat mode to the latin_punct_router_eval.py script, allowing users to input English sentences and compare full vs routed Spanish translations with color-highlighted outputs. It also enhances the validation example output with generated text highlighting and introduces a demo shell script for streamlined testing.

New interactive chat mode and usability improvements:

  • Added a --chat_mode flag and the _run_chat_mode function to enable an interactive mode where users can input English sentences and receive both full and routed Spanish translations, with color highlighting for user input and generated continuations. [1] [2] [3] [4]
  • Introduced ANSI color codes (ANSI_USER, ANSI_GEN, ANSI_RESET) and the _highlight_generated helper to visually distinguish generated segments in both chat and validation outputs. [1] [2]

Output and evaluation enhancements:

  • Modified _generate_translation to return both the plain and color-highlighted generated text, and updated validation example printing to display these highlights. [1] [2] [3] [4] [5]
  • Updated the README.md with documentation for the new chat mode and the highlighting features. [1] [2]

Demo and reproducibility:

  • Added a trim_demo.sh shell script to demonstrate running the evaluation with standard parameters.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an interactive “chat mode” to latin_punct_router_eval.py and enhances output readability by color-highlighting generated continuations, plus includes a small demo script and README updates.

Changes:

  • Add ANSI color helpers and generated-text highlighting in validation/example outputs.
  • Introduce --chat_mode interactive loop to compare full vs routed translations for user-entered English.
  • Add trim_demo.sh and document the new mode/features in README.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
huggingface_model/gemma/270M/latin_punct_router_eval.py Adds highlighting helper, changes translation generation to return plain + highlighted text, and introduces interactive chat mode flag/loop.
huggingface_model/gemma/270M/README.md Documents highlighting and the new interactive chat mode.
huggingface_model/gemma/270M/trim_demo.sh Adds a convenience script to run the evaluation with standard parameters.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,13 @@
#!/bin/bash

python latin_punct_router_eval.py \
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trim_demo.sh calls python latin_punct_router_eval.py, which only works if the script is executed from this directory. If someone runs it from the repo root (or any other cwd), it will fail to locate the Python file. Consider invoking via an explicit path (e.g., relative to the script’s own directory) so the demo is runnable from anywhere.

Suggested change
python latin_punct_router_eval.py \
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
python "$SCRIPT_DIR/latin_punct_router_eval.py" \

Copilot uses AI. Check for mistakes.
if "Spanish:" in full_text:
return full_text.split("Spanish:", 1)[1].strip()
return full_text.strip()
generated = full_text.split("Spanish:", 1)[1].strip()
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full_text.split("Spanish:", 1) splits on the first occurrence of Spanish:. Because the 3-shot prompt itself contains multiple Spanish: lines (one per example plus the final Spanish:), this will include much of the prompt/examples in generated instead of only the model’s continuation. Use the last occurrence (e.g., rsplit("Spanish:", 1)) or otherwise slice based on the final prompt boundary so generated is just the translation for the user’s input.

Suggested change
generated = full_text.split("Spanish:", 1)[1].strip()
generated = full_text.rsplit("Spanish:", 1)[1].strip()

Copilot uses AI. Check for mistakes.
@klei22
Copy link
Copy Markdown
Collaborator Author

klei22 commented Apr 16, 2026

latin_trim_sweep_accuracy latin_trim_reports_combined_accuracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants