A Python tool to capture a selected screen region on GNOME, extract text using Tesseract OCR (French by default), and translate it using either OpenAI GPT or Google Translate API.
Includes a Tkinter GUI for easy use and editable translation results.
- Region capture using
gnome-screenshot - Image preprocessing for improved OCR accuracy
- OCR with Tesseract (default language French)
- Translation via OpenAI GPT or Google Translate API
- Editable original text with re-translation option
- DPI scaling support based on monitor config
Make sure you have the required system dependencies and Python packages installed.
sudo apt update
sudo apt install python3-pip python3.13-venv python3-tk gnome-screenshot tesseract-ocr tesseract-ocr-fra- Replace
tesseract-ocr-frawith your source language pack if needed (see notes below).
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install \
pillow \
pytesseract \
opencv-python-headless \
numpy \
requests \
python-dotenv \
openai \
tkCreate a .env file in the project root with your API keys:
OPENAI_API_KEY=xxx-your-openai-key-xxx
GOOGLE_API_KEY=xxx-your-google-translate-key-xxxMake sure to replace the placeholders with your actual API keys.
By default, the script is set to:
SOURCE_LANG = 'fra' # Tesseract language code (e.g. 'fra' for French, 'eng' for English)
TARGET_LANG_OPEN_AI = "English" # Target language for OpenAI GPT translation
TARGET_LANG_GOOGLE_TRANSLATE = "en" # Target language for Google Translate APIYou can change these variables in the script to your preferred source and target languages. Examples:
SOURCE_LANG = 'eng' # English OCR
TARGET_LANG_OPEN_AI = "Spanish"
TARGET_LANG_GOOGLE_TRANSLATE = "es"Make sure you have installed the corresponding Tesseract language pack, e.g.:
sudo apt install tesseract-ocr-engThe current OpenAI translation uses the gpt-4o model with a prompt designed for comic-style text preserving tone and humor:
model="gpt-4o"
prompt = (
f"Translate the following comic-style text to {target_lang}. Preserve any tone, humor, or emotion.\n\n"
f"Text:\n{text}\n\n"
f"Translation:"
)Feel free to adjust the model or prompt in the translate_with_chatgpt() function to suit your use case.
Run the script:
python3 screen_translate.py- Select your preferred translator in the GUI dropdown.
- Click Capture & Translate and select the screen region.
- The extracted text and translation will appear in a popup window.
- You can edit the OCR-extracted text in the top field if corrections are needed.
- Then click 🔄 Retranslate to get an updated translation based on your edits.
Below is an example of the GUI in action:
- This tool currently works on GNOME/Linux environments due to reliance on
gnome-screenshotand.config/monitors.xmlfor DPI scaling. - If you want to add support for other desktop environments or OS, adapt the screenshot method and scaling retrieval accordingly.
- Error handling is basic — check console output for issues.
- The code was generated and refined with the help of ChatGPT.
MIT License
- Tesseract OCR
- OpenAI API
- Google Translate API
- ChatGPT for code generation assistance
🤖 Generated with ChatGPT This codebase was generated and refined using ChatGPT-4 to speed up prototyping and development.
