Skip to content

A Simple and high-performance, fully offline Python app for real-time image captioning using BLIP on local GPUs

License

Notifications You must be signed in to change notification settings

Code-Sapling/RuyaAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RuyaAI

A Simple and high-performance, fully offline Python app for real-time image captioning using BLIP on local GPUs

Notes

  • If you have a CUDA Device/GPU you need to install pytorch before installing the rest libraries with
pip3 install torch==2.10.0 torchvision==0.25.0 --index-url https://download.pytorch.org/whl/cu130
  • The used Python Version is 3.13 ( The Defualt version is 3.13 )

Installing Libraries

as simple as running

python.exe -m pip install --upgrade pip

and then

pip install -r requirements.txt

Running Code

python main.py
usage: main.py [-h] [--camera CAMERA] [--model-path MODEL_PATH] [--voice VOICE] [--voice-cuda] [--rate-limit RATE_LIMIT] [--headless]

Offline Real-time Camera Captioning

options:
  -h, --help            show this help message and exit
  --camera CAMERA       Camera index
  --model-path MODEL_PATH
                        HuggingFace model ID or local path for BLIP
  --voice VOICE         Path to a Piper .onnx voice model file. Download voices from https://huggingface.co/rhasspy/piper-voices Example: en_US-lessac-medium.onnx
  --voice-cuda          Run Piper voice inference on CUDA (requires onnxruntime-gpu)
  --rate-limit RATE_LIMIT
                        Seconds between captions in auto mode
  --headless            Run without OpenCV window

Usage

inside the program you can run these on your keyboard:

Q - Quit the program
P - Pause the detection
R - Resume the detection
A - Auto mode ( Generate captions and say it every time and time based on RATE_LIMIT )
M - Manual mode ( Ganerate captions and say it only when you press G )
G - Generate captions and say it in Manual mode

About

A Simple and high-performance, fully offline Python app for real-time image captioning using BLIP on local GPUs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages