Gemini-powered RAG CLI for Markdown documents — index, search, and answer questions using Vertex AI embeddings and DuckDB vector storage.
- Markdown indexing: Heading-aware chunking with JP/EN sentence boundary detection
- Semantic search: Gemini text-embedding-005 with DuckDB cosine similarity
- Context-aware answers: Vector search + adjacent chunk expansion + Gemini chat
- Cross-platform: Python + uv, runs on macOS, Linux, and Windows
- Prompt injection defense: Nonce-tagged XML wrapping for all user content
- Python 3.11+
- uv package manager
- Google Cloud project with Vertex AI API enabled
- Application Default Credentials:
gcloud auth application-default login
# From source
git clone https://github.com/nlink-jp/gem-rag.git
cd gem-rag
uv sync
# Or install as a tool
uv tool install gem-ragConfiguration is resolved in the following priority order:
- CLI flags (highest priority)
- Environment variables (
GEM_RAG_*) .envfile (in current directory)- Config file (
~/.config/gem-rag/config.toml) - Defaults (lowest priority)
Create ~/.config/gem-rag/config.toml:
project = "your-gcp-project-id"
location = "us-central1"
chat_model = "gemini-2.5-flash"
embedding_model = "text-embedding-005"
db_path = "./gem-rag.db"See config.example.toml for a full example.
Set environment variables (or create a .env file):
GEM_RAG_PROJECT=your-gcp-project-id # Required
GEM_RAG_LOCATION=us-central1 # Default
GEM_RAG_CHAT_MODEL=gemini-2.5-flash # Default
GEM_RAG_EMBEDDING_MODEL=text-embedding-005 # Default
GEM_RAG_DB_PATH=./gem-rag.db # Default# Index Markdown files
gem-rag index --dir ./docs
# Ask a question
gem-rag ask "How does authentication work?"
# JSON output with sources
gem-rag ask --json "What are the API endpoints?"
# Manage documents
gem-rag docs list
gem-rag docs delete <id>
# Re-embed after changing embedding model
gem-rag reindexmake build # Build package to dist/
make test # Run tests
make lint # Run linter- Database Schema — DuckDB tables, vector search, idempotency
- Chunking Architecture — heading-aware splitting, token estimation, JP/EN boundaries
- Cross-Language Search — JA/EN query rewriting, parallel search, result merging
- 日本語ドキュメント
MIT