lite-rag

A CLI-based Retrieval-Augmented Generation (RAG) tool for Markdown documents. Index a directory of Markdown files into a local DuckDB database, then ask natural-language questions answered by a local LLM via an OpenAI-compatible API (e.g., LM Studio).

Japanese and English mixed documents are fully supported with NFKC normalization and mixed-script token estimation.

Features

Index: recursively scans a directory for *.md files, chunks them, embeds with a local model, and stores vectors in DuckDB.
Ask: embeds your question, finds the most relevant chunks, expands context with adjacent chunks, and streams an LLM-generated answer.
Serve: starts a local HTTP API server (POST /api/ask SSE, GET /api/status) with an embedded Web UI that renders Markdown answers in the browser.
Incremental updates: re-indexing only re-processes files whose content has changed (SHA-256 hash check).
Japanese support: Unicode NFKC normalization, Japanese sentence boundaries for chunking, mixed CJK/ASCII token estimation.
Context window expansion: each vector-search hit is expanded by ±N adjacent chunks for continuity; overlapping spans are deduplicated automatically.

Requirements

Tool	Version	Purpose
Go	1.26+	Build toolchain
make	any	Build automation
cc (clang/gcc)	any	CGo (required by go-duckdb)
LM Studio	any	Local LLM inference server

See docs/setup.md for full setup instructions.

Quick Start

git clone <repo-url> lite-rag
cd lite-rag

# Copy and configure
mkdir -p ~/.config/lite-rag
cp config.example.toml ~/.config/lite-rag/config.toml
$EDITOR ~/.config/lite-rag/config.toml   # set api.base_url, api.api_key, models.*

# Install git hooks and linter
make setup

# Build
make build

# Index a directory of Markdown files
./dist/lite-rag index --dir /path/to/docs

# Index a single file
./dist/lite-rag index --file /path/to/doc.md

# Ask a question (CLI)
./dist/lite-rag ask "How do I configure the retry policy?"

# Or start the Web UI server
./dist/lite-rag serve        # opens at http://127.0.0.1:8080

Configuration

Copy config.example.toml to ~/.config/lite-rag/config.toml and adjust:

[api]
base_url = "http://localhost:1234/v1"   # LM Studio default
api_key  = "lm-studio"                 # arbitrary; LM Studio does not validate

[models]
embedding = "nomic-ai/nomic-embed-text-v1.5-GGUF"
chat      = "openai/gpt-oss-20b"

[database]
path = "./lite-rag.db"

[retrieval]
top_k          = 5     # vector search hits
context_window = 1     # adjacent chunks to expand around each hit
chunk_size     = 512
chunk_overlap  = 64
query_rewrite  = false # enable LLM-assisted query rewriting (improves recall, +~2 s/query)

[server]
addr      = "127.0.0.1:8080"  # listen address for `serve`
log_level = "info"             # info | debug | warn | error

Environment variables override file settings:

Variable	Overrides
`LITE_RAG_API_BASE_URL`	`api.base_url`
`LITE_RAG_API_KEY`	`api.api_key`
`LITE_RAG_EMBEDDING_MODEL`	`models.embedding`
`LITE_RAG_CHAT_MODEL`	`models.chat`
`LITE_RAG_DB_PATH`	`database.path`

Usage

lite-rag [--config <path>] [--db <path>] <command>

Commands:
  index   --dir <directory>   Index all *.md files under a directory
          --file <file>        Index a single file (any extension)
  ask     <question>           Answer a question using the indexed documents
  serve                        Start the HTTP API server with embedded Web UI
  docs                         Manage indexed documents
  reindex                      Re-embed documents after changing the embedding model
  version                      Print version information

Global flags:
  --config <path>   Config file path (default: ~/.config/lite-rag/config.toml)
  --db <path>       Database file path (overrides config database.path)

index

# Index all *.md files under a directory (recursive)
./dist/lite-rag index --dir ./docs

# Index a single file (any extension)
./dist/lite-rag index --file ./docs/notes.md
./dist/lite-rag index --file ./release-notes.txt

--dir: walks the directory recursively; processes only *.md files.
--file: indexes the specified file directly, regardless of extension.
Skips files whose SHA-256 hash matches the stored value (no re-embedding).
--dir per-file errors are logged and do not abort the overall run; --file errors are returned immediately.

reindex

# Re-embed all documents after changing models.embedding in the config
./dist/lite-rag reindex

# Target a specific database
./dist/lite-rag --db ./project.db reindex

Finds all documents whose stored embedding_model differs from the current config.
Re-embeds them using the chunk text already in the database — source files do not need to exist on disk.
Updates only the embedding vectors and embedding_model; chunk content and file hashes are unchanged.

ask

./dist/lite-rag ask "What is the default chunk size?"
./dist/lite-rag --config /etc/lite-rag.toml ask "Installation steps?"
./dist/lite-rag --db ./project-b.db ask "What is the default chunk size?"

# JSON output (answer + sources as a single JSON object)
./dist/lite-rag ask --json "What is the default chunk size?"

Embeds the question, searches DuckDB for the top-K similar chunks.
Expands each hit by ±context_window adjacent chunks.
Streams the LLM answer to stdout.
Set query_rewrite = true to enable multilingual query rewriting: the LLM rewrites the query into both Japanese and English declarative statements, and three parallel vector searches (original + JA + EN) are merged, improving recall across multilingual document collections (improves score in ~88% of queries; adds ~2 s latency).
--json: buffers the full answer and outputs a single JSON object. Progress messages are suppressed; stdout contains only valid JSON.

{
  "answer": "The default chunk size is 512 tokens.",
  "sources": [
    {"file_path": "docs/README.md", "heading_path": "Configuration", "score": 0.872}
  ]
}

serve

./dist/lite-rag serve                     # listen on 127.0.0.1:8080 (default)
./dist/lite-rag serve --addr 0.0.0.0:9090

Starts a local HTTP server. Open http://127.0.0.1:8080 in a browser.

Endpoint	Method	Description
`/api/ask`	POST	SSE-streamed answer (`{"query":"..."}`)
`/api/status`	GET	Health check and version
`/`	GET	Embedded Web UI

docs

Manage documents stored in the index database.

# List all indexed documents (text table)
./dist/lite-rag docs list

# List as JSON (machine-readable)
./dist/lite-rag docs list --json

# Show reconstructed content of a document by ID
./dist/lite-rag docs show <document-id>

# Delete a document and all its chunks
./dist/lite-rag docs delete <document-id>

<document-id> is the 64-character SHA-256 hex shown in docs list.

Building

# Current platform
make build

# All darwin platforms (macOS host)
make cross-build-darwin

# Linux platforms (requires podman or docker, or run on a Linux host)
make cross-build-linux

Binaries are placed in dist/. See docs/setup.md for cross-compilation details.

Development

make test    # run all tests
make vet     # go vet
make lint    # golangci-lint
make check   # full quality gate: vet + lint + test + build

Git hooks installed by make setup run make check automatically before each commit and push.

Architecture

See docs/design/architecture.md for the full design.

index command
  └─ Indexer: walk → normalize → chunk → embed → DuckDB

ask command
  └─ Retriever: embed query → SimilarChunks → AdjacentChunks → LLM Chat

License

MIT

日本語ドキュメント: docs/ja/README.md

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
cmd		cmd
docs		docs
internal		internal
pkg/chunker		pkg/chunker
scripts/hooks		scripts/hooks
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.ja.md		README.ja.md
README.md		README.md
RULES.md		RULES.md
config.example.toml		config.example.toml
eval		eval
go.mod		go.mod
go.sum		go.sum
lite-rag		lite-rag

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lite-rag

Features

Requirements

Quick Start

Configuration

Usage

index

reindex

ask

serve

docs

Building

Development

Architecture

License

About

Uh oh!

Releases 10

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lite-rag

Features

Requirements

Quick Start

Configuration

Usage

index

reindex

ask

serve

docs

Building

Development

Architecture

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Contributors

Uh oh!

Languages