A production-grade Retrieval-Augmented Generation (RAG) system for 🇮🇳 Indian Bare Acts ⚖️. Query 9 major Indian laws in plain English and get precise, cited answers grounded strictly in statutory text.
100% evaluation pass rate across 12 test cases including negative (out-of-domain) tests. Built to never hallucinate — if the answer isn't in the law, LexGrid says so.
- Key Features
- Tech Stack
- Prerequisites
- Getting Started
- Architecture
- Environment Variables
- Available Scripts
- API Reference
- Testing
- Evaluation
- Deployment
- Troubleshooting
- Contributing
- License
- Hybrid Retrieval — Combines pgvector ANN search with PostgreSQL full-text search (tsvector), fused via Reciprocal Rank Fusion (RRF k=60) for superior recall and precision
- Query Intelligence — Regex-based detection of "Section 302 IPC"-style queries bypasses embedding entirely and hits the database directly, returning sub-10ms responses
- Anti-Hallucination — LLM temperature=0, strict 5-rule system prompt, answers grounded only to retrieved context, mandatory
[Section X, Act Name]citations - Out-of-Domain Rejection — Cosine distance threshold (0.75) means physics questions and other off-topic queries return empty results and a clean "cannot find" response — no fabrication
- Redis Query Cache — SHA256-keyed cache (TTL 3600s) for repeated queries, with cache-hit flag in every API response
- Async Ingestion — Celery workers with Redis broker handle embedding + upsert in the background, enabling non-blocking ingestion of entire acts
- Evaluation Suite — 12 test cases covering direct lookup, comparative, procedural, and negative query types with P@K, Recall@K, MRR, and Legal Accuracy metrics
- 9 Indian Acts — BNS, CPC, CrPC, HMA, IDA, IEA, IPC, MVA, NIA (~2,284 sections indexed)
- Mobile-Responsive UI — Hamburger + slide-in sidebar on mobile, full-width chat panel, pinned header and input, works on all screen sizes
- Markdown Rendering — Assistant responses rendered as formatted Markdown (headings, bold, lists, inline code) in both desktop and mobile browsers
| Layer | Technology |
|---|---|
| API | FastAPI ≥0.111.0, Python 3.11+, uvicorn[standard] |
| Embeddings | OpenAI text-embedding-3-small (1536-dim) via OpenRouter |
| LLM | gpt-4o-mini via OpenRouter, temperature=0 |
| Vector Store | PostgreSQL 16 + pgvector extension |
| Full-Text Search | PostgreSQL tsvector/tsquery (GENERATED ALWAYS, GIN index) |
| Cache | Redis 7 (query cache TTL 3600s) |
| Task Queue | Celery 5 + Redis broker (concurrency=4) |
| UI | Next.js 14 (App Router) + Tailwind CSS + react-markdown |
| ORM | SQLAlchemy async (asyncpg driver) |
| Config | pydantic-settings (all env-var driven) |
| Logging | structlog (structured JSON logs) |
| Token Counting | tiktoken (cl100k_base), 4000-token context budget |
| Linting | ruff (line-length=100) |
| Type Checking | mypy |
| Testing | pytest, pytest-asyncio |
| Build | hatchling |
| Infra | Docker Compose (5 containers) |
- Docker 24+
- Docker Compose v2+ (
docker compose, notdocker-compose) - An OpenRouter API key (used for both embedding and LLM calls)
curlor any HTTP client (for testing the API)
No Python, Node.js, or PostgreSQL installation is needed locally — everything runs in Docker.
git clone https://github.com/srajasimman/lexgrid.git
cd lexgridcp .env.example .envOpen .env and configure at minimum:
# Required: your OpenRouter API key
OPENAI_API_KEY=sk-or-v1-your-openrouter-key-here
OPENAI_BASE_URL=https://openrouter.ai/api/v1
# These defaults work with Docker Compose as-is
DATABASE_URL=postgresql+asyncpg://lexgrid:lexgrid@lexgrid-postgres:5432/lexgrid
REDIS_URL=redis://lexgrid-redis:6379/0
CELERY_BROKER_URL=redis://lexgrid-redis:6379/1
CELERY_RESULT_BACKEND=redis://lexgrid-redis:6379/2See Environment Variables for the full reference.
docker compose -f infra/docker-compose.yml up -dThis starts 5 containers:
| Container | Role | Port |
|---|---|---|
lexgrid-postgres |
PostgreSQL 16 + pgvector | 5432 |
lexgrid-redis |
Redis 7 (cache + broker) | 6379 |
lexgrid-backend |
FastAPI API | 8000 |
lexgrid-celery |
Celery worker (concurrency=4) | — |
lexgrid-ui |
Next.js frontend | 3000 |
Wait ~15 seconds for PostgreSQL to initialize, then verify everything is healthy:
curl http://localhost:8000/healthExpected response:
{"status":"healthy","database":"connected","redis":"connected","version":"0.1.0"}Ingest all 9 acts (dispatches async Celery tasks for embedding and upsert):
docker exec lexgrid-backend python scripts/ingest.py --act allOr ingest specific acts:
docker exec lexgrid-backend python scripts/ingest.py --act ipc,crpc,bnsMonitor ingestion progress:
docker compose -f infra/docker-compose.yml logs celery -fIngestion of all 9 acts takes ~5–10 minutes depending on OpenRouter API latency. When done, verify:
docker exec lexgrid-postgres psql -U lexgrid -d lexgrid \
-c "SELECT act_code, COUNT(*) FROM sections GROUP BY act_code ORDER BY act_code;"curl -X POST http://localhost:8000/query/ \
-H "Content-Type: application/json" \
-d '{
"query": "What is the punishment for murder under IPC?",
"top_k": 5
}'Important: The
/query/endpoint requires a trailing slash. See Troubleshooting.
Expected response:
{
"answer": "Under Section 302 of the Indian Penal Code, whoever commits murder shall be punished with death, or imprisonment for life, and shall also be liable to fine. [Section 302, Indian Penal Code]",
"citations": [
{
"act_code": "ipc",
"act_name": "Indian Penal Code",
"section_number": "302",
"section_title": "Punishment of murder",
"source_url": "https://..."
}
],
"retrieved_chunks": [...],
"query": "What is the punishment for murder under IPC?",
"cache_hit": false,
"latency_ms": 1243.7
}Navigate to http://localhost:3000 for the Next.js chat interface.
UI features:
- Chat with the RAG system — ask questions in plain English and get cited answers
- Conversation history — conversations are stored locally and listed in the sidebar
- Act filter — filter conversations by act code (BNS, IPC, CrPC, etc.)
- Pin / delete conversations
- Mobile responsive — hamburger menu + slide-in sidebar on screens < 768px
- Markdown rendering — assistant responses render formatted text (bold, lists, headings, code)
API auto-docs (Swagger): http://localhost:8000/docs
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI Backend │
│ │
│ POST /query/ │
│ │ │
│ ├─► Redis Cache (SHA256 key, TTL 3600s) │
│ │ └── HIT → return cached response │
│ │ │
│ └─► Query Intelligence (regex) │
│ ├── "Section 302 IPC" → Direct DB Lookup │
│ └── Natural language → Hybrid Retrieval │
│ │ │
│ ┌─────────┴──────────┐ │
│ │ │ │
│ pgvector ANN PostgreSQL FTS │
│ (cosine dist) (tsvector GIN) │
│ │ │ │
│ └─────────┬──────────┘ │
│ │ │
│ RRF Fusion (k=60) │
│ │ │
│ LLM Reranker │
│ (gpt-4o-mini) │
│ │ │
│ Context Builder (4000 tokens) │
│ │ │
│ LLM Answer │
│ (gpt-4o-mini, temp=0) │
│ │ │
│ Citation Parser + Cache Write │
└─────────────────────────────────────────────────────────────────┘
lexgrid/
├── backend/
│ ├── Dockerfile
│ ├── pyproject.toml
│ └── app/
│ ├── main.py # FastAPI app factory, lifespan, CORS
│ ├── config.py # pydantic-settings (all env vars)
│ ├── api/routes/
│ │ ├── query.py # POST /query/ — main RAG endpoint
│ │ ├── search.py # GET /search/ — raw retrieval, no LLM
│ │ └── health.py # GET /health, GET /metrics
│ ├── retrieval/
│ │ ├── hybrid.py # RRF fusion of vector + keyword results
│ │ ├── query_intelligence.py # Regex direct section lookup
│ │ ├── vector_retriever.py # pgvector cosine ANN search
│ │ ├── keyword_retriever.py # PostgreSQL tsvector FTS
│ │ └── reranker.py # LLM reranking (gpt-4o-mini, temp=0)
│ ├── llm/
│ │ ├── client.py # OpenAI async client (OpenRouter)
│ │ ├── prompt_builder.py # System prompt + user prompt assembly
│ │ └── context_builder.py # tiktoken context window (4000 tokens)
│ ├── vector_store/
│ │ ├── store.py # Upsert, ANN search, direct lookup
│ │ ├── schema.py # SQLAlchemy Section ORM model
│ │ └── database.py # Async engine factory (lru_cache)
│ ├── cache/
│ │ ├── client.py # Redis async client
│ │ └── query_cache.py # SHA256-keyed cache (get/set/invalidate)
│ ├── models/
│ │ ├── chunk.py # LegalChunk, LegalChunkWithEmbedding
│ │ └── query.py # QueryRequest, QueryResponse, Citation
│ ├── ingestion/
│ │ ├── chunker.py # JSON → LegalChunk (section + explanation)
│ │ └── pipeline.py # Embed + upsert pipeline (Celery task body)
│ ├── workers/
│ │ └── celery_app.py # Celery app config, autodiscover tasks
│ └── evaluation/
│ ├── test_cases.py # 12 test cases
│ └── metrics.py # P@K, Recall@K, MRR, Legal Accuracy
├── infra/
│ ├── docker-compose.yml # 5-container stack
│ └── postgres/init.sql # Schema: sections table, pgvector, GIN index
├── scripts/
│ ├── ingest.py # CLI: dispatch Celery ingestion tasks
│ └── evaluate.py # CLI: run evaluation suite → JSON report
├── legal-acts/ # Raw JSON source data (per act)
├── ui/ # Next.js 14 + Tailwind CSS chat interface
│ ├── src/
│ │ ├── app/ # Next.js App Router pages + layout
│ │ └── components/
│ │ ├── ChatShell.tsx # Root shell: sidebar state, mobile layout
│ │ ├── MobileHeader.tsx # Hamburger + title + new-chat (mobile-only)
│ │ ├── Sidebar.tsx # Conversation list, act filters, close button
│ │ ├── ChatPanel.tsx # Message thread + input area
│ │ ├── MessageBubble.tsx # Per-message bubble with Markdown rendering
│ │ └── ...
│ ├── Dockerfile # Multi-stage: deps → build → runner (node:20-alpine)
│ └── .dockerignore
└── docs/
├── architecture.md # System design deep-dive
├── developer-guide.md # Local dev, conventions, adding acts
├── api-reference.md # Full API spec with examples
├── evaluation.md # Evaluation framework and test cases
└── ingestion.md # Data pipeline: JSON → pgvector
For a query like "What is the punishment for murder under IPC?":
-
Cache Check — SHA256 key =
query:{sha256(query.lower() + sorted(act_codes))}. Cache hit → return immediately withcache_hit: true. -
Query Intelligence — Two regex patterns checked:
_PATTERN_SECTION_FIRST— matches "Section 302 IPC", "Section 120A CrPC"_PATTERN_ACT_FIRST— matches "IPC 302", "BNS Section 103"- Match → direct DB lookup by
(act_code, section_number). No embedding, no vector search.
-
Embedding — Query text →
text-embedding-3-small→ 1536-dim float vector (OpenRouter) -
Vector Search — IVFFlat index (
lists=100,vector_cosine_ops) returns top-K with distance ≤ 0.75. Out-of-domain queries return 0 results here and short-circuit immediately. -
Keyword Search — tsvector FTS with weighted fields:
- Weight A:
section_title(highest relevance) - Weight B:
act_name - Weight C:
content
- Weight A:
-
RRF Fusion —
score(d) = Σ 1 / (60 + rank(d)). Documents in both result sets score significantly higher. -
Short-Circuit — If both retrievers return empty → return
[], skip reranker + LLM entirely. -
Reranker — Top fused results sent to
gpt-4o-minifor LLM-based relevance reranking. Falls back to RRF order on any LLM failure. -
Context Building — tiktoken (cl100k_base) counts tokens. Chunks added greedily top-to-bottom until 4000-token budget exhausted.
-
LLM Answer — System prompt + context + query →
gpt-4o-mini(temperature=0). Always cites in format[Section X, Act Name]. -
Citation Parsing — Regex extracts citations from LLM answer → typed
Citationobjects. -
Cache Write — Response written to Redis with TTL 3600s.
-
Query Logging — Every query logged to
query_logsPostgreSQL table (text, hash, retrieved_section_ids, latency_ms, cache_hit).
Pure vector search misses exact legal references. Searching "Section 302" semantically finds related concepts but may not rank the exact section first. Pure keyword search misses natural language questions like "what constitutes culpable homicide?". RRF fusion gives you both — documents relevant to either signal get a score boost, and documents relevant to both score highest.
| Variable | Description | Example |
|---|---|---|
OPENAI_API_KEY |
OpenRouter (or OpenAI) API key | sk-or-v1-... |
OPENAI_BASE_URL |
API base URL | https://openrouter.ai/api/v1 |
DATABASE_URL |
Async PostgreSQL DSN | postgresql+asyncpg://lexgrid:lexgrid@lexgrid-postgres:5432/lexgrid |
REDIS_URL |
Redis for query cache (db=0) | redis://lexgrid-redis:6379/0 |
CELERY_BROKER_URL |
Redis for Celery broker (db=1) | redis://lexgrid-redis:6379/1 |
CELERY_RESULT_BACKEND |
Redis for Celery results (db=2) | redis://lexgrid-redis:6379/2 |
| Variable | Description | Default |
|---|---|---|
LLM_MODEL |
LLM model identifier | gpt-4o-mini |
EMBEDDING_MODEL |
Embedding model identifier | text-embedding-3-small |
LLM_TEMPERATURE |
LLM temperature (0 = deterministic) | 0 |
CACHE_TTL |
Redis query cache TTL in seconds | 3600 |
TOP_K |
Default number of retrieved chunks | 5 |
MAX_DISTANCE |
Cosine distance cutoff for vector search | 0.75 |
OpenRouter provides an OpenAI-compatible API that routes to multiple model providers. The same OpenAI Python SDK works with base_url=OPENAI_BASE_URL. This means you can switch from gpt-4o-mini to claude-3-haiku or mistral-7b by changing one environment variable — no code changes.
| Script | Command | Description |
|---|---|---|
| Ingest all acts | python scripts/ingest.py --act all |
Dispatch Celery tasks for all 9 acts |
| Ingest specific acts | python scripts/ingest.py --act ipc,crpc |
Acts: bns, cpc, crpc, hma, ida, iea, ipc, mva, nia |
| Run evaluation | python scripts/evaluate.py --api-url http://localhost:8000 --output eval_report.json |
Run 12-case eval suite |
| Lint | ruff check . |
Check Python style (line-length=100, rules E/F/I/UP) |
| Auto-fix lint | ruff check . --fix |
Fix auto-fixable lint issues |
| Type check | mypy backend/ |
Run mypy type checks |
| Tests | pytest backend/tests/ -v |
Run test suite |
The main RAG endpoint. Returns a cited, LLM-generated answer.
Trailing slash required. FastAPI redirects
/query→/query/with HTTP 307. httpx does not follow 307 redirects on POST requests.
Request:
{
"query": "What is the punishment for murder?",
"act_filter": ["ipc"],
"top_k": 5,
"use_cache": true
}Response:
{
"answer": "Section 302 IPC: Whoever commits murder shall be punished with death...",
"citations": [{"act_code": "ipc", "section_number": "302", ...}],
"retrieved_chunks": [...],
"cache_hit": false,
"latency_ms": 1200.3
}Errors:
422— Validation error (query too short/long, top_k out of range)503— LLM or database unavailable
Raw retrieval without LLM synthesis. Use for debugging retrieval quality.
curl "http://localhost:8000/search/?q=murder+punishment&act=ipc&top_k=3"curl http://localhost:8000/health
# {"status":"healthy","database":"connected","redis":"connected","version":"0.1.0"}Query log statistics and section counts.
See docs/api-reference.md for full specification with all fields and curl examples.
# Run all tests
docker exec lexgrid-backend pytest backend/tests/ -v
# Run with coverage
docker exec lexgrid-backend pytest backend/tests/ --cov=app --cov-report=term-missingTest configuration in pyproject.toml:
asyncio_mode = "auto"— async test functions work without@pytest.mark.asyncio- Tests use actual async DB/Redis connections (integration tests)
docker exec lexgrid-backend \
python scripts/evaluate.py \
--api-url http://localhost:8000 \
--output eval_report.json| Metric | Score |
|---|---|
| Pass Rate | 100% (12/12) |
| MRR | 0.833 |
| Recall@5 | 0.814 |
| P@5 | 0.233 |
| Legal Accuracy | 0.703 |
| Type | Count | Purpose |
|---|---|---|
| Direct lookup | 4 | Verify exact section retrieval |
| Comparative | 3 | Multi-section reasoning |
| Procedural | 3 | Multi-step legal process queries |
| Negative (out-of-domain) | 2 | Verify hallucination rejection |
Negative tests (tc-06: dowry under HMA, tc-12: quantum physics) confirm LexGrid correctly refuses to answer when relevant law is absent from the index.
See docs/evaluation.md for all 12 test case definitions and how to add new ones.
Docker Compose is the primary deployment method. The entire stack is self-contained.
# Build and start all services
docker compose -f infra/docker-compose.yml up -d --build
# Verify health
curl http://localhost:8000/health
# Ingest data (first-time setup)
docker exec lexgrid-backend python scripts/ingest.py --act all
# Validate quality
docker exec lexgrid-backend python scripts/evaluate.py \
--api-url http://localhost:8000 --output eval_report.json- Set a strong PostgreSQL password (not the default
lexgrid) - Use a persistent Redis instance (or enable RDB/AOF persistence)
- Set
OPENAI_API_KEYto a valid, rate-limit-sufficient key - Place nginx or a load balancer in front of port 8000
- Confirm section count after ingestion (
SELECT COUNT(*) FROM sections→ ~2284) - Run
evaluate.pyand confirm 100% pass rate
# All services
docker compose -f infra/docker-compose.yml logs -f
# Backend API only
docker compose -f infra/docker-compose.yml logs backend -f
# Celery ingestion worker
docker compose -f infra/docker-compose.yml logs celery -f# Stop (data preserved in volumes)
docker compose -f infra/docker-compose.yml down
# Full reset (destroys all data — re-ingestion required)
docker compose -f infra/docker-compose.yml down -vCause: Missing trailing slash. FastAPI redirects /query → /query/ via HTTP 307. httpx does not follow 307 on POST requests.
Fix: Use POST http://localhost:8000/query/ (note the /).
Check 1 — Is data ingested?
docker exec lexgrid-postgres psql -U lexgrid -d lexgrid \
-c "SELECT COUNT(*) FROM sections;"
# Should be ~2284. If 0, run: docker exec lexgrid-backend python scripts/ingest.py --act allCheck 2 — Is the act in scope?
LexGrid only covers 9 acts: bns, cpc, crpc, hma, ida, iea, ipc, mva, nia. Questions about other laws correctly return no results.
Check 3 — Is distance threshold too strict?
Default MAX_DISTANCE=0.75. Try setting to 0.85 in .env and restarting the backend.
Cause: pgvector logs a warning when building an IVFFlat index (lists=100) with fewer rows than lists. This is expected on a fresh database before ingestion.
Fix: Run scripts/ingest.py. The warning disappears once rows are loaded.
# Check worker is running
docker compose -f infra/docker-compose.yml ps celery
# Check worker logs
docker compose -f infra/docker-compose.yml logs celery --tail=50
# Test Redis broker connectivity
docker exec lexgrid-backend python -c \
"import redis; r=redis.from_url('redis://lexgrid-redis:6379/1'); print(r.ping())"
# Should print: TrueCause: DATABASE_URL points to localhost instead of the container name.
Fix: Use lexgrid-postgres as the hostname in DATABASE_URL:
DATABASE_URL=postgresql+asyncpg://lexgrid:lexgrid@lexgrid-postgres:5432/lexgrid
Docker Compose places all containers on the same network. Container-to-container communication uses service names, not localhost.
- Fork the repository and create a feature branch:
git checkout -b feat/your-feature - Follow code conventions (see below)
- Add/update tests for changed retrieval logic
- Run the evaluation suite and confirm 100% pass rate
- Open a pull request with a clear description of what changed and why
- Linting:
ruff check . --fix(line-length=100, rules E/F/I/UP) - Types:
mypy backend/— all public functions must be typed - Logging:
structlog.get_logger()— neverprint()or rawlogging - Async: All I/O operations must be async (
asyncpg,aioredis,httpx) - Config: All settings go through
app/config.pypydantic-settings — no hardcoded values
MIT License — see LICENSE for details.