A production-ready Retrieval-Augmented Generation (RAG) system built for enterprise internal knowledge bases (SOPs, policies, procedures).
Unlike generic RAG chatbots that "guess" answers, this system is engineered for trust and compliance:
- β
Mandatory Citations β Every answer links back to source documents (
[S1],[S2]) - β No Hallucinations β "Not in KB yet" fallback when retrieval confidence is low
- β Full Audit Trail β JSONL + SQLite logging of every Q&A interaction with sources
- β Quantitative Evaluation β Built-in recall@k metrics via golden dataset
- β Manifest-Driven Ingestion β Controlled, reproducible KB updates (JSON/YAML)
- π Document ingestion (PDF, TXT, Markdown)
- π Vector search with ChromaDB
- π€ OpenAI GPT integration
- π¬ Interactive Streamlit interface
- β
Grounded answers with citations (
[S1],[S2]) - π Safety: if retrieval confidence is below a threshold, respond:
Not in KB yet. Please add the relevant SOP/policy document to the knowledge base.
- π§Ύ Audit logs written to
logs/qa.jsonl+ SQLite audit DB (logs/audit.db) with Admin viewer
- LangChain - RAG pipeline orchestration
- ChromaDB - Vector database
- OpenAI API - Language model
- Streamlit - Web interface
- Python 3.9+
git clone https://github.com/savinoo/simple-rag-chatbot.git
cd simple-rag-chatbot
pip install -r requirements.txt
export OPENAI_API_KEY="your-api-key-here"streamlit run app.pyThis repo ships with small demo docs + a golden set so you can generate a report and launch the UI quickly.
Minimum required:
Gemini
export PROVIDER=gemini
export GOOGLE_API_KEY=...
# recommended (avoids Gemini embeddings 404s)
export EMBEDDINGS_PROVIDER=localOpenAI
export PROVIDER=openai
export OPENAI_API_KEY=...
export EMBEDDINGS_PROVIDER=openaiOptional:
MODEL_NAMEoverrides the LLM model name (applies to both providers).- For Gemini, you may need to use a model your key supports (often with
models/...prefix).
# Option A (OpenAI)
export PROVIDER=openai
export OPENAI_API_KEY=...
./run_demo.sh
# Option B (Gemini)
export PROVIDER=gemini
export GOOGLE_API_KEY=...
# Recommended: use local embeddings (avoids Gemini embeddings 404s)
export EMBEDDINGS_PROVIDER=local
./run_demo.shGemini uses
GOOGLE_API_KEY(notGEMINI_API_KEY).If Gemini embeddings are not available for your key/account (common), use local embeddings:
export EMBEDDINGS_PROVIDER=localIf you get a Gemini model NOT_FOUND (404), you likely need to set a model your key supports:
python -c "import os, google.generativeai as genai; genai.configure(api_key=os.environ['GOOGLE_API_KEY']); print('\\n'.join([m.name + ' :: ' + ','.join(m.supported_generation_methods) for m in genai.list_models()]))" export GEMINI_MODEL=models/gemini-1.0-pro
What it does:
- Loads docs from
manifest.example.yaml - Runs
eval_retrieval.pyand writesreports/latest/report.md - Starts the Streamlit UI (Chat + Admin)
Use a different manifest:
export MANIFEST_PATH=path/to/your-manifest.yaml
./run_demo.shYou can either:
- Upload documents in the sidebar, or
- Load documents from a local manifest (see below).
Create a manifest JSON file (example: manifest.example.json):
{ "documents": ["docs/policies/returns.md", "docs/sops/qc_checklist.pdf"] }Then set MANIFEST_PATH (env var) or paste it in the sidebar:
export MANIFEST_PATH=manifest.example.jsonConfiguration is via env vars (see config.py):
Core:
PROVIDER=openai|geminiMODEL_NAME(default depends on provider)OPENAI_API_KEY(whenPROVIDER=openai)GOOGLE_API_KEY(whenPROVIDER=gemini)
Embeddings:
EMBEDDINGS_PROVIDER=openai|gemini|local(recommended for Gemini)LOCAL_EMBEDDINGS_MODEL(default:all-MiniLM-L6-v2)
RAG:
K_DOCUMENTS(default:5)RETRIEVAL_THRESHOLD(default:0.35)
Logging:
LOG_PATH(default:logs/qa.jsonl)AUDIT_DB_PATH(default:logs/audit.db)
Golden set JSONL format (one per line):
{"question":"What is the return window?","expected_sources":["returns.md"]}Run:
python eval_retrieval.py --golden data/golden.sample.jsonl --k 5 --out-dir reports/latestNote: for evaluation, the pipeline loads docs via
MANIFEST_PATH.
To fully match the Upwork job requirements, the next steps are:
- Google Drive/Docs/Sheets ingestion (via Google APIs)
- Scheduled daily sync + manual re-index controls
- Doc-level / role-based access control
- Slack bot interface
- Better section-level citations (heading-aware parsing)
simple-rag-chatbot/
βββ app.py
βββ rag_pipeline.py
βββ config.py
βββ eval_retrieval.py
βββ manifest.example.json
βββ requirements.txt
βββ README.md
MIT
Lucas Lorenzo Savino
AI Engineer | Agent Development & MLOps