An advanced Vietnamese Retrieval-Augmented Generation system with intelligent query routing, multi-strategy retrieval, and automatic visualization capabilities.
- Automatically routes queries between business database and general knowledge
- Vietnamese language understanding and classification
- Natural language to SQL conversion for Vietnamese queries
- Automatic error recovery and query optimization
- Smart visualization generation with Plotly
- Hybrid Retrieval: Dense + sparse search combination
- Query Rewriting: Simple, decomposition, and HyDE strategies
- Confidence-based fallback to web search
- Document reranking with multilingual models
- Context-aware chart generation
- Interactive Plotly visualizations
- Multiple export formats (HTML, PNG)
- Vietnamese Wikipedia integration
- GCP-hosted business database
- Web search fallback via Tavily
| Phase | Focus Area | Description |
|---|---|---|
| 1 | 🖼️ Image Generation | Integrate Gemini 1.5 Pro-Vision & Stable Diffusion XL for text-to-image answers in Vietnamese. |
| 2 | 🧩 Model Zoo Expansion | Add more Vietnamese embedding models (e.g. MiniLM-Vi, bge-base-vi) and lightweight generator back-ends to lower cost. |
| 3 | 📏 Metrics | Implement end-to-end RAG evaluation (EM, F1, ROUGE-L) plus SQL-accuracy & chart-quality scores. |
| 4 | 🏋️ Benchmarking Suite | Reproducible benchmarks on public (ViQuAD, UIT-SDS) and private business datasets; automatic report generation. |
| 5 | 🏭 LLMOps Production-Ready Components | Add CI/CD, Docker/K8s Helm charts, Prometheus + Grafana monitoring, Canary & A/B rollout scripts. |
- Python 3.9+
- uv package manager
- OpenAI API key
- Access to required external services
-
Clone the repository
git clone https://github.com/Johnx69/ViRAG.git cd ViRAG -
Navigate to demo and setup environment
cd demo-api-layer && \ uv venv && \ source .venv/bin/activate && \ uv sync --active
-
Configure environment variables
cp sample.env .env
Fill in your API keys and configuration in
.env:# LLM Configuration GEMINI_API_KEY=your_gemini_api_key_here OPENAI_API_KEY=your_openai_api_key_here # Vector Database WEAVIATE_URL=your_weaviate_url_here WEAVIATE_API_KEY=your_weaviate_api_key_here # Search TAVILY_API_KEY=your_tavily_api_key_here # LangSmith (Optional) LANGCHAIN_API_KEY=your_langchain_api_key_here LANGCHAIN_PROJECT=vietnamese_rag
uv run run.py| Service | Purpose | Required |
|---|---|---|
| OpenAI API | Query rewriting with GPT-4o | ✅ Required |
| Google Gemini | Main LLM for response generation | ✅ Required |
| Weaviate | Vector database for knowledge storage | ✅ Required |
| Tavily | Web search fallback | ✅ Required |
| LangSmith | Monitoring and tracing | ⚪ Optional |
# Vietnamese business questions
query = "Nhân viên nào bán được nhiều nhất tháng 8?"
query = "Doanh thu chi nhánh Hà Nội như thế nào?"
query = "Tạo biểu đồ doanh thu theo tháng"# General knowledge questions
query = "Lịch sử của Việt Nam như thế nào?"
query = "Trí tuệ nhân tạo là gì?"
query = "Các di sản thế giới ở Việt Nam"vietnamese-rag-system/
├── src/
│ ├── pipeline/ # RAG pipeline implementations
│ ├── router/ # Query routing logic
│ ├── sql_agent/ # SQL generation and execution
│ ├── retrieval/ # Retrieval and reranking
│ ├── search/ # Web search integration
│ ├── core/ # LLM and embeddings
│ ├── ingestion/ # Data loading and indexing
│ ├── prompts/ # Prompt templates
│ ├── config/ # Configuration management
│ └── utils/ # Helper utilities
├── demo-api-layer/ # Demo application
├── sample.env # Environment template
└── README.md
Key configuration parameters in src/config/settings.py:
# Model Settings
embedding_model = "AITeamVN/Vietnamese_Embedding_v2"
chunk_size = 1024
chunk_overlap = 50
confidence_threshold = 0.7
# Collections
wikipedia_collection = "VietnameseWikipedia"Intelligent classification between business and general queries using Gemini 2.0 Flash.
- Converts Vietnamese to SQL using comprehensive schema knowledge
- Automatic error recovery with 3-retry mechanism
- Generates visualizations for numerical data
- Hybrid retrieval (dense + sparse)
- Multi-strategy query rewriting
- BGE-M3 multilingual reranking
- Confidence-based web search fallback
- AI-powered visualization code generation
- Context-aware chart type selection
- Multiple export formats
- Multi-language support: Optimized for Vietnamese
- Confidence thresholding: Smart fallback mechanisms
- Error recovery: Automatic SQL query fixing
- Process logging: Detailed execution tracking
- Visualization: Automatic chart generation
- Monitoring: LangSmith integration for tracing
This project is licensed under the MIT License - see the LICENSE file for details.
- LangChain for the RAG framework
- Weaviate for vector database capabilities
- Vietnamese Wikipedia for knowledge base
- BGE-M3 for multilingual reranking
- Plotly for visualization capabilities
