This project is a lightweight code assistant API for internal experimentation and learning. It reads local reference files, builds a searchable vector index, and answers coding questions through a simple FastAPI service. The goal is practical support for day-to-day development tasks, not a production-grade platform.
ingest.pyreadsdocuments/sample.txt.- The text is split into chunks.
- Embeddings are created with local Ollama embedding models.
- Chunks are stored in a FAISS vector index under
vectorstore/. app.pyloads that index and exposes a question-answer API.
GET /- Basic service info and endpoint list.
GET /health- Health check endpoint.
POST /ask- Accepts a JSON payload and returns an answer from retrieval + Ollama chat model.
Request body for /ask:
{
"question": "What does this project do?"
}Response example:
{
"answer": "..."
}- Install dependencies:
./env3/bin/python -m pip install -r requirements.txt- Make sure Ollama is running and required models are available:
ollama serve
ollama pull nomic-embed-text
ollama pull llama3.2:latest- Build vector index:
./env3/bin/python ingest.py- Start API server:
./env3/bin/python -m uvicorn app:app --reload- Open docs:
http://127.0.0.1:8000/docs
OLLAMA_BASE_URLdefault:http://localhost:11434OLLAMA_EMBED_MODELdefault:nomic-embed-textOLLAMA_CHAT_MODELdefault:llama3.2:latest
app.py: FastAPI application and QA endpoint.ingest.py: document loading, chunking, embedding, and FAISS index creation.documents/sample.txt: source content used for indexing.vectorstore/: generated FAISS index files (index.faiss,index.pkl).
- Run ingestion before starting the API; otherwise startup fails because the FAISS index is missing.
- This is a learning/demo project, so error handling and production hardening are intentionally minimal.