AI-powered assistant that answers questions about LangChain’s Python docs. It ingests the official documentation into Pinecone, then serves grounded answers via a Streamlit chat UI.
- Ask focused questions about LangChain docs and get concise answers
- Sources are shown for every response (pulled from the retrieved context)
- Stylish Streamlit UI themed for quick triage and reading
- Asynchronous ingestion pipeline to crawl, chunk, and index docs into Pinecone
- Streamlit UI (
main.py) with custom styling inbackend/ui_theme.py - LangChain agent (
backend/core.py) using Google Gemini and a retrieval tool over Pinecone - Ingestion pipeline (
ingestion.py) leveraging Tavily for crawling and Google embeddings for vectorization - Pinecone for vector storage
- main.py — Streamlit entrypoint and chat orchestration
- backend/core.py — LLM agent and retrieval tool wiring
- backend/ui_theme.py — UI theming and header/input components
- ingestion.py — Crawl, split, and index LangChain docs
- logger.py — Colored CLI logging helpers
- Pipfile — Runtime dependencies (Python 3.13)
- Python 3.13
- Pipenv installed (
pip install pipenv) - Pinecone index created (supply
INDEX_NAMEbelow)
Create a .env file in the repo root with:
GOOGLE_API_KEY=your_google_genai_key
PINECONE_API_KEY=your_pinecone_key
INDEX_NAME=your_pinecone_index_name
TAVILY_API_KEY=your_tavily_key
- Install dependencies
pipenv install
- Activate the virtual environment
pipenv shell
This crawls the LangChain Python docs, chunks them, and indexes them into Pinecone.
python ingestion.py
streamlit run main.py
Open the provided local URL (typically http://localhost:8501) and start asking questions.
- Answers are grounded in the retrieved docs; if context is missing, the assistant says so.
- Each response includes the source URLs used for grounding.
- Rerun ingestion after Pinecone index changes or when you want fresher docs.
- No responses or empty sources: verify Pinecone index name and that ingestion completed.
- Authentication errors: confirm
GOOGLE_API_KEY,PINECONE_API_KEY, andTAVILY_API_KEYare set in your shell or.env. - SSL issues on macOS: certifi is already configured in
ingestion.py; ensure dependencies are installed via Pipenv.
MIT