*A Modern Agentic AI + LLMOps Project *
ai4sci is a production-grade LLM Agent designed to perform agentic reasoning and interpretation for scientific discovery. Before jumping into research, we experiemnt and implement the tools used for agentic reasoning for two other tasks:
- RCA analysis for ML engineer: given a bunch of logs, identify the root cause
- ESG Intelligence: document scoring and auditing
- Ollama + Llama 3.1 (local inference)
- FastAPI (async LLM server)
- VectorDB-based retrieval pipeline
- Streaming evaluation + canary testing
- Custom monitoring layer
- Streamlit operations dashboard
- Dockerized deployment
- Simulated autoscaling
This is Task 1 of a 3-part AI project:
- Root Cause Analysis (SciRCA) β Completed
- ESG Intelligence (GreenDocs) β Document parsing, ESG scoring, LLM-based auditing
- AI4Science Reasoning Module β Model-driven scientific insight & anomaly interpretation
ββββββββββββββββββββββ
β Client App β
β(Streamlit Dashboard)β
βββββββββββ¬βββββββββββ
β HTTP
βΌ
ββββββββββββββββββββββ
β FastAPI Server β
β - RCA Endpoint β
β - Monitoring Layer β
βββββββββ¬βββββββββββββ
β Calls Agent
βΌ
ββββββββββββββββββββββ
β RCA Agent β
β - Tool calls β
β - Multi-step plan β
βββββββββ¬βββββββββββββ
β LLM Chat
βΌ
ββββββββββββββββββββββ
β Llama 3.1 (8B) β
β via Ollama β
ββββββββββββββββββββββ
- Tool-using LLM agent (multi-step reasoning)
- Local inference via Ollama
- Chat + tool call parsing
- RAG pipeline integration
- FastAPI async server
- Monitoring:
- request count
- latency
- error rate
- Canary evaluation
- Model registry
- Quantized Llama models
- Async batching of tool calls
- Local GPU/Metal acceleration
- Real-time dashboard (Streamlit)
- Logs viewer
- Inference tester
- Latency charts
TODO
scirca/
β
βββ src/
β βββ agent/ β RCA agent + LLM client
β βββ retriever/ β RAG embedding + search
β βββ serve/ β FastAPI server + monitors
β βββ utils/ β YAML loader, logger
β βββ models/ β Model registry files
β
βββ dashboard/
β βββ app.py β Streamlit GUI
β βββ components/ β metrics, logs, registry, tester
β
βββ scripts/ β CLI scripts (eval, run agent, benchmark)
βββ requirements.txt
βββ Dockerfile
brew install ollamaollama pull llama3.1:8bollama serveuvicorn src.serve.api:app --reload --port 8000streamlit run dashboard/app.pyOpen:
π http://localhost:8501
Dashboard Features:
- Metrics (request rate, latency, errors)
- Logs viewer (Docker/FastAPI logs)
- Model registry viewer
- Inference runner for RCA
docker build -t scirca-api .docker run -p 8000:8000 scirca-apistreamlit run dashboard/app.pycurl -X POST http://localhost:8000/rca -H "Content-Type: application/json" -d '{
"run_summary": "Training failed with NaN loss",
"logs": ["loss=0.5", "loss=0.7", "loss=nan"],
"metrics": {"loss": [0.5, 0.7, "nan"]},
"model_tag": "rca-v2"
}'python scripts/load_test.py \
--api http://localhost:8000 \
--concurrency 50 \
--total 500or (this second one has been tested
bash scripts/run_load_test.shPlanned capabilities:
- ESG report ingestion
- Compliance summarisation
- Automated ESG scoring
- Greenwashing detection
- Multi-document RAG
Planned:
- Scientific anomaly reasoning
- Embedding-based pattern detection
- Hypothesis generation
- LLM-assisted interpretation of experimental results