Skip to content

API Map

loglux edited this page Feb 6, 2026 · 5 revisions

API Map - Visual Structure

Base URL: http://localhost:8004/api/v1

/api/v1
│
├── /health                              GET    ✓  Check API health
├── /ready                               GET    ✓  Check dependencies
├── /info                                GET    ✓  Get API info
│
├── /knowledge-bases
│   ├── /                                POST   ✓  Create KB
│   ├── /                                GET    ✓  List KBs (paginated)
│   ├── /{kb_id}                         GET    ✓  Get KB details
│   ├── /{kb_id}                         PUT    ✓  Update KB
│   ├── /{kb_id}/retrieval-settings      GET    ✓  Get KB retrieval settings
│   ├── /{kb_id}/retrieval-settings      PUT    ✓  Update KB retrieval settings
│   ├── /{kb_id}/retrieval-settings      DELETE ✓  Clear KB retrieval settings
│   ├── /{kb_id}                         DELETE ✓  Delete KB (soft)
│   ├── /{kb_id}/reprocess               POST   ✓  Reprocess all docs
│   ├── /{kb_id}/regenerate_chat_titles  POST   🤖 Regenerate chat titles
│   └── /{kb_id}/cleanup-orphaned-chunks POST   ✓  Clean orphans
│
├── /documents
│   ├── /                                POST   ✓  Upload document (multipart/form-data)
│   ├── /                                GET    ✓  List documents (paginated, filterable)
│   ├── /{doc_id}                        GET    ✓  Get document + content
│   ├── /{doc_id}                        DELETE ✓  Delete document + vectors
│   ├── /{doc_id}/status                 GET    ⚡ Get processing status (optimized for polling)
│   ├── /{doc_id}/reprocess              POST   ✓  Reprocess document
│   ├── /{doc_id}/analyze                POST   🤖 Analyze structure (LLM)
│   ├── /{doc_id}/structure/apply        POST   ✓  Apply structure
│   └── /{doc_id}/structure              GET    ✓  Get structure
│
├── /chat
│   ├── /                                POST   🤖 Query KB (RAG)
│   ├── /knowledge-bases/{kb_id}/stats   GET    ✓  Get chat stats
│   ├── /conversations                   GET    ✓  List conversations
│   ├── /conversations/{id}              GET    ✓  Get conversation
│   ├── /conversations/{id}              PATCH  ✓  Update conversation (title)
│   ├── /conversations/{id}              DELETE ✓  Delete conversation
│   ├── /conversations/{id}/settings     PATCH  ✓  Update settings
│   └── /conversations/{id}/messages     GET    ✓  Get messages
│
├── /retrieve
│   └── /                                POST   ✓  Retrieve chunks (no LLM)
│
├── /prompts
│   ├── /                                GET    ✓  List chat prompt versions
│   ├── /                                POST   ✓  Create chat prompt version
│   ├── /active                          GET    ✓  Get active chat prompt
│   ├── /{id}                            GET    ✓  Get chat prompt version
│   ├── /{id}/activate                   POST   ✓  Activate chat prompt
│   ├── /self-check                      GET    ✓  List self-check prompts
│   ├── /self-check                      POST   ✓  Create self-check prompt
│   ├── /self-check/active               GET    ✓  Get active self-check prompt
│   ├── /self-check/{id}                 GET    ✓  Get self-check prompt version
│   └── /self-check/{id}/activate        POST   ✓  Activate self-check prompt
│
├── /embeddings
│   ├── /models                          GET    ✓  List all models
│   ├── /models/{name}                   GET    ✓  Get model details
│   ├── /providers                       GET    ✓  List providers
│   └── /providers/{provider}/models     GET    ✓  Get provider models
│
├── /llm
│   ├── /models                          GET    ✓  List LLM models
│   └── /providers                       GET    ✓  List LLM providers
│
├── /ollama
│   ├── /status                          GET    ✓  Check Ollama status
│   ├── /models                          GET    ✓  List all Ollama models
│   ├── /models/embeddings               GET    ✓  List embedding models
│   └── /models/llm                      GET    ✓  List LLM models
│
└── /settings
    ├── /                                GET    ✓  Get app settings
    ├── /                                PUT    ✓  Update settings
    ├── /reset                           POST   ✓  Reset to defaults
    └── /metadata                        GET    ✓  Get metadata (options)

Legend

  • ✓ Standard CRUD operation
  • ⚡ Optimized for polling/real-time updates
  • 🤖 Uses LLM/AI processing
  • 📊 Returns statistics/analytics

Data Flow Diagrams

Document Upload & Processing Flow

┌─────────────┐
│   Client    │
└──────┬──────┘
       │ POST /documents/
       │ (multipart/form-data)
       ├─ file
       ├─ knowledge_base_id
       └─ filename (optional)
       │
       ▼
┌──────────────────────┐
│   Document Upload    │
│    (documents.py)    │
└──────┬───────────────┘
       │ Returns: 201 Created
       │ {id, status: "pending", progress: 0}
       │
       ▼
┌──────────────────────┐
│  Background Task     │
│ (document_processor) │
└──────┬───────────────┘
       │
       ├─ 5%  "Loading document..."
       ├─ 15% "Preparing to chunk..."
       ├─ 30% "Chunking completed (N chunks)"
       ├─ 35% "Generating embeddings (0/N)"
       │      ├─ Batch 1 processed
       │      ├─ 48% "Generating embeddings (100/N)"
       │      ├─ Batch 2 processed
       │      └─ 62% "Generating embeddings (200/N)"
       ├─ 75% "Embeddings created (N)"
       ├─ 80% "Indexing in Qdrant..."
       ├─ 85% "Qdrant indexing completed"
       ├─ 90% "Indexing BM25..."
       ├─ 95% "BM25 indexing completed"
       └─ 100% "Completed"
       │
       ▼
┌──────────────────────┐
│   Status Polling     │
│ GET /documents/{id}/ │
│       status         │
└──────┬───────────────┘
       │ Poll every 1s
       │ Check progress_percentage
       │ Check processing_stage
       │
       ▼
  status = "completed"

Chat/RAG Query Flow

┌─────────────┐
│   Client    │
└──────┬──────┘
       │ POST /chat/
       │ {
       │   question,
       │   knowledge_base_id,
       │   conversation_id (optional),
       │   top_k, temperature, ...
       │ }
       ▼
┌──────────────────────┐
│   Chat Endpoint      │
│     (chat.py)        │
└──────┬───────────────┘
       │
       ├─ Load KB config
       ├─ Load/create conversation
       │
       ▼
┌──────────────────────┐
│  Retrieval Engine    │
│   (retrieval.py)     │
└──────┬───────────────┘
       │
       ├─ Generate query embedding
       │
       ├─ Retrieval Mode?
       │  ├─ Dense:
       │  │  └─ Qdrant vector search
       │  │
       │  └─ Hybrid:
       │     ├─ Qdrant vector search (dense)
       │     ├─ OpenSearch BM25 (lexical)
       │     └─ Merge + rerank results
       │
       ├─ Filter by score_threshold
       ├─ Apply MMR if enabled
       └─ Return top_k chunks
       │
       ▼
┌──────────────────────┐
│   LLM Generation     │
│   (assistant.py)     │
└──────┬───────────────┘
       │
       ├─ Build prompt with context
       ├─ Call LLM (OpenAI/Ollama)
       └─ Generate answer
       │
       ▼
┌──────────────────────┐
│  Save to Database    │
│ (conversation msgs)  │
└──────┬───────────────┘
       │
       ▼
┌─────────────┐
│   Client    │ Returns: {answer, sources, ...}
└─────────────┘

Knowledge Base Creation Flow

┌─────────────┐
│   Client    │
└──────┬──────┘
       │ POST /knowledge-bases/
       │ {name, embedding_model, chunking_strategy, ...}
       │
       ▼
┌──────────────────────┐
│   KB Endpoint        │
│ (knowledge_bases.py) │
└──────┬───────────────┘
       │
       ├─ Validate embedding model exists
       ├─ Get model dimension
       ├─ Generate collection name (kb_{hash})
       │
       ▼
┌──────────────────────┐
│   Create Collection  │
│   in Qdrant          │
└──────┬───────────────┘
       │ vector_size = embedding_dimension
       │ distance = cosine
       │
       ▼
┌──────────────────────┐
│  Save to Database    │
│  (PostgreSQL)        │
└──────┬───────────────┘
       │
       ▼
┌─────────────┐
│   Client    │ Returns: 201 Created {id, ...}
└─────────────┘

Request/Response Patterns

Pagination Pattern

Request:

GET /api/v1/documents/?page=2&page_size=10&knowledge_base_id=uuid

page_size default is 10, max is 100.

Response:

{
  "items": [...],
  "total": 150,
  "page": 2,
  "page_size": 10,
  "pages": 8
}

Filtering Pattern

Request:

GET /api/v1/documents/?status=processing&knowledge_base_id=uuid

Progress Tracking Pattern

Polling Loop (Client-side):

async function pollDocumentStatus(docId) {
  const interval = setInterval(async () => {
    const status = await fetch(`/api/v1/documents/${docId}/status`)
    const data = await status.json()

    // Update UI with progress
    updateProgressBar(data.progress_percentage)
    updateStatusText(data.processing_stage)

    // Stop when completed or failed
    if (data.status === 'completed' || data.status === 'failed') {
      clearInterval(interval)
    }
  }, 1000) // Poll every 1 second
}

Integration Points

External Services

Knowledge Base Platform
│
├─ OpenAI API
│  ├─ Embeddings (text-embedding-3-*)
│  └─ Chat Completions (gpt-4o, gpt-4o-mini)
│
├─ Voyage AI API
│  └─ Embeddings (voyage-4, voyage-code-3)
│
├─ Ollama (Local)
│  ├─ Embeddings (nomic-embed-text, mxbai-embed-large)
│  └─ Chat (llama3.1, mistral, qwen)
│
├─ Qdrant (Vector Store)
│  ├─ Collections management
│  ├─ Vector CRUD
│  └─ Similarity search
│
├─ OpenSearch (Lexical Store)
│  ├─ BM25 indexing
│  └─ Keyword search
│
└─ PostgreSQL (Metadata)
   ├─ Knowledge Bases
   ├─ Documents
   ├─ Conversations
   └─ Settings

Common Use Cases

1. Build a Documentation Search

# Create KB
KB=$(curl -X POST /api/v1/knowledge-bases/ -d '{"name":"Docs","chunking_strategy":"semantic"}' | jq -r .id)

# Upload docs
for file in docs/*.md; do
  curl -X POST /api/v1/documents/ -F file=@$file -F knowledge_base_id=$KB
done

# Query
curl -X POST /api/v1/chat/ -d '{"question":"How to install?","knowledge_base_id":"'$KB'"}'

2. Monitor Processing Progress

# Upload
DOC=$(curl -X POST /api/v1/documents/ -F file=@large.pdf -F knowledge_base_id=$KB | jq -r .id)

# Watch progress
watch -n 1 "curl -s /api/v1/documents/$DOC/status | jq '{progress:.progress_percentage,stage:.processing_stage}'"

3. Hybrid Search for Better Results

curl -X POST /api/v1/chat/ -d '{
  "question": "specific technical term",
  "knowledge_base_id": "'$KB'",
  "retrieval_mode": "hybrid",
  "hybrid_dense_weight": 0.6,
  "hybrid_lexical_weight": 0.4,
  "lexical_top_k": 15,
  "top_k": 5
}'

4. Conversation with Context

# First message
CONV=$(curl -X POST /api/v1/chat/ -d '{
  "question": "What is Python?",
  "knowledge_base_id": "'$KB'"
}' | jq -r .conversation_id)

# Follow-up (with context)
curl -X POST /api/v1/chat/ -d '{
  "question": "How do I install it?",
  "knowledge_base_id": "'$KB'",
  "conversation_id": "'$CONV'"
}'

Performance Tips

1. Batch Document Uploads

Upload multiple documents without waiting for each to complete.

2. Use Appropriate Chunk Sizes

  • Small docs (< 10 pages): 500-800 chars
  • Medium docs (10-50 pages): 800-1200 chars
  • Large docs (50+ pages): 1200-2000 chars

3. Choose Right Retrieval Mode

  • Dense only: General questions, semantic similarity
  • Hybrid: Technical terms, specific phrases, entity names

4. Optimize Polling

  • Poll every 1s during processing
  • Stop polling when status = completed/failed
  • Use status endpoint (lighter than full document GET)

5. Conversation Settings

  • Save settings per conversation
  • Reuse conversation_id for multi-turn chats
  • Use lower temperature (0.5) for factual answers

Last Updated: 2026-02-05