actualizacion de features#1447
Open
javi2481 wants to merge 28 commits into
Open
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- debug flag now reads from DEBUG env var (default false) - add scripts/setup-droplet.sh for initial Digital Ocean Droplet setup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add docs/ROADMAP-2026.md with project overview and Q1-Q4 plan - Add docs/specs/api-rate-limiting.md with specification - Add .atl/skill-registry.md from sdd-init
- Add docs/specs/api-rate-limiting-design.md with technical design - Update roadmap with design status - Document approach, data flow, file changes
- Add docs/specs/api-rate-limiting-tasks.md with implementation checklist - 18 tasks across 5 phases (Foundation, Core, Integration, Testing, Cleanup)
- Add docs/PROJECT-COMPLETE.md: Complete project documentation with all info - Add docs/PROJECT-OVERVIEW.md: Detailed project exploration - Update docs/ROADMAP-2026.md: Add links to complete docs - Includes: stack, architecture, features, B2B/B2C, gaps, SDD phases This commit consolidates all exploration work into the repository: - Project overview and purpose - Complete tech stack - Architecture diagrams - Implemented features - API documentation - Business verticals (B2B/B2C) - Gap analysis and solutions - SDD phases for Rate Limiting (proposal, spec, design, tasks)
- Add tl;dr section explaining 90% is already resolved - Document what comes ready from OpenRAG base: - Docling, OpenSearch, Langflow, APIs, MCP, Auth, Connectors, Langfuse - Clarify developer role: assemble, configure, customize - Only 10% left: Rate Limiting
- Technical guide with code examples - Step 1: Add Redis to docker-compose - Step 2: Create rate_limit_middleware.py - Step 3: Connect to main.py - Includes tiered rate limiting notes - Follows SDD design
- Add 90% resolved insight (from OpenRAG base) - Document all Rate Limiting SDD files - Add implementation guide reference - Update Q1 deliverables and changelog
- Comprehensive report in Spanish - What Axioma 2.0 is and does - Tech stack breakdown - What's resolved (90%) - What's missing (10%) - Use cases (B2B/B2C) - 2026 roadmap - How to get started - Current status and metrics
IMPORTANT DISCOVERY: - SSO/SAML, DLS/FLS, Audit Logs are OpenSearch NATIVE features (config only) - Multi-language is .env config (Granite embedding) - Only Rate Limiting and White-label require CODE Updated: - INFORME-COMPLETO.md: Added TL;DR section, corrected state table - ROADMAP-2026.md: Updated Q2 to show config-based features - Fixed section numbering in INFORME-COMPLETO.md
…lback - Add RateLimitMiddleware (Starlette) intercepting all /v1/* requests - Add RateLimiter service backed by Redis with in-memory fallback - Tier-based limits: free (100 req/min), pro (1000/min), enterprise (unlimited) - Inject X-RateLimit-* headers on every passing response, 429 on exceeded - Redis key uses sha256(api_key) — raw key never stored - Add Redis 7 service to docker-compose with healthcheck and volume - Add redis[asyncio]>=5.0.0 dependency - Fix test_encryption: clear provider env vars to avoid test pollution - Fix test_opensearch_security_setup: correct call count (6→7) and make cluster.health awaitable - Add 13 unit tests for RateLimiter (TDD) - Update all docs to reflect rate limiting as complete
- Add docs/MCP-GUIDE.md: practical guide for connecting Cursor and Claude Desktop to Axioma via MCP (tools, config JSON, env vars, troubleshooting) - Rename FastAPI app title from 'OpenRAG API' to 'Axioma API' - Add app description with auth instructions (X-API-Key / Bearer) - Add openapi_tags descriptions for public and internal groups - Add summary and description to all 15 public v1 add_api_route calls - Add response_model=SettingsResponse to GET /v1/settings
- Expand vertical section into 3 distinct go-to-market segments: B2C SaaS (90% ready), B2B Cloud/Managed (65%), B2B Air-gap (40%) - Add SGLang infrastructure section: RadixAttention, supported models, hardware requirements, and drop-in integration approach - Add SGLang to Q4 2026 deliverables - Renumber sections to accommodate new content
- Add semantic cache (langchain-redis) to avoid redundant LLM calls for identical or near-identical prompts via cosine similarity - SemanticCache wraps RedisSemanticCache with graceful passthrough fallback if Redis/RediSearch is unavailable - Inject cache check in ChatService.langflow_chat() before Langflow call (non-streaming only); cache set after response - Fix granite-embedding-278m-multilingual dimensions: 1024 → 768 (IBM official specs — mismatch caused fatal OpenSearch mapping error) - Change Redis image to redis/redis-stack:7-latest (includes RediSearch module required for vector similarity search; backward-compatible) - Add LANGCACHE_ENABLED, LANGCACHE_SIMILARITY_THRESHOLD, LANGCACHE_TTL settings with env var overrides - Add langchain-redis>=0.2.0 and langchain-openai>=0.2.0 dependencies - Document multilingual config, 768-dim note, and re-ingestion strategy in PROJECT-OVERVIEW.md
Replace redis/redis-stack with valkey/valkey-bundle in docker-compose, rename REDIS_URL → VALKEY_URL in settings and rate_limiter constructor, update all docs and spec migration notes. No logic changes — redis-py is protocol-compatible with Valkey.
…ause) Rename redis_url → valkey_url parameter, update docstrings and test fixtures. langchain-redis kwarg kept as redis_url — Valkey is protocol-compatible.
- Valkey: enable I/O multi-threading (4 threads, lazyfree) for 230%+ throughput - OpenSearch: replace dis_max with hybrid query + RRF normalization pipeline (neural-search plugin, rank_constant=60, graceful fallback to dis_max) - Ragas: add nightly batch eval script with Langfuse integration (Faithfulness, Answer Relevancy, Context Precision)
- Item 3: LLMRouter service with Ollama backend for Granite 4.0 H-Tiny - GRANITE_MODEL / GRANITE_ENDPOINT / GRANITE_BACKEND env vars - Abstraction layer ready for SGLang swap in Phase 3 (GRANITE_BACKEND=sglang) - Item 2: Granite Guardian 3.3 async guardrail service - Fire-and-forget evaluation via asyncio.create_task() — never blocks responses - Hooks in async_chat() and async_langflow_chat() post-response - Scores uploaded to Langfuse: guardian/safe, guardian/faithful, guardian/evaluation_ms - GUARDIAN_ENABLED / GUARDIAN_MODEL / GUARDIAN_SAMPLE_RATE env vars - Item 1: HybridChunker + Context Expansion - extract_with_hybrid_chunker() in document_processing.py - Preserves section_title, parent_section, chunk_index metadata per chunk - Conditional dispatch in processors.py (HYBRID_CHUNKER_ENABLED flag) - _expand_chunk_contexts() + _fetch_adjacent_chunks() in search_service.py - HYBRID_CHUNKER_ENABLED / CONTEXT_EXPANSION_ENABLED env vars (default false) - docling>=2.0.0 added to pyproject.toml All features deploy-safe: disabled by default via env flags
- PROJECT-COMPLETE.md: add Phase 2 features to done table + roadmap - ROADMAP-2026.md: mark Phase 0/1/2 complete, add Phase 3/4 items - PROJECT-OVERVIEW.md: add llm_router/guardrail_service to tree + stack table - INFORME-COMPLETO.md: update TL;DR box + roadmap sections
Documents the "cuarto de máquinas" architectural pattern for Phase 3: DLS/FLS, ISM lifecycle, UBI, Search Relevance Workbench, ML Commons, alertas, and OpenSearch Assistant — all config-only except UBI (1 hook). Includes anti-pattern warning: no AI agent logic in Dashboards.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.