Feat/kb integration v2.2.0#17
Open
racmac57 wants to merge 10 commits into
Open
Conversation
Looks fine to me. qa.
- Added batch processing (100 files per cycle) to prevent system overload - Implemented stability skip optimization (files >10 min bypass checks) - Enhanced parallel processing with optional multiprocessing and fallback - Added archive reprocessing script (reprocess_output.py) - Added OneDrive migration script (migrate_to_onedrive.py) - Refactored department configuration with 18 domain-specific departments - Added auto-archival of old output sessions (>90 days) - Implemented long path handling for Windows MAX_PATH limits - Added version conflict resolution for sidecars and manifests Performance: Reduced 6,500 file processing time from ~3.5 hours to ~53 minutes (90% improvement) Updated README.md, SUMMARY.md, and CHANGELOG.md with v2.1.9 improvements
…(v2.1.9 - 2025-11-19) - Added analyze_failed_files.py script for comprehensive failed file analysis - Analyzes file types, sizes, time patterns, and reprocessing potential - Identifies files that might succeed with updated code - Saves analysis results to JSON for review - Updated config.json to use OneDrive path for failed directory - Added failed_dir: %OneDriveCommercial%\\KB_Shared\\03_archive\\failed - Ensures consistency with archive and output directories - Enhanced watcher_splitter.py load_cfg() function - Added failed_dir to environment variable expansion list - Ensures proper path resolution for failed directory - Added comprehensive HANDOFF_PROMPT.md - Complete project context for AI assistants - Current system state and findings - Recommendations and next steps - Updated documentation (README, SUMMARY, CHANGELOG) - Added failed file analysis tools section - Updated v2.1.9 changes for November 19 - Documented OneDrive failed directory configuration
…rom 13 to 15 file types
…ring (v2.2.0) - Automatic KB insertion: Chunks automatically inserted into ChromaDB during processing - Enterprise retry logic: @backoff decorator with exponential backoff (3 retries) - Duplicate prevention: Pre-insertion checks prevent duplicate chunks - Graceful degradation: Errors logged but processing continues - Real-time monitoring dashboard: Streamlit app with live metrics, charts, and RAG search - Comprehensive testing: Integration and unit tests with mocked embeddings - Updated PowerShell scripts: Start/Stop scripts now manage both watcher and dashboard - Configuration options: New KB config keys for easy tuning - Documentation: Complete KB integration guide and updated README/CHANGELOG/SUMMARY Key files: - watcher_splitter.py: Added insert_chunks_to_kb() with retry and duplicate checking - dashboard_kb_monitoring.py: New Streamlit monitoring dashboard - rag_integration.py: Fixed metadata compatibility (tags/keywords as JSON strings) - config.json: Added auto_kb_insertion, kb_insertion_batch_size, kb_insertion_retry_attempts - requirements.txt: Added backoff, plotly, pytest-asyncio - tests/: Comprehensive test coverage for KB integration - scripts/: Updated PowerShell scripts for dual-service management
hy5guy
previously approved these changes
Nov 23, 2025
- nltk is required by watcher_splitter.py and chunker_core.py - Fixes ModuleNotFoundError in CI test suite (8 failing tests)
This was referenced Apr 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Test
Checklist