-
Notifications
You must be signed in to change notification settings - Fork 0
Tracking changes: v3.1-vec development branch vs original v3.0 (not to be merged) #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v3.0
Are you sure you want to change the base?
Conversation
|
@gemini-code-assist stop |
|
It appears you've used the |
- Split exec_admin into exec_admin (shows errors) and exec_admin_silent - Execute each SET command individually to catch specific failures - Add proper error checking and early exit on configuration failures - Fix MCP endpoint test URL from /config to /mcp/config - Update displayed endpoint URLs to include /mcp/ prefix and all 5 endpoints
- Change /config to /mcp/config - Change /query to /mcp/query
- Initialize MySQL_Tool_Handler in ProxySQL_MCP_Server constructor with MySQL configuration from MCP variables - Use GloVars.get_SSL_pem_mem() to get SSL certificates correctly - Add MySQL_Tool_Handler cleanup in destructor - Change configure_mcp.sh default MySQL port from 3307 to 3306 - Change configure_mcp.sh default password from test123 to empty - Update help text and examples to match new defaults
The script correctly uses ${VAR:-default} syntax, so it already
respects environment variables. The issue was with the MCP server
not reinitializing when MySQL configuration changes, which is a
separate issue.
When LOAD MCP VARIABLES TO RUNTIME is called and the MCP server is already running, the MySQL Tool Handler is now recreated with the current configuration values. This allows changing MySQL connection parameters without restarting ProxySQL. The reinitialization: 1. Deletes the old MySQL Tool Handler 2. Creates a new one with current mcp-mysql_* values 3. Initializes the new handler 4. Logs success or failure
Change MYSQL_PASSWORD from using :- to = for default value. This allows setting an empty password via environment variable: - MYSQL_PASSWORD="" will now use empty password - Previously, empty string was treated as unset, forcing the default
Print environment variables at the start of each script when they are set: - setup_test_db.sh: Shows MYSQL_HOST, MYSQL_PORT, MYSQL_USER, MYSQL_PASSWORD, TEST_DB_NAME - configure_mcp.sh: Shows MYSQL_HOST, MYSQL_PORT, MYSQL_USER, MYSQL_PASSWORD, TEST_DB_NAME, MCP_PORT - test_mcp_tools.sh: Shows MCP_HOST, MCP_PORT This helps users verify that the correct environment variables are being used.
This commit implements Option 1 (Multiple Tool Handlers) for the MCP module, where each of the 5 endpoints has its own dedicated tool handler with specific tools. ## Architecture Changes - Created MCP_Tool_Handler base class interface for all tool handlers - Each endpoint now has its own dedicated tool handler: - /mcp/config → Config_Tool_Handler (configuration management) - /mcp/query → Query_Tool_Handler (database exploration) - /mcp/admin → Admin_Tool_Handler (administrative operations) - /mcp/cache → Cache_Tool_Handler (cache management) - /mcp/observe → Observe_Tool_Handler (monitoring & metrics) ## New Files Base Interface: - include/MCP_Tool_Handler.h - Base class for all tool handlers Tool Handlers: - include/Config_Tool_Handler.h, lib/Config_Tool_Handler.cpp - include/Query_Tool_Handler.h, lib/Query_Tool_Handler.cpp - include/Admin_Tool_Handler.h, lib/Admin_Tool_Handler.cpp - include/Cache_Tool_Handler.h, lib/Cache_Tool_Handler.cpp - include/Observe_Tool_Handler.h, lib/Observe_Tool_Handler.cpp Documentation: - doc/MCP/Architecture.md - Comprehensive architecture documentation ## Modified Files - include/MCP_Thread.h, lib/MCP_Thread.cpp - Added 5 tool handler pointers - include/MCP_Endpoint.h, lib/MCP_Endpoint.cpp - Use tool_handler base class - lib/ProxySQL_MCP_Server.cpp - Create and pass handlers to endpoints - lib/Makefile - Added new source files ## Implementation Status - Config_Tool_Handler: Functional (get_config, set_config, list_variables, get_status) - Query_Tool_Handler: Functional (wraps MySQL_Tool_Handler, all 18 tools) - Admin_Tool_Handler: Stub implementations (TODO: implement) - Cache_Tool_Handler: Stub implementations (TODO: implement) - Observe_Tool_Handler: Stub implementations (TODO: implement) See GitHub Issue #8 for detailed TODO list. Co-authored-by: Claude <claude@anthropic.com>
This commit implements Phase 2 of the MCP multi-endpoint architecture:
per-endpoint Bearer token authentication.
## Changes
### lib/MCP_Endpoint.cpp
- Implemented `authenticate_request()` method with:
- Per-endpoint token validation (mcp-{endpoint}_endpoint_auth)
- Bearer token support via Authorization header
- Query parameter fallback (?token=xxx) for simple testing
- No authentication when token is not configured (backward compatible)
- Proper 401 Unauthorized response on auth failure
- Token whitespace trimming
- Debug logging for troubleshooting
### doc/MCP/Architecture.md
- Updated Per-Endpoint Authentication section with complete implementation
- Marked Phase 3 authentication task as completed (✅)
- Added authentication implementation code example
## Authentication Flow
1. Client sends request with Bearer token:
- Header: `Authorization: Bearer <token>`
- Or query param: `?token=<token>`
2. Server validates against endpoint-specific variable:
- `/mcp/config` → `mcp-config_endpoint_auth`
- `/mcp/observe` → `mcp-observe_endpoint_auth`
- `/mcp/query` → `mcp-query_endpoint_auth`
- `/mcp/admin` → `mcp-admin_endpoint_auth`
- `/mcp/cache` → `mcp-cache_endpoint_auth`
3. Returns 401 Unauthorized if:
- Auth is required but not provided
- Token doesn't match expected value
4. Allows request if:
- No auth token configured (backward compatible)
- Token matches expected value
## Testing
```bash
# Set auth token for /mcp/query endpoint
mysql -h 127.0.0.1 -P 6032 -u admin -padmin \
-e "SET mcp-query_endpoint_auth='my-secret-token'; LOAD MCP VARIABLES TO RUNTIME;"
# Test with Bearer token
curl -k -X POST https://127.0.0.1:6071/mcp/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-token" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
# Test with query parameter
curl -k -X POST "https://127.0.0.1:6071/mcp/query?token=my-secret-token" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
```
## Status
✅ Authentication fully implemented and functional
⚠️ Testing with running ProxySQL instance still needed
Co-authored-by: Claude <claude@anthropic.com>
- Add discover_tools() function that calls tools/list on each endpoint - Store discovered tools in temp file to avoid bash associative array issues - Define all expected tools with test configurations - Only test tools that are discovered via tools/list - Add support for all 5 endpoints: config, query, admin, cache, observe - Add --list-only flag to show discovered tools without testing - Add --endpoint flag to test specific endpoint - Improve help output with endpoint descriptions
Verbose output was being printed to stdout, contaminating the captured response value and causing tool discovery to fail. Redirect debug output to stderr instead.
The code was creating a dangling pointer by calling c_str() on a temporary std::string object, causing undefined behavior and crashes when processing query results. Before: const char* col_name = columns[i].get<std::string>().c_str(); // ^ temporary string destroyed here, col_name is dangling After: std::string col_name = columns[i].get<std::string>(); // ^ col_name is valid until end of scope This bug was causing ProxySQL to crash when running MCP tool tests.
MySQL returns column names in uppercase for information_schema tables, but the code was expecting lowercase column names. This caused crashes when accessing JSON keys that didn't exist. Changes: 1. Convert all column names to lowercase in execute_query() 2. Store lowercase column names in a vector for efficient access 3. Use lowercase column names as keys in JSON row objects This ensures consistent column name casing across all queries, preventing JSON access errors for information_schema columns. Also includes the previous use-after-free fix.
The nlohmann::json value() method can throw "basic_string: construction from null is not valid" when trying to convert a JSON null value to std::string. Added helper functions get_json_string() and get_json_int() that: - Check if key exists before accessing - Check if value is not null - Check if value has correct type - Return default value if any check fails This prevents crashes when: 1. Arguments are missing (returns default) 2. Arguments are explicitly null (returns default) 3. Arguments have wrong type (returns default)
- Fixed NULL value handling in execute_query: use empty string instead of nullptr to avoid "basic_string: construction from null" errors - Fixed validate_readonly_query: corrected substring length check from substr(0,6)!="SELECT " to substr(0,6)!="SELECT" - Fixed test script: added proper variable_name parameter for get_config/set_config tools Query endpoint tools now pass all tests.
Comprehensive architecture for AI-powered database discovery agent: - Mixture-of-experts approach (Structural, Statistical, Semantic, Query) - Orchestrator pattern for coordinated exploration - Four-phase discovery process (Blind → Patterns → Hypotheses → Synthesis) - Domain-agnostic design for any database complexity or type - Catalog as shared memory for collaborative expert findings - Multiple real-world examples (law firm, scientific research, ecommerce)
Comprehensive guide for discovering and using MCP Query endpoint tools: - Tool discovery via tools/list method - Complete list of all Query endpoint tools with parameters - cURL and Python examples for tool discovery and execution - Complete database exploration example - Test script usage guide
…3-fix Fix malformed Bind packets when client provides a single parameter format
Signed-off-by: René Cannaò <rene@proxysql.com>
Fix: pgsql_users.use_ssl not being enforced on frontend connections
rework spec files, use rpm macros to handle systemd
[skip-ci] Remove deprecated read_only_action implementations from MySQL and PgSQL HostGroups managers
Improve build performance by preserving test dependencies during 'make clean'
Implement async GenAI module with socketpair-based non-blocking architecture
Comprehensive implementation documentation for two new search capabilities: FTS (Full Text Search): - 6 tools for lexical search using SQLite FTS5 - Separate mcp_fts.db database - Keyword matching and phrase search - Tools: fts_index_table, fts_search, fts_list_indexes, fts_delete_index, fts_reindex, fts_rebuild_all Vector Embeddings: - 6 tools for semantic search using sqlite-vec - Separate mcp_embeddings.db database - Vector similarity search with sqlite-rembed integration - Placeholder for future GenAI module - Tools: embed_index_table, embed_search, embed_list_indexes, embed_delete_index, embed_reindex, embed_rebuild_all Both systems: - Follow MySQL_Catalog patterns for SQLite management - Integrate with existing MCP Query endpoint - Work alongside Catalog for AI agent memory - 13-step implementation plans with detailed code examples
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Comment |
- Add #include <string> for C++ std::string support - Add #include <cmath> for sqrt() function - Change format %lld to %ld for chrono duration types (long int, not long long) Resolves compilation errors for vector_db_performance-t test.
- Add #include <string> for C++ std::string support Resolves compilation errors for ai_llm_retry_scenarios-t test.
…leanup
Security fixes:
- Add escape_identifier() helper for proper SQLite identifier escaping
- Replace sanitize_name() with allowlist validation (ASCII letters, digits, underscore only)
- Fix MATCH clause FTS5 operator injection by wrapping query in double quotes
- Apply escape_identifier() to all DDL statements (CREATE, DROP, triggers)
Memory safety fixes:
- Replace VLA with std::vector in MySQL_FTS::init(), add delete on error path
- Fix memory leak: free error string before return in list_indexes()
- Fix reindex_json["error"] potential exception using .value() with default
Thread safety fixes:
- reinit_fts(): Add mutex lock around pointer swap
- reset_fts_path(): Move blocking init() outside lock, only swap pointer under lock
Code cleanup:
- Remove 7 debug fprintf statements from Query_Tool_Handler.cpp
- Remove unused #include <memory> from MySQL_FTS.h
Test script security fixes:
- Use MYSQL_PWD environment variable instead of -p"..." for password
- Add escape_sql() function and apply to INSERT statement
- Fix CURL_OPTS quoting: ${CURL_OPTS:+"${CURL_OPTS}"}
- Remove unused FTS_INDEX_NAME and SEARCH_QUERIES variables
Documentation fixes:
- Fix bare URL to markdown link format
- Add code block language identifiers (text, bash)
This commit addresses all recommendations from CodeRabbit, Gemini Code Assist, and Copilot for PR #25 (FTS security and code quality improvements). Critical Security Fixes: - MCP_Thread.cpp: Rollback fts_path on reset failure to keep config consistent - MySQL_FTS.cpp: Add escape_mysql_identifier() for MySQL query identifier escaping - MySQL_FTS.cpp: Add unique hash-based fallback to sanitize_name() for empty strings - MySQL_FTS.cpp: Add where_clause validation to block dangerous SQL patterns Memory Safety Fixes: - MySQL_FTS.cpp: Fix indexes_result memory leak on early return in search() - MySQL_FTS.h: Delete copy/move operations to prevent accidental resource duplication Thread Safety Documentation: - MySQL_Tool_Handler.cpp: Add comment explaining FTS lock design rationale Test Script Improvements: - test_mcp_fts.sh: Add curl timeouts (5s connect, 30s max) - test_mcp_fts.sh: Remove unused delete_response variable - test_mcp_fts_detailed.sh: Make cleanup tolerant of non-existent indexes Build Fixes: - Makefile: Fix EXCLUDE_TRACKING_VARAIABLES typo to EXCLUDE_TRACKING_VARIABLES - vector_db_performance-t.cpp: Fix printf format specifiers to %lld with cast Schema Fixes: - Query_Tool_Handler.cpp: Change fts_index_table columns schema from string to array Code Cleanup: - MySQL_Tool_Handler.cpp: Remove all remaining debug fprintf statements (34 lines) - Documentation: Change "Full Text" to "Full-Text" (hyphenated) Total: ~50 fixes across 10 files
Fix PR #21 Review: Security, Memory Safety, Thread Safety + Additional Fixes
Conflict resolution summary: **Features preserved from v3.1-vec:** - mcp_use_ssl variable (HTTP/HTTPS mode support) - RAG tool handler support - FTS (Full Text Search) functionality in MySQL_Tool_Handler **Changes accepted from v3.1-MCP2:** - Schema isolation for catalog (removed mcp_catalog_path/fts_path variables) - Schema parameter added to catalog functions - Comprehensive query rewriting improvements in Query_Tool_Handler - Format string fixes in vector_db_performance-t.cpp **Key decisions:** - catalog_path/fts_path: Removed per v3.1-MCP2 commit 35b0b22 (intentional) - MySQL_Catalog.cpp: Accepted v3.1-MCP2 version (schema isolation improvements) - Query_Tool_Handler.cpp: Accepted v3.1-MCP2 version (significant improvements) - Makefile: Kept v3.1-vec's EXCLUDE_TRACKING_VARIABLES (correct spelling) **Build notes:** - MySQL_Tool_Handler now requires schema parameter for catalog operations - Catalog path is hardcoded to datadir/mcp_catalog.db - FTS functionality preserved but may need updates for schema isolation
- Fixed invalid memory accesses for tables:
+ mcp_query_rules
+ stats_mcp_query_rules
+ stats_mcp_query_digests
- Fixed inactive 'mcp_query_rules' being loaded to runtime.
- Fixed hash computation in 'compute_mcp_digest'.
- Fixed invalid escaping during 'stats_mcp_query_digests' gen.
- Fixed digest generation for MCP arguments:
+ SQL queries are now preserved using
'mysql_query_digest_and_first_comment'.
+ TODO: Options for the tokenizer are right now hardcoded.
- Added initial testing and testing plan for MCP query_(rules/digests).
+ TODO: Test finished on phase8. Timeouts destroy the MCP
connection, leaving it unusable for subsequent queries this should
be fixed for continuing testing.
- TODO: There are several limitations to fix in
'validate_readonly_query'. This reflect in some query hacks in the
testing.
+ 'SELECT' starting with comments (--) gets flagged as non-read.
+ 'SELECT' must have a 'SELECT .* FROM' structure. While common,
simple testing queries many times lack this form.
…ility) v3.1-MCP2 introduced lib/proxy_sqlite3_symbols.cpp which provides canonical definitions for proxy_sqlite3_* function pointers. Tests that link against libproxysql.a must NOT define MAIN_PROXY_SQLITE3, otherwise sqlite3db.h will provide conflicting definitions. Removed #define MAIN_PROXY_SQLITE3 from: - test/tap/tests/sqlite3-t.cpp - test/tap/tests/unit_test.h - test/tap/tests/setparser_test.cpp - test/tap/tests/setparser_test_common.h
- lib/MySQL_Catalog.cpp: Convert search/list/remove to use SQLite prepared statements instead of string concatenation for user parameters - lib/RAG_Tool_Handler.cpp: Add escape_fts_query() function to properly escape single quotes in FTS5 MATCH clauses; update all FTS and vector MATCH queries to use escaped values - lib/Static_Harvester.cpp: Add is_valid_schema_name() validation function to ensure schema names only contain safe characters (alphanumeric, underscore, dollar sign) before using in INFORMATION_SCHEMA queries - lib/Query_Tool_Handler.cpp: Add clarifying comments to validate_readonly_query explaining the blacklist (quick exit) + whitelist (allowed query types) approach - Remove backup file lib/Anomaly_Detector.cpp.bak Addresses gemini-code-assist review comments from PR #26.
Add proxy_sqlite3_enable_load_extension and proxy_sqlite3_global_stats_row_step to the #else (MAIN_PROXY_SQLITE3) branch in sqlite3db.h to maintain symmetry with the extern declarations and prevent ODR violations. Addresses coderabbitai review comment.
Add assert(proxy_sqlite3_bind_blob) to the assertion block in SQLite3DB::LoadPlugin to ensure the symbol is provided by plugins. Without this, if a plugin fails to provide the symbol, the code will crash at runtime with no safety check. proxy_sqlite3_bind_blob is actively used in Anomaly_Detector.cpp:765 to bind embeddings. Addresses coderabbitai review comment.
…pserts Add catalog_au AFTER UPDATE trigger in MySQL_Catalog that mirrors the delete+insert pattern used in catalog_ad/catalog_ai. This keeps the FTS index current when upserts occur (INSERT OR REPLACE ... ON CONFLICT ... DO UPDATE), since the UPDATE doesn't trigger INSERT/DELETE triggers. The trigger first deletes the old entry from catalog_fts then inserts the new entry, ensuring the full-text search index stays synchronized with the catalog table. Addresses coderabbitai review comment.
Change free(resultset) to delete resultset in Query_Tool_Handler (extract_schema_name function). SQLite3_result is a C++ class allocated with new, so it must be deallocated with delete, not free(). Using free() causes mixed allocator UB. Addresses coderabbitai review comment.
The execute_parameterized_query function in RAG_Tool_Handler was creating a prepared statement and binding parameters, but then executing the raw query string instead of the prepared statement. This completely bypassed the parameter binding, making the function useless for preventing SQL injection. Changed to use vector_db->execute_prepared(stmt, ...) to execute the bound statement instead of the raw query, so the bound parameters are actually used. Addresses coderabbitai review comment.
Fix off-by-one errors in row->fields indices. The SELECT clause returns: object_id(0), schema_name(1), object_name(2), object_type(3), engine(4), table_rows_est(5), data_length(6), index_length(7), has_primary_key(8), has_foreign_keys(9), has_time_column(10) But the code was reading from fields[3..9] instead of [4..10]. Added comment documenting the correct column order. Addresses coderabbitai review comment.
Add escape_sql_string() helper function that doubles single quotes to prevent SQL injection when strings are used in SQL string concatenation. Update harvest_view_definitions to use this function for view_def, schema_name, and view_name. This prevents SQL injection in the UPDATE statement that stores view definitions in the catalog. Addresses coderabbitai review comment.
Fix two issues in Query_Tool_Handler's execute_query functions: 1. mysql_query() failure path now returns immediately after return_connection() instead of continuing to process on bad state. 2. Capture affected_rows BEFORE return_connection() to avoid race condition. Previously, mysql_affected_rows() was called after return_connection(), potentially accessing a stale connection. Apply fixes to both execute_query() and execute_query_with_schema(). Addresses coderabbitai review comments.
…rules_to_runtime Change free(resultset) to delete resultset in ProxySQL_Admin::load_mcp_query_rules_to_runtime. SQLite3_result is a C++ class allocated with new, so it must be deallocated with delete, not free(). Using free() causes undefined behavior and memory leaks. Addresses coderabbitai review comment.
Fix multi-statement execution in Admin_Handler.cpp for MCP query rules. The previous code built a single SQL string with "DELETE ...; INSERT ..." but SQLite only executed the first statement. Changed to execute statements as an explicit transaction: 1. BEGIN 2. DELETE FROM target_table 3. INSERT OR REPLACE INTO target_table SELECT * FROM source_table 4. COMMIT (or ROLLBACK on error) Applied to both: - LOAD MCP QUERY RULES FROM DISK/TO MEMORY - SAVE MCP QUERY RULES TO DISK Addresses coderabbitai review comment.
…njection Fix two SQL injection vulnerabilities identified by coderabbitai in ProxySQL_Admin_Stats.cpp by converting sprintf/snprintf interpolation to SQLite prepared statements. 1. stats___mcp_query_digest (lines 2581-2636): - Changed from sprintf with format string to prepared statement - digest_text field now properly bound as parameter (was unsafe) - All 10 columns now use positional parameters (?1-?10) 2. stats___mcp_query_tools_counters (lines 1587-1635): - Changed from snprintf with unescaped fields to prepared statement - Fixed incorrect table name logic (was appending _reset incorrectly) - All 8 columns now use positional parameters (?1-?8) These changes prevent SQL injection when resultset fields contain quotes, backslashes, or other SQL metacharacters.
…ccess Fix compilation errors in the SQL injection fixes: 1. ProxySQL_Admin_Stats.cpp: Use public statsdb->prepare_v2() API - Changed from direct proxy_sqlite3_prepare_v2() calls with statsdb->db - statsdb->db is private, must use public prepare_v2(query, &stmt) method 2. Admin_Handler.cpp: Add SPA cast for template function access - Added ProxySQL_Admin *SPA=(ProxySQL_Admin *)pa; declaration - Changed all admindb->execute to SPA->admindb->execute - Removed unused 'error' and 'success' variables The build now completes successfully.
… SQL injection Fix issues identified by coderabbitai review: 1. Admin_Handler.cpp: Fix typo in strncasecmp for LOAD MCP QUERY RULES - Line 2365 had "LOAD MCP RULES FROM DISK" instead of "LOAD MCP QUERY RULES FROM DISK" 2. Admin_Handler.cpp: Fix use-after-free and missing client response - Removed l_free(*ql,*q) which freed *q before caller used it - Added send_error_msg_to_client calls on all error paths - Added send_ok_msg_to_client call on success path - Changed return value from true to false (to match handler pattern) - Applied to both LOAD MCP QUERY RULES and SAVE MCP QUERY RULES handlers 3. ProxySQL_Admin_Stats.cpp: Fix sprintf SQL injection in stats___mcp_query_rules - Replaced sprintf with prepared statement using positional parameters - Changed from char* query with malloc/free to sqlite3_stmt with prepare_v2 - Both columns now bound as parameters (?1, ?2)
Merge v3.1-MCP2 into v3.1-vec (conflicts resolved)
- Fix Discovery_Schema.cpp fingerprint output JSON consistency - Quote placeholders (e.g., "?" instead of ?) for valid JSON - Fix test_mcp_query_rules_block.sh exec_admin_silent function - Add -B and -N flags for batch mode with no headers - Remove unused YELLOW variable from test_phase1_crud.sh - Fix test_phase2_load_save.sh runtime DELETE to be non-fatal - Remove unused INITIAL_COUNT from test_phase3_runtime.sh - Fix test_phase3_runtime.sh to verify exact ID ordering - Fix test_phase4_stats.sh read-only test assertion - Make INSERT/DELETE non-fatal and fix assertion logic - Remove unused DIGEST_TEXT_CHECK from test_phase5_digest.sh - Fix test_phase8_eval_timeout.sh non-timeout response handling - Return failure for non-timeout responses - Remove stray exit 1 from test_phase8_eval_timeout.sh
Resolved merge conflict in lib/ProxySQL_Admin_Stats.cpp: - Kept v3.1-MCP2_QR version with proper NULL handling - Uses stmt_unique_ptr (RAII smart pointer) - Checks for NULL before calling atoll() on numeric fields - Discarded v3.1-vec version which lacked NULL checks The merged code correctly handles NULL values for: - run_id - count_star - first_seen - last_seen - sum_time - min_time - max_time
Address coderabbitai review - implement full JSON escaping for SQL digest:
- Handle backslash (\) and double quote (")
- Handle control characters: newline (\n), carriage return (\r), tab (\t)
- Handle other control characters (U+0000 through U+001F) with \uXXXX escapes
This ensures digest_text in stats_mcp_query_digest is always valid JSON,
preventing parsing errors for consumers of this data.
- Fix comment mismatch: Changed _2 suffix to match actual function name (mysql_query_digest_and_first_comment, not _2) - Make get_def_mysql_opts() static to avoid symbol pollution - Fix NULL first_comment parameter to prevent potential segfault - Pass valid char* pointer instead of NULL - Free first_comment if allocated by the function
fix: Multiple issues with MCP query_(rules/digests)
This addresses an issue from PR #22 where LoadPlugin() was completely disabled. The function performs necessary initialization even when no plugin is loaded (initializes built-in sqlite3 function pointers). Changes: - Added `const bool allow_load_plugin = false` flag in LoadPlugin() - Modified `if (plugin_name)` to `if (plugin_name && allow_load_plugin == true)` - Re-enabled the LoadPlugin() call in LoadPlugins() The plugin loading code remains disabled (allow_load_plugin=false) while the function pointer initialization from built-in SQLite3 now works correctly. TODO: Revisit plugin loading safety mechanism to allow actual plugin loading.
fix: Re-enable SQLite3DB::LoadPlugin() with allow_load_plugin flag
Summary
This pull request tracks all changes in the development branch
v3.1-vecagainst the original fork branchv3.0. This PR is not intended to be merged; it serves as a living changelog and allows identification of potential conflicts between the development branch and the upstream fork.Overview of Changes
The
v3.1-vecbranch extends ProxySQL with modern vector search and AI embedding generation capabilities, turning ProxySQL into a vector‑enabled SQL proxy that can perform similarity searches and generate text embeddings directly within its SQLite databases.Key Additions
sqlite‑vec static extension – Adds vector similarity search to SQLite via the
sqlite‑vecproject. Enables:vec0) for efficient indexingsqlite‑rembed integration – Adds text‑embedding generation via remote AI APIs using the
sqlite‑rembedRust extension. Supports:rembed()SQL function that calls embedding APIstemp.rembed_clientsvirtual table for managing API clientsSQLite FTS5 support – Enables full‑text search capabilities within SQLite, complementing the vector‑search features.
Build‑system modifications – Updated
Makefiles (deps/Makefile,lib/Makefile,src/Makefile) to:sqlite‑rembedsqlite‑rembedfrom a packaged.tar.gzsqlite‑vecandsqlite‑rembedintolibproxysql.aCore‑code integration – Modified
lib/Admin_Bootstrap.cppto:sqlite3_vec_initandsqlite3_rembed_initentry pointsComprehensive documentation – Added detailed guides:
SQLite3‑Server.md– Overview of SQLite3 server capabilitiessqlite‑rembed‑integration.md– Integration architecture and usageposts‑embeddings‑setup.md– Example using StackExchange Posts dataSQLITE‑REMBED‑TEST‑README.md– Testing proceduresdoc/vector‑search‑test/)Utility scripts – Added Python and shell scripts for practical workflows:
scripts/copy_stackexchange_Posts_mysql_to_sqlite3.py– Migrate Posts table from MySQL to SQLitescripts/process_posts_embeddings.py– Generate embeddings with exponential backoffsqlite‑rembed‑demo.sh,sqlite‑rembed‑examples.sh, etc.)Scope of Changes
Purpose of This PR
Because
v3.1‑vecis the development branch for all vector‑search enhancements, this PR provides a permanent, up‑to‑date diff against the originalv3.0fork. It allows:v3.0evolves.Commit History
The branch contains 20 commits (excluding merge commits) that introduce the features described above. For a detailed list, see the commit graph below.
File Changes
The modified/added files are listed below. Most additions are under
deps/sqlite3/,doc/, andscripts/.