Skip to content

Conversation

@renecannao
Copy link

Summary

This pull request tracks all changes in the development branch v3.1-vec against the original fork branch v3.0. This PR is not intended to be merged; it serves as a living changelog and allows identification of potential conflicts between the development branch and the upstream fork.

Overview of Changes

The v3.1-vec branch extends ProxySQL with modern vector search and AI embedding generation capabilities, turning ProxySQL into a vector‑enabled SQL proxy that can perform similarity searches and generate text embeddings directly within its SQLite databases.

Key Additions

  1. sqlite‑vec static extension – Adds vector similarity search to SQLite via the sqlite‑vec project. Enables:

    • Storage of high‑dimensional vectors in SQLite tables
    • Cosine, Euclidean, and inner‑product distance calculations
    • Approximate nearest‑neighbor (ANN) search
    • Virtual table interface (vec0) for efficient indexing
  2. sqlite‑rembed integration – Adds text‑embedding generation via remote AI APIs using the sqlite‑rembed Rust extension. Supports:

    • Multiple providers: OpenAI, Nomic, Cohere, Ollama, Llamafile
    • rembed() SQL function that calls embedding APIs
    • temp.rembed_clients virtual table for managing API clients
    • Static linking of the Rust library into ProxySQL
  3. SQLite FTS5 support – Enables full‑text search capabilities within SQLite, complementing the vector‑search features.

  4. Build‑system modifications – Updated Makefiles (deps/Makefile, lib/Makefile, src/Makefile) to:

    • Detect and compile the Rust toolchain for sqlite‑rembed
    • Extract sqlite‑rembed from a packaged .tar.gz
    • Statically link both sqlite‑vec and sqlite‑rembed into libproxysql.a
  5. Core‑code integration – Modified lib/Admin_Bootstrap.cpp to:

    • Declare the sqlite3_vec_init and sqlite3_rembed_init entry points
    • Enable extension loading on all ProxySQL databases (admin, stats, config, monitor, stats_disk)
    • Register both extensions for auto‑loading on every new database connection
  6. Comprehensive documentation – Added detailed guides:

    • SQLite3‑Server.md – Overview of SQLite3 server capabilities
    • sqlite‑rembed‑integration.md – Integration architecture and usage
    • posts‑embeddings‑setup.md – Example using StackExchange Posts data
    • SQLITE‑REMBED‑TEST‑README.md – Testing procedures
    • Vector‑search test suite (doc/vector‑search‑test/)
  7. Utility scripts – Added Python and shell scripts for practical workflows:

    • scripts/copy_stackexchange_Posts_mysql_to_sqlite3.py – Migrate Posts table from MySQL to SQLite
    • scripts/process_posts_embeddings.py – Generate embeddings with exponential backoff
    • Several demonstration and test scripts (sqlite‑rembed‑demo.sh, sqlite‑rembed‑examples.sh, etc.)

Scope of Changes

  • 26 files changed, 13803 insertions(+), 8 deletions(‑)
  • The changes are additive – no existing functionality is removed or broken.
  • All new features are opt‑in; vector search and embedding generation are available only when explicitly used in SQL queries.

Purpose of This PR

Because v3.1‑vec is the development branch for all vector‑search enhancements, this PR provides a permanent, up‑to‑date diff against the original v3.0 fork. It allows:

  • Tracking progress – See every commit that has been added to the development branch.
  • Conflict detection – Identify any merge conflicts that may arise as upstream v3.0 evolves.
  • Living changelog – A single place to review what has been added since the fork.

Commit History

The branch contains 20 commits (excluding merge commits) that introduce the features described above. For a detailed list, see the commit graph below.

File Changes

The modified/added files are listed below. Most additions are under deps/sqlite3/, doc/, and scripts/.

gemini-code-assist[bot]

This comment was marked as resolved.

@renecannao
Copy link
Author

@gemini-code-assist stop

@ProxySQL ProxySQL deleted a comment from gemini-code-assist bot Dec 24, 2025
@gemini-code-assist
Copy link

It appears you've used the stop command, which is not a supported command for Gemini Code Assist. If you were intending to stop a code review or summary, this command is not supported. If you were intending to invoke a command, please see the available commands in the help message by using the /gemini help command.

renecannao and others added 25 commits January 11, 2026 23:00
- Split exec_admin into exec_admin (shows errors) and exec_admin_silent
- Execute each SET command individually to catch specific failures
- Add proper error checking and early exit on configuration failures
- Fix MCP endpoint test URL from /config to /mcp/config
- Update displayed endpoint URLs to include /mcp/ prefix and all 5 endpoints
- Change /config to /mcp/config
- Change /query to /mcp/query
- Initialize MySQL_Tool_Handler in ProxySQL_MCP_Server constructor
  with MySQL configuration from MCP variables
- Use GloVars.get_SSL_pem_mem() to get SSL certificates correctly
- Add MySQL_Tool_Handler cleanup in destructor
- Change configure_mcp.sh default MySQL port from 3307 to 3306
- Change configure_mcp.sh default password from test123 to empty
- Update help text and examples to match new defaults
The script correctly uses ${VAR:-default} syntax, so it already
respects environment variables. The issue was with the MCP server
not reinitializing when MySQL configuration changes, which is a
separate issue.
When LOAD MCP VARIABLES TO RUNTIME is called and the MCP server is
already running, the MySQL Tool Handler is now recreated with the
current configuration values. This allows changing MySQL connection
parameters without restarting ProxySQL.

The reinitialization:
1. Deletes the old MySQL Tool Handler
2. Creates a new one with current mcp-mysql_* values
3. Initializes the new handler
4. Logs success or failure
Change MYSQL_PASSWORD from using :- to = for default value.
This allows setting an empty password via environment variable:
- MYSQL_PASSWORD="" will now use empty password
- Previously, empty string was treated as unset, forcing the default
Print environment variables at the start of each script when they are set:
- setup_test_db.sh: Shows MYSQL_HOST, MYSQL_PORT, MYSQL_USER, MYSQL_PASSWORD, TEST_DB_NAME
- configure_mcp.sh: Shows MYSQL_HOST, MYSQL_PORT, MYSQL_USER, MYSQL_PASSWORD, TEST_DB_NAME, MCP_PORT
- test_mcp_tools.sh: Shows MCP_HOST, MCP_PORT

This helps users verify that the correct environment variables are being used.
This commit implements Option 1 (Multiple Tool Handlers) for the MCP module,
where each of the 5 endpoints has its own dedicated tool handler with specific tools.

## Architecture Changes

- Created MCP_Tool_Handler base class interface for all tool handlers
- Each endpoint now has its own dedicated tool handler:
  - /mcp/config → Config_Tool_Handler (configuration management)
  - /mcp/query → Query_Tool_Handler (database exploration)
  - /mcp/admin → Admin_Tool_Handler (administrative operations)
  - /mcp/cache → Cache_Tool_Handler (cache management)
  - /mcp/observe → Observe_Tool_Handler (monitoring & metrics)

## New Files

Base Interface:
- include/MCP_Tool_Handler.h - Base class for all tool handlers

Tool Handlers:
- include/Config_Tool_Handler.h, lib/Config_Tool_Handler.cpp
- include/Query_Tool_Handler.h, lib/Query_Tool_Handler.cpp
- include/Admin_Tool_Handler.h, lib/Admin_Tool_Handler.cpp
- include/Cache_Tool_Handler.h, lib/Cache_Tool_Handler.cpp
- include/Observe_Tool_Handler.h, lib/Observe_Tool_Handler.cpp

Documentation:
- doc/MCP/Architecture.md - Comprehensive architecture documentation

## Modified Files

- include/MCP_Thread.h, lib/MCP_Thread.cpp - Added 5 tool handler pointers
- include/MCP_Endpoint.h, lib/MCP_Endpoint.cpp - Use tool_handler base class
- lib/ProxySQL_MCP_Server.cpp - Create and pass handlers to endpoints
- lib/Makefile - Added new source files

## Implementation Status

- Config_Tool_Handler: Functional (get_config, set_config, list_variables, get_status)
- Query_Tool_Handler: Functional (wraps MySQL_Tool_Handler, all 18 tools)
- Admin_Tool_Handler: Stub implementations (TODO: implement)
- Cache_Tool_Handler: Stub implementations (TODO: implement)
- Observe_Tool_Handler: Stub implementations (TODO: implement)

See GitHub Issue #8 for detailed TODO list.

Co-authored-by: Claude <claude@anthropic.com>
This commit implements Phase 2 of the MCP multi-endpoint architecture:
per-endpoint Bearer token authentication.

## Changes

### lib/MCP_Endpoint.cpp
- Implemented `authenticate_request()` method with:
  - Per-endpoint token validation (mcp-{endpoint}_endpoint_auth)
  - Bearer token support via Authorization header
  - Query parameter fallback (?token=xxx) for simple testing
  - No authentication when token is not configured (backward compatible)
  - Proper 401 Unauthorized response on auth failure
  - Token whitespace trimming
  - Debug logging for troubleshooting

### doc/MCP/Architecture.md
- Updated Per-Endpoint Authentication section with complete implementation
- Marked Phase 3 authentication task as completed (✅)
- Added authentication implementation code example

## Authentication Flow

1. Client sends request with Bearer token:
   - Header: `Authorization: Bearer <token>`
   - Or query param: `?token=<token>`

2. Server validates against endpoint-specific variable:
   - `/mcp/config` → `mcp-config_endpoint_auth`
   - `/mcp/observe` → `mcp-observe_endpoint_auth`
   - `/mcp/query` → `mcp-query_endpoint_auth`
   - `/mcp/admin` → `mcp-admin_endpoint_auth`
   - `/mcp/cache` → `mcp-cache_endpoint_auth`

3. Returns 401 Unauthorized if:
   - Auth is required but not provided
   - Token doesn't match expected value

4. Allows request if:
   - No auth token configured (backward compatible)
   - Token matches expected value

## Testing

```bash
# Set auth token for /mcp/query endpoint
mysql -h 127.0.0.1 -P 6032 -u admin -padmin \
  -e "SET mcp-query_endpoint_auth='my-secret-token'; LOAD MCP VARIABLES TO RUNTIME;"

# Test with Bearer token
curl -k -X POST https://127.0.0.1:6071/mcp/query \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-token" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

# Test with query parameter
curl -k -X POST "https://127.0.0.1:6071/mcp/query?token=my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
```

## Status

✅ Authentication fully implemented and functional
⚠️ Testing with running ProxySQL instance still needed

Co-authored-by: Claude <claude@anthropic.com>
- Add discover_tools() function that calls tools/list on each endpoint
- Store discovered tools in temp file to avoid bash associative array issues
- Define all expected tools with test configurations
- Only test tools that are discovered via tools/list
- Add support for all 5 endpoints: config, query, admin, cache, observe
- Add --list-only flag to show discovered tools without testing
- Add --endpoint flag to test specific endpoint
- Improve help output with endpoint descriptions
Verbose output was being printed to stdout, contaminating the
captured response value and causing tool discovery to fail.
Redirect debug output to stderr instead.
The code was creating a dangling pointer by calling c_str() on a
temporary std::string object, causing undefined behavior and crashes
when processing query results.

Before:
  const char* col_name = columns[i].get<std::string>().c_str();
  // ^ temporary string destroyed here, col_name is dangling

After:
  std::string col_name = columns[i].get<std::string>();
  // ^ col_name is valid until end of scope

This bug was causing ProxySQL to crash when running MCP tool tests.
MySQL returns column names in uppercase for information_schema tables,
but the code was expecting lowercase column names. This caused crashes
when accessing JSON keys that didn't exist.

Changes:
1. Convert all column names to lowercase in execute_query()
2. Store lowercase column names in a vector for efficient access
3. Use lowercase column names as keys in JSON row objects

This ensures consistent column name casing across all queries,
preventing JSON access errors for information_schema columns.

Also includes the previous use-after-free fix.
The nlohmann::json value() method can throw "basic_string: construction
from null is not valid" when trying to convert a JSON null value to std::string.

Added helper functions get_json_string() and get_json_int() that:
- Check if key exists before accessing
- Check if value is not null
- Check if value has correct type
- Return default value if any check fails

This prevents crashes when:
1. Arguments are missing (returns default)
2. Arguments are explicitly null (returns default)
3. Arguments have wrong type (returns default)
- Fixed NULL value handling in execute_query: use empty string instead
  of nullptr to avoid "basic_string: construction from null" errors
- Fixed validate_readonly_query: corrected substring length check
  from substr(0,6)!="SELECT " to substr(0,6)!="SELECT"
- Fixed test script: added proper variable_name parameter for
  get_config/set_config tools

Query endpoint tools now pass all tests.
Comprehensive architecture for AI-powered database discovery agent:
- Mixture-of-experts approach (Structural, Statistical, Semantic, Query)
- Orchestrator pattern for coordinated exploration
- Four-phase discovery process (Blind → Patterns → Hypotheses → Synthesis)
- Domain-agnostic design for any database complexity or type
- Catalog as shared memory for collaborative expert findings
- Multiple real-world examples (law firm, scientific research, ecommerce)
Comprehensive guide for discovering and using MCP Query endpoint tools:
- Tool discovery via tools/list method
- Complete list of all Query endpoint tools with parameters
- cURL and Python examples for tool discovery and execution
- Complete database exploration example
- Test script usage guide
…3-fix

Fix malformed Bind packets when client provides a single parameter format
Signed-off-by: René Cannaò <rene@proxysql.com>
Fix: pgsql_users.use_ssl not being enforced on frontend connections
rework spec files, use rpm macros to handle systemd
[skip-ci] Remove deprecated read_only_action implementations from MySQL and PgSQL HostGroups managers
Improve build performance by preserving test dependencies during 'make clean'
Implement async GenAI module with socketpair-based non-blocking architecture
Comprehensive implementation documentation for two new search capabilities:

FTS (Full Text Search):
- 6 tools for lexical search using SQLite FTS5
- Separate mcp_fts.db database
- Keyword matching and phrase search
- Tools: fts_index_table, fts_search, fts_list_indexes, fts_delete_index, fts_reindex, fts_rebuild_all

Vector Embeddings:
- 6 tools for semantic search using sqlite-vec
- Separate mcp_embeddings.db database
- Vector similarity search with sqlite-rembed integration
- Placeholder for future GenAI module
- Tools: embed_index_table, embed_search, embed_list_indexes, embed_delete_index, embed_reindex, embed_rebuild_all

Both systems:
- Follow MySQL_Catalog patterns for SQLite management
- Integrate with existing MCP Query endpoint
- Work alongside Catalog for AI agent memory
- 13-step implementation plans with detailed code examples
@coderabbitai
Copy link

coderabbitai bot commented Jan 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.


Comment @coderabbitai help to get the list of available commands and usage tips.

renecannao and others added 30 commits January 21, 2026 14:13
- Add #include <string> for C++ std::string support
- Add #include <cmath> for sqrt() function
- Change format %lld to %ld for chrono duration types (long int, not long long)

Resolves compilation errors for vector_db_performance-t test.
- Add #include <string> for C++ std::string support

Resolves compilation errors for ai_llm_retry_scenarios-t test.
…leanup

Security fixes:
- Add escape_identifier() helper for proper SQLite identifier escaping
- Replace sanitize_name() with allowlist validation (ASCII letters, digits, underscore only)
- Fix MATCH clause FTS5 operator injection by wrapping query in double quotes
- Apply escape_identifier() to all DDL statements (CREATE, DROP, triggers)

Memory safety fixes:
- Replace VLA with std::vector in MySQL_FTS::init(), add delete on error path
- Fix memory leak: free error string before return in list_indexes()
- Fix reindex_json["error"] potential exception using .value() with default

Thread safety fixes:
- reinit_fts(): Add mutex lock around pointer swap
- reset_fts_path(): Move blocking init() outside lock, only swap pointer under lock

Code cleanup:
- Remove 7 debug fprintf statements from Query_Tool_Handler.cpp
- Remove unused #include <memory> from MySQL_FTS.h

Test script security fixes:
- Use MYSQL_PWD environment variable instead of -p"..." for password
- Add escape_sql() function and apply to INSERT statement
- Fix CURL_OPTS quoting: ${CURL_OPTS:+"${CURL_OPTS}"}
- Remove unused FTS_INDEX_NAME and SEARCH_QUERIES variables

Documentation fixes:
- Fix bare URL to markdown link format
- Add code block language identifiers (text, bash)
This commit addresses all recommendations from CodeRabbit, Gemini Code Assist,
and Copilot for PR #25 (FTS security and code quality improvements).

Critical Security Fixes:
- MCP_Thread.cpp: Rollback fts_path on reset failure to keep config consistent
- MySQL_FTS.cpp: Add escape_mysql_identifier() for MySQL query identifier escaping
- MySQL_FTS.cpp: Add unique hash-based fallback to sanitize_name() for empty strings
- MySQL_FTS.cpp: Add where_clause validation to block dangerous SQL patterns

Memory Safety Fixes:
- MySQL_FTS.cpp: Fix indexes_result memory leak on early return in search()
- MySQL_FTS.h: Delete copy/move operations to prevent accidental resource duplication

Thread Safety Documentation:
- MySQL_Tool_Handler.cpp: Add comment explaining FTS lock design rationale

Test Script Improvements:
- test_mcp_fts.sh: Add curl timeouts (5s connect, 30s max)
- test_mcp_fts.sh: Remove unused delete_response variable
- test_mcp_fts_detailed.sh: Make cleanup tolerant of non-existent indexes

Build Fixes:
- Makefile: Fix EXCLUDE_TRACKING_VARAIABLES typo to EXCLUDE_TRACKING_VARIABLES
- vector_db_performance-t.cpp: Fix printf format specifiers to %lld with cast

Schema Fixes:
- Query_Tool_Handler.cpp: Change fts_index_table columns schema from string to array

Code Cleanup:
- MySQL_Tool_Handler.cpp: Remove all remaining debug fprintf statements (34 lines)
- Documentation: Change "Full Text" to "Full-Text" (hyphenated)

Total: ~50 fixes across 10 files
Fix PR #21 Review: Security, Memory Safety, Thread Safety + Additional Fixes
Conflict resolution summary:

**Features preserved from v3.1-vec:**
- mcp_use_ssl variable (HTTP/HTTPS mode support)
- RAG tool handler support
- FTS (Full Text Search) functionality in MySQL_Tool_Handler

**Changes accepted from v3.1-MCP2:**
- Schema isolation for catalog (removed mcp_catalog_path/fts_path variables)
- Schema parameter added to catalog functions
- Comprehensive query rewriting improvements in Query_Tool_Handler
- Format string fixes in vector_db_performance-t.cpp

**Key decisions:**
- catalog_path/fts_path: Removed per v3.1-MCP2 commit 35b0b22 (intentional)
- MySQL_Catalog.cpp: Accepted v3.1-MCP2 version (schema isolation improvements)
- Query_Tool_Handler.cpp: Accepted v3.1-MCP2 version (significant improvements)
- Makefile: Kept v3.1-vec's EXCLUDE_TRACKING_VARIABLES (correct spelling)

**Build notes:**
- MySQL_Tool_Handler now requires schema parameter for catalog operations
- Catalog path is hardcoded to datadir/mcp_catalog.db
- FTS functionality preserved but may need updates for schema isolation
- Fixed invalid memory accesses for tables:
    + mcp_query_rules
    + stats_mcp_query_rules
    + stats_mcp_query_digests
- Fixed inactive 'mcp_query_rules' being loaded to runtime.
- Fixed hash computation in 'compute_mcp_digest'.
- Fixed invalid escaping during 'stats_mcp_query_digests' gen.
- Fixed digest generation for MCP arguments:
    + SQL queries are now preserved using
      'mysql_query_digest_and_first_comment'.
    + TODO: Options for the tokenizer are right now hardcoded.
- Added initial testing and testing plan for MCP query_(rules/digests).
    + TODO: Test finished on phase8. Timeouts destroy the MCP
      connection, leaving it unusable for subsequent queries this should
      be fixed for continuing testing.
- TODO: There are several limitations to fix in
  'validate_readonly_query'. This reflect in some query hacks in the
  testing.
    + 'SELECT' starting with comments (--) gets flagged as non-read.
    + 'SELECT' must have a 'SELECT .* FROM' structure. While common,
      simple testing queries many times lack this form.
…ility)

v3.1-MCP2 introduced lib/proxy_sqlite3_symbols.cpp which provides
canonical definitions for proxy_sqlite3_* function pointers.

Tests that link against libproxysql.a must NOT define MAIN_PROXY_SQLITE3,
otherwise sqlite3db.h will provide conflicting definitions.

Removed #define MAIN_PROXY_SQLITE3 from:
- test/tap/tests/sqlite3-t.cpp
- test/tap/tests/unit_test.h
- test/tap/tests/setparser_test.cpp
- test/tap/tests/setparser_test_common.h
- lib/MySQL_Catalog.cpp: Convert search/list/remove to use SQLite prepared
  statements instead of string concatenation for user parameters

- lib/RAG_Tool_Handler.cpp: Add escape_fts_query() function to properly
  escape single quotes in FTS5 MATCH clauses; update all FTS and vector
  MATCH queries to use escaped values

- lib/Static_Harvester.cpp: Add is_valid_schema_name() validation function
  to ensure schema names only contain safe characters (alphanumeric,
  underscore, dollar sign) before using in INFORMATION_SCHEMA queries

- lib/Query_Tool_Handler.cpp: Add clarifying comments to validate_readonly_query
  explaining the blacklist (quick exit) + whitelist (allowed query types) approach

- Remove backup file lib/Anomaly_Detector.cpp.bak

Addresses gemini-code-assist review comments from PR #26.
Add proxy_sqlite3_enable_load_extension and
proxy_sqlite3_global_stats_row_step to the #else (MAIN_PROXY_SQLITE3)
branch in sqlite3db.h to maintain symmetry with the extern declarations
and prevent ODR violations.

Addresses coderabbitai review comment.
Add assert(proxy_sqlite3_bind_blob) to the assertion block in
SQLite3DB::LoadPlugin to ensure the symbol is provided by plugins.
Without this, if a plugin fails to provide the symbol, the code will
crash at runtime with no safety check.

proxy_sqlite3_bind_blob is actively used in Anomaly_Detector.cpp:765
to bind embeddings.

Addresses coderabbitai review comment.
…pserts

Add catalog_au AFTER UPDATE trigger in MySQL_Catalog that mirrors the
delete+insert pattern used in catalog_ad/catalog_ai. This keeps the FTS
index current when upserts occur (INSERT OR REPLACE ... ON CONFLICT ...
DO UPDATE), since the UPDATE doesn't trigger INSERT/DELETE triggers.

The trigger first deletes the old entry from catalog_fts then inserts
the new entry, ensuring the full-text search index stays synchronized
with the catalog table.

Addresses coderabbitai review comment.
Change free(resultset) to delete resultset in Query_Tool_Handler
(extract_schema_name function). SQLite3_result is a C++ class allocated
with new, so it must be deallocated with delete, not free(). Using free()
causes mixed allocator UB.

Addresses coderabbitai review comment.
The execute_parameterized_query function in RAG_Tool_Handler was creating
a prepared statement and binding parameters, but then executing the raw query
string instead of the prepared statement. This completely bypassed the
parameter binding, making the function useless for preventing SQL injection.

Changed to use vector_db->execute_prepared(stmt, ...) to execute the bound
statement instead of the raw query, so the bound parameters are actually used.

Addresses coderabbitai review comment.
Fix off-by-one errors in row->fields indices. The SELECT clause returns:
object_id(0), schema_name(1), object_name(2), object_type(3), engine(4),
table_rows_est(5), data_length(6), index_length(7), has_primary_key(8),
has_foreign_keys(9), has_time_column(10)

But the code was reading from fields[3..9] instead of [4..10].
Added comment documenting the correct column order.

Addresses coderabbitai review comment.
Add escape_sql_string() helper function that doubles single quotes
to prevent SQL injection when strings are used in SQL string
concatenation. Update harvest_view_definitions to use this function
for view_def, schema_name, and view_name.

This prevents SQL injection in the UPDATE statement that stores
view definitions in the catalog.

Addresses coderabbitai review comment.
Fix two issues in Query_Tool_Handler's execute_query functions:

1. mysql_query() failure path now returns immediately after
   return_connection() instead of continuing to process on bad state.

2. Capture affected_rows BEFORE return_connection() to avoid race
   condition. Previously, mysql_affected_rows() was called after
   return_connection(), potentially accessing a stale connection.

Apply fixes to both execute_query() and execute_query_with_schema().

Addresses coderabbitai review comments.
…rules_to_runtime

Change free(resultset) to delete resultset in
ProxySQL_Admin::load_mcp_query_rules_to_runtime.

SQLite3_result is a C++ class allocated with new, so it must be
deallocated with delete, not free(). Using free() causes undefined
behavior and memory leaks.

Addresses coderabbitai review comment.
Fix multi-statement execution in Admin_Handler.cpp for MCP query rules.
The previous code built a single SQL string with "DELETE ...; INSERT ..." but
SQLite only executed the first statement.

Changed to execute statements as an explicit transaction:
1. BEGIN
2. DELETE FROM target_table
3. INSERT OR REPLACE INTO target_table SELECT * FROM source_table
4. COMMIT (or ROLLBACK on error)

Applied to both:
- LOAD MCP QUERY RULES FROM DISK/TO MEMORY
- SAVE MCP QUERY RULES TO DISK

Addresses coderabbitai review comment.
…njection

Fix two SQL injection vulnerabilities identified by coderabbitai in
ProxySQL_Admin_Stats.cpp by converting sprintf/snprintf interpolation
to SQLite prepared statements.

1. stats___mcp_query_digest (lines 2581-2636):
   - Changed from sprintf with format string to prepared statement
   - digest_text field now properly bound as parameter (was unsafe)
   - All 10 columns now use positional parameters (?1-?10)

2. stats___mcp_query_tools_counters (lines 1587-1635):
   - Changed from snprintf with unescaped fields to prepared statement
   - Fixed incorrect table name logic (was appending _reset incorrectly)
   - All 8 columns now use positional parameters (?1-?8)

These changes prevent SQL injection when resultset fields contain
quotes, backslashes, or other SQL metacharacters.
…ccess

Fix compilation errors in the SQL injection fixes:

1. ProxySQL_Admin_Stats.cpp: Use public statsdb->prepare_v2() API
   - Changed from direct proxy_sqlite3_prepare_v2() calls with statsdb->db
   - statsdb->db is private, must use public prepare_v2(query, &stmt) method

2. Admin_Handler.cpp: Add SPA cast for template function access
   - Added ProxySQL_Admin *SPA=(ProxySQL_Admin *)pa; declaration
   - Changed all admindb->execute to SPA->admindb->execute
   - Removed unused 'error' and 'success' variables

The build now completes successfully.
… SQL injection

Fix issues identified by coderabbitai review:

1. Admin_Handler.cpp: Fix typo in strncasecmp for LOAD MCP QUERY RULES
   - Line 2365 had "LOAD MCP RULES FROM DISK" instead of "LOAD MCP QUERY RULES FROM DISK"

2. Admin_Handler.cpp: Fix use-after-free and missing client response
   - Removed l_free(*ql,*q) which freed *q before caller used it
   - Added send_error_msg_to_client calls on all error paths
   - Added send_ok_msg_to_client call on success path
   - Changed return value from true to false (to match handler pattern)
   - Applied to both LOAD MCP QUERY RULES and SAVE MCP QUERY RULES handlers

3. ProxySQL_Admin_Stats.cpp: Fix sprintf SQL injection in stats___mcp_query_rules
   - Replaced sprintf with prepared statement using positional parameters
   - Changed from char* query with malloc/free to sqlite3_stmt with prepare_v2
   - Both columns now bound as parameters (?1, ?2)
Merge v3.1-MCP2 into v3.1-vec (conflicts resolved)
- Fix Discovery_Schema.cpp fingerprint output JSON consistency
  - Quote placeholders (e.g., "?" instead of ?) for valid JSON
- Fix test_mcp_query_rules_block.sh exec_admin_silent function
  - Add -B and -N flags for batch mode with no headers
- Remove unused YELLOW variable from test_phase1_crud.sh
- Fix test_phase2_load_save.sh runtime DELETE to be non-fatal
- Remove unused INITIAL_COUNT from test_phase3_runtime.sh
- Fix test_phase3_runtime.sh to verify exact ID ordering
- Fix test_phase4_stats.sh read-only test assertion
  - Make INSERT/DELETE non-fatal and fix assertion logic
- Remove unused DIGEST_TEXT_CHECK from test_phase5_digest.sh
- Fix test_phase8_eval_timeout.sh non-timeout response handling
  - Return failure for non-timeout responses
- Remove stray exit 1 from test_phase8_eval_timeout.sh
Resolved merge conflict in lib/ProxySQL_Admin_Stats.cpp:
- Kept v3.1-MCP2_QR version with proper NULL handling
- Uses stmt_unique_ptr (RAII smart pointer)
- Checks for NULL before calling atoll() on numeric fields
- Discarded v3.1-vec version which lacked NULL checks

The merged code correctly handles NULL values for:
- run_id
- count_star
- first_seen
- last_seen
- sum_time
- min_time
- max_time
Address coderabbitai review - implement full JSON escaping for SQL digest:
- Handle backslash (\) and double quote (")
- Handle control characters: newline (\n), carriage return (\r), tab (\t)
- Handle other control characters (U+0000 through U+001F) with \uXXXX escapes

This ensures digest_text in stats_mcp_query_digest is always valid JSON,
preventing parsing errors for consumers of this data.
- Fix comment mismatch: Changed _2 suffix to match actual function name
  (mysql_query_digest_and_first_comment, not _2)
- Make get_def_mysql_opts() static to avoid symbol pollution
- Fix NULL first_comment parameter to prevent potential segfault
  - Pass valid char* pointer instead of NULL
  - Free first_comment if allocated by the function
fix: Multiple issues with MCP query_(rules/digests)
This addresses an issue from PR #22 where LoadPlugin() was completely
disabled. The function performs necessary initialization even when no
plugin is loaded (initializes built-in sqlite3 function pointers).

Changes:
- Added `const bool allow_load_plugin = false` flag in LoadPlugin()
- Modified `if (plugin_name)` to `if (plugin_name && allow_load_plugin == true)`
- Re-enabled the LoadPlugin() call in LoadPlugins()

The plugin loading code remains disabled (allow_load_plugin=false) while
the function pointer initialization from built-in SQLite3 now works correctly.

TODO: Revisit plugin loading safety mechanism to allow actual plugin loading.
fix: Re-enable SQLite3DB::LoadPlugin() with allow_load_plugin flag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants