-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add MCP query rules and digest statistics [WIP] #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… schema - Add ADMIN_SQLITE_TABLE_RUNTIME_MCP_QUERY_RULES schema (17 columns, same as mcp_query_rules) - Fix STATS_SQLITE_TABLE_MCP_QUERY_RULES to only have rule_id and hits columns - Add runtime_mcp_query_rules detection and refresh in ProxySQL_Admin - Implement save_mcp_query_rules_from_runtime(bool _runtime) for both config and runtime tables - Update get_mcp_query_rules() to return 17 columns (no hits) - get_stats_mcp_query_rules() returns 2 columns (rule_id, hits) Mirrors the MySQL query rules pattern: - mcp_query_rules: config table (17 cols) - runtime_mcp_query_rules: runtime state (17 cols) - stats_mcp_query_rules: hit counters (2 cols)
- Add separate MCP QUERY RULES command block in Admin_Handler - Fix string length comparison (21 chars for "SAVE/LOAD MCP QUERY RULES ") - Add handlers for: - LOAD MCP QUERY RULES TO RUNTIME - SAVE MCP QUERY RULES TO DISK - SAVE MCP QUERY RULES TO MEMORY / FROM RUNTIME - Register mcp_query_rules in disk database (tables_defs_config) Previously MCP commands were incorrectly nested inside MYSQL/PGSQL block and could not be reached. Now they have their own conditional block.
- Add LOAD MCP QUERY RULES FROM DISK command - Add LOAD MCP QUERY RULES TO MEMORY command - Both commands copy rules from disk.mcp_query_rules to main.mcp_query_rules This completes the full set of MCP query rules LOAD/SAVE commands, matching the MySQL query rules pattern.
Change function signature from stats___mcp_query_rules(bool reset) to stats___mcp_query_rules() to match MySQL query rules pattern. The reset parameter was never used in the function body and MySQL's stats___mysql_query_rules() has no reset parameter.
Previously re_modifiers was hardcoded to 1 (CASELESS), ignoring the value stored in the database. Now properly reads from row->fields[7].
Add detailed function-level documentation to all MCP query rules, query digest, static harvester, and catalog components. Static_Harvester.cpp: - Document all 18+ harvest functions (schemas, objects, columns, indexes, FKs, views) - Document lifecycle methods (init, close, connect, disconnect) - Document helper methods (is_time_type, is_id_like_name) - Document run management (start_run, finish_run, run_full_harvest) - Document statistics methods (get_harvest_stats) Query_Tool_Handler.cpp: - Document JSON helper functions (json_string, json_int, json_double) - Document digest tracking section with flow explanation MySQL_Catalog.cpp: - Document schema isolation architecture - Document CRUD operations (upsert, get, search, list, remove, merge) Discovery_Schema.cpp: - Document MCP query rules evaluation (evaluate_mcp_query_rules) - Document digest functions (compute_mcp_digest, fingerprint_mcp_args) - Document update/get functions for rules and digests ProxySQL_Admin_Stats.cpp: - Document stats collection functions ProxySQL_Admin.cpp: - Document load/save functions for query rules Admin_Handler.cpp: - Document MCP query rules command handlers include/ProxySQL_Admin_Tables_Definitions.h: - Add comments explaining table purposes
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a new MCP subsystem: in-memory rule storage and evaluation, digest/statistics tracking, admin/runtime table definitions and commands, catalog lifecycle and schema-aware catalog ops, plus a Static_Harvester for MySQL metadata harvesting. Changes
Sequence Diagram(s)sequenceDiagram
participant QTH as Query_Tool_Handler
participant DS as Discovery_Schema
participant DB as AdminDB
participant QP as MCP_Processor
QTH->>DS: evaluate_mcp_query_rules(tool_name, schemaname, arguments, original_query)
activate DS
DS->>QP: iterate active rules (regex match / filters)
activate QP
QP->>QP: evaluate match_pattern, username, schemaname, flags
QP-->>DS: MCP_Query_Processor_Output (actions: rewrite/error/OK/timeout, flags)
deactivate QP
DS-->>QTH: MCP_Query_Processor_Output*
deactivate DS
alt OK_msg set
QTH->>QTH: return success with OK_msg (early)
else error_msg set
QTH->>QTH: return error_msg (early)
else new_query provided
QTH->>QTH: execute rewritten query
QTH->>DS: compute_mcp_digest(tool_name, arguments)
DS-->>QTH: digest, digest_text
QTH->>DS: update_mcp_query_digest(tool_name, run_id, digest, digest_text, duration, ts)
DS->>DB: (optional) persist digest stats / later export via stats___mcp_query_digest
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Summary of ChangesHello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly extends ProxySQL's capabilities by integrating advanced management and monitoring features for Model Context Protocol (MCP) queries. It introduces a flexible rule-based system to control MCP query behavior, allowing for dynamic modifications or restrictions. Concurrently, it provides deep insights into MCP query performance and usage patterns through detailed digest statistics, enhancing observability and operational control for MCP workloads. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant new functionality by adding MCP query rules and digest statistics, mirroring existing infrastructure for MySQL/PostgreSQL. The overall design is consistent with the project's patterns. However, there are several areas for improvement. The new code heavily relies on manual memory management with raw pointers, which is error-prone and should be updated to use modern C++ idioms like smart pointers and std::string for better safety and maintainability. I've also identified some duplicated code, an incomplete feature implementation in the statistics handling, and a few inconsistencies between comments and code that should be addressed. My detailed comments highlight these points to help improve the robustness of this new feature.
| MCP_Query_Processor_Output* evaluate_mcp_query_rules( | ||
| const std::string& tool_name, | ||
| const std::string& schemaname, | ||
| const nlohmann::json& arguments, | ||
| const std::string& original_query | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function evaluate_mcp_query_rules returns a raw pointer MCP_Query_Processor_Output*, and the documentation states that the caller is responsible for destroying it. This manual ownership transfer is error-prone and can easily lead to memory leaks if the caller forgets to delete the object.
To make the code safer and more idiomatic, you should use a smart pointer to manage the lifetime of the returned object automatically.
std::unique_ptr<MCP_Query_Processor_Output> evaluate_mcp_query_rules(
const std::string& tool_name,
const std::string& schemaname,
const nlohmann::json& arguments,
const std::string& original_query
);
lib/ProxySQL_Admin_Stats.cpp
Outdated
| // For now, we'll leave the table empty since MCP digest stats are stored in memory | ||
| // in the Discovery_Schema and would need to be accessed differently | ||
| // TODO: Implement proper access to Discovery_Schema digest statistics | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is intended to populate the stats_mcp_query_digest table, but the implementation is incomplete. It currently only deletes existing data and contains a TODO comment, leaving the table empty. To complete this feature, you should fetch the digest statistics from the Discovery_Schema catalog (using get_mcp_query_digest) and insert the data into the stats table.
| struct MCP_Query_Rule { | ||
| int rule_id; | ||
| bool active; | ||
| char *username; | ||
| char *schemaname; | ||
| char *tool_name; | ||
| char *match_pattern; | ||
| bool negate_match_pattern; | ||
| int re_modifiers; // bitmask: 1=CASELESS | ||
| int flagIN; | ||
| int flagOUT; | ||
| char *replace_pattern; | ||
| int timeout_ms; | ||
| char *error_msg; | ||
| char *ok_msg; | ||
| bool log; | ||
| bool apply; | ||
| char *comment; | ||
| uint64_t hits; // in-memory only, not persisted to table | ||
| void* regex_engine; // compiled regex (RE2) | ||
|
|
||
| MCP_Query_Rule() : rule_id(0), active(false), username(NULL), schemaname(NULL), | ||
| tool_name(NULL), match_pattern(NULL), negate_match_pattern(false), | ||
| re_modifiers(1), flagIN(0), flagOUT(0), replace_pattern(NULL), | ||
| timeout_ms(0), error_msg(NULL), ok_msg(NULL), log(false), apply(true), | ||
| comment(NULL), hits(0), regex_engine(NULL) {} | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MCP_Query_Rule struct uses C-style char* for string members and a void* for the regex engine. This requires manual memory management (e.g., strdup/free, new/delete) which is complex and error-prone in C++. Using modern C++ features would make the code safer and more maintainable.
I recommend the following changes:
- Replace
char*members withstd::stringto automate memory management. - Replace
void* regex_enginewithstd::unique_ptr<re2::RE2>to manage the lifetime of the regex object automatically and provide type safety.
These changes would eliminate the need for manual memory management in the Discovery_Schema destructor and load_mcp_query_rules function, reducing the risk of memory leaks or corruption.
| db->execute( | ||
| "CREATE TABLE IF NOT EXISTS mcp_query_rules (" | ||
| " rule_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL ," | ||
| " active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 0 ," | ||
| " tool_name VARCHAR ," | ||
| " run_id INT ," | ||
| " match_pattern VARCHAR ," | ||
| " negate_match_pattern INT CHECK (negate_match_pattern IN (0,1)) NOT NULL DEFAULT 0 ," | ||
| " re_modifiers VARCHAR DEFAULT 'CASELESS' ," | ||
| " flagIN INT NOT NULL DEFAULT 0 ," | ||
| " flagOUT INT CHECK (flagOUT >= 0) ," | ||
| " action VARCHAR CHECK (action IN ('allow','block','rewrite','timeout')) NOT NULL DEFAULT 'allow' ," | ||
| " replace_pattern VARCHAR ," | ||
| " timeout_ms INT CHECK (timeout_ms >= 0) ," | ||
| " error_msg VARCHAR ," | ||
| " OK_msg VARCHAR ," | ||
| " log INT CHECK (log IN (0,1)) ," | ||
| " apply INT CHECK (apply IN (0,1)) NOT NULL DEFAULT 1 ," | ||
| " comment VARCHAR ," | ||
| " hits INTEGER NOT NULL DEFAULT 0" | ||
| ");" | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CREATE TABLE statement for mcp_query_rules within this function appears to be dead code. The schema defined here (with run_id and action columns) is inconsistent with the mcp_query_rules table defined in ProxySQL_Admin_Tables_Definitions.h and used by the admin interface. The application logic loads rules from the admin database into an in-memory structure and does not appear to use this table. This code should be removed to avoid confusion and reduce maintenance overhead.
| // Clear existing rules | ||
| for (auto rule : mcp_query_rules) { | ||
| if (rule->regex_engine) { | ||
| delete (re2::RE2*)rule->regex_engine; | ||
| } | ||
| free(rule->username); | ||
| free(rule->schemaname); | ||
| free(rule->tool_name); | ||
| free(rule->match_pattern); | ||
| free(rule->replace_pattern); | ||
| free(rule->error_msg); | ||
| free(rule->ok_msg); | ||
| free(rule->comment); | ||
| delete rule; | ||
| } | ||
| mcp_query_rules.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for clearing existing rules is nearly identical to the cleanup logic in the Discovery_Schema destructor. Duplicating this complex memory management logic increases the risk of bugs and makes maintenance harder.
Consider extracting this cleanup logic into a private helper function (e.g., clear_mcp_query_rules()) that can be called from both the destructor and load_mcp_query_rules to avoid code duplication.
| if (rule->username) { | ||
| // For now, we don't have username in MCP context, skip if set | ||
| // TODO: Add username matching when available | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic for username matching will cause any rule with a username specified to be skipped, as the code immediately continues if rule->username is not null. The TODO comment indicates this might be intentional for now, but it effectively makes the username field in mcp_query_rules non-functional. This could be confusing for users trying to create rules based on usernames.
| // Input: {"filters": {"status": "active", "age": 25}} | ||
| // Output: {"filters":{"?":"?","?":"?"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example in the documentation for fingerprint_mcp_args is incorrect and does not match the implementation. The implementation preserves object keys, while the example suggests keys are also replaced with ?.
The example:
Input: {"filters": {"status": "active", "age": 25}}
Output: {"filters":{"?":"?","?":"?"}}
Based on the implementation, the output should be:
Output: {"filters":{"status":"?","age":"?"}}
Please update the comment to accurately reflect the function's behavior.
| char* a = (char*)"INSERT INTO stats_mcp_query_rules VALUES (\"%s\",\"%s\")"; | ||
| for (std::vector<SQLite3_row*>::iterator it = resultset->rows.begin(); it != resultset->rows.end(); ++it) { | ||
| SQLite3_row* r = *it; | ||
| int arg_len = 0; | ||
| for (int i = 0; i < 2; i++) { | ||
| arg_len += strlen(r->fields[i]); | ||
| } | ||
| char* query = (char*)malloc(strlen(a) + arg_len + 32); | ||
| sprintf(query, a, r->fields[0], r->fields[1]); | ||
| statsdb->execute(query); | ||
| free(query); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using sprintf to construct SQL queries is generally unsafe and can lead to SQL injection vulnerabilities, even if the inputs seem safe. It's better to use prepared statements to handle parameter binding securely.
While in this specific case the inputs are generated from integers, adopting prepared statements as a standard practice will make the code more robust and secure against future changes.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements firewall functionality and query digest statistics for MCP (Model Context Protocol) queries in ProxySQL, mirroring the existing MySQL query rules and statistics infrastructure.
Changes:
- Adds MCP query rules tables (mcp_query_rules, runtime_mcp_query_rules) with 17 columns for defining firewall rules
- Implements query digest computation and statistics tracking (stats_mcp_query_digest, stats_mcp_query_rules)
- Adds LOAD/SAVE commands for MCP query rules following ProxySQL patterns
- Includes comprehensive documentation for Static Harvester, Query Tool Handler, and catalog isolation features
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/Static_Harvester.cpp | Adds comprehensive documentation for all static harvester functions and workflow |
| lib/Query_Tool_Handler.cpp | Implements MCP query rules evaluation and digest tracking integration into query execution flow |
| lib/ProxySQL_Admin_Stats.cpp | Adds stats collection functions for MCP query digest and rules (incomplete implementation) |
| lib/ProxySQL_Admin.cpp | Implements LOAD/SAVE commands for MCP query rules with proper escaping |
| lib/MySQL_Catalog.cpp | Documents schema isolation for multi-tenancy support |
| lib/Discovery_Schema.cpp | Core implementation of MCP query rules evaluation, digest computation, and statistics (has critical schema issues) |
| lib/Admin_Handler.cpp | Adds command parsing for MCP query rules LOAD/SAVE operations |
| lib/Admin_Bootstrap.cpp | Registers MCP query rules and stats tables in admin and config databases |
| include/proxysql_admin.h | Declares MCP query rules load/save and stats functions |
| include/ProxySQL_Admin_Tables_Definitions.h | Defines table schemas for MCP query rules and digest statistics |
| include/Discovery_Schema.h | Declares MCP query rule structures, processor output, and digest functions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lib/Discovery_Schema.cpp
Outdated
| SQLite3_result* Discovery_Schema::get_mcp_query_digest(bool reset) { | ||
| SQLite3_result* result = new SQLite3_result(); | ||
|
|
||
| // Define columns (10 columns, not 11 - digest_text was duplicated) |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment at line 2839 states "10 columns, not 11 - digest_text was duplicated", but counting the add_column_definition calls from lines 2840-2849, there are exactly 10 columns as expected. The comment is confusing because it suggests there was a duplication issue that was fixed, but this should be clarified or removed if it's no longer relevant.
| // Define columns (10 columns, not 11 - digest_text was duplicated) | |
| // Define columns for MCP query digest statistics |
| void ProxySQL_Admin::stats___mcp_query_digest(bool reset) { | ||
| if (!GloMCPH) return; | ||
| Query_Tool_Handler* qth = GloMCPH->query_tool_handler; | ||
| if (!qth) return; | ||
|
|
||
| // Get the discovery schema catalog | ||
| // Note: This is a simplified implementation that queries the catalog database | ||
| // In a full implementation, we would access the Discovery_Schema directly | ||
| statsdb->execute("BEGIN"); | ||
|
|
||
| if (reset) { | ||
| statsdb->execute("DELETE FROM stats_mcp_query_digest_reset"); | ||
| } else { | ||
| statsdb->execute("DELETE FROM stats_mcp_query_digest"); | ||
| } | ||
|
|
||
| // For now, we'll leave the table empty since MCP digest stats are stored in memory | ||
| // in the Discovery_Schema and would need to be accessed differently | ||
| // TODO: Implement proper access to Discovery_Schema digest statistics | ||
|
|
||
| statsdb->execute("COMMIT"); | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function signature and documentation describe a parameter 'reset' for stats___mcp_query_digest, but the implementation is incomplete. The TODO comment at line 2599 indicates that the digest statistics access needs proper implementation. The function currently only deletes rows from the table but doesn't populate it with actual data from the Discovery_Schema. This will result in empty stats tables when queried.
| char* a = (char*)"INSERT INTO stats_mcp_query_rules VALUES (\"%s\",\"%s\")"; | ||
| for (std::vector<SQLite3_row*>::iterator it = resultset->rows.begin(); it != resultset->rows.end(); ++it) { | ||
| SQLite3_row* r = *it; | ||
| int arg_len = 0; | ||
| for (int i = 0; i < 2; i++) { | ||
| arg_len += strlen(r->fields[i]); | ||
| } | ||
| char* query = (char*)malloc(strlen(a) + arg_len + 32); | ||
| sprintf(query, a, r->fields[0], r->fields[1]); | ||
| statsdb->execute(query); | ||
| free(query); | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory leak potential: The result set returned from get_stats_mcp_query_rules is deleted at line 2647, but if the function returns early at lines 2582, 2584, or 2629, the resultset would not be cleaned up. However, reviewing the code more carefully, the resultset is only created inside the get_stats_mcp_query_rules call at line 2628, so early returns before that are safe. But there's still a potential issue - if any of the malloc/sprintf/execute operations fail between lines 2634-2645, the function should handle errors gracefully.
lib/Discovery_Schema.cpp
Outdated
| rule->tool_name = row->fields[4] ? strdup(row->fields[4]) : NULL; // tool_name | ||
| rule->match_pattern = row->fields[5] ? strdup(row->fields[5]) : NULL; // match_pattern | ||
| rule->negate_match_pattern = row->fields[6] ? atoi(row->fields[6]) != 0 : false; // negate_match_pattern | ||
| rule->re_modifiers = row->fields[7] ? atoi(row->fields[7]) : 1; // default CASELESS |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type mismatch for re_modifiers: The table definition stores re_modifiers as VARCHAR DEFAULT 'CASELESS' (line 342 in ProxySQL_Admin_Tables_Definitions.h and line 359 in Discovery_Schema.cpp), but the loading code at line 2404 attempts to parse it as an integer using atoi(). This will fail - atoi('CASELESS') returns 0, not the expected bitmask value. The code expects re_modifiers to be an integer bitmask (1=CASELESS per line 2420), but the schema defines it as a string. Either change the schema to INT or update the loading logic to parse string values like 'CASELESS' to the corresponding bitmask.
| MCP_Query_Processor_Output* qpo = catalog->evaluate_mcp_query_rules( | ||
| tool_name, | ||
| schema, | ||
| arguments, | ||
| sql | ||
| ); | ||
|
|
||
| // Check for OK_msg (return success without executing) | ||
| if (qpo->OK_msg) { | ||
| unsigned long long duration = monotonic_time() - start_time; | ||
| track_tool_invocation(this, tool_name, schema, duration); | ||
| catalog->log_query_tool_call(tool_name, schema, 0, start_time, duration, "OK message from query rule"); | ||
| result = create_success_response(qpo->OK_msg); | ||
| delete qpo; | ||
| return result; | ||
| } | ||
|
|
||
| // Check for error_msg (block the query) | ||
| if (qpo->error_msg) { | ||
| unsigned long long duration = monotonic_time() - start_time; | ||
| track_tool_invocation(this, tool_name, schema, duration); | ||
| catalog->log_query_tool_call(tool_name, schema, 0, start_time, duration, "Blocked by query rule"); | ||
| result = create_error_response(qpo->error_msg); | ||
| delete qpo; | ||
| return result; | ||
| } | ||
|
|
||
| // Apply rewritten query if provided | ||
| if (qpo->new_query) { | ||
| sql = *qpo->new_query; | ||
| } | ||
|
|
||
| // Apply timeout if provided | ||
| if (qpo->timeout_ms > 0) { | ||
| timeout_sec = qpo->timeout_ms / 1000; | ||
| } | ||
|
|
||
| // Apply log flag if set | ||
| if (qpo->log == 1) { | ||
| // TODO: Implement query logging if needed | ||
| } | ||
|
|
||
| delete qpo; |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory leak: The qpo object allocated at line 1602 is deleted at lines 1615, 1625, and 1644, but if execution continues past line 1644 and an exception is thrown in the query execution code (lines 1647-1693), the qpo object will not be deleted, causing a memory leak. Consider using RAII (std::unique_ptr) or ensuring cleanup in all code paths.
| if (success == false) { | ||
| if (resultset) { | ||
| free(resultset); | ||
| } | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential memory leak in error path: If resultset allocation succeeds but the execute_statement fails (returning an error), the function returns NULL at line 7789 without freeing the resultset. The condition at line 7789-7792 attempts to free the resultset only when success is false, but the resultset pointer may already be allocated by execute_statement even on error. The cleanup logic should verify whether resultset was allocated before freeing it.
| // Periodically persist to SQLite (every 100 updates or so) | ||
| static thread_local unsigned int update_count = 0; | ||
| if (++update_count % 100 == 0) { | ||
| // TODO: Implement batch persistence | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread safety issue: The static thread_local variable update_count at line 2801 is incremented without any synchronization. While thread_local ensures each thread has its own copy, if the same thread processes multiple queries concurrently (which is unlikely but possible in some threading models), this could lead to race conditions. However, since the persistence is only a TODO at line 2803, this is not currently a functional issue but should be addressed when implementing batch persistence.
| // Apply timeout if provided | ||
| if (qpo->timeout_ms > 0) { | ||
| timeout_sec = qpo->timeout_ms / 1000; | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout_sec variable is declared but never used after being set at line 1636. The variable appears to be intended for future query timeout enforcement but is not actually applied to the query execution. This could mislead developers into thinking timeout enforcement is active when it's not.
| // Apply timeout if provided | |
| if (qpo->timeout_ms > 0) { | |
| timeout_sec = qpo->timeout_ms / 1000; | |
| } | |
| // Note: qpo->timeout_ms is currently not enforced. Timeout handling | |
| // should be implemented where queries are actually executed. |
| // Output: {"sql":"?","timeout":"?"} | ||
| // | ||
| // Input: {"filters": {"status": "active", "age": 25}} | ||
| // Output: {"filters":{"?":"?","?":"?"}} |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment at line 2971 shows an example output {"filters":{"?":"?","?":"?"}} which incorrectly suggests that object keys are replaced with "?". However, the code at line 2988 preserves the key using it.key(), so the correct output should be {"filters":{"status":"?","age":"?"}}. The documentation should be corrected to accurately reflect the fingerprinting behavior.
| // Output: {"filters":{"?":"?","?":"?"}} | |
| // Output: {"filters":{"status":"?","age":"?"}} |
|
|
||
| // Check schemaname match | ||
| if (rule->schemaname) { | ||
| if (!schemaname.empty() && strcmp(rule->schemaname, schemaname.c_str()) != 0) { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition at line 2511 checks if schemaname is not empty before comparing with rule->schemaname, but it doesn't handle the case where rule->schemaname is NULL and schemaname is empty. If rule->schemaname is NULL, it should match an empty schemaname parameter (meaning "match all schemas"). The current logic will skip the rule if schemaname is empty but rule->schemaname is not NULL, which may not be the intended behavior. Consider: if rule->schemaname is NULL, it should match any schema; if rule->schemaname is not NULL and schemaname is not empty, only match if they're equal.
| if (!schemaname.empty() && strcmp(rule->schemaname, schemaname.c_str()) != 0) { | |
| // Rules with an explicit schemaname only apply when a matching | |
| // non-empty schemaname is provided. | |
| if (schemaname.empty() || strcmp(rule->schemaname, schemaname.c_str()) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
lib/MySQL_Catalog.cpp (3)
278-301: SQL injection vulnerability: user input concatenated directly into query.The
schema,kind,tags, andqueryparameters are concatenated directly into the SQL string without sanitization or parameterized binding. An attacker could inject malicious SQL.🔒 Recommended fix using parameterized queries
std::string MySQL_Catalog::search( const std::string& schema, const std::string& query, const std::string& kind, const std::string& tags, int limit, int offset ) { - std::ostringstream sql; - sql << "SELECT schema, kind, key, document, tags , links FROM catalog WHERE 1=1"; - - // Add schema filter - if (!schema.empty()) { - sql << " AND schema = '" << schema << "'"; - } - - // Add kind filter - if (!kind.empty()) { - sql << " AND kind = '" << kind << "'"; - } - - // Add tags filter - if (!tags.empty()) { - sql << " AND tags LIKE '%" << tags << "%'"; - } - - // Add search query - if (!query.empty()) { - sql << " AND (key LIKE '%" << query << "%' " - << "OR document LIKE '%" << query << "%' " - << "OR tags LIKE '%" << query << "%')"; - } + // Use parameterized query to prevent SQL injection + sqlite3_stmt* stmt = NULL; + std::string sql = "SELECT schema, kind, key, document, tags, links FROM catalog WHERE 1=1"; + + std::vector<std::string> bindings; + if (!schema.empty()) { + sql += " AND schema = ?"; + bindings.push_back(schema); + } + if (!kind.empty()) { + sql += " AND kind = ?"; + bindings.push_back(kind); + } + // ... continue with parameterized bindings for tags and query
368-390: SQL injection vulnerability in list() method.Same issue as
search()—schemaandkindare concatenated directly into SQL without sanitization.
505-511: SQL injection vulnerability in remove() method.The
schema,kind, andkeyparameters are directly concatenated into the DELETE statement.🔒 Recommended fix
int MySQL_Catalog::remove( const std::string& schema, const std::string& kind, const std::string& key ) { - std::ostringstream sql; - sql << "DELETE FROM catalog WHERE 1=1"; - - if (!schema.empty()) { - sql << " AND schema = '" << schema << "'"; - } - sql << " AND kind = '" << kind << "' AND key = '" << key << "'"; - - if (!db->execute(sql.str().c_str())) { + sqlite3_stmt* stmt = NULL; + const char* delete_sql = schema.empty() + ? "DELETE FROM catalog WHERE kind = ?1 AND key = ?2" + : "DELETE FROM catalog WHERE schema = ?1 AND kind = ?2 AND key = ?3"; + + int rc = db->prepare_v2(delete_sql, &stmt); + if (rc != SQLITE_OK) { + proxy_error("Failed to prepare catalog remove: %d\n", rc); + return -1; + } + + int idx = 1; + if (!schema.empty()) { + (*proxy_sqlite3_bind_text)(stmt, idx++, schema.c_str(), -1, SQLITE_TRANSIENT); + } + (*proxy_sqlite3_bind_text)(stmt, idx++, kind.c_str(), -1, SQLITE_TRANSIENT); + (*proxy_sqlite3_bind_text)(stmt, idx++, key.c_str(), -1, SQLITE_TRANSIENT); + + SAFE_SQLITE3_STEP2(stmt); + (*proxy_sqlite3_finalize)(stmt); + + return 0;lib/Static_Harvester.cpp (3)
398-423: SQL injection in fetch_schemas; use parameterized query.The
filterparameter is concatenated directly into the SQL query. While this may only be called internally, using parameterized queries is a best practice for defense-in-depth.🔒 Suggested fix
std::vector<Static_Harvester::SchemaRow> Static_Harvester::fetch_schemas(const std::string& filter) { std::vector<SchemaRow> schemas; - std::ostringstream sql; - sql << "SELECT SCHEMA_NAME, DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME " - << "FROM information_schema.SCHEMATA"; - - if (!filter.empty()) { - sql << " WHERE SCHEMA_NAME = '" << filter << "'"; - } - - sql << " ORDER BY SCHEMA_NAME;"; + std::string sql = "SELECT SCHEMA_NAME, DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME " + "FROM information_schema.SCHEMATA"; + + if (!filter.empty()) { + // Escape single quotes in filter to prevent injection + std::string escaped_filter; + for (char c : filter) { + if (c == '\'') escaped_filter += "''"; + else escaped_filter += c; + } + sql += " WHERE SCHEMA_NAME = '" + escaped_filter + "'"; + } + + sql += " ORDER BY SCHEMA_NAME;";Or better, use MySQL prepared statements via
mysql_stmt_*APIs.
479-513: Same SQL injection pattern in fetch_tables_views.The
filterparameter is directly concatenated. Apply the same fix asfetch_schemas.
1036-1055: SQL injection in harvest_view_definitions via view_def.Line 1045 inserts
view_defdirectly into an UPDATE statement. View definitions from MySQL could contain single quotes or malicious content.🔒 Use parameterized update
- std::ostringstream update_sql; - update_sql << "UPDATE objects SET definition_sql = '" << view_def << "' " - << "WHERE run_id = " << current_run_id - << " AND schema_name = '" << schema_name << "'" - << " AND object_name = '" << view_name << "'" - << " AND object_type = 'view';"; - - catalog->get_db()->execute_statement(update_sql.str().c_str(), &error, &cols, &affected); + sqlite3_stmt* stmt = NULL; + const char* update_sql = "UPDATE objects SET definition_sql = ?1 " + "WHERE run_id = ?2 AND schema_name = ?3 AND object_name = ?4 AND object_type = 'view'"; + + int rc = catalog->get_db()->prepare_v2(update_sql, &stmt); + if (rc == SQLITE_OK) { + (*proxy_sqlite3_bind_text)(stmt, 1, view_def.c_str(), -1, SQLITE_TRANSIENT); + (*proxy_sqlite3_bind_int)(stmt, 2, current_run_id); + (*proxy_sqlite3_bind_text)(stmt, 3, schema_name.c_str(), -1, SQLITE_TRANSIENT); + (*proxy_sqlite3_bind_text)(stmt, 4, view_name.c_str(), -1, SQLITE_TRANSIENT); + SAFE_SQLITE3_STEP2(stmt); + (*proxy_sqlite3_finalize)(stmt); + }lib/Discovery_Schema.cpp (2)
1256-1267: SQL injection in fts_search via query parameter.The
query,object_type, andschema_nameparameters are directly concatenated into the FTS MATCH clause and WHERE conditions.🔒 Escape or validate FTS query input
FTS5 MATCH queries require special handling. At minimum, escape single quotes:
- sql << "SELECT object_key, schema_name, object_name, object_type, tags , bm25(fts_objects) AS score " - << "FROM fts_objects WHERE fts_objects MATCH '" << query << "'"; + // Escape single quotes for FTS query + std::string escaped_query; + for (char c : query) { + if (c == '\'') escaped_query += "''"; + else escaped_query += c; + } + sql << "SELECT object_key, schema_name, object_name, object_type, tags, bm25(fts_objects) AS score " + << "FROM fts_objects WHERE fts_objects MATCH '" << escaped_query << "'";
2075-2087: SQL injection in fts_search_llm.Same issue — the
queryparameter is directly concatenated into the MATCH clause.
🤖 Fix all issues with AI agents
In `@lib/Admin_Handler.cpp`:
- Around line 2362-2383: The current INSERT OR REPLACE SELECT statements in the
LOAD and SAVE branches (where *q is set via l_strdup to "INSERT OR REPLACE INTO
main.mcp_query_rules SELECT * FROM disk.mcp_query_rules" and "INSERT OR REPLACE
INTO disk.mcp_query_rules SELECT * FROM main.mcp_query_rules") do not remove
source-deleted rows and can resurrect stale rules; modify the SQL assigned to *q
in both branches so the target table is cleared before copying (e.g., issue a
DELETE FROM <target_table> or use a transactional sequence "DELETE FROM
<target_table>; INSERT INTO <target_table> SELECT ...") while keeping the same
surrounding logic (the blocks that test query_no_space_length and call l_free,
l_strdup, set *ql and return true) and ensure both the LOAD (target
main.mcp_query_rules) and SAVE (target disk.mcp_query_rules) cases are updated.
In `@lib/Discovery_Schema.cpp`:
- Around line 2655-2684: The code allocates pta with malloc and many strdup
calls (symbols: pta, malloc, strdup, result->add_row) without checking for NULL;
add checks after malloc and after each strdup (or use a helper that allocates
and duplicates safely) so that on allocation failure you free any
already-allocated entries, free pta, and avoid calling result->add_row (return
an error or handle gracefully). Specifically: verify pta != NULL immediately
after malloc, check each strdup result before storing it into pta (or wrap
duplication to return NULL on failure and propagate), on any allocation failure
free any non-NULL pta[i] and pta itself, and only call result->add_row when all
required allocations succeeded.
- Around line 2851-2895: The code has a TOCTOU race: it unlocks the read lock
(pthread_rwlock_rdlock / pthread_rwlock_unlock) then later acquires the write
lock (pthread_rwlock_wrlock) when reset is true, allowing concurrent
modification of mcp_digest_umap; change the locking so that if reset is true you
acquire the write lock before iterating (i.e., call pthread_rwlock_wrlock
instead of pthread_rwlock_rdlock at the start when reset is true) and hold that
lock for both building rows and clearing/deleting entries (use mcp_digest_umap,
MCP_Query_Digest_Stats, result->add_row references), otherwise keep the current
read-lock path—ensure you do not double-lock and that the unlock used matches
the lock acquired.
In `@lib/ProxySQL_Admin.cpp`:
- Around line 1358-1363: The bug is that stats_mcp_query_digest_reset contains
stats_mcp_query_digest so both flags get set; update the checks in the block
that reads query_no_space so the reset path is handled exclusively (mirror
MySQL/PGSQL): test for "stats_mcp_query_digest_reset" before testing for
"stats_mcp_query_digest" (or use a precise token/boundary match) so only
stats_mcp_query_digest_reset becomes true and refresh is set once; adjust the
logic around the stats_mcp_query_digest_reset and stats_mcp_query_digest flag
assignments to avoid double-triggering the reset and non-reset handling.
In `@lib/Query_Tool_Handler.cpp`:
- Around line 1635-1637: The conversion of qpo->timeout_ms to timeout_sec uses
integer division which truncates sub-second values to zero (e.g., 500ms => 0);
change the conversion logic in Query_Tool_Handler.cpp around the timeout
handling (variable timeout_sec and qpo->timeout_ms) to round up instead of
truncating so any non-zero millisecond timeout becomes at least 1 second (e.g.,
replace the direct /1000 integer division with a ceil-style conversion such as
(timeout_ms + 999)/1000 or use floating-point division plus ceil) and keep
existing behavior when timeout_ms is 0 or negative.
♻️ Duplicate comments (4)
lib/ProxySQL_Admin_Stats.cpp (2)
2581-2602: Implementation remains incomplete as previously noted.The function deletes existing rows but never inserts new data, leaving the table permanently empty. The TODO comment at lines 2597-2599 acknowledges this. The digest statistics are stored in memory in the
Discovery_Schemaand need to be accessed viaget_mcp_query_digest()to complete this implementation.
2634-2645: The sprintf pattern follows existing codebase conventions.This implementation mirrors the pattern used in
stats___mysql_query_rules()(lines 1385-1396) andstats___pgsql_query_rules()(lines 1407-1418). While prepared statements would be safer as previously noted, this is consistent with the established codebase style. The input data (rule_id and hits) comes from the catalog's internal integer counters, limiting injection risk in practice.lib/Discovery_Schema.cpp (2)
2372-2387: Duplicate cleanup code — consider extracting helper.This cleanup logic is nearly identical to the destructor code at lines 34-48. Extracting a private helper method (e.g.,
clear_mcp_query_rules()) would reduce duplication and maintenance burden.
2503-2507: Username matching effectively disabled.The current logic skips any rule with a username set, making the
usernamefield non-functional. The TODO indicates this is intentional for now, but users may be confused when their username-based rules don't work.
🧹 Nitpick comments (7)
lib/Admin_Bootstrap.cpp (1)
911-914: Minor: Clarify the comment on line 914.The comment "Reuse same schema for stats" is slightly misleading. The
stats_mcp_query_rulestable has a different, simpler schema (justrule_idandhits) compared to the fullmcp_query_rulesschema. Consider updating the comment to better reflect this.📝 Suggested comment improvement
// MCP query digest stats insert_into_tables_defs(tables_defs_stats,"stats_mcp_query_digest", STATS_SQLITE_TABLE_MCP_QUERY_DIGEST); insert_into_tables_defs(tables_defs_stats,"stats_mcp_query_digest_reset", STATS_SQLITE_TABLE_MCP_QUERY_DIGEST_RESET); - insert_into_tables_defs(tables_defs_stats,"stats_mcp_query_rules", STATS_SQLITE_TABLE_MCP_QUERY_RULES); // Reuse same schema for stats + insert_into_tables_defs(tables_defs_stats,"stats_mcp_query_rules", STATS_SQLITE_TABLE_MCP_QUERY_RULES);lib/MySQL_Catalog.cpp (2)
53-66: VLA is non-standard C++; prefer stack-allocated fixed buffer or std::vector.Line 56 uses a variable-length array (VLA) which is a GCC extension and not standard C++. This can cause stack overflow for large paths.
♻️ Suggested fix
int MySQL_Catalog::init() { // Initialize database connection db = new SQLite3DB(); - char path_buf[db_path.size() + 1]; - strcpy(path_buf, db_path.c_str()); - int rc = db->open(path_buf, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); + int rc = db->open((char*)db_path.c_str(), SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); if (rc != SQLITE_OK) {
476-488: Merge uses hardcoded empty schema and fixed kinds; document or parameterize.The merge operation hardcodes empty schema
""and only searches for"table"and"view"kinds. This may limit flexibility for future use cases.include/Discovery_Schema.h (2)
93-132: Mixed ownership semantics in MCP_Query_Processor_Output.
new_queryusesnew/deletewhileerror_msgandOK_msgusestrdup/free. This inconsistency increases cognitive load and bug risk. Consider unifying ownership usingstd::stringorstd::unique_ptr<std::string>for all string members.♻️ Suggested modernization
struct MCP_Query_Processor_Output { - std::string *new_query; // Rewritten query (caller must delete) + std::unique_ptr<std::string> new_query; // Rewritten query int timeout_ms; // Query timeout in milliseconds (-1 = not set) - char *error_msg; // Error message to return (NULL = not set) - char *OK_msg; // OK message to return (NULL = not set) + std::string error_msg; // Error message (empty = not set) + std::string OK_msg; // OK message (empty = not set) int log; // Whether to log this query (-1 = not set, 0 = no, 1 = yes) int next_query_flagIN; // Flag for next query (-1 = not set)
149-156: Consider using typed pointer instead of void for digest stats map.*The
mcp_digest_umapstoresvoid*which loses type safety. Since the value type is alwaysMCP_Query_Digest_Stats*, using the explicit type would prevent casting errors.♻️ Suggested improvement
// MCP query digest statistics - std::unordered_map<std::string, std::unordered_map<uint64_t, void*>> mcp_digest_umap; + std::unordered_map<std::string, std::unordered_map<uint64_t, MCP_Query_Digest_Stats*>> mcp_digest_umap;lib/Static_Harvester.cpp (1)
150-154: Hardcoded 30-second timeout; consider making configurable.The connection timeouts are hardcoded. For environments with varying network latencies, this might be too short or unnecessarily long.
lib/Discovery_Schema.cpp (1)
62-75: VLA usage; same issue as MySQL_Catalog.Line 65 uses a variable-length array. Apply the same fix as recommended for MySQL_Catalog.
| // LOAD MCP QUERY RULES FROM DISK / TO MEMORY | ||
| // Copies rules from persistent storage (disk.mcp_query_rules) to working memory (main.mcp_query_rules) | ||
| if ( | ||
| (query_no_space_length == strlen("LOAD MCP QUERY RULES FROM DISK") && !strncasecmp("LOAD MCP QUERY RULES FROM DISK", query_no_space, query_no_space_length)) | ||
| || | ||
| (query_no_space_length == strlen("LOAD MCP QUERY RULES TO MEMORY") && !strncasecmp("LOAD MCP QUERY RULES TO MEMORY", query_no_space, query_no_space_length)) | ||
| ) { | ||
| l_free(*ql,*q); | ||
| *q=l_strdup("INSERT OR REPLACE INTO main.mcp_query_rules SELECT * FROM disk.mcp_query_rules"); | ||
| *ql=strlen(*q)+1; | ||
| return true; | ||
| } | ||
|
|
||
| // SAVE MCP QUERY RULES TO DISK | ||
| // Copies rules from working memory (main.mcp_query_rules) to persistent storage (disk.mcp_query_rules) | ||
| if ( | ||
| (query_no_space_length == strlen("SAVE MCP QUERY RULES TO DISK") && !strncasecmp("SAVE MCP QUERY RULES TO DISK", query_no_space, query_no_space_length)) | ||
| ) { | ||
| l_free(*ql,*q); | ||
| *q=l_strdup("INSERT OR REPLACE INTO disk.mcp_query_rules SELECT * FROM main.mcp_query_rules"); | ||
| *ql=strlen(*q)+1; | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent stale MCP query rules when syncing disk ↔ memory.
INSERT OR REPLACE ... SELECT doesn’t remove rows deleted on the source side, so obsolete rules can persist and later be re-activated on runtime loads. Clear the target table before copying (or use a dedicated flush helper) to keep disk/memory in sync.
🧩 Proposed fix (clear target before copy)
- l_free(*ql,*q);
- *q=l_strdup("INSERT OR REPLACE INTO main.mcp_query_rules SELECT * FROM disk.mcp_query_rules");
- *ql=strlen(*q)+1;
- return true;
+ ProxySQL_Admin* SPA = (ProxySQL_Admin*)pa;
+ SPA->admindb->execute("DELETE FROM main.mcp_query_rules");
+ SPA->admindb->execute("INSERT OR REPLACE INTO main.mcp_query_rules SELECT * FROM disk.mcp_query_rules");
+ SPA->send_ok_msg_to_client(sess, NULL, 0, query_no_space);
+ return false;- l_free(*ql,*q);
- *q=l_strdup("INSERT OR REPLACE INTO disk.mcp_query_rules SELECT * FROM main.mcp_query_rules");
- *ql=strlen(*q)+1;
- return true;
+ ProxySQL_Admin* SPA = (ProxySQL_Admin*)pa;
+ SPA->admindb->execute("DELETE FROM disk.mcp_query_rules");
+ SPA->admindb->execute("INSERT OR REPLACE INTO disk.mcp_query_rules SELECT * FROM main.mcp_query_rules");
+ SPA->send_ok_msg_to_client(sess, NULL, 0, query_no_space);
+ return false;🤖 Prompt for AI Agents
In `@lib/Admin_Handler.cpp` around lines 2362 - 2383, The current INSERT OR
REPLACE SELECT statements in the LOAD and SAVE branches (where *q is set via
l_strdup to "INSERT OR REPLACE INTO main.mcp_query_rules SELECT * FROM
disk.mcp_query_rules" and "INSERT OR REPLACE INTO disk.mcp_query_rules SELECT *
FROM main.mcp_query_rules") do not remove source-deleted rows and can resurrect
stale rules; modify the SQL assigned to *q in both branches so the target table
is cleared before copying (e.g., issue a DELETE FROM <target_table> or use a
transactional sequence "DELETE FROM <target_table>; INSERT INTO <target_table>
SELECT ...") while keeping the same surrounding logic (the blocks that test
query_no_space_length and call l_free, l_strdup, set *ql and return true) and
ensure both the LOAD (target main.mcp_query_rules) and SAVE (target
disk.mcp_query_rules) cases are updated.
| char** pta = (char**)malloc(sizeof(char*) * 17); | ||
|
|
||
| pta[0] = strdup(std::to_string(rule->rule_id).c_str()); // rule_id | ||
| pta[1] = strdup(std::to_string(rule->active ? 1 : 0).c_str()); // active | ||
| pta[2] = rule->username ? strdup(rule->username) : NULL; // username | ||
| pta[3] = rule->schemaname ? strdup(rule->schemaname) : NULL; // schemaname | ||
| pta[4] = rule->tool_name ? strdup(rule->tool_name) : NULL; // tool_name | ||
| pta[5] = rule->match_pattern ? strdup(rule->match_pattern) : NULL; // match_pattern | ||
| pta[6] = strdup(std::to_string(rule->negate_match_pattern ? 1 : 0).c_str()); // negate_match_pattern | ||
| pta[7] = strdup(std::to_string(rule->re_modifiers).c_str()); // re_modifiers | ||
| pta[8] = strdup(std::to_string(rule->flagIN).c_str()); // flagIN | ||
| pta[9] = strdup(std::to_string(rule->flagOUT).c_str()); // flagOUT | ||
| pta[10] = rule->replace_pattern ? strdup(rule->replace_pattern) : NULL; // replace_pattern | ||
| pta[11] = strdup(std::to_string(rule->timeout_ms).c_str()); // timeout_ms | ||
| pta[12] = rule->error_msg ? strdup(rule->error_msg) : NULL; // error_msg | ||
| pta[13] = rule->ok_msg ? strdup(rule->ok_msg) : NULL; // OK_msg | ||
| pta[14] = strdup(std::to_string(rule->log ? 1 : 0).c_str()); // log | ||
| pta[15] = strdup(std::to_string(rule->apply ? 1 : 0).c_str()); // apply | ||
| pta[16] = rule->comment ? strdup(rule->comment) : NULL; // comment | ||
|
|
||
| result->add_row(pta); | ||
|
|
||
| // Free the row data | ||
| for (int j = 0; j < 17; j++) { | ||
| if (pta[j]) { | ||
| free(pta[j]); | ||
| } | ||
| } | ||
| free(pta); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No null checks after malloc/strdup allocations.
If malloc or strdup returns NULL due to memory exhaustion, the subsequent operations will cause a crash. While rare, this is flagged by static analysis.
Consider adding null checks or using a wrapper that handles allocation failures consistently across the codebase.
🧰 Tools
🪛 Cppcheck (2.19.0)
[warning] 2657-2657: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2658-2658: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2659-2659: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2660-2660: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2661-2661: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2662-2662: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2663-2663: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2664-2664: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2665-2665: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2666-2666: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2667-2667: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2668-2668: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2669-2669: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2670-2670: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2671-2671: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2672-2672: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2673-2673: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
[warning] 2679-2679: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
🤖 Prompt for AI Agents
In `@lib/Discovery_Schema.cpp` around lines 2655 - 2684, The code allocates pta
with malloc and many strdup calls (symbols: pta, malloc, strdup,
result->add_row) without checking for NULL; add checks after malloc and after
each strdup (or use a helper that allocates and duplicates safely) so that on
allocation failure you free any already-allocated entries, free pta, and avoid
calling result->add_row (return an error or handle gracefully). Specifically:
verify pta != NULL immediately after malloc, check each strdup result before
storing it into pta (or wrap duplication to return NULL on failure and
propagate), on any allocation failure free any non-NULL pta[i] and pta itself,
and only call result->add_row when all required allocations succeeded.
| if (qpo->timeout_ms > 0) { | ||
| timeout_sec = qpo->timeout_ms / 1000; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integer division truncates sub-second timeouts to zero.
The conversion qpo->timeout_ms / 1000 truncates values less than 1000ms to 0 seconds. If a rule specifies a 500ms timeout, timeout_sec becomes 0, which may cause unexpected behavior depending on how zero is handled downstream.
Consider rounding up to ensure sub-second timeouts are at least 1 second:
Suggested fix
// Apply timeout if provided
if (qpo->timeout_ms > 0) {
- timeout_sec = qpo->timeout_ms / 1000;
+ timeout_sec = (qpo->timeout_ms + 999) / 1000; // Round up to at least 1 sec
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (qpo->timeout_ms > 0) { | |
| timeout_sec = qpo->timeout_ms / 1000; | |
| } | |
| if (qpo->timeout_ms > 0) { | |
| timeout_sec = (qpo->timeout_ms + 999) / 1000; // Round up to at least 1 sec | |
| } |
🤖 Prompt for AI Agents
In `@lib/Query_Tool_Handler.cpp` around lines 1635 - 1637, The conversion of
qpo->timeout_ms to timeout_sec uses integer division which truncates sub-second
values to zero (e.g., 500ms => 0); change the conversion logic in
Query_Tool_Handler.cpp around the timeout handling (variable timeout_sec and
qpo->timeout_ms) to round up instead of truncating so any non-zero millisecond
timeout becomes at least 1 second (e.g., replace the direct /1000 integer
division with a ceil-style conversion such as (timeout_ms + 999)/1000 or use
floating-point division plus ceil) and keep existing behavior when timeout_ms is
0 or negative.
Fixed 6 legitimate issues from AI code review: Critical fixes: - Implement stats___mcp_query_digest to populate table with actual data - Fix double trigger bug (stats_mcp_query_digest_reset substring matching) Important fixes: - Fix re_modifiers parsing (VARCHAR "CASELESS" -> int 1) - Fix TOCTOU race condition in reset path (use write lock from start) - Add column count validation before accessing row fields Documentation: - Add memory ownership documentation for evaluate_mcp_query_rules False positives ignored (Issue 2: Schema mismatch, Issue 6: SQL injection)
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@lib/ProxySQL_Admin_Stats.cpp`:
- Around line 2605-2625: The current loop builds SQL via sprintf into query
(using template a) which allows quotes/JSON in digest_text (and other
r->fields[]) to break SQL or permit injection; replace this with a prepared
INSERT into stats_mcp_query_digest and bind the 10 values from
r->fields[0]..r->fields[9] for each SQLite3_row instead of sprintf/malloc.
Locate the loop using SQLite3_row* r and statsdb->execute and change it to
prepare a parameterized statement once (e.g., "INSERT INTO
stats_mcp_query_digest VALUES (?,?,?,?,?,?,?,?,?,?)"), bind each r->fields[i]
per row, execute/reset the statement for the next row, and remove the
sprintf/malloc usage to eliminate injection and quoting issues.
- Around line 2594-2606: The reset path deletes from
stats_mcp_query_digest_reset but the subsequent insert always writes into
stats_mcp_query_digest; update the insert target based on the reset flag so the
reset branch re-populates stats_mcp_query_digest_reset. Concretely, change how
the INSERT string (variable a) is chosen before the loop in the block that calls
statsdb->execute and iterates resultset->rows: if reset is true set a to an
INSERT INTO stats_mcp_query_digest_reset ... string, otherwise keep the INSERT
INTO stats_mcp_query_digest ... string, then use that chosen string inside the
insertion loop so inserts go to the correct table.
In `@lib/ProxySQL_Admin.cpp`:
- Around line 7789-7793: The cleanup code is calling free(resultset) on an
object returned by execute_statement (an instance of SQLite3_result), which is
undefined behavior; replace that call with the appropriate C++ delete (i.e.,
delete resultset) and ensure any other cleanup paths that free resultset are
updated similarly so SQLite3_result is destroyed properly (look for the
resultset variable and usages around execute_statement and the success==false
branch in ProxySQL_Admin.cpp).
♻️ Duplicate comments (8)
lib/ProxySQL_Admin_Stats.cpp (1)
2662-2673: Same concern as prior review: rawsprintfwith unescaped fields.This is the same SQL construction pattern that was previously flagged. Please use prepared statements or escape values to avoid injection/breakage.
lib/Discovery_Schema.cpp (7)
34-48: Duplicate cleanup logic withload_mcp_query_rules.This cleanup block is nearly identical to lines 2373-2387 in
load_mcp_query_rules. Consider extracting to a private helper function (e.g.,clear_mcp_query_rules()) to eliminate duplication and reduce maintenance risk.
351-372: Schema mismatch between Discovery_Schema and admin tables.This
mcp_query_rulestable schema differs from the admin table definitions (includesrun_id,action,hitscolumns not present in admin tables). If rules are loaded from admin tables viaload_mcp_query_rules, the column layouts won't align.Verify whether this table is actually used or if it should be removed/synchronized with admin table definitions.
2529-2533: Username matching is non-functional.Rules with
usernamespecified are always skipped since username context isn't available in MCP. This makes theusernamefield inmcp_query_ruleseffectively unusable. The TODO should be tracked as a follow-up issue.
2632-2635: Redundantrule->applycheck.At line 2633,
rule->applyis guaranteed to be true because log-only rules (!rule->apply) alreadycontinueat line 2575. The condition is logically redundant.♻️ Suggested simplification
- // If apply is true and not a log-only rule, stop processing further rules - if (rule->apply) { - break; - } + // Stop processing further rules (log-only rules already continued above) + break;
2865-2865: Confusing historical comment.The comment mentions "digest_text was duplicated" which references a past bug fix rather than describing current state. Consider simplifying to just document the current column count.
📝 Suggested simplification
- // Define columns (10 columns, not 11 - digest_text was duplicated) + // Define columns for MCP query digest statistics (10 columns)
2681-2710: No null checks after malloc/strdup allocations.If
mallocorstrdupreturns NULL due to memory exhaustion, subsequent operations will cause undefined behavior. While OOM is rare, the static analyzer flags this consistently across the function.Consider adding a null check after
mallocor using a wrapper that handles allocation failures consistently.
3000-3001: Incorrect example in documentation.The example output
{"filters":{"?":"?","?":"?"}}is wrong. The code at line 3017 usesit.key()which preserves object keys. The correct output should be:Output: {"filters":{"status":"?","age":"?"}}📝 Suggested fix
-// Output: {"filters":{"?":"?","?":"?"}} +// Output: {"filters":{"status":"?","age":"?"}}
| statsdb->execute("BEGIN"); | ||
|
|
||
| if (reset) { | ||
| statsdb->execute("DELETE FROM stats_mcp_query_digest_reset"); | ||
| } else { | ||
| statsdb->execute("DELETE FROM stats_mcp_query_digest"); | ||
| } | ||
|
|
||
| // Insert digest statistics into the stats table | ||
| // Columns: tool_name, run_id, digest, digest_text, count_star, | ||
| // first_seen, last_seen, sum_time, min_time, max_time | ||
| char* a = (char*)"INSERT INTO stats_mcp_query_digest VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")"; | ||
| for (std::vector<SQLite3_row*>::iterator it = resultset->rows.begin(); it != resultset->rows.end(); ++it) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset path inserts into the wrong table.
When reset is true you delete stats_mcp_query_digest_reset, but the insert always targets stats_mcp_query_digest. That leaves the reset table empty and overwrites the non‑reset view on reset reads.
🛠️ Suggested fix
- char* a = (char*)"INSERT INTO stats_mcp_query_digest VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")";
+ const char* table = reset ? "stats_mcp_query_digest_reset" : "stats_mcp_query_digest";
+ std::string a = std::string("INSERT INTO ") + table +
+ " VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")";
...
- char* query = (char*)malloc(strlen(a) + arg_len + 32);
- sprintf(query, a,
+ char* query = (char*)malloc(a.size() + arg_len + 32);
+ sprintf(query, a.c_str(),
r->fields[0], r->fields[1], r->fields[2], r->fields[3], r->fields[4],
r->fields[5], r->fields[6], r->fields[7], r->fields[8], r->fields[9]);🤖 Prompt for AI Agents
In `@lib/ProxySQL_Admin_Stats.cpp` around lines 2594 - 2606, The reset path
deletes from stats_mcp_query_digest_reset but the subsequent insert always
writes into stats_mcp_query_digest; update the insert target based on the reset
flag so the reset branch re-populates stats_mcp_query_digest_reset. Concretely,
change how the INSERT string (variable a) is chosen before the loop in the block
that calls statsdb->execute and iterates resultset->rows: if reset is true set a
to an INSERT INTO stats_mcp_query_digest_reset ... string, otherwise keep the
INSERT INTO stats_mcp_query_digest ... string, then use that chosen string
inside the insertion loop so inserts go to the correct table.
| char* a = (char*)"INSERT INTO stats_mcp_query_digest VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")"; | ||
| for (std::vector<SQLite3_row*>::iterator it = resultset->rows.begin(); it != resultset->rows.end(); ++it) { | ||
| SQLite3_row* r = *it; | ||
| int arg_len = 0; | ||
| for (int i = 0; i < 10; i++) { | ||
| arg_len += strlen(r->fields[i]); | ||
| } | ||
| char* query = (char*)malloc(strlen(a) + arg_len + 32); | ||
| sprintf(query, a, | ||
| r->fields[0], // tool_name | ||
| r->fields[1], // run_id | ||
| r->fields[2], // digest | ||
| r->fields[3], // digest_text | ||
| r->fields[4], // count_star | ||
| r->fields[5], // first_seen | ||
| r->fields[6], // last_seen | ||
| r->fields[7], // sum_time | ||
| r->fields[8], // min_time | ||
| r->fields[9] // max_time | ||
| ); | ||
| statsdb->execute(query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid raw sprintf inserts for digest rows.
digest_text and other fields can contain quotes/JSON, which will break the SQL and can enable injection into the stats DB. Use a prepared statement with bound parameters (as you already do in other stats functions).
🛠️ Suggested approach (prepared statement)
- char* a = (char*)"INSERT INTO stats_mcp_query_digest VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")";
- for (...) {
- ...
- sprintf(query, a, ...);
- statsdb->execute(query);
- free(query);
- }
+ const char* sql = reset
+ ? "INSERT INTO stats_mcp_query_digest_reset VALUES (?1,?2,?3,?4,?5,?6,?7,?8,?9,?10)"
+ : "INSERT INTO stats_mcp_query_digest VALUES (?1,?2,?3,?4,?5,?6,?7,?8,?9,?10)";
+ sqlite3_stmt* stmt = nullptr;
+ int rc = statsdb->prepare_v2(sql, &stmt);
+ ASSERT_SQLITE_OK(rc, statsdb);
+ for (SQLite3_row* r : resultset->rows) {
+ rc = (*proxy_sqlite3_bind_text)(stmt, 1, r->fields[0], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 2, r->fields[1], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 3, r->fields[2], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 4, r->fields[3], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 5, r->fields[4], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 6, r->fields[5], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 7, r->fields[6], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 8, r->fields[7], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt, 9, r->fields[8], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_bind_text)(stmt,10, r->fields[9], -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb);
+ SAFE_SQLITE3_STEP2(stmt);
+ rc = (*proxy_sqlite3_clear_bindings)(stmt); ASSERT_SQLITE_OK(rc, statsdb);
+ rc = (*proxy_sqlite3_reset)(stmt); ASSERT_SQLITE_OK(rc, statsdb);
+ }
+ (*proxy_sqlite3_finalize)(stmt);🤖 Prompt for AI Agents
In `@lib/ProxySQL_Admin_Stats.cpp` around lines 2605 - 2625, The current loop
builds SQL via sprintf into query (using template a) which allows quotes/JSON in
digest_text (and other r->fields[]) to break SQL or permit injection; replace
this with a prepared INSERT into stats_mcp_query_digest and bind the 10 values
from r->fields[0]..r->fields[9] for each SQLite3_row instead of sprintf/malloc.
Locate the loop using SQLite3_row* r and statsdb->execute and change it to
prepare a parameterized statement once (e.g., "INSERT INTO
stats_mcp_query_digest VALUES (?,?,?,?,?,?,?,?,?,?)"), bind each r->fields[i]
per row, execute/reset the statement for the next row, and remove the
sprintf/malloc usage to eliminate injection and quoting issues.
| if (success == false) { | ||
| if (resultset) { | ||
| free(resultset); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use delete for SQLite3_result cleanup.
execute_statement returns a C++ object; freeing it with free is undefined behavior.
🐛 Proposed fix
- if (resultset) {
- free(resultset);
- }
+ if (resultset) {
+ delete resultset;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (success == false) { | |
| if (resultset) { | |
| free(resultset); | |
| } | |
| } | |
| if (success == false) { | |
| if (resultset) { | |
| delete resultset; | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@lib/ProxySQL_Admin.cpp` around lines 7789 - 7793, The cleanup code is calling
free(resultset) on an object returned by execute_statement (an instance of
SQLite3_result), which is undefined behavior; replace that call with the
appropriate C++ delete (i.e., delete resultset) and ensure any other cleanup
paths that free resultset are updated similarly so SQLite3_result is destroyed
properly (look for the resultset variable and usages around execute_statement
and the success==false branch in ProxySQL_Admin.cpp).
- Add special build rule in Makefile for anomaly_detection-t that includes: - $(OBJ) for global variables (GloVars, GloGATH) - -Wl,--allow-multiple-definition to allow test's main() to override ProxySQL's - ClickHouse client libraries (libclickhouse-cpp-lib.a, libzstdstatic.a, liblz4.a) - SQLite rembed library (libsqlite_rembed.a) - -lscram -lusual for PostgreSQL SCRAM support - Add stub function SQLite3_Server_session_handler required by SQLite3_Server.cpp Resolves compilation errors for anomaly_detection-t test.
- Add #include <string> for C++ std::string support - Add #include <cmath> for sqrt() function - Change format %lld to %ld for chrono duration types (long int, not long long) Resolves compilation errors for vector_db_performance-t test.
- Add #include <string> for C++ std::string support Resolves compilation errors for ai_llm_retry_scenarios-t test.
This commit addresses valid concerns raised by coding agents (Gemini, Copilot, CoderabbitAI): 1. Fix stats_mcp_query_digest naming conflict (ProxySQL_Admin.cpp): - Made reset and non-reset paths mutually exclusive using else block - Prevents both flags from being true, matching MySQL pattern - Ensures reset takes precedence over non-reset 2. Fix INSERT OR REPLACE sync issue (Admin_Handler.cpp): - Added DELETE before INSERT OR REPLACE in LOAD/SAVE MCP QUERY RULES - Prevents stale rules from persisting when syncing disk <-> memory - Ensures deleted source rows are also removed from target 3. Fix integer division truncation for timeout (Query_Tool_Handler.cpp): - Changed timeout_ms/1000 to (timeout_ms+999)/1000 for ceiling division - Ensures sub-second timeouts (e.g., 500ms) become at least 1 second - Prevents zero-second timeouts from causing unexpected behavior 4. Remove confusing comment (Discovery_Schema.cpp): - Simplified column count comment to be clear and accurate Note: The re_modifiers parsing code already correctly handles VARCHAR "CASELESS" to int conversion (lines 2414-2425), so that review comment was already addressed.
|
Thank you to the three coding agents for the thorough code review! Summary of FixesI've addressed the valid concerns raised by @gemini-code-assist[bot], @copilot, and @coderabbitai[bot] in commit : Issues Fixed:
Already Correct:
Items Requiring Further Discussion (deferred to future work):Some excellent suggestions were noted for future enhancements:
These are good ideas for follow-up PRs to improve code quality and robustness. Thanks again for the thoughtful reviews! 🙏 |
|
@renecannao I've opened a new pull request, #24, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
Hi @renecannao, Thank you for the detailed summary of the fixes and for addressing the feedback from the review! It's great to see these improvements implemented. I particularly appreciate the fixes for:
I also agree that the suggestions regarding Thanks again for your thorough work on this pull request! 🙏 |
|
Tip For best results, initiate chat on the files or code changes.
Your fixes look solid:
Great to hear the re_modifiers parsing was already handling VARCHAR-to-int conversion correctly! Deferring the enhancement suggestions (smart pointers, NULL checks, helper extraction, TODO completion) to future PRs is a pragmatic approach that keeps this PR focused. Those are excellent candidates for follow-up quality improvements. Thanks for being so responsive to the feedback! If you'd like me to review the new commit with these fixes, just let me know. 🙏 |
Summary
Implement firewall functionality and query digest statistics for MCP (Model Context Protocol) queries in ProxySQL, mirroring the existing MySQL query rules and statistics infrastructure.
Key Design
Action is inferred from rule properties (like MySQL/PostgreSQL query rules):
error_msg != NULL→ blockreplace_pattern != NULL→ rewritetimeout_ms > 0→ timeoutOK_msg != NULL→ return OK messageActions are NOT mutually exclusive - a single rule can perform multiple actions simultaneously.
Changes
mcp_query_rulestable for firewall rules (17 columns)runtime_mcp_query_rulestable (same 17 columns, shows in-memory state)stats_mcp_query_rulestable with rule_id and hitsstats_mcp_query_digesttable for query statistics (10 columns)Tables Added
mcp_query_rules- Config table for defining rulesruntime_mcp_query_rules- Runtime view of active rulesstats_mcp_query_rules- Hit counters per rulestats_mcp_query_digest- Query digest statisticsstats_mcp_query_digest_reset- Reset-on-read statsCommands Added
LOAD MCP QUERY RULES FROM DISKLOAD MCP QUERY RULES TO MEMORYLOAD MCP QUERY RULES TO RUNTIMESAVE MCP QUERY RULES TO DISKSAVE MCP QUERY RULES TO MEMORYSAVE MCP QUERY RULES FROM RUNTIMEDocumentation
This PR also includes comprehensive documentation for:
Test plan