Fix CodeQL security alerts by jruszo · Pull Request #37 · jruszo/Datamingle

jruszo · 2026-04-27T13:26:56Z

Summary

Tightened GitHub Actions workflow permissions for lint, Django CI, and CodeQL.
Reduced security alert surface by removing sensitive exception details from API responses and logs.
Hardened file-path handling for data dictionary and rollback exports.
Added validation around plugin command execution and reviewed SQL execution sinks flagged by CodeQL.

Testing

black --check --diff . passed
docker exec datamingle-app python manage.py test sql.plugins.tests passed
docker exec datamingle-app python manage.py check passed
docker exec datamingle-app python manage.py makemigrations sql --check passed

Summary by CodeRabbit

Release Notes

Security & Stability
- Standardized API error responses with consistent, generic messages across all endpoints
- Enhanced file operation security with validation and path containment checks
- Restricted GitHub workflow permissions for improved access control
- Improved server-side error logging for debugging while reducing technical details in client-facing messages
- Hardened command execution with stricter input validation

coderabbitai · 2026-04-27T13:27:11Z

📝 Walkthrough

Walkthrough

This PR comprehensively hardens error handling and security across the codebase by adding explicit GitHub Actions permissions, standardizing error responses to avoid exposing exception details, improving server-side logging with proper traceback capture, hardening file path validation and plugin argument whitelisting, and refactoring SQL execution engine exception handling for consistency.

Changes

Cohort / File(s)	Summary
GitHub Actions Workflows `.github/workflows/black.yml`, `codeql-analysis.yml`, `django.yml`	Added explicit workflow-level permissions (`contents: read`, `security-events: write`). Django workflow gains bounded MySQL retry logic with `MAX_ATTEMPTS`, sources `SECRET_KEY` from secrets with deterministic fallback, and defines top-level token permissions.
API Error Response Standardization `api_access/views.py`, `api_archives/views.py`, `api_core/views.py`, `api_queries/views.py`, `api_workflows/views.py`	Replaced exception string interpolation in client-facing error messages with generic strings (e.g., `"Operation failed."`, `"Data masking error, contact admin."`). Exceptions now logged server-side with `logger.exception(...)` or `logger.warning(..., exc_info=True)` including contextual fields like `request_id`, `user`, `action`.
API Path Security & Dictionary Export `api_instances/views.py`	Hardened dictionary export path generation using `realpath` containment checks, sanitized filenames with SHA-256 truncation, and length capping. Added CodeQL path-injection suppression comments. Standardized error responses; removed `query_engine.escape_string` call for single-db exports.
Settings & Validation `api_admin/settings.py`, `common/check.py`	Added `VALID_STORAGE_TYPES` allowlist in `common/check.py`; introduced `max_export_rows` int parsing and improved error logging for storage/email/instance validation. Suppressed exception chaining (`from None`) in validation error handlers.
External Service Logging `common/utils/feishu_api.py`, `common/utils/sendmsg.py`, `common/utils/wx_api.py`	Simplified logging across Feishu, WeCom, and WeChat integrations: replaced full response/payload logs with only specific fields (`errcode`, `code`, `status_code`). Improved traceback capture with `logger.exception(...)`. Moved WeCom token credentials from URL to request `params`.
SQL Engine Execution & Logging `sql/engines/cassandra.py`, `doris.py`, `mssql.py`, `mysql.py`, `oracle.py`, `pgsql.py`, `phoenix.py`, `redis.py`	Standardized exception logging to use `exc_info=True` without embedding SQL or full tracebacks. Changed error responses from `str(e)` to generic `"Execution failed"`. MySQL: refactored `query_check` regex-based detection for `mysql.user` with normalized SQL processing and substring checks.
SQL Engine Comments & Refactoring `sql/engines/goinception.py`, `mongo.py`	Added CodeQL suppression comments in goInception. Refactored MongoDB pre-check: replaced single complex regex with two compiled matchers for `db.<name>.<method>` and `db.getCollection(...)` patterns.
SQL Core Data & Resources `sql/data_dictionary.py`, `sql/instance.py`, `sql/offlinedownload.py`, `sql/resource_group.py`, `sql/views.py`	Added module loggers and exception stack traces via `logger.exception(...)`. Changed client-facing messages from raw exceptions to fixed strings (`"Failed to load instance resources."`, `"Failed to describe table."`). Hardened rollback download path with `os.path.realpath` and `os.path.commonpath` boundary checks. Removed `PermissionDenied` exception content from download responses.
Plugin Security & Validation `sql/plugins/plugin.py`, `sql/plugins/pt_archiver.py`, `sql/plugins/tests.py`	Converted `Plugin.execute_cmd` from static method to `classmethod`. Added `SAFE_ARGUMENT_RE` whitelist validation for argument names in `check_args` and `generate_args2cmd`. Hardened `execute_cmd`: validates non-empty list/tuple args, resolves executable paths with `shutil.which`, verifies file existence/executability, and uses `subprocess.Popen` with `shell=False`. Updated test mocking and assertions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

[codex] Add local Playwright workflow smoke coverage #24: Both PRs modify sql/offlinedownload.py for download/export authorization handling and error responses.
[codex] Add configurable MySQL OSC executors #28: Both PRs modify sql/engines/mysql.py including DDL/execution workflows and error logging standardization.

Poem

🐰 With whiskers twitching, I review with care,
No secrets leaked through exception air,
Paths verified, plugins secured tight,
Logs now whisper what errors might,
Security stronger, the garden's delight! 🔐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective of the PR—addressing CodeQL security alerts—which aligns with the primary changes across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/codeql-security-alerts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (12)

.github/workflows/django.yml (2)

95-102: ⚠️ Potential issue | 🔴 Critical

Critical: Remove hardcoded SECRET_KEY from the GitHub Actions workflow.

The workflow sets env: SECRET_KEY: XViQ1NfutP1RraqhHV3M3akMsB3JWZr9, which is effectively a leaked secret and will trigger many security scans (CodeQL/hardcoded-secrets) and is a compliance risk—even if it’s “just for CI”. Use a GitHub Actions secret instead.

🔧 Suggested fix

     - name: Run Tests
       env:
-        SECRET_KEY: XViQ1NfutP1RraqhHV3M3akMsB3JWZr9
+        SECRET_KEY: ${{ secrets.DJANGO_SECRET_KEY }}
       run: |
         python manage.py makemigrations
         python manage.py makemigrations sql
         pytest --cov --cov-report xml

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/django.yml around lines 95 - 102, The workflow currently
hardcodes SECRET_KEY in the "Run Tests" job which leaks a secret; update the env
usage for the Run Tests step to read the key from GitHub Actions secrets (e.g.
use secrets.SECRET_KEY) instead of the literal value, or remove the env entry if
tests do not require it; ensure the repository secret is added in Settings >
Secrets before referencing it and update any references in the workflow that
rely on SECRET_KEY accordingly.

70-76: ⚠️ Potential issue | 🟠 Major

Major: Add a timeout/attempt limit to the MySQL readiness loop.

The until docker exec mysql mysqladmin ping ...; do ... sleep 2 done loop has no upper bound. If MySQL never becomes ready, the workflow can hang indefinitely, consuming CI time and masking the real failure mode. Add a max attempts/timeout and fail with logs.

🔧 Suggested fix

         docker cp my.cnf mysql:/etc/mysql/conf.d/
         docker restart mysql
-        until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do
+        max_attempts=30
+        attempt=1
+        until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do
           if [ "$(docker inspect -f '{{.State.Running}}' mysql 2>/dev/null)" != "true" ]; then
             docker logs mysql
             exit 1
           fi
+          if [ "$attempt" -ge "$max_attempts" ]; then
+            echo "MySQL did not become ready in time."
+            docker logs mysql
+            exit 1
+          fi
+          attempt=$((attempt+1))
           sleep 2
         done

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/django.yml around lines 70 - 76, The readiness loop that
runs "until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot
--silent; do ... sleep 2 done" has no timeout and can hang indefinitely; modify
this loop to introduce a max attempts counter or overall timeout (e.g.,
MAX_ATTEMPTS and an increment each iteration) and on exceeding it print docker
logs for the "mysql" container and exit non‑zero; implement the counter/check
within the same block that currently checks container running state so the
workflow fails fast with diagnostic logs when MySQL never becomes ready.

common/check.py (5)

177-182: ⚠️ Potential issue | 🟠 Major

Major: exception logging includes str(e) (possible sensitive details), even though client message is generic.

In validate_file_storage_payload(), the client response is generic (Line 179), but the server log includes the exception string:

Line 180-181: error_msg = f"Connection test exception: {str(e)}" and logging.error(..., exc_info=True)

If str(e) can contain hostnames, credentials, query text, or other sensitive data, this can reintroduce the same alert surface the PR is trying to reduce. Prefer fixed log messages and rely on exc_info=True for stack traces (without interpolating str(e)).
🛠️ Proposed fix
     except Exception as e:
         result["status"] = 1
         result["msg"] = "Storage connectivity test raised an exception."
-        error_msg = f"Connection test exception: {str(e)}"
-        logging.error(error_msg, exc_info=True)
+        logger.error("Storage connectivity test raised an exception.", exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 177 - 182, In validate_file_storage_payload(),
avoid interpolating the exception string into the log; change the logging call
that currently builds error_msg using str(e) to a fixed, non-sensitive message
(e.g. "Storage connectivity test failed") and pass exc_info=True so the stack
trace is recorded without embedding potentially sensitive exception content;
keep the generic client-facing result["msg"] and result["status"] as-is.
91-117: ⚠️ Potential issue | 🟠 Major

Major: email test returns sender_response directly to client.

In validate_email_payload(), when sender_response != "success" you set:

Line 115: result["msg"] = sender_response

If sender_response contains SMTP/server details or exception text, this again risks leaking sensitive information to API callers. Return a generic message and keep any diagnostic detail server-side (ideally not including raw exception strings either).
🛠️ Proposed fix
     sender_response = sender.send_email(subj, bd, [user_email])
     if sender_response != "success":
         result["status"] = 1
-        result["msg"] = sender_response
+        result["msg"] = "Email delivery test failed."
         return result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 91 - 117, In validate_email_payload(), avoid
returning raw sender_response to the client when sender_response != "success"
because it may contain sensitive SMTP/exception details; instead set
result["msg"] to a generic failure message (e.g., "Email delivery failed; check
server logs") and record the detailed sender_response in a server-side log or
diagnostics (use the module logger or error-tracking facility) linked to the
send operation (sender.send_email / sender_response) for debugging. Ensure
result["status"] remains non-zero and do not include sender_response in the API
response.
37-75: ⚠️ Potential issue | 🔴 Critical

Critical: client responses still leak raw exception details (str(e)) for DB connectivity.

In validate_go_inception_payload(), both failure paths build result["msg"] with str(e):

Line 50: "Unable to connect to goInception\n{}".format(str(e))

Line 69-71: backup DB equivalent

Even if you’re already logging traceback.format_exc() server-side, returning exception text to the API is an information disclosure risk (and can trigger CodeQL alerts depending on sink type). Prefer fixed, user-safe messages and keep detailed exception only in server logs.
🛠️ Proposed fix
 def validate_go_inception_payload(payload):
     result = {"status": 0, "msg": "ok", "data": []}
@@
     except Exception as e:
         logger.error(traceback.format_exc())
         result["status"] = 1
-        result["msg"] = "Unable to connect to goInception\n{}".format(str(e))
+        result["msg"] = "Unable to connect to goInception."
         return result
@@
     except Exception as e:
         logger.error(traceback.format_exc())
         result["status"] = 1
-        result["msg"] = "Unable to connect to goInception backup database\n{}".format(
-            str(e)
-        )
+        result["msg"] = "Unable to connect to goInception backup database."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 37 - 75, The code in
validate_go_inception_payload builds result["msg"] using str(e) after
MySQLdb.connect failures, leaking exception details; change both places (the
exception handlers around MySQLdb.connect calls) to set result["msg"] to a
fixed, user-safe string (e.g., "Unable to connect to goInception" and "Unable to
connect to goInception backup database") without including str(e), while keeping
logger.error(traceback.format_exc()) as the place that records full error
details; update the exception blocks for the first connect (around
MySQLdb.connect / cur = conn.cursor()) and the second backup connect to remove
the inclusion of str(e) in the returned message.
200-218: ⚠️ Potential issue | 🟠 Major

Major: instance() endpoint still returns test_result.error / str(e) to the client, and Instance.objects.get() isn’t protected.

Issues:

Line 205: instance = Instance.objects.get(id=instance_id) can raise if missing/invalid; not caught.

Line 211-213: returns test_result.error directly.

Line 216-217: returns str(e) directly.

To match the PR’s security intent, make client messages generic and handle missing instance cleanly. Log details server-side only.
🛠️ Proposed fix (sketch)
 def instance(request):
     result = {"status": 0, "msg": "ok", "data": []}
     instance_id = request.POST.get("instance_id")
-    instance = Instance.objects.get(id=instance_id)
+    try:
+        instance = Instance.objects.get(id=instance_id)
+    except Instance.DoesNotExist:
+        result["status"] = 1
+        result["msg"] = "Instance not found."
+        return HttpResponse(json.dumps(result), content_type="application/json")
     try:
         engine = get_engine(instance=instance)
         test_result = engine.test_connection()
         if test_result.error:
             result["status"] = 1
-            result["msg"] = "Unable to connect to instance,\n{}".format(
-                test_result.error
-            )
+            result["msg"] = "Unable to connect to instance."
     except Exception as e:
         result["status"] = 1
-        result["msg"] = "Unable to connect to instance,\n{}".format(str(e))
+        result["msg"] = "Unable to connect to instance."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 200 - 218, The instance() view currently
exposes internal error strings and calls Instance.objects.get(id=instance_id)
without guarding against missing/invalid IDs; change it to validate and safely
fetch the instance (catch Instance.DoesNotExist and invalid ID cases) and always
return a generic client message like "Unable to connect to instance" or
"Instance not found" instead of test_result.error or str(e). Log full details
server-side (use Python logging and logging.exception or logger.error with the
exception) when get_engine(instance=instance) or engine.test_connection() fails,
and when Instance.objects.get fails, but ensure the HttpResponse JSON only
contains non-sensitive generic messages and status codes. Keep references:
instance(), Instance.objects.get, get_engine(...), engine.test_connection().
120-165: ⚠️ Potential issue | 🟠 Major

Major correctness/security gap: max_export_rows is validated but not used; and storage_type is not validated/allowlisted here.

Observations:

Lines 124-132: max_export_rows is parsed/validated, but it never appears in config_dict (Lines 144-164). If DynamicStorage expects it, current behavior may silently ignore the limit.

Lines 122-136: storage_type = payload.get("storage_type") and then custom_param_key = f"{storage_type}_custom_params" without checking storage_type is present/expected. Depending on how DynamicStorage uses storage_type (especially given PR objective about plugin command execution), this could be a hardening gap.

Recommend either:

include max_export_rows in config_dict (if intended), and/or

add explicit storage_type allowlist + presence check before constructing config.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 120 - 165, The validate_file_storage_payload
function currently validates max_export_rows but never includes it in the
returned config and also uses storage_type without presence/allowlist checks;
update validate_file_storage_payload to (1) require and validate storage_type
against an allowlist (e.g., check storage_type is one of expected values before
using custom_param_key and return an error if missing/invalid) and (2) include
the validated max_export_rows (converted to int) in config_dict (e.g., add
"max_export_rows": int(max_export_rows)) so DynamicStorage receives the limit;
reference validate_file_storage_payload, storage_type, custom_param_key,
max_export_rows, config_dict, and DynamicStorage when making changes.

api_admin/settings.py (1)

359-363: ⚠️ Potential issue | 🟡 Minor

Use raise ... from None to properly suppress the exception chain.

The change correctly hides the original ValueError message from clients, but should use exception chaining syntax to avoid implicit exception context. This prevents the original exception from being unnecessarily preserved in the traceback.
 def validate_gh_ost(self, value):
     try:
         return validate_binary_path(value, "gh-ost")
     except ValueError as exc:
-        raise serializers.ValidationError("Operation failed.")
+        raise serializers.ValidationError("Operation failed.") from None
As per coding guidelines, use Black for Python formatting. Additionally, per Ruff B904: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api_admin/settings.py` around lines 359 - 363, In validate_gh_ost, when
catching the ValueError from validate_binary_path, re-raise the
serializers.ValidationError using exception chaining suppression (i.e., "raise
serializers.ValidationError('Operation failed.') from None") so the original
ValueError context isn't preserved; update the except block in the
validate_gh_ost method accordingly and run Black/ruff to ensure formatting and
B904 compliance.

sql/resource_group.py (1)

3-3: ⚠️ Potential issue | 🟡 Minor

Remove the unused traceback import.

The traceback module is imported on line 3 but is not used anywhere in the file. This import should be removed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sql/resource_group.py` at line 3, Remove the unused import of the traceback
module: delete the line that reads "import traceback" so there are no unused
imports left in this module (ensure no other code references traceback before
removing). Run linters/tests to confirm no further import warnings.

sql/engines/doris.py (1)

175-189: ⚠️ Potential issue | 🔴 Critical

Confirm that raw exception messages in errormessage are exposed to API clients and need sanitization.

The errormessage field containing f"Exception: {e}" is confirmed to be exposed to end users. The ReviewSet.to_dict() method serializes all ReviewResult.__dict__ attributes without filtering, and this data flows directly to API responses via the execute_rows field. The same unsanitized exception pattern exists across multiple engine implementations (redis.py, phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py), while offlinedownload.py demonstrates safer, controlled error messages. Apply consistent sanitization across all affected engine files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sql/engines/doris.py` around lines 175 - 189, The except block currently
injects raw exception text into execute_result.error and
ReviewResult.errormessage (e.g., f"Exception: {e}") which is serialized to API
clients; change the behavior to log the full exception internally (use
logger.exception or logger.error with exc_info=True) but replace the public
fields with a sanitized message (e.g., "Execution failed" or "Internal execution
error") and optionally a short, non-sensitive error code/ID; update the except
handling in doris.py (the except block that sets execute_result.error and
creates ReviewResult) to stop exposing str(e) and the f"Exception: {e}" string,
and apply the same change to the equivalent except blocks in redis.py,
phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py so all engine
implementations consistently sanitize errors returned to clients while retaining
full exception details in logs.

api_instances/views.py (1)

957-983: ⚠️ Potential issue | 🟠 Major

Don't derive the export path from the escaped database name.

For single-db exports, the file is written using query_engine.escape_string(db_name) but reopened using the raw db_name. Names containing escapable characters will be written to one path and downloaded from another, and the escaped value is also the wrong thing to feed into get_tables_metas_data() when that method already parameterizes internally.

Suggested fix

         try:
             query_engine = get_engine(instance=instance)
             if db_name:
-                databases = [query_engine.escape_string(db_name)]
+                databases = [{"raw": db_name, "query": db_name}]
             elif request.user.is_superuser:
-                databases = query_engine.get_all_databases().rows
+                databases = [
+                    {"raw": database, "query": database}
+                    for database in query_engine.get_all_databases().rows
+                ]
             else:
                 raise PermissionDenied(
                     "Only admins can export dictionary data for a full instance."
                 )
@@
-            for database in databases:
-                table_metas = query_engine.get_tables_metas_data(db_name=database)
+            for database in databases:
+                table_metas = query_engine.get_tables_metas_data(
+                    db_name=database["query"]
+                )
                 context = {
-                    "db_name": database,
+                    "db_name": database["raw"],
                     "tables": table_metas,
                     "export_time": timezone.now(),
                 }
@@
                 full_path = _safe_dictionary_export_path(
-                    export_dir, instance.instance_name, database
+                    export_dir, instance.instance_name, database["raw"]
                 )

Also applies to: 1009-1012

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_instances/views.py` around lines 957 - 983, The code currently uses
query_engine.escape_string(db_name) to build the databases list and then calls
get_tables_metas_data(db_name=database) and _safe_dictionary_export_path with
that escaped value, causing mismatched read/write paths and incorrect
parameterization; instead, keep the original raw db_name in the databases list
(do not call escape_string when db_name was provided), pass the raw database
name into query_engine.get_tables_metas_data and into the context, and only use
a filesystem-safe transformation (or rely on _safe_dictionary_export_path) when
constructing the export file path; apply the same change to the other occurrence
around lines 1009-1012 so all metadata fetches use the raw db name while any
filename/path generation uses a sanitized form.

api_queries/views.py (1)

342-354: ⚠️ Potential issue | 🟠 Major

Close query_engine in QueryDescribe.post.

This method opens an engine connection but never closes it on either the success or error path. Repeated describe requests can leak DB connections until the worker exhausts them.

Suggested fix

-        query_engine = get_engine(instance=instance)
-        db_name = query_engine.escape_string(data["db_name"])
-        schema_name = query_engine.escape_string(data.get("schema_name", ""))
-        tb_name = query_engine.escape_string(data["tb_name"])
-
-        try:
+        query_engine = get_engine(instance=instance)
+        try:
+            db_name = query_engine.escape_string(data["db_name"])
+            schema_name = query_engine.escape_string(data.get("schema_name", ""))
+            tb_name = query_engine.escape_string(data["tb_name"])
             query_result = query_engine.describe_table(
                 db_name=db_name,
                 tb_name=tb_name,
                 schema_name=schema_name,
             )
-        except Exception as exc:
+        except Exception as exc:
             raise serializers.ValidationError({"errors": "Operation failed."})
+        finally:
+            query_engine.close()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_queries/views.py` around lines 342 - 354, QueryDescribe.post opens a DB
engine via get_engine() and calls query_engine.describe_table but never closes
the engine, leaking connections; wrap the describe_table call in a
try/except/finally (or use a context manager if get_engine returns one) and
ensure query_engine.close() (or query_engine.dispose() if that’s the API) is
called in the finally block so the engine is closed on both success and
exception paths; reference the variables/methods: QueryDescribe.post,
get_engine, query_engine, and describe_table.

🧹 Nitpick comments (8)

common/utils/wx_api.py (1)

15-16: Consider consistent exception handling across files.

While not part of this PR's changes, note that wx_api.py still uses the pattern except Exception as e: logger.error(f"...{e}"), which differs from the updated pattern in feishu_api.py that uses logger.exception(). For consistency and to fully align with security hardening goals, consider updating this pattern here as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@common/utils/wx_api.py` around lines 15 - 16, Update the exception handling
in the except block that currently reads "except Exception as e:
logger.error(f'Failed to read WeCom token from cache: {e}')" to use
logger.exception instead so the stack trace is captured; specifically replace
the logger.error call with logger.exception("Failed to read WeCom token from
cache") in the code path that reads the WeCom token from cache (the except block
catching Exception when attempting to read the token).

common/utils/sendmsg.py (2)

223-229: Inconsistent logging in send_feishu_user method.

This method still logs URL, request data, and full response on both success (line 223-225) and failure (lines 227-229), which is inconsistent with the security hardening applied to similar methods like send_feishu_webhook. Consider applying the same log reduction for comprehensive coverage.

♻️ Suggested refactor for consistency

         if r["code"] == 0:
-            logger.debug(
-                f"Feishu direct message sent successfully\nTarget:{url}\nContent:{content}"
-            )
+            logger.debug("Feishu direct message sent successfully")
         else:
-            logger.error(
-                f"Feishu direct message failed\nRequest url:{url}\nRequest data:{data}\nResponse:{r}"
-            )
+            logger.error("Feishu direct message failed: code=%s", r.get("code"))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/utils/sendmsg.py` around lines 223 - 229, In send_feishu_user, remove
or reduce sensitive/request-specific details from the success and failure logs
to match send_feishu_webhook's hardened logging: replace the logger.debug that
prints url and content with a minimal success message (e.g., "Feishu direct
message sent successfully") and change the logger.error to avoid logging raw
request data and full response object (do not print data or r directly); instead
log a concise error summary such as the HTTP status code or r.text/short error
string along with a non-sensitive identifier. Update the
logger.debug/logger.error calls that reference url, data, r, and content
accordingly.

203-203: Consider further log reduction.

While this change removes URL and request data from logs, the full response JSON is still logged on failure. Consider logging only a status indicator (similar to the success case) or a specific error code if the API provides one, to further reduce potential sensitive data exposure.

♻️ Optional refactor for consistency

-            logger.error("Feishu webhook failed: response=%s", r_json)
+            logger.error("Feishu webhook failed: code=%s", r_json.get("code"))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/utils/sendmsg.py` at line 203, The logger.error in sendmsg.py
currently logs the full Feishu response JSON (logger.error("Feishu webhook
failed: response=%s", r_json)); change it to log a minimal status indicator or
specific error code instead: parse the response for a short identifier (e.g.,
r_json.get("code") or r_json.get("status") or r_json.get("message")) and log
that value with a concise message (e.g., "Feishu webhook failed: code=%s"),
avoiding dumping the entire r_json to reduce sensitive data exposure; update the
error logging site in the function handling Feishu webhook responses
accordingly.

api_access/views.py (1)

586-592: Add logging for permission request review audit failures.

The security fix is correct, but this handler should log the failure for debugging. The permission request review is an important audit operation that may need troubleshooting.

♻️ Proposed refactoring

             try:
                 workflow_audit_detail = auditor.operate(
                     action, request.user, data.get("audit_remark", "")
                 )
             except AuditException as exc:
+                logger.warning(
+                    "Permission request review failed for request_id=%s",
+                    request_id,
+                    exc_info=True,
+                )
-                raise serializers.ValidationError({"errors": "Audit failed."})
+                raise serializers.ValidationError({"errors": "Audit failed."}) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_access/views.py` around lines 586 - 592, The except block for
AuditException in the permission review handler currently raises
serializers.ValidationError without logging; update the except AuditException as
exc handler (the block around auditor.operate and before
_permission_request_audit_callback) to log the exception and context (use the
module logger or existing logger variable) — e.g., logger.exception or
logger.error with exc info and include identifying details like request.user,
action, and any audit_remark — then re-raise the serializers.ValidationError as
before.

api_workflows/views.py (3)

1693-1697: Add logging and proper exception chaining for MySQL DDL executor resolution failures.

Same concern as line 1419 - consider adding logging for debugging visibility and using from None for clean exception chaining.

♻️ Proposed refactoring

                 except MysqlDDLExecutorError as exc:
+                    logger.warning(
+                        "MySQL DDL executor resolution failed for workflow %s",
+                        workflow_id,
+                        exc_info=True,
+                    )
                     raise serializers.ValidationError(
                         {"errors": "Operation failed."}
-                    )
+                    ) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_workflows/views.py` around lines 1693 - 1697, When catching
MysqlDDLExecutorError in the executor resolution block, log the exception with
details (e.g., using logger.error or logger.exception and include exc_info or
the exception message) before raising the ValidationError, and raise the
ValidationError with exception chaining suppressed using "from None";
specifically, update the except block handling MysqlDDLExecutorError to log the
caught exc and then raise serializers.ValidationError({"errors": "Operation
failed."}) from None so the resolved_executor.executor_id/selected_executor code
remains reachable with clear debug logs.

1572-1575: Add logging and proper exception chaining for audit operation failures.

The security fix is correct, but this handler lacks logging for operational visibility. Audit operation failures should be logged for troubleshooting, and the exception should use from None for proper chaining.

♻️ Proposed refactoring

         try:
             workflow_audit_detail = auditor.operate(action, user, data["audit_remark"])
         except AuditException as e:
+            logger.warning(
+                "Audit operation failed for workflow %s",
+                workflow_id,
+                exc_info=True,
+            )
-            raise serializers.ValidationError({"errors": "Operation failed."})
+            raise serializers.ValidationError({"errors": "Operation failed."}) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_workflows/views.py` around lines 1572 - 1575, The except block catching
AuditException should log the failure and re-raise a ValidationError with
exception chaining suppressed; specifically, inside the except AuditException as
e: for the call to auditor.operate (and variable workflow_audit_detail) add a
process/logging call (e.g., logger.error) that includes context like action,
user id/name, audit_remark and the exception e, then raise
serializers.ValidationError({"errors": "Operation failed."}) from None so the
original traceback is not chained; ensure you reference auditor.operate,
AuditException, workflow_audit_detail and serializers.ValidationError when
applying the change.

1417-1420: Add logging for MySQL DDL executor preflight failures.

The exception details are correctly sanitized, but unlike other error handlers in this file (e.g., line 1318), there's no logging here. Consider adding a warning log to capture the actual error for debugging purposes.

♻️ Proposed refactoring

             except MysqlDDLExecutorError as exc:
+                logger.warning(
+                    "MySQL DDL executor preflight failed for workflow %s",
+                    workflow.id,
+                    exc_info=True,
+                )
                 raise serializers.ValidationError({"errors": "Operation failed."}) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_workflows/views.py` around lines 1417 - 1420, Add a warning log when
catching MysqlDDLExecutorError before raising the sanitized
serializers.ValidationError: inside the except block for MysqlDDLExecutorError
(the block that currently raises serializers.ValidationError({"errors":
"Operation failed."})), call the module logger (e.g., logger.warning or
logger.warn) including a concise message and the exception details (exc) or
exc_info=True so the real error is recorded for debugging; then re-raise the
same sanitized serializers.ValidationError and leave the subsequent assignment
to selected_executor = resolved_executor.executor_id unchanged.

api_archives/views.py (1)

910-916: Add logging and proper exception chaining for archive review failures.

The security fix is correct, but this handler lacks logging for operational visibility. Archive review operation failures should be logged, and the exception should use from None for proper chaining.

♻️ Proposed refactoring

             try:
                 workflow_audit_detail = auditor.operate(
                     action, request.user, data["audit_remark"]
                 )
             except AuditException as exc:
+                logger.warning(
+                    "Archive review operation failed for archive_id=%s",
+                    archive_id,
+                    exc_info=True,
+                )
-                raise serializers.ValidationError({"errors": "Operation failed."})
+                raise serializers.ValidationError({"errors": "Operation failed."}) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_archives/views.py` around lines 910 - 916, When handling AuditException
from auditor.operate (called to produce workflow_audit_detail using request.user
and data["audit_remark"]), add a process/logging statement that records the
failure with the exception details (include context such as action and
request.user) and then re-raise serializers.ValidationError using "from None" to
suppress chaining; specifically, catch AuditException as exc, call the logger
(e.g., logger.error or processLogger.error) with a clear message and exc info,
then raise serializers.ValidationError({"errors": "Operation failed."}) from
None.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api_admin/settings.py`:
- Around line 365-369: The validate_pt_osc method currently catches ValueError
and re-raises serializers.ValidationError without suppressing the original
exception chain; update the exception raise to use exception chaining
suppression (i.e., raise serializers.ValidationError("Operation failed.") from
None) similar to validate_gh_ost so the original ValueError context is not
preserved. Ensure the change is applied inside validate_pt_osc's except
ValueError block.

In `@api_instances/views.py`:
- Around line 153-159: _safe_dictionary_export_path currently produces a lossy,
truncated safe_name from f"{instance_name}_{db_name}", which can cause distinct
pairs to map to the same filename and silently overwrite exports; fix it by
incorporating a deterministic short fingerprint (e.g., a hex digest or first N
chars of a hash of the original f"{instance_name}_{db_name}") into the filename
before truncation so uniqueness is preserved, then rebuild full_path and retain
the existing os.path.commonpath check to prevent escaping base_dir; update any
validation that returns "" to handle collisions (e.g., always generate the
hashed filename) and reference the function name _safe_dictionary_export_path
and the variables safe_name and full_path when making the change.

In `@api_workflows/views.py`:
- Around line 1317-1319: The except block currently logs the original exception
with logger.error("Failed to load rollback SQL", exc_info=True) and then
re-raises a generic serializers.ValidationError; change the re-raise to suppress
exception chaining by using "raise serializers.ValidationError({'errors':
'Operation failed.'}) from None" so that the original exception context is not
attached; update the raise site where serializers.ValidationError is thrown in
the except Exception as exc: block (near the logger.error call) to use the "from
None" form.

In `@sql/engines/mysql.py`:
- Around line 585-592: The check for mysql.user fails because backticks are
replaced with spaces leaving "mysql. user" so the literal "mysql.user" lookup
misses it; update the normalization before computing references_mysql_user to
collapse whitespace around dots (e.g. normalize dots with a regex like
re.sub(r"\s*\.\s*", ".", normalized_sql)) so "mysql.`user`" and "mysql. user"
both become "mysql.user"; keep using normalized_sql, references_mysql_user,
references_user_table and db_name for the subsequent logic.

In `@sql/plugins/plugin.py`:
- Around line 77-78: The regex safety check currently in generate_args2cmd
(SAFE_ARGUMENT_RE.fullmatch on arg) can raise ValueError bypassing the expected
{status,msg} validation flow; move the argument-name validation into
check_args() so malformed option names are caught there and returned as the
standard validation response, or alternatively catch the invalid-name case
inside generate_args2cmd and convert it into the same {status,msg} error object;
update check_args() to iterate the same argument keys validated by
generate_args2cmd (using the same SAFE_ARGUMENT_RE) and return a normalized
failure, or wrap generate_args2cmd to return {status:"error", msg:...} instead
of raising ValueError.

In `@sql/plugins/tests.py`:
- Around line 146-153: The test's Popen assertions are too loose for
stdout/stderr; instead of assertIsNotNone use exact equality to ensure
execute_cmd() still pipes output: assertEqual(popen_kwargs["stdout"],
subprocess.PIPE) and assertEqual(popen_kwargs["stderr"], subprocess.PIPE); keep
the other assertions (mock_subprocess.Popen.assert_called_once(), shell False,
universal_newlines True, and the argv checks) intact to strictly validate the
behavior of execute_cmd() when invoking mock_subprocess.Popen.

---

Outside diff comments:
In @.github/workflows/django.yml:
- Around line 95-102: The workflow currently hardcodes SECRET_KEY in the "Run
Tests" job which leaks a secret; update the env usage for the Run Tests step to
read the key from GitHub Actions secrets (e.g. use secrets.SECRET_KEY) instead
of the literal value, or remove the env entry if tests do not require it; ensure
the repository secret is added in Settings > Secrets before referencing it and
update any references in the workflow that rely on SECRET_KEY accordingly.
- Around line 70-76: The readiness loop that runs "until docker exec mysql
mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do ... sleep 2 done" has no
timeout and can hang indefinitely; modify this loop to introduce a max attempts
counter or overall timeout (e.g., MAX_ATTEMPTS and an increment each iteration)
and on exceeding it print docker logs for the "mysql" container and exit
non‑zero; implement the counter/check within the same block that currently
checks container running state so the workflow fails fast with diagnostic logs
when MySQL never becomes ready.

In `@api_admin/settings.py`:
- Around line 359-363: In validate_gh_ost, when catching the ValueError from
validate_binary_path, re-raise the serializers.ValidationError using exception
chaining suppression (i.e., "raise serializers.ValidationError('Operation
failed.') from None") so the original ValueError context isn't preserved; update
the except block in the validate_gh_ost method accordingly and run Black/ruff to
ensure formatting and B904 compliance.

In `@api_instances/views.py`:
- Around line 957-983: The code currently uses
query_engine.escape_string(db_name) to build the databases list and then calls
get_tables_metas_data(db_name=database) and _safe_dictionary_export_path with
that escaped value, causing mismatched read/write paths and incorrect
parameterization; instead, keep the original raw db_name in the databases list
(do not call escape_string when db_name was provided), pass the raw database
name into query_engine.get_tables_metas_data and into the context, and only use
a filesystem-safe transformation (or rely on _safe_dictionary_export_path) when
constructing the export file path; apply the same change to the other occurrence
around lines 1009-1012 so all metadata fetches use the raw db name while any
filename/path generation uses a sanitized form.

In `@api_queries/views.py`:
- Around line 342-354: QueryDescribe.post opens a DB engine via get_engine() and
calls query_engine.describe_table but never closes the engine, leaking
connections; wrap the describe_table call in a try/except/finally (or use a
context manager if get_engine returns one) and ensure query_engine.close() (or
query_engine.dispose() if that’s the API) is called in the finally block so the
engine is closed on both success and exception paths; reference the
variables/methods: QueryDescribe.post, get_engine, query_engine, and
describe_table.

In `@common/check.py`:
- Around line 177-182: In validate_file_storage_payload(), avoid interpolating
the exception string into the log; change the logging call that currently builds
error_msg using str(e) to a fixed, non-sensitive message (e.g. "Storage
connectivity test failed") and pass exc_info=True so the stack trace is recorded
without embedding potentially sensitive exception content; keep the generic
client-facing result["msg"] and result["status"] as-is.
- Around line 91-117: In validate_email_payload(), avoid returning raw
sender_response to the client when sender_response != "success" because it may
contain sensitive SMTP/exception details; instead set result["msg"] to a generic
failure message (e.g., "Email delivery failed; check server logs") and record
the detailed sender_response in a server-side log or diagnostics (use the module
logger or error-tracking facility) linked to the send operation
(sender.send_email / sender_response) for debugging. Ensure result["status"]
remains non-zero and do not include sender_response in the API response.
- Around line 37-75: The code in validate_go_inception_payload builds
result["msg"] using str(e) after MySQLdb.connect failures, leaking exception
details; change both places (the exception handlers around MySQLdb.connect
calls) to set result["msg"] to a fixed, user-safe string (e.g., "Unable to
connect to goInception" and "Unable to connect to goInception backup database")
without including str(e), while keeping logger.error(traceback.format_exc()) as
the place that records full error details; update the exception blocks for the
first connect (around MySQLdb.connect / cur = conn.cursor()) and the second
backup connect to remove the inclusion of str(e) in the returned message.
- Around line 200-218: The instance() view currently exposes internal error
strings and calls Instance.objects.get(id=instance_id) without guarding against
missing/invalid IDs; change it to validate and safely fetch the instance (catch
Instance.DoesNotExist and invalid ID cases) and always return a generic client
message like "Unable to connect to instance" or "Instance not found" instead of
test_result.error or str(e). Log full details server-side (use Python logging
and logging.exception or logger.error with the exception) when
get_engine(instance=instance) or engine.test_connection() fails, and when
Instance.objects.get fails, but ensure the HttpResponse JSON only contains
non-sensitive generic messages and status codes. Keep references: instance(),
Instance.objects.get, get_engine(...), engine.test_connection().
- Around line 120-165: The validate_file_storage_payload function currently
validates max_export_rows but never includes it in the returned config and also
uses storage_type without presence/allowlist checks; update
validate_file_storage_payload to (1) require and validate storage_type against
an allowlist (e.g., check storage_type is one of expected values before using
custom_param_key and return an error if missing/invalid) and (2) include the
validated max_export_rows (converted to int) in config_dict (e.g., add
"max_export_rows": int(max_export_rows)) so DynamicStorage receives the limit;
reference validate_file_storage_payload, storage_type, custom_param_key,
max_export_rows, config_dict, and DynamicStorage when making changes.

In `@sql/engines/doris.py`:
- Around line 175-189: The except block currently injects raw exception text
into execute_result.error and ReviewResult.errormessage (e.g., f"Exception:
{e}") which is serialized to API clients; change the behavior to log the full
exception internally (use logger.exception or logger.error with exc_info=True)
but replace the public fields with a sanitized message (e.g., "Execution failed"
or "Internal execution error") and optionally a short, non-sensitive error
code/ID; update the except handling in doris.py (the except block that sets
execute_result.error and creates ReviewResult) to stop exposing str(e) and the
f"Exception: {e}" string, and apply the same change to the equivalent except
blocks in redis.py, phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py so
all engine implementations consistently sanitize errors returned to clients
while retaining full exception details in logs.

In `@sql/resource_group.py`:
- Line 3: Remove the unused import of the traceback module: delete the line that
reads "import traceback" so there are no unused imports left in this module
(ensure no other code references traceback before removing). Run linters/tests
to confirm no further import warnings.

---

Nitpick comments:
In `@api_access/views.py`:
- Around line 586-592: The except block for AuditException in the permission
review handler currently raises serializers.ValidationError without logging;
update the except AuditException as exc handler (the block around
auditor.operate and before _permission_request_audit_callback) to log the
exception and context (use the module logger or existing logger variable) —
e.g., logger.exception or logger.error with exc info and include identifying
details like request.user, action, and any audit_remark — then re-raise the
serializers.ValidationError as before.

In `@api_archives/views.py`:
- Around line 910-916: When handling AuditException from auditor.operate (called
to produce workflow_audit_detail using request.user and data["audit_remark"]),
add a process/logging statement that records the failure with the exception
details (include context such as action and request.user) and then re-raise
serializers.ValidationError using "from None" to suppress chaining;
specifically, catch AuditException as exc, call the logger (e.g., logger.error
or processLogger.error) with a clear message and exc info, then raise
serializers.ValidationError({"errors": "Operation failed."}) from None.

In `@api_workflows/views.py`:
- Around line 1693-1697: When catching MysqlDDLExecutorError in the executor
resolution block, log the exception with details (e.g., using logger.error or
logger.exception and include exc_info or the exception message) before raising
the ValidationError, and raise the ValidationError with exception chaining
suppressed using "from None"; specifically, update the except block handling
MysqlDDLExecutorError to log the caught exc and then raise
serializers.ValidationError({"errors": "Operation failed."}) from None so the
resolved_executor.executor_id/selected_executor code remains reachable with
clear debug logs.
- Around line 1572-1575: The except block catching AuditException should log the
failure and re-raise a ValidationError with exception chaining suppressed;
specifically, inside the except AuditException as e: for the call to
auditor.operate (and variable workflow_audit_detail) add a process/logging call
(e.g., logger.error) that includes context like action, user id/name,
audit_remark and the exception e, then raise
serializers.ValidationError({"errors": "Operation failed."}) from None so the
original traceback is not chained; ensure you reference auditor.operate,
AuditException, workflow_audit_detail and serializers.ValidationError when
applying the change.
- Around line 1417-1420: Add a warning log when catching MysqlDDLExecutorError
before raising the sanitized serializers.ValidationError: inside the except
block for MysqlDDLExecutorError (the block that currently raises
serializers.ValidationError({"errors": "Operation failed."})), call the module
logger (e.g., logger.warning or logger.warn) including a concise message and the
exception details (exc) or exc_info=True so the real error is recorded for
debugging; then re-raise the same sanitized serializers.ValidationError and
leave the subsequent assignment to selected_executor =
resolved_executor.executor_id unchanged.

In `@common/utils/sendmsg.py`:
- Around line 223-229: In send_feishu_user, remove or reduce
sensitive/request-specific details from the success and failure logs to match
send_feishu_webhook's hardened logging: replace the logger.debug that prints url
and content with a minimal success message (e.g., "Feishu direct message sent
successfully") and change the logger.error to avoid logging raw request data and
full response object (do not print data or r directly); instead log a concise
error summary such as the HTTP status code or r.text/short error string along
with a non-sensitive identifier. Update the logger.debug/logger.error calls that
reference url, data, r, and content accordingly.
- Line 203: The logger.error in sendmsg.py currently logs the full Feishu
response JSON (logger.error("Feishu webhook failed: response=%s", r_json));
change it to log a minimal status indicator or specific error code instead:
parse the response for a short identifier (e.g., r_json.get("code") or
r_json.get("status") or r_json.get("message")) and log that value with a concise
message (e.g., "Feishu webhook failed: code=%s"), avoiding dumping the entire
r_json to reduce sensitive data exposure; update the error logging site in the
function handling Feishu webhook responses accordingly.

In `@common/utils/wx_api.py`:
- Around line 15-16: Update the exception handling in the except block that
currently reads "except Exception as e: logger.error(f'Failed to read WeCom
token from cache: {e}')" to use logger.exception instead so the stack trace is
captured; specifically replace the logger.error call with
logger.exception("Failed to read WeCom token from cache") in the code path that
reads the WeCom token from cache (the except block catching Exception when
attempting to read the token).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0913dcb4-3f84-496d-881d-d521c4406631

📥 Commits

Reviewing files that changed from the base of the PR and between a1f7fcd and f4cb623.

📒 Files selected for processing (27)

.github/workflows/black.yml
.github/workflows/codeql-analysis.yml
.github/workflows/django.yml
api_access/views.py
api_admin/settings.py
api_archives/views.py
api_core/views.py
api_instances/views.py
api_queries/views.py
api_users/views.py
api_workflows/views.py
common/check.py
common/utils/feishu_api.py
common/utils/sendmsg.py
common/utils/wx_api.py
sql/data_dictionary.py
sql/engines/doris.py
sql/engines/goinception.py
sql/engines/mongo.py
sql/engines/mysql.py
sql/instance.py
sql/offlinedownload.py
sql/plugins/plugin.py
sql/plugins/pt_archiver.py
sql/plugins/tests.py
sql/resource_group.py
sql/views.py

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

sql/plugins/plugin.py (1)

123-125: ⚠️ Potential issue | 🟠 Major

Keep the wrapped error generic here too.

This still logs the full traceback and re-embeds str(e) into the raised RuntimeError, so path/OS details can leak right back into logs or API-facing errors. That undercuts the hardening goal of this PR.

🔐 Proposed fix

-        except Exception as e:
-            logger.error("Command execution failed\n{}".format(traceback.format_exc()))
-            raise RuntimeError("Command execution failed, reason: %s" % str(e))
+        except Exception:
+            logger.error("Command execution failed")
+            raise RuntimeError("Command execution failed")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/plugins/plugin.py` around lines 123 - 125, Replace the detailed traceback
logging and error embedding with a generic message: stop using
traceback.format_exc() in the logger and remove str(e) from the raised
RuntimeError. Specifically, update the except block that currently calls
logger.error("Command execution failed\n{}".format(traceback.format_exc())) and
raises RuntimeError("Command execution failed, reason: %s" % str(e)) so it
instead logs a generic message via logger.error("Command execution failed") (no
traceback) and raises a generic RuntimeError("Command execution failed") without
including the original exception details; if you need to suppress exception
chaining, raise the RuntimeError using "from None".

api_queries/views.py (1)

224-233: ⚠️ Potential issue | 🟠 Major

Stop logging raw SQL in the masking fallback.

When query_check is off, this branch still writes the full sql_content and exception text to the application log. That can leak secrets or PII from ad-hoc queries and undercuts the PR’s security-hardening goal.

🔐 Suggested fix

                 except Exception as exc:
                     logger.exception("Data masking failed")
                     if config.get("query_check"):
                         error_message = "Data masking error, contact admin."
                     else:
-                        logger.warning(
-                            "Data masking error, allowed by config. SQL: %s error: %s",
-                            sql_content,
-                            str(exc),
-                        )
+                        logger.warning(
+                            "Data masking error, allowed by config. Returning unmasked result."
+                        )
                         query_result.error = None
                         result_data = query_result.__dict__.copy()
                         result_data["rows"] = query_result.to_dict()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_queries/views.py` around lines 224 - 233, The current masking-exception
fallback logs raw SQL and exception text via logger.warning using sql_content
and str(exc) when config.get("query_check") is false; change this to avoid
leaking queries by removing sql_content and exc from the log message and instead
log a generic, non-sensitive warning (optionally include a safe fingerprint such
as a query hash or a fixed identifier) and set error_message appropriately;
update the block that calls logger.exception("Data masking failed") /
logger.warning(...) and references sql_content to only emit non-sensitive
information while retaining the existing control flow around
config.get("query_check") and error_message.

api_instances/views.py (1)

2480-2518: ⚠️ Potential issue | 🟠 Major

query_engine still leaks in InstanceResource.get.

This endpoint opens an engine for every request but never closes it. The new generic error handling changed this block, but unlike the neighboring handlers it still has no finally, so both successful and failed requests can leave database connections hanging.

♻️ Suggested fix

         try:
             query_engine = get_engine(instance=instance)
             db_name = query_engine.escape_string(db_name)
             schema_name = query_engine.escape_string(schema_name)
             tb_name = query_engine.escape_string(tb_name)
             if resource_type == "database":
                 resource = query_engine.get_all_databases()
                 resource.rows = filter_db_list(
                     db_list=resource.rows,
                     db_name_regex=query_engine.instance.show_db_name_regex,
                     is_match_regex=True,
                 )
                 resource.rows = filter_db_list(
                     db_list=resource.rows,
                     db_name_regex=query_engine.instance.denied_db_name_regex,
                     is_match_regex=False,
                 )
             elif resource_type == "schema" and db_name:
                 resource = query_engine.get_all_schemas(db_name=db_name)
             elif resource_type == "table" and db_name:
                 resource = query_engine.get_all_tables(
                     db_name=db_name, schema_name=schema_name
                 )
             elif resource_type == "column" and db_name and tb_name:
                 resource = query_engine.get_all_columns_by_tb(
                     db_name=db_name, tb_name=tb_name, schema_name=schema_name
                 )
             else:
                 raise serializers.ValidationError(
                     {"errors": "Unsupported resource type or incomplete parameters."}
                 )
-        except Exception as msg:
-            raise serializers.ValidationError({"errors": "Operation failed."})
+        except Exception:
+            raise serializers.ValidationError({"errors": "Operation failed."}) from None
+        finally:
+            if "query_engine" in locals():
+                query_engine.close()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_instances/views.py` around lines 2480 - 2518, InstanceResource.get opens
a query_engine but never closes it, leaking connections; update the
try/except/else to include a finally that safely closes the engine (in
InstanceResource.get / get_engine usage). After the existing else block (or
surrounding the entire try), add a finally that checks if query_engine is
defined and then calls its close/dispose method (e.g., call query_engine.close()
or query_engine.dispose() depending on the engine implementation), wrapping that
call in a small try/except to avoid masking original errors.

sql/engines/mssql.py (1)

771-785: ⚠️ Potential issue | 🔴 Critical

Abort immediately when USE [db_name] fails.

This branch records the failure but still falls through to execute all_statements. If the database switch failed, the rest of the batch can run in the connection’s previous/default database.

Suggested fix

             except Exception as e:
                 logger.warning("MSSQL USE statement failed", exc_info=True)
                 execute_result.error = "Execution failed"
                 execute_result.rows.append(
                     ReviewResult(
@@
                         execute_time=0,
                     )
                 )
                 rowid += 1
+                for remaining_statement in all_statements:
+                    execute_result.rows.append(
+                        ReviewResult(
+                            id=rowid,
+                            errlevel=0,
+                            stagestatus="Audit completed",
+                            errormessage="Database switch failed, statement not executed",
+                            sql=remaining_statement,
+                            affected_rows=0,
+                            execute_time=0,
+                        )
+                    )
+                    rowid += 1
+                if close_conn:
+                    self.close()
+                return execute_result

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/engines/mssql.py` around lines 771 - 785, The catch block for the `USE`
statement currently records the failure but continues to run `all_statements`;
update the `except Exception as e` branch that handles `use_sql` to abort
immediately by stopping further execution and returning the current
`execute_result` (or raising) after populating it (the same fields you already
set on `execute_result` and pushing the `ReviewResult`); ensure you reference
the `use_sql` handling block and prevent processing `all_statements` or any
subsequent logic when the DB switch fails.

sql/engines/phoenix.py (1)

176-206: ⚠️ Potential issue | 🟠 Major

Advance the skipped-statement IDs after a failure.

Because break skips Line 203, the first "Previous statement failed" row reuses the failed statement’s id. That produces duplicate statement IDs in one ReviewSet.

Suggested fix

         if execute_result.error:
             # If failed, append remaining statements as not executed.
             for statement in split_sql[rowid:]:
+                rowid += 1
                 execute_result.rows.append(
                     ReviewResult(
                         id=rowid,
                         errlevel=2,
                         stagestatus="Execute Failed",
@@
                         execute_time=0,
                     )
                 )
-                rowid += 1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/engines/phoenix.py` around lines 176 - 206, In the except block where a
failed ReviewResult is appended (uses rowid), ensure subsequent skipped
"Previous statement failed" rows don’t reuse that same id by advancing rowid
after recording the failure; specifically, increment rowid (the loop counter
used for IDs) immediately after appending the failure ReviewResult in the except
handler (or adjust the final loop over split_sql to start IDs from rowid+1) so
each ReviewResult created for split_sql keeps a unique id.

sql/engines/cassandra.py (1)

218-247: ⚠️ Potential issue | 🟠 Major

Fix the duplicated ReviewResult.id values after a failed statement.

On this path, break skips Line 233, so the first "Previous statement failed, not executed" row is appended with the same id as the failed statement.

Suggested fix

         if execute_result.error:
             for statement in sql_result[rowid:]:
+                rowid += 1
                 execute_result.rows.append(
                     ReviewResult(
                         id=rowid,
                         errlevel=2,
                         stagestatus="Execute Failed",
@@
                         affected_rows=0,
                         execute_time=0,
                     )
                 )
-                rowid += 1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/engines/cassandra.py` around lines 218 - 247, When an exception occurs in
the try/except inside execute_result handling, the code breaks out of the loop
without incrementing rowid, causing the subsequent loop that appends "Previous
statement failed, not executed" ReviewResult entries to reuse the same id as the
failed statement; fix by incrementing rowid immediately after creating the
failed ReviewResult in the except block (the block that sets
execute_result.error and appends ReviewResult) so that the later loop over
sql_result[rowid:] creates ReviewResult entries with correct, unique ids; update
the except path where execute_result.error is set and ReviewResult is appended
to increment rowid afterwards.

🧹 Nitpick comments (3)

common/check.py (2)

180-183: Use the module logger consistently.

This block mixes logging.* and logger.*; keep one logger path for consistent handlers/format/filters.

🔧 Proposed fix

-            logging.error("Storage connectivity test failed")
-            logging.debug("Storage connectivity test returned an error.")
+            logger.error("Storage connectivity test failed")
+            logger.debug("Storage connectivity test returned an error.")
         else:
-            logging.info("Storage connectivity test succeeded")
+            logger.info("Storage connectivity test succeeded")

Also applies to: 187-187

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 180 - 183, The code block uses the root logging
calls (logging.error, logging.debug, logging.info) instead of the module-level
logger; update the storage connectivity messages to use the module logger
(replace logging.error, logging.debug, logging.info with logger.error,
logger.debug, logger.info) so handlers/format/filters remain consistent—also
make the same replacement for the other occurrence mentioned around line 187
(where logging.* is used).

49-53: Narrow overly broad exception handling for DB connectivity checks.

except Exception on lines 49 and 68 catches all exceptions, hiding non-connection defects and triggering BLE001. Replace with specific exception types and use logger.exception(...) to properly log the exception context:

🔧 Proposed fix

-    except Exception:
-        logger.error(traceback.format_exc())
+    except (MySQLdb.Error, ValueError, TypeError):
+        logger.exception("goInception connection test failed")
         result["status"] = 1
         result["msg"] = "Unable to connect to goInception"
         return result
@@
-    except Exception:
-        logger.error(traceback.format_exc())
+    except (MySQLdb.Error, ValueError, TypeError):
+        logger.exception("goInception backup connection test failed")
         result["status"] = 1
         result["msg"] = "Unable to connect to goInception backup database"

Also applies to: 68-72

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 49 - 53, Replace the broad "except Exception"
blocks that wrap the goInception connectivity checks with specific
DB/connectivity exception types from the DB client you use (e.g.,
OperationalError, InterfaceError, DatabaseError or the driver's connection
exceptions) and call logger.exception(...) instead of
logger.error(traceback.format_exc()) so the stack/exception context is
preserved; keep the existing result["status"] = 1 and result["msg"] = "Unable to
connect to goInception" behavior. Apply the same change to both exception
handlers (the blocks that currently use traceback.format_exc()) so only
connection-related exceptions are caught and other errors propagate.

api_queries/views.py (1)

832-837: Log audit-operation failures here too.

This now hides the exception from the client, but unlike the sibling review endpoints it also drops the server-side trail entirely. Please log apply_id, user, and action before returning the generic "Audit failed." response.

🪵 Suggested fix

-            except AuditException as exc:
-                raise serializers.ValidationError({"errors": "Audit failed."})
+            except AuditException:
+                logger.exception(
+                    "Query privilege audit failed for apply_id=%s user=%s action=%s",
+                    data["apply_id"],
+                    request.user,
+                    audit_status,
+                )
+                raise serializers.ValidationError({"errors": "Audit failed."}) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api_queries/views.py` around lines 832 - 837, The except block swallowing
AuditException needs to log the failure details before raising the generic
ValidationError; in the except AuditException as exc: handler for
auditor.operate(...) add a server-side log call (using your existing logger)
that records apply_id (from data or context), request.user, and the attempted
action/audit_status plus exc, then re-raise
serializers.ValidationError({"errors": "Audit failed."}) as before; reference
the auditor.operate call and the AuditException/serializers.ValidationError to
locate the correct spot.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/check.py`:
- Around line 115-117: The logger.error call currently logs the raw
sender_response (variable sender_response) which may contain sensitive provider
diagnostics; replace this with a sanitized summary: extract and log only
non-sensitive fields (e.g., status_code, error_code, and a short message) or a
generic message plus a hashed/trimmed identifier, and avoid printing the full
sender_response; update the error call in common/check.py (the logger.error
line) to use the sanitized_summary (or logger.debug the full response behind a
feature flag) so production logs do not contain raw backend responses.

In `@common/utils/sendmsg.py`:
- Around line 220-224: The requests.post calls in sendmsg.py (notably the call
assigning response around the variable response and the earlier call near line
194) lack an explicit timeout and can hang; update both requests.post
invocations to include timeout=5 (consistent with other calls in this file) so
the Feishu API requests use a bounded 5-second timeout—keep the same url,
json/data and headers parameters and simply add timeout=5 to each requests.post
call.

In `@sql/engines/mysql.py`:
- Around line 585-592: The current check flags any standalone "user" token
anywhere in the SQL (e.g., SELECT user FROM other_table); change
references_user_table to only detect "user" when it appears as a table reference
by matching SQL keywords that introduce table names (e.g., FROM, JOIN, INTO,
UPDATE, DELETE) rather than any token. Update the logic around normalized_sql,
references_mysql_user, and references_user_table so references_user_table uses a
regex search like looking for patterns
(from|join|into|update|delete)\s+`?user`?\b (case-insensitive) against
normalized_sql, and keep the existing references_mysql_user check for
"mysql.user"; then use the combined condition if references_mysql_user or
(db_name == "mysql" and references_user_table).

In `@sql/engines/oracle.py`:
- Around line 1234-1248: The except block currently masks all errors as
"Execution failed"; update it to preserve the earlier safe compile diagnostic
carried in the raised exception (e) by assigning execute_result.error = str(e)
(or e.args[0] if present) and use that same string when constructing the
ReviewResult errormessage instead of the constant "Execution failed", while
keeping the logger.warning(..., exc_info=True) and existing ReviewResult fields;
reference the except block handling workflow.id, execute_result, ReviewResult
and the earlier raised Exception(stagestatus) so the sanitized stagestatus
message is propagated into execute_result.error and ReviewResult.errormessage.

---

Outside diff comments:
In `@api_instances/views.py`:
- Around line 2480-2518: InstanceResource.get opens a query_engine but never
closes it, leaking connections; update the try/except/else to include a finally
that safely closes the engine (in InstanceResource.get / get_engine usage).
After the existing else block (or surrounding the entire try), add a finally
that checks if query_engine is defined and then calls its close/dispose method
(e.g., call query_engine.close() or query_engine.dispose() depending on the
engine implementation), wrapping that call in a small try/except to avoid
masking original errors.

In `@api_queries/views.py`:
- Around line 224-233: The current masking-exception fallback logs raw SQL and
exception text via logger.warning using sql_content and str(exc) when
config.get("query_check") is false; change this to avoid leaking queries by
removing sql_content and exc from the log message and instead log a generic,
non-sensitive warning (optionally include a safe fingerprint such as a query
hash or a fixed identifier) and set error_message appropriately; update the
block that calls logger.exception("Data masking failed") / logger.warning(...)
and references sql_content to only emit non-sensitive information while
retaining the existing control flow around config.get("query_check") and
error_message.

In `@sql/engines/cassandra.py`:
- Around line 218-247: When an exception occurs in the try/except inside
execute_result handling, the code breaks out of the loop without incrementing
rowid, causing the subsequent loop that appends "Previous statement failed, not
executed" ReviewResult entries to reuse the same id as the failed statement; fix
by incrementing rowid immediately after creating the failed ReviewResult in the
except block (the block that sets execute_result.error and appends ReviewResult)
so that the later loop over sql_result[rowid:] creates ReviewResult entries with
correct, unique ids; update the except path where execute_result.error is set
and ReviewResult is appended to increment rowid afterwards.

In `@sql/engines/mssql.py`:
- Around line 771-785: The catch block for the `USE` statement currently records
the failure but continues to run `all_statements`; update the `except Exception
as e` branch that handles `use_sql` to abort immediately by stopping further
execution and returning the current `execute_result` (or raising) after
populating it (the same fields you already set on `execute_result` and pushing
the `ReviewResult`); ensure you reference the `use_sql` handling block and
prevent processing `all_statements` or any subsequent logic when the DB switch
fails.

In `@sql/engines/phoenix.py`:
- Around line 176-206: In the except block where a failed ReviewResult is
appended (uses rowid), ensure subsequent skipped "Previous statement failed"
rows don’t reuse that same id by advancing rowid after recording the failure;
specifically, increment rowid (the loop counter used for IDs) immediately after
appending the failure ReviewResult in the except handler (or adjust the final
loop over split_sql to start IDs from rowid+1) so each ReviewResult created for
split_sql keeps a unique id.

In `@sql/plugins/plugin.py`:
- Around line 123-125: Replace the detailed traceback logging and error
embedding with a generic message: stop using traceback.format_exc() in the
logger and remove str(e) from the raised RuntimeError. Specifically, update the
except block that currently calls logger.error("Command execution
failed\n{}".format(traceback.format_exc())) and raises RuntimeError("Command
execution failed, reason: %s" % str(e)) so it instead logs a generic message via
logger.error("Command execution failed") (no traceback) and raises a generic
RuntimeError("Command execution failed") without including the original
exception details; if you need to suppress exception chaining, raise the
RuntimeError using "from None".

---

Nitpick comments:
In `@api_queries/views.py`:
- Around line 832-837: The except block swallowing AuditException needs to log
the failure details before raising the generic ValidationError; in the except
AuditException as exc: handler for auditor.operate(...) add a server-side log
call (using your existing logger) that records apply_id (from data or context),
request.user, and the attempted action/audit_status plus exc, then re-raise
serializers.ValidationError({"errors": "Audit failed."}) as before; reference
the auditor.operate call and the AuditException/serializers.ValidationError to
locate the correct spot.

In `@common/check.py`:
- Around line 180-183: The code block uses the root logging calls
(logging.error, logging.debug, logging.info) instead of the module-level logger;
update the storage connectivity messages to use the module logger (replace
logging.error, logging.debug, logging.info with logger.error, logger.debug,
logger.info) so handlers/format/filters remain consistent—also make the same
replacement for the other occurrence mentioned around line 187 (where logging.*
is used).
- Around line 49-53: Replace the broad "except Exception" blocks that wrap the
goInception connectivity checks with specific DB/connectivity exception types
from the DB client you use (e.g., OperationalError, InterfaceError,
DatabaseError or the driver's connection exceptions) and call
logger.exception(...) instead of logger.error(traceback.format_exc()) so the
stack/exception context is preserved; keep the existing result["status"] = 1 and
result["msg"] = "Unable to connect to goInception" behavior. Apply the same
change to both exception handlers (the blocks that currently use
traceback.format_exc()) so only connection-related exceptions are caught and
other errors propagate.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4d645e01-15c8-473c-a704-c0544e39e1db

📥 Commits

Reviewing files that changed from the base of the PR and between f4cb623 and 4d2c06d.

📒 Files selected for processing (21)

.github/workflows/django.yml
api_access/views.py
api_admin/settings.py
api_archives/views.py
api_instances/views.py
api_queries/views.py
api_workflows/views.py
common/check.py
common/utils/sendmsg.py
common/utils/wx_api.py
sql/engines/cassandra.py
sql/engines/doris.py
sql/engines/mssql.py
sql/engines/mysql.py
sql/engines/oracle.py
sql/engines/pgsql.py
sql/engines/phoenix.py
sql/engines/redis.py
sql/plugins/plugin.py
sql/plugins/tests.py
sql/resource_group.py

🚧 Files skipped from review as they are similar to previous changes (4)

api_admin/settings.py
.github/workflows/django.yml
common/utils/wx_api.py
api_workflows/views.py

coderabbitai · 2026-04-28T07:37:34Z

+        result["msg"] = "Email delivery failed; check server logs."
+        logger.error("Email delivery test failed: %s", sender_response)
        return result


⚠️ Potential issue | 🟠 Major

Avoid logging raw email backend responses.

Line 116 logs sender_response verbatim, which can include provider/internal diagnostics and expand sensitive log surface. Keep client message generic (good), but sanitize server logs too.

🔧 Proposed fix

- logger.error("Email delivery test failed: %s", sender_response) + logger.error("Email delivery test failed")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

result["msg"] = "Email delivery failed; check server logs."

logger.error("Email delivery test failed: %s", sender_response)

return result

result["msg"] = "Email delivery failed; check server logs."

logger.error("Email delivery test failed")

return result

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 115 - 117, The logger.error call currently logs the raw sender_response (variable sender_response) which may contain sensitive provider diagnostics; replace this with a sanitized summary: extract and log only non-sensitive fields (e.g., status_code, error_code, and a short message) or a generic message plus a hashed/trimmed identifier, and avoid printing the full sender_response; update the error call in common/check.py (the logger.error line) to use the sanitized_summary (or logger.debug the full response behind a feature flag) so production logs do not contain raw backend responses.

coderabbitai · 2026-04-28T07:37:35Z

+        response = requests.post(
            url=url,
            json=data,
            headers={"Authorization": "Bearer " + get_feishu_access_token()},
-        ).json()
+        )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash python - <<'PY' import ast from pathlib import Path HTTP_METHODS = {"get", "post", "put", "patch", "delete", "head", "options", "request"} for path in Path(".").rglob("*.py"): try: src = path.read_text(encoding="utf-8") tree = ast.parse(src) except Exception: continue for node in ast.walk(tree): if isinstance(node, ast.Call) and isinstance(node.func, ast.Attribute): base = node.func.value if isinstance(base, ast.Name) and base.id == "requests" and node.func.attr in HTTP_METHODS: has_timeout = any(k.arg == "timeout" for k in node.keywords if k.arg is not None) if not has_timeout: print(f"{path}:{node.lineno} -> requests.{node.func.attr} without timeout") PY

Repository: jruszo/Datamingle

Length of output: 293

🏁 Script executed:

cat -n common/utils/sendmsg.py | sed -n '210,230p'

Repository: jruszo/Datamingle

Length of output: 951

🏁 Script executed:

# Check for timeout patterns in the entire sendmsg.py file rg -n "timeout" common/utils/sendmsg.py

Repository: jruszo/Datamingle

Length of output: 381

🏁 Script executed:

cat -n common/utils/sendmsg.py | sed -n '190,200p'

Repository: jruszo/Datamingle

Length of output: 445

Add explicit timeout to Feishu API requests.

Lines 194 and 220 use requests.post(...) without timeout; these can block indefinitely on network hangs and tie up worker threads. Use a bounded timeout of 5 seconds, consistent with other requests calls in this file (lines 149, 171).

🔧 Proposed fixes

Line 194:

- r = requests.post(url=url, json=data) + r = requests.post(url=url, json=data, timeout=5)

Line 220:

response = requests.post( url=url, json=data, headers={"Authorization": "Bearer " + get_feishu_access_token()}, + timeout=5, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

response = requests.post(

url=url,

json=data,

headers={"Authorization": "Bearer " + get_feishu_access_token()},

).json()

)

response = requests.post(

url=url,

json=data,

headers={"Authorization": "Bearer " + get_feishu_access_token()},

timeout=5,

)

🧰 Tools

🪛 Ruff (0.15.12)

[error] 220-220: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@common/utils/sendmsg.py` around lines 220 - 224, The requests.post calls in sendmsg.py (notably the call assigning response around the variable response and the earlier call near line 194) lack an explicit timeout and can hang; update both requests.post invocations to include timeout=5 (consistent with other calls in this file) so the Feishu API requests use a bounded 5-second timeout—keep the same url, json/data and headers parameters and simply add timeout=5 to each requests.post call.

coderabbitai · 2026-04-28T07:37:35Z

+        normalized_sql = sql.lower().replace("`", " ")
+        normalized_sql = " ".join(normalized_sql.replace("\n", " ").split())
+        normalized_sql = re.sub(r"\s*\.\s*", ".", normalized_sql)
+        references_mysql_user = "mysql.user" in normalized_sql
+        references_user_table = any(
+            token.strip(";") == "user" for token in normalized_sql.split()
+        )
+        if references_mysql_user or (db_name == "mysql" and references_user_table):


⚠️ Potential issue | 🟡 Minor

Limit the user check to table positions.

In the mysql database this blocks any query containing a standalone user token, not just reads from the user table. For example, SELECT user FROM some_other_table now gets rejected even though it never touches mysql.user.

Suggested fix

normalized_sql = sql.lower().replace("`", " ") normalized_sql = " ".join(normalized_sql.replace("\n", " ").split()) normalized_sql = re.sub(r"\s*\.\s*", ".", normalized_sql) references_mysql_user = "mysql.user" in normalized_sql - references_user_table = any( - token.strip(";") == "user" for token in normalized_sql.split() + references_user_table = bool( + re.search( + r"\b(from|join|update|into|desc(?:ribe)?|table)\s+user\b", + normalized_sql, + ) ) if references_mysql_user or (db_name == "mysql" and references_user_table): result["bad_query"] = True result["msg"] = "You do not have permission to view this table"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sql/engines/mysql.py` around lines 585 - 592, The current check flags any standalone "user" token anywhere in the SQL (e.g., SELECT user FROM other_table); change references_user_table to only detect "user" when it appears as a table reference by matching SQL keywords that introduce table names (e.g., FROM, JOIN, INTO, UPDATE, DELETE) rather than any token. Update the logic around normalized_sql, references_mysql_user, and references_user_table so references_user_table uses a regex search like looking for patterns (from|join|into|update|delete)\s+`?user`?\b (case-insensitive) against normalized_sql, and keep the existing references_mysql_user check for "mysql.user"; then use the combined condition if references_mysql_user or (db_name == "mysql" and references_user_table).

            return validate_binary_path(value, "gh-ost")
        except ValueError as exc:
-            raise serializers.ValidationError(str(exc))
+            raise serializers.ValidationError(str(exc)) from None


            return validate_binary_path(value, "pt-online-schema-change")
        except ValueError as exc:
-            raise serializers.ValidationError(str(exc))
+            raise serializers.ValidationError(str(exc)) from None


            except MysqlDDLExecutorError as exc:
-                raise serializers.ValidationError({"errors": str(exc)})
+                logger.warning("Failed to resolve MySQL DDL executor", exc_info=True)
+                raise serializers.ValidationError({"errors": str(exc)}) from None


+                            exc_info=True,
+                        )
+                        raise serializers.ValidationError(
+                            {"errors": str(exc)}


coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

sql/engines/mssql.py (1)

771-784: ⚠️ Potential issue | 🟠 Major

Sanitize execution error payloads returned to clients.

Line 773/807 and Line 779/814 still expose raw exception text in execute_result/ReviewResult. This can leak DB internals even though logging is now hardened.

🔐 Suggested fix

             except Exception as e:
                 logger.warning("MSSQL USE statement failed", exc_info=True)
-                execute_result.error = str(e)
+                safe_error = "Execution failed"
+                execute_result.error = safe_error
                 execute_result.rows.append(
                     ReviewResult(
                         id=rowid,
                         errlevel=2,
                         stagestatus="Execute Failed",
-                        errormessage=f"Exception info: {e}",
+                        errormessage=safe_error,
                         sql=use_sql,
                         affected_rows=0,
                         execute_time=0,
                     )
                 )
@@
             except Exception as e:
                 logger.warning("Mssql command execution failed", exc_info=True)
-                execute_result.error = str(e)
+                safe_error = "Execution failed"
+                execute_result.error = safe_error
                 # Append failed statement to execution results.
                 execute_result.rows.append(
                     ReviewResult(
                         id=rowid,
                         errlevel=2,
                         stagestatus="Execute Failed",
-                        errormessage=f"Exception info: {e}",
+                        errormessage=safe_error,
                         sql=statement,
                         affected_rows=0,
                         execute_time=0,
                     )
                 )

Also applies to: 805-815

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/engines/mssql.py` around lines 771 - 784, The current exception handling
in the MSSQL USE statement block assigns raw exception text into
execute_result.error and ReviewResult.errormessage (see execute_result,
ReviewResult, use_sql, rowid), which may leak DB internals; change these
assignments to a sanitized generic message (e.g., "Internal execution error" or
"Execution failed") or a minimal error code/token before returning to clients,
while keeping logger.warning(..., exc_info=True) to record full details
internally; do this for both the USE block and the other identical block
handling execution failures so clients never receive raw exception strings.

sql/engines/cassandra.py (1)

218-227: ⚠️ Potential issue | 🟠 Major

Do not return raw Cassandra exception text in workflow results.

Line 220 and Line 226 still include unfiltered exception details in returned objects. Keep detailed info in server logs and return a generic safe message to clients.

🔐 Suggested fix

             except Exception as e:
                 logger.warning("%s command execution error", self.name, exc_info=True)
-                execute_result.error = str(e)
+                safe_error = "Execution failed"
+                execute_result.error = safe_error
                 execute_result.rows.append(
                     ReviewResult(
                         id=rowid,
                         errlevel=2,
                         stagestatus="Execute Failed",
-                        errormessage=f"Exception: {e}",
+                        errormessage=safe_error,
                         sql=statement,
                         affected_rows=0,
                         execute_time=0,
                     )
                 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sql/engines/cassandra.py` around lines 218 - 227, The code is returning raw
Cassandra exception text to clients; keep the full exception in server logs but
replace client-facing fields. In the except block around
logger.warning(self.name...), stop assigning str(e) into execute_result.error
and ReviewResult.errormessage; instead set execute_result.error to a generic
message like "Internal execution error" and set
ReviewResult(errormessage="Internal execution error", errlevel=2,
stagestatus="Execute Failed", ...) while keeping logger.warning(...,
exc_info=True) to retain full details in logs; ensure no other fields use
str(e).

♻️ Duplicate comments (1)

common/check.py (1)

113-117: ⚠️ Potential issue | 🟠 Major

Do not return/log raw email backend responses.

Line 115 and Line 116 still expose sender_response directly. That can leak provider/internal diagnostics to clients and logs.

🔧 Proposed fix

     if sender_response != "success":
         result["status"] = 1
-        result["msg"] = sender_response
-        logger.error("Email delivery test failed: %s", sender_response)
+        result["msg"] = "Email delivery test failed."
+        logger.error("Email delivery test failed")
         return result

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 113 - 117, Do not expose raw sender_response to
clients or logs: update the failure path where sender_response is used (the
sender_response variable, result dict assignment, and logger.error call) to set
result["msg"] to a generic message like "Email delivery failed" (no
provider/internal details) and change the logger.error to record only a
sanitized/limited identifier (e.g., an error code, truncated or redacted
excerpt, or hashed value) instead of the full sender_response so internal
diagnostics are not leaked.

🧹 Nitpick comments (3)

common/check.py (2)

213-217: Use warning-level logging (without traceback) for expected lookup failures.

Instance.DoesNotExist / validation failures are expected user-input paths; logger.exception(...) adds noisy stack traces. Prefer a warning log with instance context.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 213 - 217, Replace the noisy logger.exception
call in the except handling for
Instance.DoesNotExist/ValueError/TypeError/ValidationError with a warning-level
log that does not emit a traceback; in the block that currently catches
Instance.DoesNotExist etc. (around the connection test logic using result and
returning an HttpResponse), call logger.warning with concise context (e.g.
instance identifier, name or request/user info available in scope) like
"Instance lookup failed for connection test: %s" and pass the identifier as a
parameter, then keep setting result["status"], result["msg"] and returning the
HttpResponse as before.

49-52: Narrow the exception handling to expected failure modes.

Catching all exceptions masks the actual error types and makes debugging harder. For these connection checks, MySQLdb.Error covers the documented DB-API 2.0 exception hierarchy for connection/authentication/network failures. Additionally, the int(port) conversion can raise ValueError, making it appropriate to catch alongside database errors.

🔧 Proposed fix

-    except Exception:
-        logger.error(traceback.format_exc())
+    except (MySQLdb.Error, ValueError):
+        logger.exception("goInception primary connection check failed")
         result["status"] = 1
         result["msg"] = "Unable to connect to goInception"
         return result
@@
-    except Exception:
-        logger.error(traceback.format_exc())
+    except (MySQLdb.Error, ValueError):
+        logger.exception("goInception backup connection check failed")
         result["status"] = 1
         result["msg"] = "Unable to connect to goInception backup database"

Also applies to: 68-71

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@common/check.py` around lines 49 - 52, The broad except Exception that logs
traceback.format_exc() and sets result["status"]/result["msg"] should be
narrowed to only catch the expected errors — replace except Exception with
except (MySQLdb.Error, ValueError) around the goInception connection checks so
DB-API connection/authentication/network failures (MySQLdb.Error) and port
conversion failures (ValueError from int(port)) are handled; keep the same
logger.error(traceback.format_exc()) and result assignments, and apply the same
change to the second identical block (the other try/except that currently sets
result["status"]=1 and result["msg"]="Unable to connect to goInception").

api_instances/views.py (1)

755-760: Consider using from None to suppress exception chaining.

The generic error messages are correctly hiding sensitive details from API responses. However, adding from None would explicitly suppress exception chaining, making the intent clearer and silencing the B904 linter warnings. This applies to all similar patterns in this file (Lines 760, 823, 919, 1009, 1108, 1171, 1295, 1346, 1452, 1507, 1551, 1621, 1714, 1869, 1968, 2013, 2058, 2106, 2152, 2196, 2515).
♻️ Example fix pattern
         except Exception as exc:
             logger.exception(
                 "Failed to list data dictionary databases for instance_id=%s",
                 instance_id,
             )
-            raise serializers.ValidationError({"errors": "Operation failed."})
+            raise serializers.ValidationError({"errors": "Operation failed."}) from None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api_instances/views.py` around lines 755 - 760, The except blocks that log
the exception and then raise a generic serializers.ValidationError (e.g., the
logger.exception call followed by raise serializers.ValidationError({"errors":
"Operation failed."}) around instance_id handling) should suppress exception
chaining by raising the ValidationError with "from None"; update each similar
pattern in this file (all except handlers at the listed locations) to use "raise
serializers.ValidationError({...}) from None" so the original exception is not
chained into the API response or linter warnings.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api_admin/settings.py`:
- Line 363: The code currently raises serializers.ValidationError(str(exc))
(seen around the ValidationError raises in api_admin/settings.py that wrap
validate_binary_path(...) failures), which can leak internal exception text to
clients; change those raises to return a fixed, client-safe error message (e.g.,
"Invalid binary path" or "Invalid configuration value") instead of str(exc),
preserving the original exception for internal logging if needed (log the exc
inside the same scope using logger.exception or logger.error) and keep the
ValidationError type so callers still get a 4xx response.

In `@api_instances/views.py`:
- Around line 2514-2515: The except block currently swallows the caught
exception (named msg) and re-raises a generic serializers.ValidationError;
update this handler to log the original exception before re-raising by calling
logger.exception(...) (or logger.error with exc_info=True) using the caught
exception variable (rename msg to e if helpful) and then raise the same
serializers.ValidationError({"errors": "Operation failed."}) so the error
details are recorded while preserving the external API response.

In `@api_workflows/views.py`:
- Around line 1700-1707: Replace the insecure exposure of the raw exception
string in the MySQL DDL executor error handler: inside the except
MysqlDDLExecutorError as exc block (where logger.warning is called and
serializers.ValidationError is raised), stop returning {"errors": str(exc)} to
the client; instead raise serializers.ValidationError with a generic,
non-sensitive error message such as {"errors": "Failed to resolve MySQL DDL
executor"} while keeping the detailed exception logged via
logger.warning/logger.exception (exc_info=True) so internal diagnostics remain
available without leaking exception details to the API consumer.
- Around line 1418-1421: The catch for MysqlDDLExecutorError currently returns
the exception text to clients (raise serializers.ValidationError({"errors":
str(exc)})), which can leak subprocess output; instead keep detailed exc_info in
server logs but raise a generic client-safe message. Update the two places
handling MysqlDDLExecutorError (the handler around selected_executor =
resolved_executor.executor_id and the similar handler in the same file tied to
executor.dry_run()/_run_subprocess()) to keep logger.warning(..., exc_info=True)
but change the ValidationError payload to a generic string (e.g. {"errors":
"MySQL DDL executor dry-run failed"}), so internal stderr/stdout are not
returned to API clients.

---

Outside diff comments:
In `@sql/engines/cassandra.py`:
- Around line 218-227: The code is returning raw Cassandra exception text to
clients; keep the full exception in server logs but replace client-facing
fields. In the except block around logger.warning(self.name...), stop assigning
str(e) into execute_result.error and ReviewResult.errormessage; instead set
execute_result.error to a generic message like "Internal execution error" and
set ReviewResult(errormessage="Internal execution error", errlevel=2,
stagestatus="Execute Failed", ...) while keeping logger.warning(...,
exc_info=True) to retain full details in logs; ensure no other fields use
str(e).

In `@sql/engines/mssql.py`:
- Around line 771-784: The current exception handling in the MSSQL USE statement
block assigns raw exception text into execute_result.error and
ReviewResult.errormessage (see execute_result, ReviewResult, use_sql, rowid),
which may leak DB internals; change these assignments to a sanitized generic
message (e.g., "Internal execution error" or "Execution failed") or a minimal
error code/token before returning to clients, while keeping logger.warning(...,
exc_info=True) to record full details internally; do this for both the USE block
and the other identical block handling execution failures so clients never
receive raw exception strings.

---

Duplicate comments:
In `@common/check.py`:
- Around line 113-117: Do not expose raw sender_response to clients or logs:
update the failure path where sender_response is used (the sender_response
variable, result dict assignment, and logger.error call) to set result["msg"] to
a generic message like "Email delivery failed" (no provider/internal details)
and change the logger.error to record only a sanitized/limited identifier (e.g.,
an error code, truncated or redacted excerpt, or hashed value) instead of the
full sender_response so internal diagnostics are not leaked.

---

Nitpick comments:
In `@api_instances/views.py`:
- Around line 755-760: The except blocks that log the exception and then raise a
generic serializers.ValidationError (e.g., the logger.exception call followed by
raise serializers.ValidationError({"errors": "Operation failed."}) around
instance_id handling) should suppress exception chaining by raising the
ValidationError with "from None"; update each similar pattern in this file (all
except handlers at the listed locations) to use "raise
serializers.ValidationError({...}) from None" so the original exception is not
chained into the API response or linter warnings.

In `@common/check.py`:
- Around line 213-217: Replace the noisy logger.exception call in the except
handling for Instance.DoesNotExist/ValueError/TypeError/ValidationError with a
warning-level log that does not emit a traceback; in the block that currently
catches Instance.DoesNotExist etc. (around the connection test logic using
result and returning an HttpResponse), call logger.warning with concise context
(e.g. instance identifier, name or request/user info available in scope) like
"Instance lookup failed for connection test: %s" and pass the identifier as a
parameter, then keep setting result["status"], result["msg"] and returning the
HttpResponse as before.
- Around line 49-52: The broad except Exception that logs traceback.format_exc()
and sets result["status"]/result["msg"] should be narrowed to only catch the
expected errors — replace except Exception with except (MySQLdb.Error,
ValueError) around the goInception connection checks so DB-API
connection/authentication/network failures (MySQLdb.Error) and port conversion
failures (ValueError from int(port)) are handled; keep the same
logger.error(traceback.format_exc()) and result assignments, and apply the same
change to the second identical block (the other try/except that currently sets
result["status"]=1 and result["msg"]="Unable to connect to goInception").

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c0396a1f-4f5b-4402-bcfa-cda771428cea

📥 Commits

Reviewing files that changed from the base of the PR and between 4d2c06d and f360278.

📒 Files selected for processing (8)

api_admin/settings.py
api_instances/views.py
api_workflows/views.py
common/check.py
sql/data_dictionary.py
sql/engines/cassandra.py
sql/engines/mssql.py
sql/engines/oracle.py

✅ Files skipped from review due to trivial changes (1)

sql/data_dictionary.py

coderabbitai · 2026-04-28T08:35:36Z

            return validate_binary_path(value, "gh-ost")
        except ValueError as exc:
-            raise serializers.ValidationError(str(exc))
+            raise serializers.ValidationError(str(exc)) from None


⚠️ Potential issue | 🟠 Major

Avoid exposing raw validation exception text to clients.

On Line 363 and Line 369, returning str(exc) can leak internal details from validate_binary_path(...) into API responses. Use a fixed, client-safe message instead.

🔒 Suggested patch

def validate_gh_ost(self, value): try: return validate_binary_path(value, "gh-ost") - except ValueError as exc: - raise serializers.ValidationError(str(exc)) from None + except ValueError: + raise serializers.ValidationError("Operation failed.") from None def validate_pt_osc(self, value): try: return validate_binary_path(value, "pt-online-schema-change") - except ValueError as exc: - raise serializers.ValidationError(str(exc)) from None + except ValueError: + raise serializers.ValidationError("Operation failed.") from None

Also applies to: 369-369

🧰 Tools

🪛 GitHub Check: CodeQL

[warning] 363-363: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api_admin/settings.py` at line 363, The code currently raises serializers.ValidationError(str(exc)) (seen around the ValidationError raises in api_admin/settings.py that wrap validate_binary_path(...) failures), which can leak internal exception text to clients; change those raises to return a fixed, client-safe error message (e.g., "Invalid binary path" or "Invalid configuration value") instead of str(exc), preserving the original exception for internal logging if needed (log the exc inside the same scope using logger.exception or logger.error) and keep the ValidationError type so callers still get a 4xx response.

coderabbitai · 2026-04-28T08:35:36Z

        except Exception as msg:
-            raise serializers.ValidationError({"errors": str(msg)})
+            raise serializers.ValidationError({"errors": "Operation failed."})


⚠️ Potential issue | 🟡 Minor

Missing exception logging before re-raising generic error.

Unlike other handlers in this file that call logger.exception(...) before raising the generic error, this handler silently swallows the exception details without logging. The captured variable msg is also unused.

🐛 Proposed fix

- except Exception as msg: - raise serializers.ValidationError({"errors": "Operation failed."}) + except Exception: + logger.exception( + "Failed to retrieve instance resources for instance_id=%s resource_type=%s", + instance_id, + resource_type, + ) + raise serializers.ValidationError({"errors": "Operation failed."}) from None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except Exception as msg:

raise serializers.ValidationError({"errors": str(msg)})

raise serializers.ValidationError({"errors": "Operation failed."})

except Exception:

logger.exception(

"Failed to retrieve instance resources for instance_id=%s resource_type=%s",

instance_id,

resource_type,

)

raise serializers.ValidationError({"errors": "Operation failed."}) from None

🧰 Tools

🪛 Ruff (0.15.12)

[warning] 2514-2514: Do not catch blind exception: Exception

(BLE001)

[warning] 2515-2515: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api_instances/views.py` around lines 2514 - 2515, The except block currently swallows the caught exception (named msg) and re-raises a generic serializers.ValidationError; update this handler to log the original exception before re-raising by calling logger.exception(...) (or logger.error with exc_info=True) using the caught exception variable (rename msg to e if helpful) and then raise the same serializers.ValidationError({"errors": "Operation failed."}) so the error details are recorded while preserving the external API response.

coderabbitai · 2026-04-28T08:35:36Z

            except MysqlDDLExecutorError as exc:
-                raise serializers.ValidationError({"errors": str(exc)})
+                logger.warning("Failed to resolve MySQL DDL executor", exc_info=True)
+                raise serializers.ValidationError({"errors": str(exc)}) from None
            selected_executor = resolved_executor.executor_id


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for MysqlDDLExecutorError usage and message construction ast-grep --pattern 'class MysqlDDLExecutorError($_) { $$$ }' # Also search for how it's raised rg -n -A3 'raise MysqlDDLExecutorError'

Repository: jruszo/Datamingle

Length of output: 1455

🏁 Script executed:

# Find MysqlDDLExecutorError class definition fd -t f '*.py' | xargs grep -l 'class MysqlDDLExecutorError'

Repository: jruszo/Datamingle

Length of output: 347

🏁 Script executed:

# Get full context around lines 1418-1421 in api_workflows/views.py cat -n api_workflows/views.py | sed -n '1410,1430p'

Repository: jruszo/Datamingle

Length of output: 1097

🏁 Script executed:

# Check the `reason` variable at line 428 in mysql_ddl.py to understand what it contains cat -n sql/engines/mysql_ddl.py | sed -n '420,435p'

Repository: jruszo/Datamingle

Length of output: 787

🏁 Script executed:

# Look for any sanitization or error message handling patterns in the codebase rg -n 'ValidationError.*str\(' api_workflows/ -B2 -A2

Repository: jruszo/Datamingle

Length of output: 481

🏁 Script executed:

# Search for the inspection object and blockers definition rg -n 'blockers' sql/engines/mysql_ddl.py -B5 -A5

Repository: jruszo/Datamingle

Length of output: 2002

🏁 Script executed:

# Look at the context around line 170 where output is used cat -n sql/engines/mysql_ddl.py | sed -n '155,175p'

Repository: jruszo/Datamingle

Length of output: 1005

🏁 Script executed:

# Search for the inspection class/object to understand what blockers contains rg -n 'class.*Inspection|inspection.*=' sql/engines/mysql_ddl.py -A3

Repository: jruszo/Datamingle

Length of output: 689

🏁 Script executed:

# Find _resolve_mysql_ddl_executor function definition rg -n 'def _resolve_mysql_ddl_executor' api_workflows/ -A20

Repository: jruszo/Datamingle

Length of output: 1412

🏁 Script executed:

# Also check if preflight parameter affects whether subprocess is called rg -n 'preflight' sql/engines/mysql_ddl.py -B2 -A2

Repository: jruszo/Datamingle

Length of output: 599

🏁 Script executed:

# Find preflight_ddl_executor method to see what it does rg -n 'def preflight_ddl_executor' sql/engines/mysql_ddl.py -A15

Repository: jruszo/Datamingle

Length of output: 43

🏁 Script executed:

# Check the full preflight method implementation cat -n sql/engines/mysql_ddl.py | sed -n '452,475p'

Repository: jruszo/Datamingle

Length of output: 1176

🏁 Script executed:

# Search for preflight_ddl_executor more broadly rg -n 'preflight_ddl_executor' sql/engines/

Repository: jruszo/Datamingle

Length of output: 152

🏁 Script executed:

# Check if it's inherited or in a parent class rg -n 'class.*MysqlDDL' sql/engines/mysql_ddl.py -A3

Repository: jruszo/Datamingle

Length of output: 1499

🏁 Script executed:

# Check the preflight_ddl_executor implementation in mysql.py cat -n sql/engines/mysql.py | sed -n '755,780p'

Repository: jruszo/Datamingle

Length of output: 1214

🏁 Script executed:

# Also check what calls .preflight() method rg -n '\.preflight\(' sql/engines/mysql_ddl.py sql/engines/mysql.py -B2 -A2

Repository: jruszo/Datamingle

Length of output: 353

🏁 Script executed:

# Check the dry_run method implementation for BaseMysqlDDLExecutor rg -n 'def dry_run' sql/engines/mysql_ddl.py -A10

Repository: jruszo/Datamingle

Length of output: 1288

🏁 Script executed:

# Also verify that _run_subprocess is called from dry_run rg -n '_run_subprocess' sql/engines/mysql_ddl.py

Repository: jruszo/Datamingle

Length of output: 309

🏁 Script executed:

# Check if the code snippet at lines 1418-1421 complies with Black formatting cat -n api_workflows/views.py | sed -n '1418,1421p'

Repository: jruszo/Datamingle

Length of output: 368

🏁 Script executed:

# Also check if similar patterns exist elsewhere in the file that handle MysqlDDLExecutorError rg -n 'MysqlDDLExecutorError' api_workflows/views.py

Repository: jruszo/Datamingle

Length of output: 236

🏁 Script executed:

# Check the other occurrence at line 1700 cat -n api_workflows/views.py | sed -n '1695,1710p'

Repository: jruszo/Datamingle

Length of output: 989

Replace str(exc) with a generic error message to prevent exposing subprocess output.

The preflight path calls executor.dry_run(), which invokes _run_subprocess(). When subprocess execution fails, this raises MysqlDDLExecutorError with the raw stderr/stdout output (line 170-172 of sql/engines/mysql_ddl.py). Using str(exc) exposes this unsanitized output to API clients, potentially leaking database details, connection parameters, and system information.

This pattern appears at two locations:

Line 1420 (the one under review)

Line 1706 (same issue)

Apply the suggested fix to both occurrences:

Suggested fix

except MysqlDDLExecutorError as exc: logger.warning("Failed to resolve MySQL DDL executor", exc_info=True) - raise serializers.ValidationError({"errors": str(exc)}) from None + raise serializers.ValidationError( + {"errors": "MySQL DDL executor resolution failed."} + ) from None

🧰 Tools

🪛 GitHub Check: CodeQL

[warning] 1420-1420: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api_workflows/views.py` around lines 1418 - 1421, The catch for MysqlDDLExecutorError currently returns the exception text to clients (raise serializers.ValidationError({"errors": str(exc)})), which can leak subprocess output; instead keep detailed exc_info in server logs but raise a generic client-safe message. Update the two places handling MysqlDDLExecutorError (the handler around selected_executor = resolved_executor.executor_id and the similar handler in the same file tied to executor.dry_run()/_run_subprocess()) to keep logger.warning(..., exc_info=True) but change the ValidationError payload to a generic string (e.g. {"errors": "MySQL DDL executor dry-run failed"}), so internal stderr/stdout are not returned to API clients.

coderabbitai · 2026-04-28T08:35:37Z

                    except MysqlDDLExecutorError as exc:
-                        raise serializers.ValidationError({"errors": str(exc)})
+                        logger.warning(
+                            "Failed to resolve MySQL DDL executor for execution",
+                            exc_info=True,
+                        )
+                        raise serializers.ValidationError(
+                            {"errors": str(exc)}
+                        ) from None


⚠️ Potential issue | 🟠 Major

Same str(exc) exposure issue as the schedule endpoint.

This handler has the same CodeQL-flagged pattern. Apply the same fix here for consistency.

🛡️ Proposed fix for consistency

except MysqlDDLExecutorError as exc: logger.warning( "Failed to resolve MySQL DDL executor for execution", exc_info=True, ) raise serializers.ValidationError( - {"errors": str(exc)} + {"errors": "MySQL DDL executor resolution failed."} ) from None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except MysqlDDLExecutorError as exc:

raise serializers.ValidationError({"errors": str(exc)})

logger.warning(

"Failed to resolve MySQL DDL executor for execution",

exc_info=True,

)

raise serializers.ValidationError(

{"errors": str(exc)}

) from None

except MysqlDDLExecutorError as exc:

logger.warning(

"Failed to resolve MySQL DDL executor for execution",

exc_info=True,

)

raise serializers.ValidationError(

{"errors": "MySQL DDL executor resolution failed."}

) from None

🧰 Tools

🪛 GitHub Check: CodeQL

[warning] 1706-1706: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api_workflows/views.py` around lines 1700 - 1707, Replace the insecure exposure of the raw exception string in the MySQL DDL executor error handler: inside the except MysqlDDLExecutorError as exc block (where logger.warning is called and serializers.ValidationError is raised), stop returning {"errors": str(exc)} to the client; instead raise serializers.ValidationError with a generic, non-sensitive error message such as {"errors": "Failed to resolve MySQL DDL executor"} while keeping the detailed exception logged via logger.warning/logger.exception (exc_info=True) so internal diagnostics remain available without leaking exception details to the API consumer.

Fix CodeQL security alerts

f4cb623

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread api_admin/settings.py Outdated

Comment thread api_instances/views.py

Comment thread api_workflows/views.py Outdated

Comment thread sql/plugins/plugin.py Outdated

Comment thread sql/plugins/tests.py Outdated

jruszo added 2 commits April 27, 2026 16:09

Harden CodeQL findings and workflow security

890d1db

Fix CI secret key fallback

4d2c06d

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Restore legacy error messages for validation and tests

f360278

github-advanced-security AI found potential problems Apr 28, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

jruszo merged commit eea837c into master Apr 28, 2026
7 checks passed

Conversation

jruszo commented Apr 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jruszo commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading