Conversation
📝 WalkthroughWalkthroughThis PR comprehensively hardens error handling and security across the codebase by adding explicit GitHub Actions permissions, standardizing error responses to avoid exposing exception details, improving server-side logging with proper traceback capture, hardening file path validation and plugin argument whitelisting, and refactoring SQL execution engine exception handling for consistency. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (12)
.github/workflows/django.yml (2)
95-102:⚠️ Potential issue | 🔴 CriticalCritical: Remove hardcoded
SECRET_KEYfrom the GitHub Actions workflow.The workflow sets
env: SECRET_KEY: XViQ1NfutP1RraqhHV3M3akMsB3JWZr9, which is effectively a leaked secret and will trigger many security scans (CodeQL/hardcoded-secrets) and is a compliance risk—even if it’s “just for CI”. Use a GitHub Actions secret instead.🔧 Suggested fix
- name: Run Tests env: - SECRET_KEY: XViQ1NfutP1RraqhHV3M3akMsB3JWZr9 + SECRET_KEY: ${{ secrets.DJANGO_SECRET_KEY }} run: | python manage.py makemigrations python manage.py makemigrations sql pytest --cov --cov-report xml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/django.yml around lines 95 - 102, The workflow currently hardcodes SECRET_KEY in the "Run Tests" job which leaks a secret; update the env usage for the Run Tests step to read the key from GitHub Actions secrets (e.g. use secrets.SECRET_KEY) instead of the literal value, or remove the env entry if tests do not require it; ensure the repository secret is added in Settings > Secrets before referencing it and update any references in the workflow that rely on SECRET_KEY accordingly.
70-76:⚠️ Potential issue | 🟠 MajorMajor: Add a timeout/attempt limit to the MySQL readiness loop.
The
until docker exec mysql mysqladmin ping ...; do ... sleep 2 doneloop has no upper bound. If MySQL never becomes ready, the workflow can hang indefinitely, consuming CI time and masking the real failure mode. Add a max attempts/timeout and fail with logs.🔧 Suggested fix
docker cp my.cnf mysql:/etc/mysql/conf.d/ docker restart mysql - until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do + max_attempts=30 + attempt=1 + until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do if [ "$(docker inspect -f '{{.State.Running}}' mysql 2>/dev/null)" != "true" ]; then docker logs mysql exit 1 fi + if [ "$attempt" -ge "$max_attempts" ]; then + echo "MySQL did not become ready in time." + docker logs mysql + exit 1 + fi + attempt=$((attempt+1)) sleep 2 done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/django.yml around lines 70 - 76, The readiness loop that runs "until docker exec mysql mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do ... sleep 2 done" has no timeout and can hang indefinitely; modify this loop to introduce a max attempts counter or overall timeout (e.g., MAX_ATTEMPTS and an increment each iteration) and on exceeding it print docker logs for the "mysql" container and exit non‑zero; implement the counter/check within the same block that currently checks container running state so the workflow fails fast with diagnostic logs when MySQL never becomes ready.common/check.py (5)
177-182:⚠️ Potential issue | 🟠 MajorMajor: exception logging includes
str(e)(possible sensitive details), even though client message is generic.In
validate_file_storage_payload(), the client response is generic (Line 179), but the server log includes the exception string:
- Line 180-181:
error_msg = f"Connection test exception: {str(e)}"andlogging.error(..., exc_info=True)If
str(e)can contain hostnames, credentials, query text, or other sensitive data, this can reintroduce the same alert surface the PR is trying to reduce. Prefer fixed log messages and rely onexc_info=Truefor stack traces (without interpolatingstr(e)).🛠️ Proposed fix
except Exception as e: result["status"] = 1 result["msg"] = "Storage connectivity test raised an exception." - error_msg = f"Connection test exception: {str(e)}" - logging.error(error_msg, exc_info=True) + logger.error("Storage connectivity test raised an exception.", exc_info=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 177 - 182, In validate_file_storage_payload(), avoid interpolating the exception string into the log; change the logging call that currently builds error_msg using str(e) to a fixed, non-sensitive message (e.g. "Storage connectivity test failed") and pass exc_info=True so the stack trace is recorded without embedding potentially sensitive exception content; keep the generic client-facing result["msg"] and result["status"] as-is.
91-117:⚠️ Potential issue | 🟠 MajorMajor: email test returns
sender_responsedirectly to client.In
validate_email_payload(), whensender_response != "success"you set:
- Line 115:
result["msg"] = sender_responseIf
sender_responsecontains SMTP/server details or exception text, this again risks leaking sensitive information to API callers. Return a generic message and keep any diagnostic detail server-side (ideally not including raw exception strings either).🛠️ Proposed fix
sender_response = sender.send_email(subj, bd, [user_email]) if sender_response != "success": result["status"] = 1 - result["msg"] = sender_response + result["msg"] = "Email delivery test failed." return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 91 - 117, In validate_email_payload(), avoid returning raw sender_response to the client when sender_response != "success" because it may contain sensitive SMTP/exception details; instead set result["msg"] to a generic failure message (e.g., "Email delivery failed; check server logs") and record the detailed sender_response in a server-side log or diagnostics (use the module logger or error-tracking facility) linked to the send operation (sender.send_email / sender_response) for debugging. Ensure result["status"] remains non-zero and do not include sender_response in the API response.
37-75:⚠️ Potential issue | 🔴 CriticalCritical: client responses still leak raw exception details (
str(e)) for DB connectivity.In
validate_go_inception_payload(), both failure paths buildresult["msg"]withstr(e):
- Line 50:
"Unable to connect to goInception\n{}".format(str(e))- Line 69-71: backup DB equivalent
Even if you’re already logging
traceback.format_exc()server-side, returning exception text to the API is an information disclosure risk (and can trigger CodeQL alerts depending on sink type). Prefer fixed, user-safe messages and keep detailed exception only in server logs.🛠️ Proposed fix
def validate_go_inception_payload(payload): result = {"status": 0, "msg": "ok", "data": []} @@ except Exception as e: logger.error(traceback.format_exc()) result["status"] = 1 - result["msg"] = "Unable to connect to goInception\n{}".format(str(e)) + result["msg"] = "Unable to connect to goInception." return result @@ except Exception as e: logger.error(traceback.format_exc()) result["status"] = 1 - result["msg"] = "Unable to connect to goInception backup database\n{}".format( - str(e) - ) + result["msg"] = "Unable to connect to goInception backup database."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 37 - 75, The code in validate_go_inception_payload builds result["msg"] using str(e) after MySQLdb.connect failures, leaking exception details; change both places (the exception handlers around MySQLdb.connect calls) to set result["msg"] to a fixed, user-safe string (e.g., "Unable to connect to goInception" and "Unable to connect to goInception backup database") without including str(e), while keeping logger.error(traceback.format_exc()) as the place that records full error details; update the exception blocks for the first connect (around MySQLdb.connect / cur = conn.cursor()) and the second backup connect to remove the inclusion of str(e) in the returned message.
200-218:⚠️ Potential issue | 🟠 MajorMajor:
instance()endpoint still returnstest_result.error/str(e)to the client, andInstance.objects.get()isn’t protected.Issues:
- Line 205:
instance = Instance.objects.get(id=instance_id)can raise if missing/invalid; not caught.- Line 211-213: returns
test_result.errordirectly.- Line 216-217: returns
str(e)directly.To match the PR’s security intent, make client messages generic and handle missing instance cleanly. Log details server-side only.
🛠️ Proposed fix (sketch)
def instance(request): result = {"status": 0, "msg": "ok", "data": []} instance_id = request.POST.get("instance_id") - instance = Instance.objects.get(id=instance_id) + try: + instance = Instance.objects.get(id=instance_id) + except Instance.DoesNotExist: + result["status"] = 1 + result["msg"] = "Instance not found." + return HttpResponse(json.dumps(result), content_type="application/json") try: engine = get_engine(instance=instance) test_result = engine.test_connection() if test_result.error: result["status"] = 1 - result["msg"] = "Unable to connect to instance,\n{}".format( - test_result.error - ) + result["msg"] = "Unable to connect to instance." except Exception as e: result["status"] = 1 - result["msg"] = "Unable to connect to instance,\n{}".format(str(e)) + result["msg"] = "Unable to connect to instance."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 200 - 218, The instance() view currently exposes internal error strings and calls Instance.objects.get(id=instance_id) without guarding against missing/invalid IDs; change it to validate and safely fetch the instance (catch Instance.DoesNotExist and invalid ID cases) and always return a generic client message like "Unable to connect to instance" or "Instance not found" instead of test_result.error or str(e). Log full details server-side (use Python logging and logging.exception or logger.error with the exception) when get_engine(instance=instance) or engine.test_connection() fails, and when Instance.objects.get fails, but ensure the HttpResponse JSON only contains non-sensitive generic messages and status codes. Keep references: instance(), Instance.objects.get, get_engine(...), engine.test_connection().
120-165:⚠️ Potential issue | 🟠 MajorMajor correctness/security gap:
max_export_rowsis validated but not used; andstorage_typeis not validated/allowlisted here.Observations:
- Lines 124-132:
max_export_rowsis parsed/validated, but it never appears inconfig_dict(Lines 144-164). IfDynamicStorageexpects it, current behavior may silently ignore the limit.- Lines 122-136:
storage_type = payload.get("storage_type")and thencustom_param_key = f"{storage_type}_custom_params"without checkingstorage_typeis present/expected. Depending on howDynamicStorageusesstorage_type(especially given PR objective about plugin command execution), this could be a hardening gap.Recommend either:
- include
max_export_rowsinconfig_dict(if intended), and/or- add explicit
storage_typeallowlist + presence check before constructing config.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 120 - 165, The validate_file_storage_payload function currently validates max_export_rows but never includes it in the returned config and also uses storage_type without presence/allowlist checks; update validate_file_storage_payload to (1) require and validate storage_type against an allowlist (e.g., check storage_type is one of expected values before using custom_param_key and return an error if missing/invalid) and (2) include the validated max_export_rows (converted to int) in config_dict (e.g., add "max_export_rows": int(max_export_rows)) so DynamicStorage receives the limit; reference validate_file_storage_payload, storage_type, custom_param_key, max_export_rows, config_dict, and DynamicStorage when making changes.api_admin/settings.py (1)
359-363:⚠️ Potential issue | 🟡 MinorUse
raise ... from Noneto properly suppress the exception chain.The change correctly hides the original
ValueErrormessage from clients, but should use exception chaining syntax to avoid implicit exception context. This prevents the original exception from being unnecessarily preserved in the traceback.def validate_gh_ost(self, value): try: return validate_binary_path(value, "gh-ost") except ValueError as exc: - raise serializers.ValidationError("Operation failed.") + raise serializers.ValidationError("Operation failed.") from NoneAs per coding guidelines, use Black for Python formatting. Additionally, per Ruff B904: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_admin/settings.py` around lines 359 - 363, In validate_gh_ost, when catching the ValueError from validate_binary_path, re-raise the serializers.ValidationError using exception chaining suppression (i.e., "raise serializers.ValidationError('Operation failed.') from None") so the original ValueError context isn't preserved; update the except block in the validate_gh_ost method accordingly and run Black/ruff to ensure formatting and B904 compliance.sql/resource_group.py (1)
3-3:⚠️ Potential issue | 🟡 MinorRemove the unused
tracebackimport.The
tracebackmodule is imported on line 3 but is not used anywhere in the file. This import should be removed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/resource_group.py` at line 3, Remove the unused import of the traceback module: delete the line that reads "import traceback" so there are no unused imports left in this module (ensure no other code references traceback before removing). Run linters/tests to confirm no further import warnings.sql/engines/doris.py (1)
175-189:⚠️ Potential issue | 🔴 CriticalConfirm that raw exception messages in
errormessageare exposed to API clients and need sanitization.The
errormessagefield containingf"Exception: {e}"is confirmed to be exposed to end users. TheReviewSet.to_dict()method serializes allReviewResult.__dict__attributes without filtering, and this data flows directly to API responses via theexecute_rowsfield. The same unsanitized exception pattern exists across multiple engine implementations (redis.py, phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py), while offlinedownload.py demonstrates safer, controlled error messages. Apply consistent sanitization across all affected engine files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/doris.py` around lines 175 - 189, The except block currently injects raw exception text into execute_result.error and ReviewResult.errormessage (e.g., f"Exception: {e}") which is serialized to API clients; change the behavior to log the full exception internally (use logger.exception or logger.error with exc_info=True) but replace the public fields with a sanitized message (e.g., "Execution failed" or "Internal execution error") and optionally a short, non-sensitive error code/ID; update the except handling in doris.py (the except block that sets execute_result.error and creates ReviewResult) to stop exposing str(e) and the f"Exception: {e}" string, and apply the same change to the equivalent except blocks in redis.py, phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py so all engine implementations consistently sanitize errors returned to clients while retaining full exception details in logs.api_instances/views.py (1)
957-983:⚠️ Potential issue | 🟠 MajorDon't derive the export path from the escaped database name.
For single-db exports, the file is written using
query_engine.escape_string(db_name)but reopened using the rawdb_name. Names containing escapable characters will be written to one path and downloaded from another, and the escaped value is also the wrong thing to feed intoget_tables_metas_data()when that method already parameterizes internally.Suggested fix
try: query_engine = get_engine(instance=instance) if db_name: - databases = [query_engine.escape_string(db_name)] + databases = [{"raw": db_name, "query": db_name}] elif request.user.is_superuser: - databases = query_engine.get_all_databases().rows + databases = [ + {"raw": database, "query": database} + for database in query_engine.get_all_databases().rows + ] else: raise PermissionDenied( "Only admins can export dictionary data for a full instance." ) @@ - for database in databases: - table_metas = query_engine.get_tables_metas_data(db_name=database) + for database in databases: + table_metas = query_engine.get_tables_metas_data( + db_name=database["query"] + ) context = { - "db_name": database, + "db_name": database["raw"], "tables": table_metas, "export_time": timezone.now(), } @@ full_path = _safe_dictionary_export_path( - export_dir, instance.instance_name, database + export_dir, instance.instance_name, database["raw"] )Also applies to: 1009-1012
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_instances/views.py` around lines 957 - 983, The code currently uses query_engine.escape_string(db_name) to build the databases list and then calls get_tables_metas_data(db_name=database) and _safe_dictionary_export_path with that escaped value, causing mismatched read/write paths and incorrect parameterization; instead, keep the original raw db_name in the databases list (do not call escape_string when db_name was provided), pass the raw database name into query_engine.get_tables_metas_data and into the context, and only use a filesystem-safe transformation (or rely on _safe_dictionary_export_path) when constructing the export file path; apply the same change to the other occurrence around lines 1009-1012 so all metadata fetches use the raw db name while any filename/path generation uses a sanitized form.api_queries/views.py (1)
342-354:⚠️ Potential issue | 🟠 MajorClose
query_engineinQueryDescribe.post.This method opens an engine connection but never closes it on either the success or error path. Repeated describe requests can leak DB connections until the worker exhausts them.
Suggested fix
- query_engine = get_engine(instance=instance) - db_name = query_engine.escape_string(data["db_name"]) - schema_name = query_engine.escape_string(data.get("schema_name", "")) - tb_name = query_engine.escape_string(data["tb_name"]) - - try: + query_engine = get_engine(instance=instance) + try: + db_name = query_engine.escape_string(data["db_name"]) + schema_name = query_engine.escape_string(data.get("schema_name", "")) + tb_name = query_engine.escape_string(data["tb_name"]) query_result = query_engine.describe_table( db_name=db_name, tb_name=tb_name, schema_name=schema_name, ) - except Exception as exc: + except Exception as exc: raise serializers.ValidationError({"errors": "Operation failed."}) + finally: + query_engine.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_queries/views.py` around lines 342 - 354, QueryDescribe.post opens a DB engine via get_engine() and calls query_engine.describe_table but never closes the engine, leaking connections; wrap the describe_table call in a try/except/finally (or use a context manager if get_engine returns one) and ensure query_engine.close() (or query_engine.dispose() if that’s the API) is called in the finally block so the engine is closed on both success and exception paths; reference the variables/methods: QueryDescribe.post, get_engine, query_engine, and describe_table.
🧹 Nitpick comments (8)
common/utils/wx_api.py (1)
15-16: Consider consistent exception handling across files.While not part of this PR's changes, note that
wx_api.pystill uses the patternexcept Exception as e: logger.error(f"...{e}"), which differs from the updated pattern infeishu_api.pythat useslogger.exception(). For consistency and to fully align with security hardening goals, consider updating this pattern here as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/utils/wx_api.py` around lines 15 - 16, Update the exception handling in the except block that currently reads "except Exception as e: logger.error(f'Failed to read WeCom token from cache: {e}')" to use logger.exception instead so the stack trace is captured; specifically replace the logger.error call with logger.exception("Failed to read WeCom token from cache") in the code path that reads the WeCom token from cache (the except block catching Exception when attempting to read the token).common/utils/sendmsg.py (2)
223-229: Inconsistent logging insend_feishu_usermethod.This method still logs URL, request data, and full response on both success (line 223-225) and failure (lines 227-229), which is inconsistent with the security hardening applied to similar methods like
send_feishu_webhook. Consider applying the same log reduction for comprehensive coverage.♻️ Suggested refactor for consistency
if r["code"] == 0: - logger.debug( - f"Feishu direct message sent successfully\nTarget:{url}\nContent:{content}" - ) + logger.debug("Feishu direct message sent successfully") else: - logger.error( - f"Feishu direct message failed\nRequest url:{url}\nRequest data:{data}\nResponse:{r}" - ) + logger.error("Feishu direct message failed: code=%s", r.get("code"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/utils/sendmsg.py` around lines 223 - 229, In send_feishu_user, remove or reduce sensitive/request-specific details from the success and failure logs to match send_feishu_webhook's hardened logging: replace the logger.debug that prints url and content with a minimal success message (e.g., "Feishu direct message sent successfully") and change the logger.error to avoid logging raw request data and full response object (do not print data or r directly); instead log a concise error summary such as the HTTP status code or r.text/short error string along with a non-sensitive identifier. Update the logger.debug/logger.error calls that reference url, data, r, and content accordingly.
203-203: Consider further log reduction.While this change removes URL and request data from logs, the full response JSON is still logged on failure. Consider logging only a status indicator (similar to the success case) or a specific error code if the API provides one, to further reduce potential sensitive data exposure.
♻️ Optional refactor for consistency
- logger.error("Feishu webhook failed: response=%s", r_json) + logger.error("Feishu webhook failed: code=%s", r_json.get("code"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/utils/sendmsg.py` at line 203, The logger.error in sendmsg.py currently logs the full Feishu response JSON (logger.error("Feishu webhook failed: response=%s", r_json)); change it to log a minimal status indicator or specific error code instead: parse the response for a short identifier (e.g., r_json.get("code") or r_json.get("status") or r_json.get("message")) and log that value with a concise message (e.g., "Feishu webhook failed: code=%s"), avoiding dumping the entire r_json to reduce sensitive data exposure; update the error logging site in the function handling Feishu webhook responses accordingly.api_access/views.py (1)
586-592: Add logging for permission request review audit failures.The security fix is correct, but this handler should log the failure for debugging. The permission request review is an important audit operation that may need troubleshooting.
♻️ Proposed refactoring
try: workflow_audit_detail = auditor.operate( action, request.user, data.get("audit_remark", "") ) except AuditException as exc: + logger.warning( + "Permission request review failed for request_id=%s", + request_id, + exc_info=True, + ) - raise serializers.ValidationError({"errors": "Audit failed."}) + raise serializers.ValidationError({"errors": "Audit failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_access/views.py` around lines 586 - 592, The except block for AuditException in the permission review handler currently raises serializers.ValidationError without logging; update the except AuditException as exc handler (the block around auditor.operate and before _permission_request_audit_callback) to log the exception and context (use the module logger or existing logger variable) — e.g., logger.exception or logger.error with exc info and include identifying details like request.user, action, and any audit_remark — then re-raise the serializers.ValidationError as before.api_workflows/views.py (3)
1693-1697: Add logging and proper exception chaining for MySQL DDL executor resolution failures.Same concern as line 1419 - consider adding logging for debugging visibility and using
from Nonefor clean exception chaining.♻️ Proposed refactoring
except MysqlDDLExecutorError as exc: + logger.warning( + "MySQL DDL executor resolution failed for workflow %s", + workflow_id, + exc_info=True, + ) raise serializers.ValidationError( {"errors": "Operation failed."} - ) + ) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_workflows/views.py` around lines 1693 - 1697, When catching MysqlDDLExecutorError in the executor resolution block, log the exception with details (e.g., using logger.error or logger.exception and include exc_info or the exception message) before raising the ValidationError, and raise the ValidationError with exception chaining suppressed using "from None"; specifically, update the except block handling MysqlDDLExecutorError to log the caught exc and then raise serializers.ValidationError({"errors": "Operation failed."}) from None so the resolved_executor.executor_id/selected_executor code remains reachable with clear debug logs.
1572-1575: Add logging and proper exception chaining for audit operation failures.The security fix is correct, but this handler lacks logging for operational visibility. Audit operation failures should be logged for troubleshooting, and the exception should use
from Nonefor proper chaining.♻️ Proposed refactoring
try: workflow_audit_detail = auditor.operate(action, user, data["audit_remark"]) except AuditException as e: + logger.warning( + "Audit operation failed for workflow %s", + workflow_id, + exc_info=True, + ) - raise serializers.ValidationError({"errors": "Operation failed."}) + raise serializers.ValidationError({"errors": "Operation failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_workflows/views.py` around lines 1572 - 1575, The except block catching AuditException should log the failure and re-raise a ValidationError with exception chaining suppressed; specifically, inside the except AuditException as e: for the call to auditor.operate (and variable workflow_audit_detail) add a process/logging call (e.g., logger.error) that includes context like action, user id/name, audit_remark and the exception e, then raise serializers.ValidationError({"errors": "Operation failed."}) from None so the original traceback is not chained; ensure you reference auditor.operate, AuditException, workflow_audit_detail and serializers.ValidationError when applying the change.
1417-1420: Add logging for MySQL DDL executor preflight failures.The exception details are correctly sanitized, but unlike other error handlers in this file (e.g., line 1318), there's no logging here. Consider adding a warning log to capture the actual error for debugging purposes.
♻️ Proposed refactoring
except MysqlDDLExecutorError as exc: + logger.warning( + "MySQL DDL executor preflight failed for workflow %s", + workflow.id, + exc_info=True, + ) raise serializers.ValidationError({"errors": "Operation failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_workflows/views.py` around lines 1417 - 1420, Add a warning log when catching MysqlDDLExecutorError before raising the sanitized serializers.ValidationError: inside the except block for MysqlDDLExecutorError (the block that currently raises serializers.ValidationError({"errors": "Operation failed."})), call the module logger (e.g., logger.warning or logger.warn) including a concise message and the exception details (exc) or exc_info=True so the real error is recorded for debugging; then re-raise the same sanitized serializers.ValidationError and leave the subsequent assignment to selected_executor = resolved_executor.executor_id unchanged.api_archives/views.py (1)
910-916: Add logging and proper exception chaining for archive review failures.The security fix is correct, but this handler lacks logging for operational visibility. Archive review operation failures should be logged, and the exception should use
from Nonefor proper chaining.♻️ Proposed refactoring
try: workflow_audit_detail = auditor.operate( action, request.user, data["audit_remark"] ) except AuditException as exc: + logger.warning( + "Archive review operation failed for archive_id=%s", + archive_id, + exc_info=True, + ) - raise serializers.ValidationError({"errors": "Operation failed."}) + raise serializers.ValidationError({"errors": "Operation failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_archives/views.py` around lines 910 - 916, When handling AuditException from auditor.operate (called to produce workflow_audit_detail using request.user and data["audit_remark"]), add a process/logging statement that records the failure with the exception details (include context such as action and request.user) and then re-raise serializers.ValidationError using "from None" to suppress chaining; specifically, catch AuditException as exc, call the logger (e.g., logger.error or processLogger.error) with a clear message and exc info, then raise serializers.ValidationError({"errors": "Operation failed."}) from None.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api_admin/settings.py`:
- Around line 365-369: The validate_pt_osc method currently catches ValueError
and re-raises serializers.ValidationError without suppressing the original
exception chain; update the exception raise to use exception chaining
suppression (i.e., raise serializers.ValidationError("Operation failed.") from
None) similar to validate_gh_ost so the original ValueError context is not
preserved. Ensure the change is applied inside validate_pt_osc's except
ValueError block.
In `@api_instances/views.py`:
- Around line 153-159: _safe_dictionary_export_path currently produces a lossy,
truncated safe_name from f"{instance_name}_{db_name}", which can cause distinct
pairs to map to the same filename and silently overwrite exports; fix it by
incorporating a deterministic short fingerprint (e.g., a hex digest or first N
chars of a hash of the original f"{instance_name}_{db_name}") into the filename
before truncation so uniqueness is preserved, then rebuild full_path and retain
the existing os.path.commonpath check to prevent escaping base_dir; update any
validation that returns "" to handle collisions (e.g., always generate the
hashed filename) and reference the function name _safe_dictionary_export_path
and the variables safe_name and full_path when making the change.
In `@api_workflows/views.py`:
- Around line 1317-1319: The except block currently logs the original exception
with logger.error("Failed to load rollback SQL", exc_info=True) and then
re-raises a generic serializers.ValidationError; change the re-raise to suppress
exception chaining by using "raise serializers.ValidationError({'errors':
'Operation failed.'}) from None" so that the original exception context is not
attached; update the raise site where serializers.ValidationError is thrown in
the except Exception as exc: block (near the logger.error call) to use the "from
None" form.
In `@sql/engines/mysql.py`:
- Around line 585-592: The check for mysql.user fails because backticks are
replaced with spaces leaving "mysql. user" so the literal "mysql.user" lookup
misses it; update the normalization before computing references_mysql_user to
collapse whitespace around dots (e.g. normalize dots with a regex like
re.sub(r"\s*\.\s*", ".", normalized_sql)) so "mysql.`user`" and "mysql. user"
both become "mysql.user"; keep using normalized_sql, references_mysql_user,
references_user_table and db_name for the subsequent logic.
In `@sql/plugins/plugin.py`:
- Around line 77-78: The regex safety check currently in generate_args2cmd
(SAFE_ARGUMENT_RE.fullmatch on arg) can raise ValueError bypassing the expected
{status,msg} validation flow; move the argument-name validation into
check_args() so malformed option names are caught there and returned as the
standard validation response, or alternatively catch the invalid-name case
inside generate_args2cmd and convert it into the same {status,msg} error object;
update check_args() to iterate the same argument keys validated by
generate_args2cmd (using the same SAFE_ARGUMENT_RE) and return a normalized
failure, or wrap generate_args2cmd to return {status:"error", msg:...} instead
of raising ValueError.
In `@sql/plugins/tests.py`:
- Around line 146-153: The test's Popen assertions are too loose for
stdout/stderr; instead of assertIsNotNone use exact equality to ensure
execute_cmd() still pipes output: assertEqual(popen_kwargs["stdout"],
subprocess.PIPE) and assertEqual(popen_kwargs["stderr"], subprocess.PIPE); keep
the other assertions (mock_subprocess.Popen.assert_called_once(), shell False,
universal_newlines True, and the argv checks) intact to strictly validate the
behavior of execute_cmd() when invoking mock_subprocess.Popen.
---
Outside diff comments:
In @.github/workflows/django.yml:
- Around line 95-102: The workflow currently hardcodes SECRET_KEY in the "Run
Tests" job which leaks a secret; update the env usage for the Run Tests step to
read the key from GitHub Actions secrets (e.g. use secrets.SECRET_KEY) instead
of the literal value, or remove the env entry if tests do not require it; ensure
the repository secret is added in Settings > Secrets before referencing it and
update any references in the workflow that rely on SECRET_KEY accordingly.
- Around line 70-76: The readiness loop that runs "until docker exec mysql
mysqladmin ping -h127.0.0.1 -uroot -proot --silent; do ... sleep 2 done" has no
timeout and can hang indefinitely; modify this loop to introduce a max attempts
counter or overall timeout (e.g., MAX_ATTEMPTS and an increment each iteration)
and on exceeding it print docker logs for the "mysql" container and exit
non‑zero; implement the counter/check within the same block that currently
checks container running state so the workflow fails fast with diagnostic logs
when MySQL never becomes ready.
In `@api_admin/settings.py`:
- Around line 359-363: In validate_gh_ost, when catching the ValueError from
validate_binary_path, re-raise the serializers.ValidationError using exception
chaining suppression (i.e., "raise serializers.ValidationError('Operation
failed.') from None") so the original ValueError context isn't preserved; update
the except block in the validate_gh_ost method accordingly and run Black/ruff to
ensure formatting and B904 compliance.
In `@api_instances/views.py`:
- Around line 957-983: The code currently uses
query_engine.escape_string(db_name) to build the databases list and then calls
get_tables_metas_data(db_name=database) and _safe_dictionary_export_path with
that escaped value, causing mismatched read/write paths and incorrect
parameterization; instead, keep the original raw db_name in the databases list
(do not call escape_string when db_name was provided), pass the raw database
name into query_engine.get_tables_metas_data and into the context, and only use
a filesystem-safe transformation (or rely on _safe_dictionary_export_path) when
constructing the export file path; apply the same change to the other occurrence
around lines 1009-1012 so all metadata fetches use the raw db name while any
filename/path generation uses a sanitized form.
In `@api_queries/views.py`:
- Around line 342-354: QueryDescribe.post opens a DB engine via get_engine() and
calls query_engine.describe_table but never closes the engine, leaking
connections; wrap the describe_table call in a try/except/finally (or use a
context manager if get_engine returns one) and ensure query_engine.close() (or
query_engine.dispose() if that’s the API) is called in the finally block so the
engine is closed on both success and exception paths; reference the
variables/methods: QueryDescribe.post, get_engine, query_engine, and
describe_table.
In `@common/check.py`:
- Around line 177-182: In validate_file_storage_payload(), avoid interpolating
the exception string into the log; change the logging call that currently builds
error_msg using str(e) to a fixed, non-sensitive message (e.g. "Storage
connectivity test failed") and pass exc_info=True so the stack trace is recorded
without embedding potentially sensitive exception content; keep the generic
client-facing result["msg"] and result["status"] as-is.
- Around line 91-117: In validate_email_payload(), avoid returning raw
sender_response to the client when sender_response != "success" because it may
contain sensitive SMTP/exception details; instead set result["msg"] to a generic
failure message (e.g., "Email delivery failed; check server logs") and record
the detailed sender_response in a server-side log or diagnostics (use the module
logger or error-tracking facility) linked to the send operation
(sender.send_email / sender_response) for debugging. Ensure result["status"]
remains non-zero and do not include sender_response in the API response.
- Around line 37-75: The code in validate_go_inception_payload builds
result["msg"] using str(e) after MySQLdb.connect failures, leaking exception
details; change both places (the exception handlers around MySQLdb.connect
calls) to set result["msg"] to a fixed, user-safe string (e.g., "Unable to
connect to goInception" and "Unable to connect to goInception backup database")
without including str(e), while keeping logger.error(traceback.format_exc()) as
the place that records full error details; update the exception blocks for the
first connect (around MySQLdb.connect / cur = conn.cursor()) and the second
backup connect to remove the inclusion of str(e) in the returned message.
- Around line 200-218: The instance() view currently exposes internal error
strings and calls Instance.objects.get(id=instance_id) without guarding against
missing/invalid IDs; change it to validate and safely fetch the instance (catch
Instance.DoesNotExist and invalid ID cases) and always return a generic client
message like "Unable to connect to instance" or "Instance not found" instead of
test_result.error or str(e). Log full details server-side (use Python logging
and logging.exception or logger.error with the exception) when
get_engine(instance=instance) or engine.test_connection() fails, and when
Instance.objects.get fails, but ensure the HttpResponse JSON only contains
non-sensitive generic messages and status codes. Keep references: instance(),
Instance.objects.get, get_engine(...), engine.test_connection().
- Around line 120-165: The validate_file_storage_payload function currently
validates max_export_rows but never includes it in the returned config and also
uses storage_type without presence/allowlist checks; update
validate_file_storage_payload to (1) require and validate storage_type against
an allowlist (e.g., check storage_type is one of expected values before using
custom_param_key and return an error if missing/invalid) and (2) include the
validated max_export_rows (converted to int) in config_dict (e.g., add
"max_export_rows": int(max_export_rows)) so DynamicStorage receives the limit;
reference validate_file_storage_payload, storage_type, custom_param_key,
max_export_rows, config_dict, and DynamicStorage when making changes.
In `@sql/engines/doris.py`:
- Around line 175-189: The except block currently injects raw exception text
into execute_result.error and ReviewResult.errormessage (e.g., f"Exception:
{e}") which is serialized to API clients; change the behavior to log the full
exception internally (use logger.exception or logger.error with exc_info=True)
but replace the public fields with a sanitized message (e.g., "Execution failed"
or "Internal execution error") and optionally a short, non-sensitive error
code/ID; update the except handling in doris.py (the except block that sets
execute_result.error and creates ReviewResult) to stop exposing str(e) and the
f"Exception: {e}" string, and apply the same change to the equivalent except
blocks in redis.py, phoenix.py, oracle.py, mssql.py, pgsql.py, cassandra.py so
all engine implementations consistently sanitize errors returned to clients
while retaining full exception details in logs.
In `@sql/resource_group.py`:
- Line 3: Remove the unused import of the traceback module: delete the line that
reads "import traceback" so there are no unused imports left in this module
(ensure no other code references traceback before removing). Run linters/tests
to confirm no further import warnings.
---
Nitpick comments:
In `@api_access/views.py`:
- Around line 586-592: The except block for AuditException in the permission
review handler currently raises serializers.ValidationError without logging;
update the except AuditException as exc handler (the block around
auditor.operate and before _permission_request_audit_callback) to log the
exception and context (use the module logger or existing logger variable) —
e.g., logger.exception or logger.error with exc info and include identifying
details like request.user, action, and any audit_remark — then re-raise the
serializers.ValidationError as before.
In `@api_archives/views.py`:
- Around line 910-916: When handling AuditException from auditor.operate (called
to produce workflow_audit_detail using request.user and data["audit_remark"]),
add a process/logging statement that records the failure with the exception
details (include context such as action and request.user) and then re-raise
serializers.ValidationError using "from None" to suppress chaining;
specifically, catch AuditException as exc, call the logger (e.g., logger.error
or processLogger.error) with a clear message and exc info, then raise
serializers.ValidationError({"errors": "Operation failed."}) from None.
In `@api_workflows/views.py`:
- Around line 1693-1697: When catching MysqlDDLExecutorError in the executor
resolution block, log the exception with details (e.g., using logger.error or
logger.exception and include exc_info or the exception message) before raising
the ValidationError, and raise the ValidationError with exception chaining
suppressed using "from None"; specifically, update the except block handling
MysqlDDLExecutorError to log the caught exc and then raise
serializers.ValidationError({"errors": "Operation failed."}) from None so the
resolved_executor.executor_id/selected_executor code remains reachable with
clear debug logs.
- Around line 1572-1575: The except block catching AuditException should log the
failure and re-raise a ValidationError with exception chaining suppressed;
specifically, inside the except AuditException as e: for the call to
auditor.operate (and variable workflow_audit_detail) add a process/logging call
(e.g., logger.error) that includes context like action, user id/name,
audit_remark and the exception e, then raise
serializers.ValidationError({"errors": "Operation failed."}) from None so the
original traceback is not chained; ensure you reference auditor.operate,
AuditException, workflow_audit_detail and serializers.ValidationError when
applying the change.
- Around line 1417-1420: Add a warning log when catching MysqlDDLExecutorError
before raising the sanitized serializers.ValidationError: inside the except
block for MysqlDDLExecutorError (the block that currently raises
serializers.ValidationError({"errors": "Operation failed."})), call the module
logger (e.g., logger.warning or logger.warn) including a concise message and the
exception details (exc) or exc_info=True so the real error is recorded for
debugging; then re-raise the same sanitized serializers.ValidationError and
leave the subsequent assignment to selected_executor =
resolved_executor.executor_id unchanged.
In `@common/utils/sendmsg.py`:
- Around line 223-229: In send_feishu_user, remove or reduce
sensitive/request-specific details from the success and failure logs to match
send_feishu_webhook's hardened logging: replace the logger.debug that prints url
and content with a minimal success message (e.g., "Feishu direct message sent
successfully") and change the logger.error to avoid logging raw request data and
full response object (do not print data or r directly); instead log a concise
error summary such as the HTTP status code or r.text/short error string along
with a non-sensitive identifier. Update the logger.debug/logger.error calls that
reference url, data, r, and content accordingly.
- Line 203: The logger.error in sendmsg.py currently logs the full Feishu
response JSON (logger.error("Feishu webhook failed: response=%s", r_json));
change it to log a minimal status indicator or specific error code instead:
parse the response for a short identifier (e.g., r_json.get("code") or
r_json.get("status") or r_json.get("message")) and log that value with a concise
message (e.g., "Feishu webhook failed: code=%s"), avoiding dumping the entire
r_json to reduce sensitive data exposure; update the error logging site in the
function handling Feishu webhook responses accordingly.
In `@common/utils/wx_api.py`:
- Around line 15-16: Update the exception handling in the except block that
currently reads "except Exception as e: logger.error(f'Failed to read WeCom
token from cache: {e}')" to use logger.exception instead so the stack trace is
captured; specifically replace the logger.error call with
logger.exception("Failed to read WeCom token from cache") in the code path that
reads the WeCom token from cache (the except block catching Exception when
attempting to read the token).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0913dcb4-3f84-496d-881d-d521c4406631
📒 Files selected for processing (27)
.github/workflows/black.yml.github/workflows/codeql-analysis.yml.github/workflows/django.ymlapi_access/views.pyapi_admin/settings.pyapi_archives/views.pyapi_core/views.pyapi_instances/views.pyapi_queries/views.pyapi_users/views.pyapi_workflows/views.pycommon/check.pycommon/utils/feishu_api.pycommon/utils/sendmsg.pycommon/utils/wx_api.pysql/data_dictionary.pysql/engines/doris.pysql/engines/goinception.pysql/engines/mongo.pysql/engines/mysql.pysql/instance.pysql/offlinedownload.pysql/plugins/plugin.pysql/plugins/pt_archiver.pysql/plugins/tests.pysql/resource_group.pysql/views.py
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
sql/plugins/plugin.py (1)
123-125:⚠️ Potential issue | 🟠 MajorKeep the wrapped error generic here too.
This still logs the full traceback and re-embeds
str(e)into the raisedRuntimeError, so path/OS details can leak right back into logs or API-facing errors. That undercuts the hardening goal of this PR.🔐 Proposed fix
- except Exception as e: - logger.error("Command execution failed\n{}".format(traceback.format_exc())) - raise RuntimeError("Command execution failed, reason: %s" % str(e)) + except Exception: + logger.error("Command execution failed") + raise RuntimeError("Command execution failed")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/plugins/plugin.py` around lines 123 - 125, Replace the detailed traceback logging and error embedding with a generic message: stop using traceback.format_exc() in the logger and remove str(e) from the raised RuntimeError. Specifically, update the except block that currently calls logger.error("Command execution failed\n{}".format(traceback.format_exc())) and raises RuntimeError("Command execution failed, reason: %s" % str(e)) so it instead logs a generic message via logger.error("Command execution failed") (no traceback) and raises a generic RuntimeError("Command execution failed") without including the original exception details; if you need to suppress exception chaining, raise the RuntimeError using "from None".api_queries/views.py (1)
224-233:⚠️ Potential issue | 🟠 MajorStop logging raw SQL in the masking fallback.
When
query_checkis off, this branch still writes the fullsql_contentand exception text to the application log. That can leak secrets or PII from ad-hoc queries and undercuts the PR’s security-hardening goal.🔐 Suggested fix
except Exception as exc: logger.exception("Data masking failed") if config.get("query_check"): error_message = "Data masking error, contact admin." else: - logger.warning( - "Data masking error, allowed by config. SQL: %s error: %s", - sql_content, - str(exc), - ) + logger.warning( + "Data masking error, allowed by config. Returning unmasked result." + ) query_result.error = None result_data = query_result.__dict__.copy() result_data["rows"] = query_result.to_dict()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_queries/views.py` around lines 224 - 233, The current masking-exception fallback logs raw SQL and exception text via logger.warning using sql_content and str(exc) when config.get("query_check") is false; change this to avoid leaking queries by removing sql_content and exc from the log message and instead log a generic, non-sensitive warning (optionally include a safe fingerprint such as a query hash or a fixed identifier) and set error_message appropriately; update the block that calls logger.exception("Data masking failed") / logger.warning(...) and references sql_content to only emit non-sensitive information while retaining the existing control flow around config.get("query_check") and error_message.api_instances/views.py (1)
2480-2518:⚠️ Potential issue | 🟠 Major
query_enginestill leaks inInstanceResource.get.This endpoint opens an engine for every request but never closes it. The new generic error handling changed this block, but unlike the neighboring handlers it still has no
finally, so both successful and failed requests can leave database connections hanging.♻️ Suggested fix
try: query_engine = get_engine(instance=instance) db_name = query_engine.escape_string(db_name) schema_name = query_engine.escape_string(schema_name) tb_name = query_engine.escape_string(tb_name) if resource_type == "database": resource = query_engine.get_all_databases() resource.rows = filter_db_list( db_list=resource.rows, db_name_regex=query_engine.instance.show_db_name_regex, is_match_regex=True, ) resource.rows = filter_db_list( db_list=resource.rows, db_name_regex=query_engine.instance.denied_db_name_regex, is_match_regex=False, ) elif resource_type == "schema" and db_name: resource = query_engine.get_all_schemas(db_name=db_name) elif resource_type == "table" and db_name: resource = query_engine.get_all_tables( db_name=db_name, schema_name=schema_name ) elif resource_type == "column" and db_name and tb_name: resource = query_engine.get_all_columns_by_tb( db_name=db_name, tb_name=tb_name, schema_name=schema_name ) else: raise serializers.ValidationError( {"errors": "Unsupported resource type or incomplete parameters."} ) - except Exception as msg: - raise serializers.ValidationError({"errors": "Operation failed."}) + except Exception: + raise serializers.ValidationError({"errors": "Operation failed."}) from None + finally: + if "query_engine" in locals(): + query_engine.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_instances/views.py` around lines 2480 - 2518, InstanceResource.get opens a query_engine but never closes it, leaking connections; update the try/except/else to include a finally that safely closes the engine (in InstanceResource.get / get_engine usage). After the existing else block (or surrounding the entire try), add a finally that checks if query_engine is defined and then calls its close/dispose method (e.g., call query_engine.close() or query_engine.dispose() depending on the engine implementation), wrapping that call in a small try/except to avoid masking original errors.sql/engines/mssql.py (1)
771-785:⚠️ Potential issue | 🔴 CriticalAbort immediately when
USE [db_name]fails.This branch records the failure but still falls through to execute
all_statements. If the database switch failed, the rest of the batch can run in the connection’s previous/default database.Suggested fix
except Exception as e: logger.warning("MSSQL USE statement failed", exc_info=True) execute_result.error = "Execution failed" execute_result.rows.append( ReviewResult( @@ execute_time=0, ) ) rowid += 1 + for remaining_statement in all_statements: + execute_result.rows.append( + ReviewResult( + id=rowid, + errlevel=0, + stagestatus="Audit completed", + errormessage="Database switch failed, statement not executed", + sql=remaining_statement, + affected_rows=0, + execute_time=0, + ) + ) + rowid += 1 + if close_conn: + self.close() + return execute_result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/mssql.py` around lines 771 - 785, The catch block for the `USE` statement currently records the failure but continues to run `all_statements`; update the `except Exception as e` branch that handles `use_sql` to abort immediately by stopping further execution and returning the current `execute_result` (or raising) after populating it (the same fields you already set on `execute_result` and pushing the `ReviewResult`); ensure you reference the `use_sql` handling block and prevent processing `all_statements` or any subsequent logic when the DB switch fails.sql/engines/phoenix.py (1)
176-206:⚠️ Potential issue | 🟠 MajorAdvance the skipped-statement IDs after a failure.
Because
breakskips Line 203, the first"Previous statement failed"row reuses the failed statement’sid. That produces duplicate statement IDs in oneReviewSet.Suggested fix
if execute_result.error: # If failed, append remaining statements as not executed. for statement in split_sql[rowid:]: + rowid += 1 execute_result.rows.append( ReviewResult( id=rowid, errlevel=2, stagestatus="Execute Failed", @@ execute_time=0, ) ) - rowid += 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/phoenix.py` around lines 176 - 206, In the except block where a failed ReviewResult is appended (uses rowid), ensure subsequent skipped "Previous statement failed" rows don’t reuse that same id by advancing rowid after recording the failure; specifically, increment rowid (the loop counter used for IDs) immediately after appending the failure ReviewResult in the except handler (or adjust the final loop over split_sql to start IDs from rowid+1) so each ReviewResult created for split_sql keeps a unique id.sql/engines/cassandra.py (1)
218-247:⚠️ Potential issue | 🟠 MajorFix the duplicated
ReviewResult.idvalues after a failed statement.On this path,
breakskips Line 233, so the first"Previous statement failed, not executed"row is appended with the sameidas the failed statement.Suggested fix
if execute_result.error: for statement in sql_result[rowid:]: + rowid += 1 execute_result.rows.append( ReviewResult( id=rowid, errlevel=2, stagestatus="Execute Failed", @@ affected_rows=0, execute_time=0, ) ) - rowid += 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/cassandra.py` around lines 218 - 247, When an exception occurs in the try/except inside execute_result handling, the code breaks out of the loop without incrementing rowid, causing the subsequent loop that appends "Previous statement failed, not executed" ReviewResult entries to reuse the same id as the failed statement; fix by incrementing rowid immediately after creating the failed ReviewResult in the except block (the block that sets execute_result.error and appends ReviewResult) so that the later loop over sql_result[rowid:] creates ReviewResult entries with correct, unique ids; update the except path where execute_result.error is set and ReviewResult is appended to increment rowid afterwards.
🧹 Nitpick comments (3)
common/check.py (2)
180-183: Use the module logger consistently.This block mixes
logging.*andlogger.*; keep one logger path for consistent handlers/format/filters.🔧 Proposed fix
- logging.error("Storage connectivity test failed") - logging.debug("Storage connectivity test returned an error.") + logger.error("Storage connectivity test failed") + logger.debug("Storage connectivity test returned an error.") else: - logging.info("Storage connectivity test succeeded") + logger.info("Storage connectivity test succeeded")Also applies to: 187-187
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 180 - 183, The code block uses the root logging calls (logging.error, logging.debug, logging.info) instead of the module-level logger; update the storage connectivity messages to use the module logger (replace logging.error, logging.debug, logging.info with logger.error, logger.debug, logger.info) so handlers/format/filters remain consistent—also make the same replacement for the other occurrence mentioned around line 187 (where logging.* is used).
49-53: Narrow overly broad exception handling for DB connectivity checks.
except Exceptionon lines 49 and 68 catches all exceptions, hiding non-connection defects and triggering BLE001. Replace with specific exception types and uselogger.exception(...)to properly log the exception context:🔧 Proposed fix
- except Exception: - logger.error(traceback.format_exc()) + except (MySQLdb.Error, ValueError, TypeError): + logger.exception("goInception connection test failed") result["status"] = 1 result["msg"] = "Unable to connect to goInception" return result @@ - except Exception: - logger.error(traceback.format_exc()) + except (MySQLdb.Error, ValueError, TypeError): + logger.exception("goInception backup connection test failed") result["status"] = 1 result["msg"] = "Unable to connect to goInception backup database"Also applies to: 68-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 49 - 53, Replace the broad "except Exception" blocks that wrap the goInception connectivity checks with specific DB/connectivity exception types from the DB client you use (e.g., OperationalError, InterfaceError, DatabaseError or the driver's connection exceptions) and call logger.exception(...) instead of logger.error(traceback.format_exc()) so the stack/exception context is preserved; keep the existing result["status"] = 1 and result["msg"] = "Unable to connect to goInception" behavior. Apply the same change to both exception handlers (the blocks that currently use traceback.format_exc()) so only connection-related exceptions are caught and other errors propagate.api_queries/views.py (1)
832-837: Log audit-operation failures here too.This now hides the exception from the client, but unlike the sibling review endpoints it also drops the server-side trail entirely. Please log
apply_id, user, and action before returning the generic"Audit failed."response.🪵 Suggested fix
- except AuditException as exc: - raise serializers.ValidationError({"errors": "Audit failed."}) + except AuditException: + logger.exception( + "Query privilege audit failed for apply_id=%s user=%s action=%s", + data["apply_id"], + request.user, + audit_status, + ) + raise serializers.ValidationError({"errors": "Audit failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_queries/views.py` around lines 832 - 837, The except block swallowing AuditException needs to log the failure details before raising the generic ValidationError; in the except AuditException as exc: handler for auditor.operate(...) add a server-side log call (using your existing logger) that records apply_id (from data or context), request.user, and the attempted action/audit_status plus exc, then re-raise serializers.ValidationError({"errors": "Audit failed."}) as before; reference the auditor.operate call and the AuditException/serializers.ValidationError to locate the correct spot.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@common/check.py`:
- Around line 115-117: The logger.error call currently logs the raw
sender_response (variable sender_response) which may contain sensitive provider
diagnostics; replace this with a sanitized summary: extract and log only
non-sensitive fields (e.g., status_code, error_code, and a short message) or a
generic message plus a hashed/trimmed identifier, and avoid printing the full
sender_response; update the error call in common/check.py (the logger.error
line) to use the sanitized_summary (or logger.debug the full response behind a
feature flag) so production logs do not contain raw backend responses.
In `@common/utils/sendmsg.py`:
- Around line 220-224: The requests.post calls in sendmsg.py (notably the call
assigning response around the variable response and the earlier call near line
194) lack an explicit timeout and can hang; update both requests.post
invocations to include timeout=5 (consistent with other calls in this file) so
the Feishu API requests use a bounded 5-second timeout—keep the same url,
json/data and headers parameters and simply add timeout=5 to each requests.post
call.
In `@sql/engines/mysql.py`:
- Around line 585-592: The current check flags any standalone "user" token
anywhere in the SQL (e.g., SELECT user FROM other_table); change
references_user_table to only detect "user" when it appears as a table reference
by matching SQL keywords that introduce table names (e.g., FROM, JOIN, INTO,
UPDATE, DELETE) rather than any token. Update the logic around normalized_sql,
references_mysql_user, and references_user_table so references_user_table uses a
regex search like looking for patterns
(from|join|into|update|delete)\s+`?user`?\b (case-insensitive) against
normalized_sql, and keep the existing references_mysql_user check for
"mysql.user"; then use the combined condition if references_mysql_user or
(db_name == "mysql" and references_user_table).
In `@sql/engines/oracle.py`:
- Around line 1234-1248: The except block currently masks all errors as
"Execution failed"; update it to preserve the earlier safe compile diagnostic
carried in the raised exception (e) by assigning execute_result.error = str(e)
(or e.args[0] if present) and use that same string when constructing the
ReviewResult errormessage instead of the constant "Execution failed", while
keeping the logger.warning(..., exc_info=True) and existing ReviewResult fields;
reference the except block handling workflow.id, execute_result, ReviewResult
and the earlier raised Exception(stagestatus) so the sanitized stagestatus
message is propagated into execute_result.error and ReviewResult.errormessage.
---
Outside diff comments:
In `@api_instances/views.py`:
- Around line 2480-2518: InstanceResource.get opens a query_engine but never
closes it, leaking connections; update the try/except/else to include a finally
that safely closes the engine (in InstanceResource.get / get_engine usage).
After the existing else block (or surrounding the entire try), add a finally
that checks if query_engine is defined and then calls its close/dispose method
(e.g., call query_engine.close() or query_engine.dispose() depending on the
engine implementation), wrapping that call in a small try/except to avoid
masking original errors.
In `@api_queries/views.py`:
- Around line 224-233: The current masking-exception fallback logs raw SQL and
exception text via logger.warning using sql_content and str(exc) when
config.get("query_check") is false; change this to avoid leaking queries by
removing sql_content and exc from the log message and instead log a generic,
non-sensitive warning (optionally include a safe fingerprint such as a query
hash or a fixed identifier) and set error_message appropriately; update the
block that calls logger.exception("Data masking failed") / logger.warning(...)
and references sql_content to only emit non-sensitive information while
retaining the existing control flow around config.get("query_check") and
error_message.
In `@sql/engines/cassandra.py`:
- Around line 218-247: When an exception occurs in the try/except inside
execute_result handling, the code breaks out of the loop without incrementing
rowid, causing the subsequent loop that appends "Previous statement failed, not
executed" ReviewResult entries to reuse the same id as the failed statement; fix
by incrementing rowid immediately after creating the failed ReviewResult in the
except block (the block that sets execute_result.error and appends ReviewResult)
so that the later loop over sql_result[rowid:] creates ReviewResult entries with
correct, unique ids; update the except path where execute_result.error is set
and ReviewResult is appended to increment rowid afterwards.
In `@sql/engines/mssql.py`:
- Around line 771-785: The catch block for the `USE` statement currently records
the failure but continues to run `all_statements`; update the `except Exception
as e` branch that handles `use_sql` to abort immediately by stopping further
execution and returning the current `execute_result` (or raising) after
populating it (the same fields you already set on `execute_result` and pushing
the `ReviewResult`); ensure you reference the `use_sql` handling block and
prevent processing `all_statements` or any subsequent logic when the DB switch
fails.
In `@sql/engines/phoenix.py`:
- Around line 176-206: In the except block where a failed ReviewResult is
appended (uses rowid), ensure subsequent skipped "Previous statement failed"
rows don’t reuse that same id by advancing rowid after recording the failure;
specifically, increment rowid (the loop counter used for IDs) immediately after
appending the failure ReviewResult in the except handler (or adjust the final
loop over split_sql to start IDs from rowid+1) so each ReviewResult created for
split_sql keeps a unique id.
In `@sql/plugins/plugin.py`:
- Around line 123-125: Replace the detailed traceback logging and error
embedding with a generic message: stop using traceback.format_exc() in the
logger and remove str(e) from the raised RuntimeError. Specifically, update the
except block that currently calls logger.error("Command execution
failed\n{}".format(traceback.format_exc())) and raises RuntimeError("Command
execution failed, reason: %s" % str(e)) so it instead logs a generic message via
logger.error("Command execution failed") (no traceback) and raises a generic
RuntimeError("Command execution failed") without including the original
exception details; if you need to suppress exception chaining, raise the
RuntimeError using "from None".
---
Nitpick comments:
In `@api_queries/views.py`:
- Around line 832-837: The except block swallowing AuditException needs to log
the failure details before raising the generic ValidationError; in the except
AuditException as exc: handler for auditor.operate(...) add a server-side log
call (using your existing logger) that records apply_id (from data or context),
request.user, and the attempted action/audit_status plus exc, then re-raise
serializers.ValidationError({"errors": "Audit failed."}) as before; reference
the auditor.operate call and the AuditException/serializers.ValidationError to
locate the correct spot.
In `@common/check.py`:
- Around line 180-183: The code block uses the root logging calls
(logging.error, logging.debug, logging.info) instead of the module-level logger;
update the storage connectivity messages to use the module logger (replace
logging.error, logging.debug, logging.info with logger.error, logger.debug,
logger.info) so handlers/format/filters remain consistent—also make the same
replacement for the other occurrence mentioned around line 187 (where logging.*
is used).
- Around line 49-53: Replace the broad "except Exception" blocks that wrap the
goInception connectivity checks with specific DB/connectivity exception types
from the DB client you use (e.g., OperationalError, InterfaceError,
DatabaseError or the driver's connection exceptions) and call
logger.exception(...) instead of logger.error(traceback.format_exc()) so the
stack/exception context is preserved; keep the existing result["status"] = 1 and
result["msg"] = "Unable to connect to goInception" behavior. Apply the same
change to both exception handlers (the blocks that currently use
traceback.format_exc()) so only connection-related exceptions are caught and
other errors propagate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4d645e01-15c8-473c-a704-c0544e39e1db
📒 Files selected for processing (21)
.github/workflows/django.ymlapi_access/views.pyapi_admin/settings.pyapi_archives/views.pyapi_instances/views.pyapi_queries/views.pyapi_workflows/views.pycommon/check.pycommon/utils/sendmsg.pycommon/utils/wx_api.pysql/engines/cassandra.pysql/engines/doris.pysql/engines/mssql.pysql/engines/mysql.pysql/engines/oracle.pysql/engines/pgsql.pysql/engines/phoenix.pysql/engines/redis.pysql/plugins/plugin.pysql/plugins/tests.pysql/resource_group.py
🚧 Files skipped from review as they are similar to previous changes (4)
- api_admin/settings.py
- .github/workflows/django.yml
- common/utils/wx_api.py
- api_workflows/views.py
| result["msg"] = "Email delivery failed; check server logs." | ||
| logger.error("Email delivery test failed: %s", sender_response) | ||
| return result |
There was a problem hiding this comment.
Avoid logging raw email backend responses.
Line 116 logs sender_response verbatim, which can include provider/internal diagnostics and expand sensitive log surface. Keep client message generic (good), but sanitize server logs too.
🔧 Proposed fix
- logger.error("Email delivery test failed: %s", sender_response)
+ logger.error("Email delivery test failed")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result["msg"] = "Email delivery failed; check server logs." | |
| logger.error("Email delivery test failed: %s", sender_response) | |
| return result | |
| result["msg"] = "Email delivery failed; check server logs." | |
| logger.error("Email delivery test failed") | |
| return result |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@common/check.py` around lines 115 - 117, The logger.error call currently logs
the raw sender_response (variable sender_response) which may contain sensitive
provider diagnostics; replace this with a sanitized summary: extract and log
only non-sensitive fields (e.g., status_code, error_code, and a short message)
or a generic message plus a hashed/trimmed identifier, and avoid printing the
full sender_response; update the error call in common/check.py (the logger.error
line) to use the sanitized_summary (or logger.debug the full response behind a
feature flag) so production logs do not contain raw backend responses.
| response = requests.post( | ||
| url=url, | ||
| json=data, | ||
| headers={"Authorization": "Bearer " + get_feishu_access_token()}, | ||
| ).json() | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import ast
from pathlib import Path
HTTP_METHODS = {"get", "post", "put", "patch", "delete", "head", "options", "request"}
for path in Path(".").rglob("*.py"):
try:
src = path.read_text(encoding="utf-8")
tree = ast.parse(src)
except Exception:
continue
for node in ast.walk(tree):
if isinstance(node, ast.Call) and isinstance(node.func, ast.Attribute):
base = node.func.value
if isinstance(base, ast.Name) and base.id == "requests" and node.func.attr in HTTP_METHODS:
has_timeout = any(k.arg == "timeout" for k in node.keywords if k.arg is not None)
if not has_timeout:
print(f"{path}:{node.lineno} -> requests.{node.func.attr} without timeout")
PYRepository: jruszo/Datamingle
Length of output: 293
🏁 Script executed:
cat -n common/utils/sendmsg.py | sed -n '210,230p'Repository: jruszo/Datamingle
Length of output: 951
🏁 Script executed:
# Check for timeout patterns in the entire sendmsg.py file
rg -n "timeout" common/utils/sendmsg.pyRepository: jruszo/Datamingle
Length of output: 381
🏁 Script executed:
cat -n common/utils/sendmsg.py | sed -n '190,200p'Repository: jruszo/Datamingle
Length of output: 445
Add explicit timeout to Feishu API requests.
Lines 194 and 220 use requests.post(...) without timeout; these can block indefinitely on network hangs and tie up worker threads. Use a bounded timeout of 5 seconds, consistent with other requests calls in this file (lines 149, 171).
🔧 Proposed fixes
Line 194:
- r = requests.post(url=url, json=data)
+ r = requests.post(url=url, json=data, timeout=5)Line 220:
response = requests.post(
url=url,
json=data,
headers={"Authorization": "Bearer " + get_feishu_access_token()},
+ timeout=5,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response = requests.post( | |
| url=url, | |
| json=data, | |
| headers={"Authorization": "Bearer " + get_feishu_access_token()}, | |
| ).json() | |
| ) | |
| response = requests.post( | |
| url=url, | |
| json=data, | |
| headers={"Authorization": "Bearer " + get_feishu_access_token()}, | |
| timeout=5, | |
| ) |
🧰 Tools
🪛 Ruff (0.15.12)
[error] 220-220: Probable use of requests call without timeout
(S113)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@common/utils/sendmsg.py` around lines 220 - 224, The requests.post calls in
sendmsg.py (notably the call assigning response around the variable response and
the earlier call near line 194) lack an explicit timeout and can hang; update
both requests.post invocations to include timeout=5 (consistent with other calls
in this file) so the Feishu API requests use a bounded 5-second timeout—keep the
same url, json/data and headers parameters and simply add timeout=5 to each
requests.post call.
| normalized_sql = sql.lower().replace("`", " ") | ||
| normalized_sql = " ".join(normalized_sql.replace("\n", " ").split()) | ||
| normalized_sql = re.sub(r"\s*\.\s*", ".", normalized_sql) | ||
| references_mysql_user = "mysql.user" in normalized_sql | ||
| references_user_table = any( | ||
| token.strip(";") == "user" for token in normalized_sql.split() | ||
| ) | ||
| if references_mysql_user or (db_name == "mysql" and references_user_table): |
There was a problem hiding this comment.
Limit the user check to table positions.
In the mysql database this blocks any query containing a standalone user token, not just reads from the user table. For example, SELECT user FROM some_other_table now gets rejected even though it never touches mysql.user.
Suggested fix
normalized_sql = sql.lower().replace("`", " ")
normalized_sql = " ".join(normalized_sql.replace("\n", " ").split())
normalized_sql = re.sub(r"\s*\.\s*", ".", normalized_sql)
references_mysql_user = "mysql.user" in normalized_sql
- references_user_table = any(
- token.strip(";") == "user" for token in normalized_sql.split()
+ references_user_table = bool(
+ re.search(
+ r"\b(from|join|update|into|desc(?:ribe)?|table)\s+user\b",
+ normalized_sql,
+ )
)
if references_mysql_user or (db_name == "mysql" and references_user_table):
result["bad_query"] = True
result["msg"] = "You do not have permission to view this table"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sql/engines/mysql.py` around lines 585 - 592, The current check flags any
standalone "user" token anywhere in the SQL (e.g., SELECT user FROM
other_table); change references_user_table to only detect "user" when it appears
as a table reference by matching SQL keywords that introduce table names (e.g.,
FROM, JOIN, INTO, UPDATE, DELETE) rather than any token. Update the logic around
normalized_sql, references_mysql_user, and references_user_table so
references_user_table uses a regex search like looking for patterns
(from|join|into|update|delete)\s+`?user`?\b (case-insensitive) against
normalized_sql, and keep the existing references_mysql_user check for
"mysql.user"; then use the combined condition if references_mysql_user or
(db_name == "mysql" and references_user_table).
| return validate_binary_path(value, "gh-ost") | ||
| except ValueError as exc: | ||
| raise serializers.ValidationError(str(exc)) | ||
| raise serializers.ValidationError(str(exc)) from None |
| return validate_binary_path(value, "pt-online-schema-change") | ||
| except ValueError as exc: | ||
| raise serializers.ValidationError(str(exc)) | ||
| raise serializers.ValidationError(str(exc)) from None |
| except MysqlDDLExecutorError as exc: | ||
| raise serializers.ValidationError({"errors": str(exc)}) | ||
| logger.warning("Failed to resolve MySQL DDL executor", exc_info=True) | ||
| raise serializers.ValidationError({"errors": str(exc)}) from None |
| exc_info=True, | ||
| ) | ||
| raise serializers.ValidationError( | ||
| {"errors": str(exc)} |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
sql/engines/mssql.py (1)
771-784:⚠️ Potential issue | 🟠 MajorSanitize execution error payloads returned to clients.
Line 773/807 and Line 779/814 still expose raw exception text in
execute_result/ReviewResult. This can leak DB internals even though logging is now hardened.🔐 Suggested fix
except Exception as e: logger.warning("MSSQL USE statement failed", exc_info=True) - execute_result.error = str(e) + safe_error = "Execution failed" + execute_result.error = safe_error execute_result.rows.append( ReviewResult( id=rowid, errlevel=2, stagestatus="Execute Failed", - errormessage=f"Exception info: {e}", + errormessage=safe_error, sql=use_sql, affected_rows=0, execute_time=0, ) ) @@ except Exception as e: logger.warning("Mssql command execution failed", exc_info=True) - execute_result.error = str(e) + safe_error = "Execution failed" + execute_result.error = safe_error # Append failed statement to execution results. execute_result.rows.append( ReviewResult( id=rowid, errlevel=2, stagestatus="Execute Failed", - errormessage=f"Exception info: {e}", + errormessage=safe_error, sql=statement, affected_rows=0, execute_time=0, ) )Also applies to: 805-815
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/mssql.py` around lines 771 - 784, The current exception handling in the MSSQL USE statement block assigns raw exception text into execute_result.error and ReviewResult.errormessage (see execute_result, ReviewResult, use_sql, rowid), which may leak DB internals; change these assignments to a sanitized generic message (e.g., "Internal execution error" or "Execution failed") or a minimal error code/token before returning to clients, while keeping logger.warning(..., exc_info=True) to record full details internally; do this for both the USE block and the other identical block handling execution failures so clients never receive raw exception strings.sql/engines/cassandra.py (1)
218-227:⚠️ Potential issue | 🟠 MajorDo not return raw Cassandra exception text in workflow results.
Line 220 and Line 226 still include unfiltered exception details in returned objects. Keep detailed info in server logs and return a generic safe message to clients.
🔐 Suggested fix
except Exception as e: logger.warning("%s command execution error", self.name, exc_info=True) - execute_result.error = str(e) + safe_error = "Execution failed" + execute_result.error = safe_error execute_result.rows.append( ReviewResult( id=rowid, errlevel=2, stagestatus="Execute Failed", - errormessage=f"Exception: {e}", + errormessage=safe_error, sql=statement, affected_rows=0, execute_time=0, ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/engines/cassandra.py` around lines 218 - 227, The code is returning raw Cassandra exception text to clients; keep the full exception in server logs but replace client-facing fields. In the except block around logger.warning(self.name...), stop assigning str(e) into execute_result.error and ReviewResult.errormessage; instead set execute_result.error to a generic message like "Internal execution error" and set ReviewResult(errormessage="Internal execution error", errlevel=2, stagestatus="Execute Failed", ...) while keeping logger.warning(..., exc_info=True) to retain full details in logs; ensure no other fields use str(e).
♻️ Duplicate comments (1)
common/check.py (1)
113-117:⚠️ Potential issue | 🟠 MajorDo not return/log raw email backend responses.
Line 115 and Line 116 still expose
sender_responsedirectly. That can leak provider/internal diagnostics to clients and logs.🔧 Proposed fix
if sender_response != "success": result["status"] = 1 - result["msg"] = sender_response - logger.error("Email delivery test failed: %s", sender_response) + result["msg"] = "Email delivery test failed." + logger.error("Email delivery test failed") return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 113 - 117, Do not expose raw sender_response to clients or logs: update the failure path where sender_response is used (the sender_response variable, result dict assignment, and logger.error call) to set result["msg"] to a generic message like "Email delivery failed" (no provider/internal details) and change the logger.error to record only a sanitized/limited identifier (e.g., an error code, truncated or redacted excerpt, or hashed value) instead of the full sender_response so internal diagnostics are not leaked.
🧹 Nitpick comments (3)
common/check.py (2)
213-217: Use warning-level logging (without traceback) for expected lookup failures.
Instance.DoesNotExist/ validation failures are expected user-input paths;logger.exception(...)adds noisy stack traces. Prefer a warning log with instance context.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 213 - 217, Replace the noisy logger.exception call in the except handling for Instance.DoesNotExist/ValueError/TypeError/ValidationError with a warning-level log that does not emit a traceback; in the block that currently catches Instance.DoesNotExist etc. (around the connection test logic using result and returning an HttpResponse), call logger.warning with concise context (e.g. instance identifier, name or request/user info available in scope) like "Instance lookup failed for connection test: %s" and pass the identifier as a parameter, then keep setting result["status"], result["msg"] and returning the HttpResponse as before.
49-52: Narrow the exception handling to expected failure modes.Catching all exceptions masks the actual error types and makes debugging harder. For these connection checks,
MySQLdb.Errorcovers the documented DB-API 2.0 exception hierarchy for connection/authentication/network failures. Additionally, theint(port)conversion can raiseValueError, making it appropriate to catch alongside database errors.🔧 Proposed fix
- except Exception: - logger.error(traceback.format_exc()) + except (MySQLdb.Error, ValueError): + logger.exception("goInception primary connection check failed") result["status"] = 1 result["msg"] = "Unable to connect to goInception" return result @@ - except Exception: - logger.error(traceback.format_exc()) + except (MySQLdb.Error, ValueError): + logger.exception("goInception backup connection check failed") result["status"] = 1 result["msg"] = "Unable to connect to goInception backup database"Also applies to: 68-71
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@common/check.py` around lines 49 - 52, The broad except Exception that logs traceback.format_exc() and sets result["status"]/result["msg"] should be narrowed to only catch the expected errors — replace except Exception with except (MySQLdb.Error, ValueError) around the goInception connection checks so DB-API connection/authentication/network failures (MySQLdb.Error) and port conversion failures (ValueError from int(port)) are handled; keep the same logger.error(traceback.format_exc()) and result assignments, and apply the same change to the second identical block (the other try/except that currently sets result["status"]=1 and result["msg"]="Unable to connect to goInception").api_instances/views.py (1)
755-760: Consider usingfrom Noneto suppress exception chaining.The generic error messages are correctly hiding sensitive details from API responses. However, adding
from Nonewould explicitly suppress exception chaining, making the intent clearer and silencing the B904 linter warnings. This applies to all similar patterns in this file (Lines 760, 823, 919, 1009, 1108, 1171, 1295, 1346, 1452, 1507, 1551, 1621, 1714, 1869, 1968, 2013, 2058, 2106, 2152, 2196, 2515).♻️ Example fix pattern
except Exception as exc: logger.exception( "Failed to list data dictionary databases for instance_id=%s", instance_id, ) - raise serializers.ValidationError({"errors": "Operation failed."}) + raise serializers.ValidationError({"errors": "Operation failed."}) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api_instances/views.py` around lines 755 - 760, The except blocks that log the exception and then raise a generic serializers.ValidationError (e.g., the logger.exception call followed by raise serializers.ValidationError({"errors": "Operation failed."}) around instance_id handling) should suppress exception chaining by raising the ValidationError with "from None"; update each similar pattern in this file (all except handlers at the listed locations) to use "raise serializers.ValidationError({...}) from None" so the original exception is not chained into the API response or linter warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api_admin/settings.py`:
- Line 363: The code currently raises serializers.ValidationError(str(exc))
(seen around the ValidationError raises in api_admin/settings.py that wrap
validate_binary_path(...) failures), which can leak internal exception text to
clients; change those raises to return a fixed, client-safe error message (e.g.,
"Invalid binary path" or "Invalid configuration value") instead of str(exc),
preserving the original exception for internal logging if needed (log the exc
inside the same scope using logger.exception or logger.error) and keep the
ValidationError type so callers still get a 4xx response.
In `@api_instances/views.py`:
- Around line 2514-2515: The except block currently swallows the caught
exception (named msg) and re-raises a generic serializers.ValidationError;
update this handler to log the original exception before re-raising by calling
logger.exception(...) (or logger.error with exc_info=True) using the caught
exception variable (rename msg to e if helpful) and then raise the same
serializers.ValidationError({"errors": "Operation failed."}) so the error
details are recorded while preserving the external API response.
In `@api_workflows/views.py`:
- Around line 1700-1707: Replace the insecure exposure of the raw exception
string in the MySQL DDL executor error handler: inside the except
MysqlDDLExecutorError as exc block (where logger.warning is called and
serializers.ValidationError is raised), stop returning {"errors": str(exc)} to
the client; instead raise serializers.ValidationError with a generic,
non-sensitive error message such as {"errors": "Failed to resolve MySQL DDL
executor"} while keeping the detailed exception logged via
logger.warning/logger.exception (exc_info=True) so internal diagnostics remain
available without leaking exception details to the API consumer.
- Around line 1418-1421: The catch for MysqlDDLExecutorError currently returns
the exception text to clients (raise serializers.ValidationError({"errors":
str(exc)})), which can leak subprocess output; instead keep detailed exc_info in
server logs but raise a generic client-safe message. Update the two places
handling MysqlDDLExecutorError (the handler around selected_executor =
resolved_executor.executor_id and the similar handler in the same file tied to
executor.dry_run()/_run_subprocess()) to keep logger.warning(..., exc_info=True)
but change the ValidationError payload to a generic string (e.g. {"errors":
"MySQL DDL executor dry-run failed"}), so internal stderr/stdout are not
returned to API clients.
---
Outside diff comments:
In `@sql/engines/cassandra.py`:
- Around line 218-227: The code is returning raw Cassandra exception text to
clients; keep the full exception in server logs but replace client-facing
fields. In the except block around logger.warning(self.name...), stop assigning
str(e) into execute_result.error and ReviewResult.errormessage; instead set
execute_result.error to a generic message like "Internal execution error" and
set ReviewResult(errormessage="Internal execution error", errlevel=2,
stagestatus="Execute Failed", ...) while keeping logger.warning(...,
exc_info=True) to retain full details in logs; ensure no other fields use
str(e).
In `@sql/engines/mssql.py`:
- Around line 771-784: The current exception handling in the MSSQL USE statement
block assigns raw exception text into execute_result.error and
ReviewResult.errormessage (see execute_result, ReviewResult, use_sql, rowid),
which may leak DB internals; change these assignments to a sanitized generic
message (e.g., "Internal execution error" or "Execution failed") or a minimal
error code/token before returning to clients, while keeping logger.warning(...,
exc_info=True) to record full details internally; do this for both the USE block
and the other identical block handling execution failures so clients never
receive raw exception strings.
---
Duplicate comments:
In `@common/check.py`:
- Around line 113-117: Do not expose raw sender_response to clients or logs:
update the failure path where sender_response is used (the sender_response
variable, result dict assignment, and logger.error call) to set result["msg"] to
a generic message like "Email delivery failed" (no provider/internal details)
and change the logger.error to record only a sanitized/limited identifier (e.g.,
an error code, truncated or redacted excerpt, or hashed value) instead of the
full sender_response so internal diagnostics are not leaked.
---
Nitpick comments:
In `@api_instances/views.py`:
- Around line 755-760: The except blocks that log the exception and then raise a
generic serializers.ValidationError (e.g., the logger.exception call followed by
raise serializers.ValidationError({"errors": "Operation failed."}) around
instance_id handling) should suppress exception chaining by raising the
ValidationError with "from None"; update each similar pattern in this file (all
except handlers at the listed locations) to use "raise
serializers.ValidationError({...}) from None" so the original exception is not
chained into the API response or linter warnings.
In `@common/check.py`:
- Around line 213-217: Replace the noisy logger.exception call in the except
handling for Instance.DoesNotExist/ValueError/TypeError/ValidationError with a
warning-level log that does not emit a traceback; in the block that currently
catches Instance.DoesNotExist etc. (around the connection test logic using
result and returning an HttpResponse), call logger.warning with concise context
(e.g. instance identifier, name or request/user info available in scope) like
"Instance lookup failed for connection test: %s" and pass the identifier as a
parameter, then keep setting result["status"], result["msg"] and returning the
HttpResponse as before.
- Around line 49-52: The broad except Exception that logs traceback.format_exc()
and sets result["status"]/result["msg"] should be narrowed to only catch the
expected errors — replace except Exception with except (MySQLdb.Error,
ValueError) around the goInception connection checks so DB-API
connection/authentication/network failures (MySQLdb.Error) and port conversion
failures (ValueError from int(port)) are handled; keep the same
logger.error(traceback.format_exc()) and result assignments, and apply the same
change to the second identical block (the other try/except that currently sets
result["status"]=1 and result["msg"]="Unable to connect to goInception").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c0396a1f-4f5b-4402-bcfa-cda771428cea
📒 Files selected for processing (8)
api_admin/settings.pyapi_instances/views.pyapi_workflows/views.pycommon/check.pysql/data_dictionary.pysql/engines/cassandra.pysql/engines/mssql.pysql/engines/oracle.py
✅ Files skipped from review due to trivial changes (1)
- sql/data_dictionary.py
| return validate_binary_path(value, "gh-ost") | ||
| except ValueError as exc: | ||
| raise serializers.ValidationError(str(exc)) | ||
| raise serializers.ValidationError(str(exc)) from None |
There was a problem hiding this comment.
Avoid exposing raw validation exception text to clients.
On Line 363 and Line 369, returning str(exc) can leak internal details from validate_binary_path(...) into API responses. Use a fixed, client-safe message instead.
🔒 Suggested patch
def validate_gh_ost(self, value):
try:
return validate_binary_path(value, "gh-ost")
- except ValueError as exc:
- raise serializers.ValidationError(str(exc)) from None
+ except ValueError:
+ raise serializers.ValidationError("Operation failed.") from None
def validate_pt_osc(self, value):
try:
return validate_binary_path(value, "pt-online-schema-change")
- except ValueError as exc:
- raise serializers.ValidationError(str(exc)) from None
+ except ValueError:
+ raise serializers.ValidationError("Operation failed.") from NoneAlso applies to: 369-369
🧰 Tools
🪛 GitHub Check: CodeQL
[warning] 363-363: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api_admin/settings.py` at line 363, The code currently raises
serializers.ValidationError(str(exc)) (seen around the ValidationError raises in
api_admin/settings.py that wrap validate_binary_path(...) failures), which can
leak internal exception text to clients; change those raises to return a fixed,
client-safe error message (e.g., "Invalid binary path" or "Invalid configuration
value") instead of str(exc), preserving the original exception for internal
logging if needed (log the exc inside the same scope using logger.exception or
logger.error) and keep the ValidationError type so callers still get a 4xx
response.
| except Exception as msg: | ||
| raise serializers.ValidationError({"errors": str(msg)}) | ||
| raise serializers.ValidationError({"errors": "Operation failed."}) |
There was a problem hiding this comment.
Missing exception logging before re-raising generic error.
Unlike other handlers in this file that call logger.exception(...) before raising the generic error, this handler silently swallows the exception details without logging. The captured variable msg is also unused.
🐛 Proposed fix
- except Exception as msg:
- raise serializers.ValidationError({"errors": "Operation failed."})
+ except Exception:
+ logger.exception(
+ "Failed to retrieve instance resources for instance_id=%s resource_type=%s",
+ instance_id,
+ resource_type,
+ )
+ raise serializers.ValidationError({"errors": "Operation failed."}) from None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as msg: | |
| raise serializers.ValidationError({"errors": str(msg)}) | |
| raise serializers.ValidationError({"errors": "Operation failed."}) | |
| except Exception: | |
| logger.exception( | |
| "Failed to retrieve instance resources for instance_id=%s resource_type=%s", | |
| instance_id, | |
| resource_type, | |
| ) | |
| raise serializers.ValidationError({"errors": "Operation failed."}) from None |
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 2514-2514: Do not catch blind exception: Exception
(BLE001)
[warning] 2515-2515: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api_instances/views.py` around lines 2514 - 2515, The except block currently
swallows the caught exception (named msg) and re-raises a generic
serializers.ValidationError; update this handler to log the original exception
before re-raising by calling logger.exception(...) (or logger.error with
exc_info=True) using the caught exception variable (rename msg to e if helpful)
and then raise the same serializers.ValidationError({"errors": "Operation
failed."}) so the error details are recorded while preserving the external API
response.
| except MysqlDDLExecutorError as exc: | ||
| raise serializers.ValidationError({"errors": str(exc)}) | ||
| logger.warning("Failed to resolve MySQL DDL executor", exc_info=True) | ||
| raise serializers.ValidationError({"errors": str(exc)}) from None | ||
| selected_executor = resolved_executor.executor_id |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for MysqlDDLExecutorError usage and message construction
ast-grep --pattern 'class MysqlDDLExecutorError($_) {
$$$
}'
# Also search for how it's raised
rg -n -A3 'raise MysqlDDLExecutorError'Repository: jruszo/Datamingle
Length of output: 1455
🏁 Script executed:
# Find MysqlDDLExecutorError class definition
fd -t f '*.py' | xargs grep -l 'class MysqlDDLExecutorError'Repository: jruszo/Datamingle
Length of output: 347
🏁 Script executed:
# Get full context around lines 1418-1421 in api_workflows/views.py
cat -n api_workflows/views.py | sed -n '1410,1430p'Repository: jruszo/Datamingle
Length of output: 1097
🏁 Script executed:
# Check the `reason` variable at line 428 in mysql_ddl.py to understand what it contains
cat -n sql/engines/mysql_ddl.py | sed -n '420,435p'Repository: jruszo/Datamingle
Length of output: 787
🏁 Script executed:
# Look for any sanitization or error message handling patterns in the codebase
rg -n 'ValidationError.*str\(' api_workflows/ -B2 -A2Repository: jruszo/Datamingle
Length of output: 481
🏁 Script executed:
# Search for the inspection object and blockers definition
rg -n 'blockers' sql/engines/mysql_ddl.py -B5 -A5Repository: jruszo/Datamingle
Length of output: 2002
🏁 Script executed:
# Look at the context around line 170 where output is used
cat -n sql/engines/mysql_ddl.py | sed -n '155,175p'Repository: jruszo/Datamingle
Length of output: 1005
🏁 Script executed:
# Search for the inspection class/object to understand what blockers contains
rg -n 'class.*Inspection|inspection.*=' sql/engines/mysql_ddl.py -A3Repository: jruszo/Datamingle
Length of output: 689
🏁 Script executed:
# Find _resolve_mysql_ddl_executor function definition
rg -n 'def _resolve_mysql_ddl_executor' api_workflows/ -A20Repository: jruszo/Datamingle
Length of output: 1412
🏁 Script executed:
# Also check if preflight parameter affects whether subprocess is called
rg -n 'preflight' sql/engines/mysql_ddl.py -B2 -A2Repository: jruszo/Datamingle
Length of output: 599
🏁 Script executed:
# Find preflight_ddl_executor method to see what it does
rg -n 'def preflight_ddl_executor' sql/engines/mysql_ddl.py -A15Repository: jruszo/Datamingle
Length of output: 43
🏁 Script executed:
# Check the full preflight method implementation
cat -n sql/engines/mysql_ddl.py | sed -n '452,475p'Repository: jruszo/Datamingle
Length of output: 1176
🏁 Script executed:
# Search for preflight_ddl_executor more broadly
rg -n 'preflight_ddl_executor' sql/engines/Repository: jruszo/Datamingle
Length of output: 152
🏁 Script executed:
# Check if it's inherited or in a parent class
rg -n 'class.*MysqlDDL' sql/engines/mysql_ddl.py -A3Repository: jruszo/Datamingle
Length of output: 1499
🏁 Script executed:
# Check the preflight_ddl_executor implementation in mysql.py
cat -n sql/engines/mysql.py | sed -n '755,780p'Repository: jruszo/Datamingle
Length of output: 1214
🏁 Script executed:
# Also check what calls .preflight() method
rg -n '\.preflight\(' sql/engines/mysql_ddl.py sql/engines/mysql.py -B2 -A2Repository: jruszo/Datamingle
Length of output: 353
🏁 Script executed:
# Check the dry_run method implementation for BaseMysqlDDLExecutor
rg -n 'def dry_run' sql/engines/mysql_ddl.py -A10Repository: jruszo/Datamingle
Length of output: 1288
🏁 Script executed:
# Also verify that _run_subprocess is called from dry_run
rg -n '_run_subprocess' sql/engines/mysql_ddl.pyRepository: jruszo/Datamingle
Length of output: 309
🏁 Script executed:
# Check if the code snippet at lines 1418-1421 complies with Black formatting
cat -n api_workflows/views.py | sed -n '1418,1421p'Repository: jruszo/Datamingle
Length of output: 368
🏁 Script executed:
# Also check if similar patterns exist elsewhere in the file that handle MysqlDDLExecutorError
rg -n 'MysqlDDLExecutorError' api_workflows/views.pyRepository: jruszo/Datamingle
Length of output: 236
🏁 Script executed:
# Check the other occurrence at line 1700
cat -n api_workflows/views.py | sed -n '1695,1710p'Repository: jruszo/Datamingle
Length of output: 989
Replace str(exc) with a generic error message to prevent exposing subprocess output.
The preflight path calls executor.dry_run(), which invokes _run_subprocess(). When subprocess execution fails, this raises MysqlDDLExecutorError with the raw stderr/stdout output (line 170-172 of sql/engines/mysql_ddl.py). Using str(exc) exposes this unsanitized output to API clients, potentially leaking database details, connection parameters, and system information.
This pattern appears at two locations:
- Line 1420 (the one under review)
- Line 1706 (same issue)
Apply the suggested fix to both occurrences:
Suggested fix
except MysqlDDLExecutorError as exc:
logger.warning("Failed to resolve MySQL DDL executor", exc_info=True)
- raise serializers.ValidationError({"errors": str(exc)}) from None
+ raise serializers.ValidationError(
+ {"errors": "MySQL DDL executor resolution failed."}
+ ) from None🧰 Tools
🪛 GitHub Check: CodeQL
[warning] 1420-1420: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api_workflows/views.py` around lines 1418 - 1421, The catch for
MysqlDDLExecutorError currently returns the exception text to clients (raise
serializers.ValidationError({"errors": str(exc)})), which can leak subprocess
output; instead keep detailed exc_info in server logs but raise a generic
client-safe message. Update the two places handling MysqlDDLExecutorError (the
handler around selected_executor = resolved_executor.executor_id and the similar
handler in the same file tied to executor.dry_run()/_run_subprocess()) to keep
logger.warning(..., exc_info=True) but change the ValidationError payload to a
generic string (e.g. {"errors": "MySQL DDL executor dry-run failed"}), so
internal stderr/stdout are not returned to API clients.
| except MysqlDDLExecutorError as exc: | ||
| raise serializers.ValidationError({"errors": str(exc)}) | ||
| logger.warning( | ||
| "Failed to resolve MySQL DDL executor for execution", | ||
| exc_info=True, | ||
| ) | ||
| raise serializers.ValidationError( | ||
| {"errors": str(exc)} | ||
| ) from None |
There was a problem hiding this comment.
Same str(exc) exposure issue as the schedule endpoint.
This handler has the same CodeQL-flagged pattern. Apply the same fix here for consistency.
🛡️ Proposed fix for consistency
except MysqlDDLExecutorError as exc:
logger.warning(
"Failed to resolve MySQL DDL executor for execution",
exc_info=True,
)
raise serializers.ValidationError(
- {"errors": str(exc)}
+ {"errors": "MySQL DDL executor resolution failed."}
) from None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except MysqlDDLExecutorError as exc: | |
| raise serializers.ValidationError({"errors": str(exc)}) | |
| logger.warning( | |
| "Failed to resolve MySQL DDL executor for execution", | |
| exc_info=True, | |
| ) | |
| raise serializers.ValidationError( | |
| {"errors": str(exc)} | |
| ) from None | |
| except MysqlDDLExecutorError as exc: | |
| logger.warning( | |
| "Failed to resolve MySQL DDL executor for execution", | |
| exc_info=True, | |
| ) | |
| raise serializers.ValidationError( | |
| {"errors": "MySQL DDL executor resolution failed."} | |
| ) from None |
🧰 Tools
🪛 GitHub Check: CodeQL
[warning] 1706-1706: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api_workflows/views.py` around lines 1700 - 1707, Replace the insecure
exposure of the raw exception string in the MySQL DDL executor error handler:
inside the except MysqlDDLExecutorError as exc block (where logger.warning is
called and serializers.ValidationError is raised), stop returning {"errors":
str(exc)} to the client; instead raise serializers.ValidationError with a
generic, non-sensitive error message such as {"errors": "Failed to resolve MySQL
DDL executor"} while keeping the detailed exception logged via
logger.warning/logger.exception (exc_info=True) so internal diagnostics remain
available without leaking exception details to the API consumer.
Summary
Testing
black --check --diff .passeddocker exec datamingle-app python manage.py test sql.plugins.testspasseddocker exec datamingle-app python manage.py checkpasseddocker exec datamingle-app python manage.py makemigrations sql --checkpassedSummary by CodeRabbit
Release Notes