Skip to content

fix: Sanitize database credentials in error messages #11621

Merged
Cristhianzl merged 3 commits into
mainfrom
cz/fix-db-exposed
Mar 3, 2026
Merged

fix: Sanitize database credentials in error messages #11621
Cristhianzl merged 3 commits into
mainfrom
cz/fix-db-exposed

Conversation

@Cristhianzl
Copy link
Copy Markdown
Member

@Cristhianzl Cristhianzl commented Feb 6, 2026

OBJECTIVE: Prevent sensitive database credentials (username, password) from being exposed in application logs when DATABASE_URL is misconfigured.

CHANGES:

  • Add sanitize_database_url function to mask credentials with ***
  • Update database_url validator to use sanitized URL in error messages
  • Add comprehensive unit tests for credential sanitization

Before:

  File "/Users/criszl/Documents/langflow/.venv/lib/python3.13/site-packages/pydantic/main.py", line 253, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
database_url
  Value error, Invalid database_url provided: 'postgresql+psycopg://myuser:mysecretpassword@127.0.0.1::5432/mydb' [type=value_error, input_value='postgresql+psycopg://myu...rd@127.0.0.1::5432/mydb', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error

After:

  File "/Users/criszl/Documents/langflow/.venv/lib/python3.13/site-packages/pydantic/main.py", line 253, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
database_url
  Value error, Invalid database_url provided: 'postgresql+psycopg://***:***@127.0.0.1::5432/mydb' [type=value_error, input_value='postgresql+psycopg://myu...rd@127.0.0.1::5432/mydb', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error

Summary by CodeRabbit

  • New Features

    • Database credentials are now masked in error messages to prevent accidental exposure.
    • Enhanced database URL validation with improved error handling.
  • Tests

    • Added comprehensive test coverage for database URL validation and credential masking scenarios.

@Cristhianzl Cristhianzl self-assigned this Feb 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 6, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This change adds database URL sanitization to prevent credential exposure in error messages. A new sanitize_database_url function masks credentials in database URLs, with fallback to regex-based masking. The function is integrated into settings validation error handling, and comprehensive tests validate the sanitization behavior across various URL formats.

Changes

Cohort / File(s) Summary
Core Sanitization Logic
src/lfx/src/lfx/utils/util_strings.py
Added new sanitize_database_url() function that masks username/password in database URLs using SQLAlchemy parsing with regex fallback. Updated is_valid_database_url() to explicitly handle empty input and return boolean result.
Settings Integration
src/lfx/src/lfx/services/settings/base.py
Imported sanitize_database_url() and applied it to database URL error messages to prevent credential exposure in validation error output.
Test Coverage
src/lfx/tests/unit/utils/test_database_url_sanitization.py
Added comprehensive unit tests covering sanitization across PostgreSQL, MySQL, and SQLite URLs; validation error message masking; malformed URL handling; and edge cases with empty/None inputs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)
Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error Test coverage is comprehensive but contains a security-critical flaw: weak assertions at lines 132-133 use 'or' logic that short-circuits and allows credential leaks to pass undetected. Replace 'or' assertions with direct checks: assert "user" not in sanitized and assert "pass" not in sanitized. Use distinctive credentials like "testuser123" instead of common substrings to ensure proper masking validation.
Test Quality And Coverage ⚠️ Warning Test suite contains weak assertions using 'or' logic that fail to properly validate credential masking, allowing tests to pass even when credentials are not actually masked. Replace weak 'or' assertions with stronger 'and' logic or individual assertions that independently verify both credential absence and masking placeholder presence.
Test File Naming And Structure ⚠️ Warning Test assertions use or operators that fail to properly validate credential masking, allowing the test to pass even when credentials are not actually masked. Replace weak assertions with independent assertions that enforce credential masking without short-circuiting logic.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: Sanitize database credentials in error messages' accurately and clearly describes the main change—preventing credential exposure in error messages by sanitizing database URLs.
Docstring Coverage ✅ Passed Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.
Excessive Mock Usage Warning ✅ Passed Test file demonstrates appropriate and minimal use of test doubles with 8 test methods validating real function behavior without excessive mocking.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cz/fix-db-exposed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the bug Something isn't working label Feb 6, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Feb 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/lfx/tests/unit/utils/test_database_url_sanitization.py`:
- Around line 125-135: The test
test_should_handle_malformed_url_with_regex_fallback uses weak assertions that
always pass because it uses "or" with "***" present; update the assertions to
require both that the original credentials are not present and that the mask is
present (use "and" logic) so that sanitize_database_url(url) is verified to
remove credentials and also to include the mask; additionally replace the
placeholder credentials with distinctive strings (e.g.,
"s3cr3tuser"/"s3cr3tpass") to avoid accidental substring matches against
host/path components when checking the sanitized result.
🧹 Nitpick comments (1)
src/lfx/src/lfx/utils/util_strings.py (1)

37-50: Type hint doesn't reflect None handling.

The function signature declares url: str, but the implementation handles None at line 49 (and the test file exercises this path). Consider updating the signature to url: str | None to match the actual contract.

Proposed fix
-def sanitize_database_url(url: str) -> str:
+def sanitize_database_url(url: str | None) -> str | None:

Comment thread src/lfx/tests/unit/utils/test_database_url_sanitization.py Outdated
Copy link
Copy Markdown
Member

@dkaushik94 dkaushik94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good except the check for username presence only.

Comment thread src/lfx/src/lfx/utils/util_strings.py Outdated
from sqlalchemy.engine import make_url

parsed_url = make_url(url)
if parsed_url.username:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cristhianzl
This is a brittle check since systems often use no username but with password auth. For example possibly:
postgresql://:password123@localhost/db

This would end up failing this check but we still would want to mask the password.

Additionally, this case also fails the RegEx check since we are enforcing the username should be at least 1 character long.
We can improve the regex, but I think our use case is pretty contained provided we aren't extending this utility function to work with all kinds of datastore connection strings (that'd add more edge cases we would need to handle in case make_url fails.).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick improvement with Gemini came up with:

def sanitize_database_url(url: str) -> str:
    if not url:
        return url

    # Strategy 1: SQLAlchemy (The Gold Standard)
    try:
        from sqlalchemy.engine import make_url
        parsed_url = make_url(url)
        # Mask if there is a username OR a password
        if parsed_url.username or parsed_url.password:
            return str(parsed_url.set(username="***", password="***"))
        return str(parsed_url)
    
    except (ImportError, Exception):
        # Strategy 2: urllib.parse (Standard Library Fallback)
        try:
            p = urlparse(url)
            if p.username or p.password:
                # Rebuild netloc without credentials
                port_str = f":{p.port}" if p.port else ""
                new_netloc = f"***:***@{p.hostname}{port_str}"
                return urlunparse(p._replace(netloc=new_netloc))
            return url
        except Exception:
            # Final Fallback: Defensive Regex
            # This pattern is more aggressive, catching the :// and everything up to the @
            return re.sub(r"(://).*?@", r"\1***:***@", url)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thank you!

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 2, 2026
@Cristhianzl Cristhianzl requested a review from dkaushik94 March 2, 2026 19:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 2, 2026

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 22%
22.04% (7566/34320) 14.69% (3947/26855) 14.88% (1081/7261)

Unit Test Results

Tests Skipped Failures Errors Time
2507 0 💤 0 ❌ 0 🔥 40.238s ⏱️

Copy link
Copy Markdown
Member

@dkaushik94 dkaushik94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the lgtm This PR has been approved by a maintainer label Mar 2, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 36.42%. Comparing base (e601ead) to head (7f0ec54).
⚠️ Report is 79 commits behind head on main.

❌ Your project status has failed because the head coverage (41.60%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #11621      +/-   ##
==========================================
+ Coverage   34.95%   36.42%   +1.46%     
==========================================
  Files        1506     1570      +64     
  Lines       71881    76690    +4809     
  Branches    10692    11638     +946     
==========================================
+ Hits        25124    27931    +2807     
- Misses      45425    47184    +1759     
- Partials     1332     1575     +243     
Flag Coverage Δ
backend 56.29% <ø> (+0.75%) ⬆️
frontend 19.80% <ø> (+3.72%) ⬆️
lfx 41.60% <100.00%> (-0.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/lfx/src/lfx/services/settings/base.py 73.58% <100.00%> (+2.43%) ⬆️
src/lfx/src/lfx/utils/util_strings.py 55.31% <100.00%> (+45.94%) ⬆️

... and 285 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Cristhianzl Cristhianzl added this pull request to the merge queue Mar 3, 2026
Merged via the queue into main with commit 15b9853 Mar 3, 2026
95 of 96 checks passed
@Cristhianzl Cristhianzl deleted the cz/fix-db-exposed branch March 3, 2026 10:42
HimavarshaVS pushed a commit that referenced this pull request Mar 10, 2026
* sanitize error from db

* [autofix.ci] apply automated fixes

* improve regex

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants