Skip to content

Conversation

@fogelito
Copy link
Contributor

@fogelito fogelito commented Oct 16, 2025

Support Postgres case-insensitivity and accent-insensitive ONLY on $id attribute unique index

Summary by CodeRabbit

  • Chores
    • Updated PostgreSQL database collation settings and index configuration to enhance consistency in text handling and database performance.
    • Improved test infrastructure for exception handling validation.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Walkthrough

PostgreSQL adapter collation definitions updated from utf8_ci to utf8_ci_ai with adjusted locale settings. COLLATE clauses added to indexed columns (_uid, _document) across shared and non-shared table configurations. Test exception handling refactored from expectException to try-catch blocks for duplicate detection scenarios.

Changes

Cohort / File(s) Summary
PostgreSQL Collation & Index Updates
src/Database/Adapter/Postgres.php
Updated collation definition from utf8_ci (locale 'und-u-ks-primary') to utf8_ci_ai (locale 'und-u-ks-level1'). Added COLLATE utf8_ci_ai to unique indexes on "_uid" and "_document" fields in both shared and non-shared table modes. Simplified index attribute handling by removing conditional COLLATE logic in createIndex for UNIQUE indexes. Adjusted variable naming in deleteIndex from $name to $collection.
Test Exception Handling
tests/e2e/Adapter/Scopes/DocumentTests.php
Replaced expectException declarations with try-catch blocks in four test methods (testUniqueIndexDuplicate, testUniqueIndexDuplicateUpdate, testExceptionDuplicate, testExceptionCaseInsensitiveDuplicate) to assert DuplicateException on document creation/update failures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • utopia-php/database#661: Modifies Postgres index creation logic and index/column handling for spatial enhancements, overlapping with collation and index attribute changes.
  • utopia-php/database#613: Adjusts Postgres uniqueness handling for _uid/_tenant indexes with related index-creation logic modifications.

Suggested reviewers

  • abnegate

Poem

🐰 Collations shift like carrot gardens fair,
With utf8_ci_ai blooming everywhere,
Indexes dressed in COLLATE's bright dress,
Try-catch blocks put tests to the test,
Postgres hops on with indexed finesse! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Title Check ✅ Passed The title clearly and concisely summarizes the primary change, indicating that Utopia’s Postgres adapter now treats UID comparisons as case-insensitive, which aligns with the pull request’s objective to apply a utf8_ci_ai collation to the UID index. It avoids extraneous detail while accurately reflecting the core enhancement.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch postgres-case-insensitive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/e2e/Adapter/Scopes/DocumentTests.php (1)

5331-5342: Add accent‑insensitive duplicate check for $id (Postgres‑only)

To fully cover the PR objective, also assert that 'café' vs 'cafe' (and/or 'CAFÉ') duplicates on $id under Postgres.

Example test to add (scoped to Postgres):

public function testExceptionAccentInsensitiveDuplicateUID(): void
{
    /** @var Database $database */
    $database = static::getDatabase();

    // Only relevant for the Postgres adapter
    if (!($database->getAdapter() instanceof \Utopia\Database\Adapter\Postgres)) {
        $this->expectNotToPerformAssertions();
        return;
    }

    $doc = new Document([
        '$id' => 'café',
        '$permissions' => [Permission::read(Role::any())],
    ]);

    // First insert
    $doc->removeAttribute('$sequence');
    $database->createDocument('documents', $doc);

    // Same ID without accent should conflict under accent-insensitive collation
    $doc->setAttribute('$id', 'cafe');
    $doc->removeAttribute('$sequence');

    try {
        $database->createDocument('documents', $doc);
        $this->fail('Failed to throw exception');
    } catch (\Throwable $e) {
        $this->assertInstanceOf(DuplicateException::class, $e);
    }
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ce7ffc and 6b932c1.

📒 Files selected for processing (2)
  • src/Database/Adapter/Postgres.php (5 hunks)
  • tests/e2e/Adapter/Scopes/DocumentTests.php (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/DocumentTests.php
🧬 Code graph analysis (2)
tests/e2e/Adapter/Scopes/DocumentTests.php (5)
src/Database/Adapter/Postgres.php (2)
  • createDocument (950-1050)
  • create (135-162)
src/Database/Adapter/MariaDB.php (2)
  • createDocument (822-933)
  • create (28-43)
src/Database/Adapter/SQLite.php (2)
  • createDocument (522-630)
  • create (113-116)
src/Database/Database.php (2)
  • createDocument (4176-4270)
  • create (1241-1260)
src/Database/Document.php (3)
  • Document (12-470)
  • getId (63-66)
  • removeAttribute (287-293)
src/Database/Adapter/Postgres.php (3)
src/Database/Adapter/SQLite.php (1)
  • getSQLTable (1080-1083)
src/Database/Adapter/SQL.php (1)
  • getSQLTable (1704-1707)
src/Database/Adapter.php (1)
  • filter (1196-1205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Setup & Build Docker Image
🔇 Additional comments (7)
tests/e2e/Adapter/Scopes/DocumentTests.php (3)

4737-4765: LGTM: duplicate on unique index asserted correctly

Try/catch pattern and DuplicateException assertion look good and consistent with the suite.


4801-4807: LGTM: update collision handled

Update path correctly asserts DuplicateException via try/catch.


5309-5319: LGTM: preparing re-insert by clearing $sequence

Clearing '$sequence' before each create avoids manual sequence leakage; duplicate $id is asserted properly.

src/Database/Adapter/Postgres.php (4)

247-247: LGTM! Collation correctly applied to _uid indexes.

The COLLATE clause is properly added to the unique indexes on the _uid column for both shared and non-shared table modes, ensuring case-insensitive and accent-insensitive matching for the $id attribute as intended by the PR objectives.

Also applies to: 254-254


855-855: LGTM! Appropriate simplification of index attribute formatting.

The removal of per-attribute COLLATE logic from createIndex is correct, as the collation is now explicitly applied only to the built-in indexes on _uid and _document columns in createCollection. This makes the behavior more explicit and aligns with the PR objective to limit the collation changes to the $id attribute rather than all unique indexes.


903-903: LGTM! Improved variable naming for clarity.

Renaming the variable from $name to $collection improves code readability by using a more descriptive identifier.


282-282: Verify whether COLLATE on _document should be applied to shared mode permissions index.

The asymmetry is confirmed: in non-shared mode (line 282), the _document column in the unique permissions index gets COLLATE utf8_ci_ai, while in shared mode (lines 273–274), the _document column in the unique index has no COLLATE clause. No inline comments explain this difference.

Both modes use _document in the unique index structure, so this difference may be intentional or an oversight. Confirm whether shared mode should also apply COLLATE utf8_ci_ai on _document for consistency, or document the architectural reason for the asymmetry.

@fogelito fogelito changed the title Postgres case sensitive for UID Postgres case insensitive for UID Oct 16, 2025
@abnegate abnegate merged commit 141338a into main Oct 16, 2025
15 checks passed
@abnegate abnegate deleted the postgres-case-insensitive branch October 16, 2025 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants