Skip to content

Conversation

@abnegate
Copy link
Member

@abnegate abnegate commented Aug 6, 2025

…improvements

Summary by CodeRabbit

  • New Features

    • Batch-oriented relationship population for faster, breadth-first loading across many documents.
    • New "relationships" CLI task and executable wrapper to populate and benchmark relational test data.
    • Multi-adapter DB support including PostgreSQL and a sharedTables option for CLI tasks.
  • Bug Fixes

    • More consistent depth-aware traversal with two-way back-reference handling to reduce redundant queries.
  • Tests

    • New end-to-end tests validating relationship population and nested creation depth handling.
  • Chores

    • Increased query value limits to support much larger bulk operations.

…improvements

Co-authored-by: jakeb994 <jakeb994@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 6, 2025

Walkthrough

Adds a breadth-first, batch-oriented relationship population flow to Database with new batch helpers, larger query chunking, and routing of single-document paths through the batch flow. Adds a relationships benchmarking CLI task, adapter generalization (including Postgres and sharedTables), tests for relationships (duplicated), and validator defaults increased.

Changes

Cohort / File(s) Change Summary
Batch population core & handlers
src/Database/Database.php
Added populateDocumentsRelationships(...), many per-relationship batch helpers (populateSingleRelationshipBatch, populateOneToOneRelationshipsBatch, populateOneToManyRelationshipsBatch, populateManyToOneRelationshipsBatch, populateManyToManyRelationshipsBatch), applySelectFiltersToDocuments, shouldSkipRelationshipFetchBatch, RELATION_QUERY_CHUNK_SIZE and routing single-document flows through batch population.
Relationship traversal / controls
src/Database/Database.php
Integrated depth/stack controls and back-reference handling into batch traversal; chunked large ID lists and updated MAX_DEPTH enforcement for batch processing.
Call sites — batch routing
src/Database/Database.php
Reworked getDocument, createDocument, createDocuments, updateDocument, find, upsertDocumentsWithIncrease, and related paths to wrap single docs into arrays, call batch population, and extract results.
Validators
src/Database/Validator/Queries/Documents.php
src/Database/Validator/Query/Filter.php
Increased default maxValuesCount from 1005000; added getMaxValuesCount() to Filter (duplicate addition appears in patch).
CLI adapter generalization & tasks
bin/tasks/*
bin/tasks/index.php
bin/tasks/load.php
bin/tasks/query.php
Replaced per-adapter branching with a data-driven adapter map; added Postgres support; added sharedTables parameter and Database::setSharedTables usage; refactored tasks to use unified PDO/adapter flow.
Relationships task & wrapper
bin/tasks/relationships.php
bin/relationships
New relationships CLI task and helper functions to create schema, generate relationship data, and run benchmarks; added shell wrapper bin/relationships.
CLI error handling
bin/cli.php
Added Utopia\CLI\Console import and a CLI error handler using Console::error.
View & JS tweaks
bin/view/index.php
Dynamic datatable header generation from results, JS fixes (let, semicolons), and consistent PHP escaping for injected JSON.
Tests — e2e & unit
tests/e2e/Adapter/Scopes/RelationshipTests.php
tests/unit/Validator/Query/FilterTest.php
Added testSimpleRelationshipPopulation() and testNestedDocumentCreationWithDepthHandling() (each duplicated in file); unit test updated to use Filter type and call new getter.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Database
    participant Storage
    Note over Database: Batch-oriented, breadth-first relationship population

    Client->>Database: get/create/update/find (single or many)
    Database->>Database: Normalize input -> array of documents
    Database->>Database: populateDocumentsRelationships(docs, collection, depth, stack, selects)
    loop per relationship depth (breadth-first)
        Database->>Storage: Bulk query related IDs/values (chunked by RELATION_QUERY_CHUNK_SIZE)
        Storage-->>Database: Related documents (bulk)
        Database->>Database: Map results -> source documents (apply selects, back-refs)
    end
    Database-->>Client: Documents with relationships populated
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • fogelito

Poem

In my burrow I batch every hop and leap,
I fetch relations in packs instead of one-by-one heap.
Breadth-first hops, chunked queries—no more lonely digs,
I stitch back-references, then nibble on my figs. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Optimize relationship fetching” accurately and concisely captures the main focus of the changeset, which reorganizes internal relationship population into batch processes and improves query performance when fetching relationships. It’s specific enough to signal the primary intent without unnecessary detail.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch optimize-relationship-performance

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cefe7c6 and ef5bf42.

📒 Files selected for processing (1)
  • src/Database/Database.php (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Database/Database.php (5)
src/Database/Mirror.php (1)
  • silent (172-175)
src/Database/Document.php (9)
  • Document (12-470)
  • getAttribute (224-231)
  • setAttribute (244-261)
  • getId (63-66)
  • removeAttribute (287-293)
  • getCollection (85-88)
  • isEmpty (396-399)
  • find (304-322)
  • getArrayCopy (423-458)
src/Database/Query.php (7)
  • getAttribute (159-162)
  • setAttribute (200-205)
  • Query (8-1116)
  • equal (435-438)
  • limit (623-626)
  • getMethod (151-154)
  • getValues (167-170)
src/Database/Adapter/SQL.php (2)
  • find (2334-2513)
  • count (2525-2587)
src/Database/Adapter.php (2)
  • find (802-802)
  • count (825-825)
🪛 PHPMD (2.15.0)
src/Database/Database.php

3577-3577: Avoid unused parameters such as '$relationshipFetchStack'. (undefined)

(UnusedFormalParameter)


3632-3632: Avoid unused local variables such as '$relationType'. (undefined)

(UnusedLocalVariable)


3726-3726: Avoid unused local variables such as '$key'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MySQL)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/Database/Database.php (2)

3598-3600: Address unused variables flagged by static analysis.

The static analysis correctly identifies that $twoWay, $twoWayKey, and $side variables are extracted but never used in the main loop of populateDocumentsRelationships. These variables are only used within the specific batch population methods.

Consider removing these unused variables to improve code clarity:

-            $twoWay = $relationship['options']['twoWay'];
-            $twoWayKey = $relationship['options']['twoWayKey'];
-            $side = $relationship['options']['side'];

3643-3694: Complex relationship cycle detection logic.

The cycle detection logic handles reflexive, symmetric, and transitive relationships correctly. However, the transitive detection conditions are quite complex and could benefit from better documentation or simplification.

Consider breaking down the transitive relationship detection into separate, well-named boolean variables for better readability:

// Transitive relationship detection
$sameKeyDifferentSide = $existingKey === $twoWayKey 
    && $existingCollection === $relatedCollectionId 
    && $existingSide !== $side;
    
$sameTwoWayKeyDifferentSide = $existingTwoWayKey === $key 
    && $existingRelatedCollection === $collection->getId() 
    && $existingSide !== $side;
    
// ... other conditions
$transitive = $sameKeyDifferentSide || $sameTwoWayKeyDifferentSide || ...;
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 669f38d and 32b571a.

📒 Files selected for processing (1)
  • src/Database/Database.php (4 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3598-3598: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3599-3599: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)

🔇 Additional comments (8)
src/Database/Database.php (8)

3579-3634: Well-designed batch relationship population approach.

The breadth-first approach effectively addresses the N+1 query problem by batching relationship fetches. The implementation correctly handles empty documents, filters for relationship attributes, and delegates to specialized methods based on relationship type.


3706-3764: Efficient one-to-one batch implementation.

The method correctly collects all related IDs, performs a single bulk query, and maps results back to documents. The cycle detection mapping and relationship stack management are properly implemented.


3777-3844: Proper handling of one-to-many relationship sides.

The implementation correctly distinguishes between parent and child sides of the relationship, with appropriate handling for each case. The back-reference removal to avoid cycles is a good practice.


3857-3931: Consistent many-to-one relationship handling.

The method maintains consistency with the one-to-many implementation by properly handling parent/child sides and applying similar cycle prevention strategies.


3944-4033: Sophisticated many-to-many relationship resolution.

The implementation efficiently handles junction tables by fetching junction records first, then related documents, maintaining proper ordering and relationships. The two-step approach (junctions → related documents) is optimal for this relationship type.


4227-4230: Appropriate batch processing in createDocuments.

The integration of batch relationship population in the document creation flow improves performance when creating multiple documents with relationships.


5608-5611: Consistent batch processing in createOrUpdateDocumentsWithIncrease.

The batch relationship population is consistently applied across different document creation/update methods, maintaining API consistency.


6611-6620: Smart conditional batch processing in find method.

The logic intelligently chooses between batch processing for multiple documents and single-document processing for one document, optimizing performance based on result set size.

- Update batch methods to use Document->getAttribute() instead of array access
- Fix shouldSkipRelationshipFetchBatch to accept Document parameter
- Convert Document to array when adding to relationship fetch stack
- Maintain compatibility with existing relationship fetch stack structure
- Use array access syntax (['key']) instead of getAttribute()
- Matches the original implementation pattern
- Always use batch processing for all document counts
- Should resolve relationship population issues in tests
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/Database/Database.php (2)

3598-3600: Remove unused local variables.

The variables $twoWay, $twoWayKey, and $side are extracted but not used in this method scope. They're only used within the specialized batch methods.

-            $twoWay = $relationship->getAttribute('options')['twoWay'];
-            $twoWayKey = $relationship->getAttribute('options')['twoWayKey'];
-            $side = $relationship->getAttribute('options')['side'];
             $queries = $selects[$key] ?? [];

6615-6624: Smart optimization with appropriate fallback logic.

The conditional logic correctly chooses between batch and single-document processing based on result count, maintaining performance benefits while preserving existing behavior.

Consider using elseif instead of else if for consistency with PHP conventions:

-            } else if (count($results) === 1) {
+            } elseif (count($results) === 1) {
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32b571a and 8da5e64.

📒 Files selected for processing (1)
  • src/Database/Database.php (4 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3598-3598: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3599-3599: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)

🔇 Additional comments (2)
src/Database/Database.php (2)

4231-4234: LGTM! Excellent performance optimization.

The switch to batch relationship population will significantly reduce database queries during bulk document creation, addressing the N+1 query problem.


5612-5615: LGTM! Consistent performance optimization.

The batch relationship population is consistently applied across document creation/update methods.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/Database/Database.php (1)

3568-3634: Excellent batch relationship population implementation with minor cleanup needed.

The breadth-first batch approach effectively addresses the N+1 query problem. The method structure is well-organized with proper early returns and delegation to specialized batch methods.

However, address these unused variables flagged by static analysis:

-        $twoWay = $relationship['options']['twoWay'];
-        $twoWayKey = $relationship['options']['twoWayKey'];
-        $side = $relationship['options']['side'];

These variables are extracted but only used within the specialized batch methods, not in this orchestration method.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8da5e64 and f9a0a2d.

📒 Files selected for processing (1)
  • src/Database/Database.php (4 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3598-3598: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3599-3599: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: PHP CS Fixer (Pint) PSR-12 style check failed: 'no_whitespace_in_blank_line' violation.

🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 3742-3742: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3816-3816: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3904-3904: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3975-3975: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (Postgres)
🔇 Additional comments (4)
src/Database/Database.php (4)

3636-3694: LGTM! Comprehensive cycle detection and depth limiting.

The method properly handles reflexive, symmetric, and transitive relationship detection to prevent infinite loops and excessive depth. The logic is thorough and necessary for safe batch relationship population.


4231-4234: Clean integration of batch relationship population.

The replacement of individual document processing with batch processing maintains the same behavior while improving performance. The silent wrapper usage is consistent with existing patterns.


5612-5615: Consistent batch integration in createOrUpdateDocumentsWithIncrease.

The integration follows the same clean pattern as createDocuments, maintaining consistency across the codebase.


6615-6621: Smart conditional batch processing in find method.

The integration properly checks for relationships and selections before applying batch processing, ensuring the optimization is only used when beneficial.

- Add batchRelationshipProcessing flag to prevent infinite recursion
- When already in batch processing, fall back to original single-document method
- This should resolve issues with nested relationship queries returning empty arrays
- Use batch processing only for top-level calls (relationshipFetchDepth === 1)
- Use original single-document processing for nested relationship calls
- Remove custom flag and rely on existing relationshipFetchDepth system
- This should resolve missing relationship data in nested scenarios
- Use empty relationshipFetchStack to detect top-level calls for batch processing
- Use non-empty stack to detect nested calls that should use single processing
- This should properly distinguish between top-level and nested relationship calls
- Should resolve issues with one-to-many and other relationship types returning empty arrays
- Remove conditional logic that mixed batch and single processing
- Always use batch processing for all relationship population
- Eliminates N+1 problems at all nesting levels
- Simplifies logic and ensures consistent behavior
- Maximum performance benefits across all relationship scenarios
- Convert getDocument to use populateDocumentsRelationships with single-element array
- Convert createDocument to use batch processing for relationships
- Convert updateDocument to use batch processing for relationships
- This ensures consistent batch processing at all levels, including nested relationships
- Should resolve 2nd level relationship population issues
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
src/Database/Database.php (1)

3324-3567: Remove unused method populateDocumentRelationships

Static analysis and a project-wide search confirm that the private method populateDocumentRelationships in src/Database/Database.php (lines 3324–3567) is never invoked. Please delete this dead code to keep the codebase clean.

  • File: src/Database/Database.php
  • Lines: 3324–3567
♻️ Duplicate comments (5)
src/Database/Database.php (5)

3644-3695: Type consistency issue already identified

The comparison at line 3664 has a type mismatch that's related to the broader relationshipFetchStack type issue already flagged in previous reviews.


3741-3743: Type issue with relationshipFetchStack assignment

This issue has already been identified in previous reviews. The suggested fix using clone should be applied.


3815-3817: Same relationshipFetchStack type issue

This is the same type mismatch issue already identified in previous reviews.


3903-3905: Consistent type issue with relationshipFetchStack

Same type mismatch as in other batch methods, already covered in previous reviews.


3974-3976: Type issue consistency across all batch methods

Same relationshipFetchStack type issue as in the other batch methods.

🧹 Nitpick comments (1)
src/Database/Database.php (1)

6619-6624: Simplify condition check

The nested condition can be simplified since !empty($results) already ensures count > 0.

-        if ($this->resolveRelationships && !empty($relationships) && (empty($selects) || !empty($nestedSelections))) {
-            if (count($results) > 0) {
-                // Always use batch processing for all cases (single and multiple documents, nested or top-level)
-                $results = $this->silent(fn () => $this->populateDocumentsRelationships($collection, $results, $nestedSelections));
-            }
+        if ($this->resolveRelationships && !empty($relationships) && (empty($selects) || !empty($nestedSelections)) && !empty($results)) {
+            // Always use batch processing for all cases (single and multiple documents, nested or top-level)
+            $results = $this->silent(fn () => $this->populateDocumentsRelationships($collection, $results, $nestedSelections));
         }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9a0a2d and c01c352.

📒 Files selected for processing (1)
  • src/Database/Database.php (7 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3599-3599: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3601-3601: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: PHP CS Fixer (laravel/pint) PSR-12 style check failed: 'no_whitespace_in_blank_line' violation detected.

🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 3324-3324: PHPStan: Method Utopia\Database\Database::populateDocumentRelationships() is unused.


[error] 3743-3743: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3817-3817: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3905-3905: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.


[error] 3976-3976: PHPStan: Property Utopia\Database\Database::$relationshipFetchStack (array<Utopia\Database\Document>) does not accept array<array<string, mixed>|Utopia\Database\Document>.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Unit Test
🔇 Additional comments (4)
src/Database/Database.php (4)

3293-3294: Good refactoring to use batch processing consistently

Converting single document relationship population to use the batch method improves consistency and leverages the performance optimizations.


3569-3635: Well-structured batch relationship population method

The breadth-first approach and delegation to specialized batch methods is a solid design that will improve performance by eliminating N+1 query problems.


4134-4135: Consistent use of batch processing

Good to see uniform usage of the batch method across all document operations.


4233-4236: Optimal use of batch processing for document creation

This is where the batch processing will show the most performance improvement when creating multiple documents with relationships.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/Database/Database.php (4)

3745-3745: Fix type issue with relationshipFetchStack assignment.

The batch one-to-one implementation efficiently collects IDs and performs a single query. However, there's a type mismatch issue that needs to be addressed.

The relationshipFetchStack expects Document objects, not arrays. This issue was identified in previous reviews and should be fixed as suggested.


3820-3820: Fix type issue with relationshipFetchStack assignment.

The asymmetric handling of parent vs child sides is correct, and the grouping logic efficiently organizes results. However, there's the same type issue with relationshipFetchStack identified in previous reviews.


3906-3906: Fix type issue with relationshipFetchStack assignment.

The role reversal logic compared to one-to-many is correct, and the twoWay requirement for child side relationships is appropriate. However, there's the same type issue with relationshipFetchStack.


3975-3975: Fix type issue with relationshipFetchStack assignment.

The two-phase approach (junction records then related documents) is correctly implemented, and the order preservation logic ensures consistent results. However, there's the same type issue with relationshipFetchStack.

🧹 Nitpick comments (3)
src/Database/Database.php (3)

3580-3639: Remove unused variables and clean up commented code.

The batch relationship population logic is well-structured and documented. However, there are some cleanup issues:

  1. Unused variables detected by static analysis (lines 3599-3601)
  2. Large block of commented-out cycle detection code (lines 3607-3617)

Apply this diff to remove unused variables:

         foreach ($relationships as $relationship) {
             $key = $relationship['key'];
             $relatedCollection = $this->getCollection($relationship['options']['relatedCollection']);
             $relationType = $relationship['options']['relationType'];
-            $twoWay = $relationship['options']['twoWay'];
-            $twoWayKey = $relationship['options']['twoWayKey'];
-            $side = $relationship['options']['side'];
             $queries = $selects[$key] ?? [];

Consider either implementing the cycle detection properly or removing the commented code entirely if it's no longer needed.


3784-3784: Remove unused parameter.

The $collection parameter is flagged as unused by static analysis.

If this parameter is not needed for the method's functionality, consider removing it from the signature and updating all call sites.


3863-3863: Remove unused parameter.

The $collection parameter is flagged as unused by static analysis, same as in the one-to-many method.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c01c352 and f43a9c2.

📒 Files selected for processing (1)
  • src/Database/Database.php (7 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3599-3599: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3601-3601: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)


3648-3699: Avoid unused private methods such as 'shouldSkipRelationshipFetchBatch'. (Unused Code Rules)

(UnusedPrivateMethod)


3784-3784: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)


3863-3863: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: PHP CS Fixer (laravel/pint) PSR-12 style check failed: no_trailing_whitespace, no_whitespace_in_blank_line violations detected.

🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 3324-3324: PHPStan: Method Utopia\Database\Database::populateDocumentRelationships() is unused.


[error] 3648-3648: PHPStan: Method Utopia\Database\Database::shouldSkipRelationshipFetchBatch() is unused.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
🔇 Additional comments (6)
src/Database/Database.php (6)

3293-3294: LGTM! Clean reuse of batch logic for single documents.

The approach of wrapping single documents in arrays to reuse the batch relationship population logic is elegant and maintains consistency across the codebase.


4133-4134: LGTM! Consistent application of batch relationship population.

The modification to use batch relationship population for single documents maintains consistency with the new approach.


4232-4235: LGTM! Proper batch optimization for document creation.

Using batch relationship population for the entire batch of created documents will significantly improve performance by reducing the number of queries.


4777-4778: LGTM! Consistent with other single document modifications.

The updateDocument method now uses the same batch approach as getDocument and createDocument for consistency.


5614-5617: LGTM! Batch optimization for upsert operations.

Applying batch relationship population to the createOrUpdateDocumentsWithIncrease method ensures consistent performance benefits across all bulk operations.


6617-6623: LGTM! Unified batch processing for all query results.

The modification to always use batch relationship population for all documents (single or multiple) creates a unified approach and ensures optimal performance regardless of result size.

…ping

- Fixed TypeError: Cannot access offset of type Document in isset
- Handle cases where twoWayKey returns Document objects instead of string IDs
- Extract ID from Document objects when using as array keys
- Re-enabled cycle detection after fixing core issue
…ument object equality

- Replace Document == comparison with specific property matching
- Compare key, collection, and relatedCollection for reflexive detection
- Should fix cycle detection issues while preventing infinite recursion
- Complex cycle detection was over-blocking legitimate relationships
- Simpler approach showed better results (8 errors vs 15 errors)
- Focus on max depth limit for two-way relationships only
- Keep back-reference removal as in original implementation
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/Database/Database.php (1)

3647-3701: Remove unused cycle detection method.

This method implements sophisticated cycle detection logic but is currently unused, creating dead code in the codebase.

The method should either be integrated into the relationship population logic or removed entirely to maintain clean code.

🧹 Nitpick comments (3)
src/Database/Database.php (3)

3569-3638: Excellent batch optimization with minor cleanup needed.

The breadth-first approach and batch processing design effectively solves the N+1 query problem. The implementation correctly handles relationship collection, cycle detection, and delegation to specialized methods.

Clean up unused variables flagged by static analysis:

 foreach ($relationships as $relationship) {
     $key = $relationship['key'];
     $relatedCollection = $this->getCollection($relationship['options']['relatedCollection']);
     $relationType = $relationship['options']['relationType'];
-    $twoWay = $relationship['options']['twoWay'];
-    $twoWayKey = $relationship['options']['twoWayKey'];
-    $side = $relationship['options']['side'];
     $queries = $selects[$key] ?? [];

3786-3854: Asymmetric one-to-many handling is architecturally sound.

The distinction between parent and child sides is correctly implemented. The parent side performs batch fetching while child side delegates to one-to-one logic, which is appropriate for the relationship semantics.

Remove the unused $collection parameter flagged by static analysis:

-private function populateOneToManyRelationshipsBatch(array $documents, Document $relationship, Document $relatedCollection, array $queries, Document $collection): void
+private function populateOneToManyRelationshipsBatch(array $documents, Document $relationship, Document $relatedCollection, array $queries): void

Also update the method call in line 3624.


3867-3942: Many-to-one logic correctly mirrors one-to-many with role reversal.

The implementation correctly handles the inverted relationship semantics, with parent side using one-to-one logic and child side performing batch operations. The twoWay requirement for child side relationships is appropriate.

Remove the unused $collection parameter:

-private function populateManyToOneRelationshipsBatch(array $documents, Document $relationship, Document $relatedCollection, array $queries, Document $collection): void
+private function populateManyToOneRelationshipsBatch(array $documents, Document $relationship, Document $relatedCollection, array $queries): void

Also update the method call in line 3628.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f43a9c2 and f6cecd6.

📒 Files selected for processing (1)
  • src/Database/Database.php (7 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/Database/Database.php (5)
src/Database/Mirror.php (1)
  • silent (172-175)
src/Database/Document.php (7)
  • Document (12-462)
  • getAttribute (216-223)
  • getCollection (79-82)
  • setAttribute (236-253)
  • getId (63-66)
  • removeAttribute (279-285)
  • find (296-314)
src/Database/Query.php (5)
  • getAttribute (120-123)
  • setAttribute (161-166)
  • Query (8-730)
  • equal (355-358)
  • limit (496-499)
src/Database/Adapter/MariaDB.php (1)
  • find (1353-1516)
src/Database/Adapter/Postgres.php (1)
  • find (1444-1606)
🪛 PHPMD (2.15.0)
src/Database/Database.php

3600-3600: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3601-3601: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)


3647-3701: Avoid unused private methods such as 'shouldSkipRelationshipFetchBatch'. (Unused Code Rules)

(UnusedPrivateMethod)


3786-3786: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)


3867-3867: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: PHP CS Fixer (Pint) PSR-12 style issue: 'no_whitespace_in_blank_line' violation detected.

🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 3324-3324: PHPStan: Method Utopia\Database\Database::populateDocumentRelationships() is unused.


[error] 3647-3647: PHPStan: Method Utopia\Database\Database::shouldSkipRelationshipFetchBatch() is unused.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Unit Test
🔇 Additional comments (8)
src/Database/Database.php (8)

3293-3294: LGTM! Clean integration of batch relationship population.

The approach of wrapping a single document in an array and extracting the first element after batch processing is consistent and maintains the existing API while leveraging the optimized batch functionality.


3713-3773: Solid one-to-one batch implementation with excellent performance gains.

The approach of collecting all related IDs, performing a single query, and then mapping results back to documents is optimal. The null handling for missing related documents is correct.

The cycle detection via the map property and proper cleanup of the relationship stack show good attention to preventing infinite loops.


3955-4043: Sophisticated many-to-many implementation with proper two-phase approach.

The junction-first, then related documents approach is the correct strategy for many-to-many relationships. The order preservation through junction indexing ensures consistent results.

The complexity is justified for this relationship type, and the batch optimization provides significant performance benefits over the traditional per-document approach.


4139-4140: LGTM! Consistent batch integration pattern.

The same pattern of wrapping single documents for batch processing is correctly applied here.


4238-4243: Perfect use case for batch relationship population!

Since createDocuments already processes multiple documents, using batch relationship population directly (without wrapping/unwrapping) is the optimal approach and provides maximum performance benefit.


4783-4784: LGTM! Consistent single document batch integration.

Correctly applies the established pattern for integrating batch processing with single document operations.


5620-5625: LGTM! Optimal batch processing for bulk operations.

Like createDocuments, this method naturally benefits from batch relationship population since it processes multiple documents simultaneously.


6623-6631: Excellent unification of relationship population strategy!

Always using batch processing regardless of result count is a smart optimization. This eliminates the performance penalty for single document results while providing maximum benefit for multiple documents, creating a consistent and efficient approach.

The comment accurately describes the performance improvement from replacing per-document processing with batch processing at all levels.

- Replace depth-first cycle detection with collection-level cycle detection
- Track collection→collection paths instead of document-level relationships
- Detect direct cycles (A→B, B→A) and indirect cycles (A→B→A)
- Designed specifically for batch/breadth-first relationship traversal
- Should prevent infinite loops while allowing legitimate nested relationships
- Only block exact same relationship (same key + collections) being processed
- Only block direct back-references for two-way relationships
- Remove overly broad collection-level blocking that prevented legitimate relationships
- Should allow multiple different relationships to same collection
…essing

- Remove complex collection-level cycle tracking that was over-blocking relationships
- Use simple depth-based limits only: max depth for all, max-1 for two-way
- Should allow legitimate nested relationships like veterinarians→presidents→animals
- Still prevents infinite recursion through depth limits
Per user insight: In breadth-first batch processing:
- Depth control prevents infinite recursion (process by levels)
- Map prevents duplicate fetches (no need to track cycles)
- Level-by-level processing can't create infinite loops

Testing if cycle detection is unnecessary in batch scenario.
- Remove all cycle detection logic, rely only on depth control
- Add map tracking to all batch methods to prevent duplicate fetches
- Test theory: level-by-level processing + depth control = no infinite recursion possible
- Should allow maximum relationship population with minimal restrictions
Your insight implemented:
- Use map to check if relationships already processed BEFORE collecting IDs
- Skip duplicate relationship processing automatically
- No cycle detection needed - map + depth control is sufficient
- Should eliminate redundant database queries and improve performance significantly
…ed deduplication

- Skip one-to-one relationships already populated (value is Document object)
- Skip array relationships already populated (array contains Document objects)
- Map prevents duplicate processing across calls
- No cycle detection needed - breadth-first + depth control + smart skipping
- Should achieve maximum efficiency with minimal redundant work
- Create testSimpleRelationshipPopulation for isolated debugging
- Test basic one-to-many user->posts relationship
- Add debug output to trace relationship population flow
- Add debug logs for map blocking behavior
- Debug skip logic for already-populated relationships
- Track relationship assignment in one-to-many
- Temporarily force-include IDs that would be blocked by map
- Monitor existing values during processing
- Added debugging before/after each relationship processing
- Added debugging to show nested query processing
- Added debugging to show related document attributes after fetching
- This will help trace why Tiger's relationships are strings vs Documents
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
src/Database/Database.php (3)

3652-3668: Remove unused cycle detection method.

This method is not called anywhere and represents dead code that should be cleaned up.

The static analysis correctly identifies this as unused code that should be removed or integrated.


3719-3720: Remove debug logging from one-to-one batch method.

Multiple debug statements should be removed for production code.

This continues the pattern of debug logging that needs to be cleaned up across all batch methods.

Also applies to: 3723-3724, 3735-3747


3599-3601: Remove unused variables.

These variables are extracted but never used in the method logic.

-            $twoWay = $relationship['options']['twoWay'];
-            $twoWayKey = $relationship['options']['twoWayKey'];
-            $side = $relationship['options']['side'];
tests/e2e/Adapter/Scopes/RelationshipTests.php (1)

222-252: Remove all debug output before merging.

The debug statements throughout this section should be removed as they're not appropriate for production test code:

  • Line 222: Debug separator
  • Lines 224-226: Error log redirection to /tmp/debug.log
  • Lines 242-243, 247-248, 251-252: var_dump() statements

These will clutter test output and the hardcoded /tmp/debug.log path may not work in all environments.

-        var_dump('=== start === === start === === start === === start === === start === === start === === start === === start === === start ===');
-        
-        // Clear error log and start fresh
-        file_put_contents('/tmp/debug.log', '');
-        ini_set('error_log', '/tmp/debug.log');

         $docs = $database->find(
             'veterinarians',
             [
                 Query::select([
                     '*',
                     'animals.*',
                     'animals.zoo.*',
                     'animals.president.*',
                     'presidents.*',
                     'presidents.animal.*',
                 ])
             ]
         );

-        var_dump('=== VETERINARIANS RESULT ===');
-        var_dump($docs);
-        
-        // Let's also check what presidents look like directly
-        $presidents = $database->find('presidents');
-        var_dump('=== PRESIDENTS DIRECTLY ===');
-        var_dump($presidents);
-        
-        // Dump error log contents
-        var_dump('=== ERROR LOG ===');
-        var_dump(file_get_contents('/tmp/debug.log'));
+        // Add proper assertions here to validate the relationship data
+        $this->assertNotEmpty($docs);
+        
+        $presidents = $database->find('presidents');
+        $this->assertNotEmpty($presidents);
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bffb3be and 1bbc241.

📒 Files selected for processing (2)
  • src/Database/Database.php (7 hunks)
  • tests/e2e/Adapter/Scopes/RelationshipTests.php (2 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3599-3599: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3601-3601: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)


3652-3668: Avoid unused private methods such as 'shouldSkipRelationshipFetchBatch'. (Unused Code Rules)

(UnusedPrivateMethod)


3652-3652: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)

tests/e2e/Adapter/Scopes/RelationshipTests.php

284-284: Avoid unused local variables such as '$user'. (Unused Code Rules)

(UnusedLocalVariable)


290-290: Avoid unused local variables such as '$post1'. (Unused Code Rules)

(UnusedLocalVariable)


297-297: Avoid unused local variables such as '$post2'. (Unused Code Rules)

(UnusedLocalVariable)

🔇 Additional comments (10)
src/Database/Database.php (10)

3293-3294: Clean refactoring to use batch relationship population.

The approach of wrapping a single document in an array and extracting the first result is correct and maintains backward compatibility while leveraging the optimized batch processing.


3680-3771: Excellent one-to-one batch implementation.

The logic correctly handles multiple documents referencing the same related ID, uses proper mapping to avoid duplicates, and handles missing related documents gracefully.


3784-3881: Solid one-to-many batch implementation with correct asymmetric handling.

The differentiation between parent and child sides is correct, and the grouping logic efficiently organizes results. The back-reference removal prevents cycles appropriately.


3894-3988: Well-implemented many-to-one batch processing.

The role reversal logic compared to one-to-many is correct, and the twoWay requirement for child side relationships is appropriate.


4001-4108: Sophisticated many-to-many batch implementation.

The two-phase approach (junction records then related documents) is correctly implemented, and the order preservation logic ensures consistent results. The complexity is warranted for this relationship type.


4204-4205: Consistent refactoring pattern.

Same clean approach as in getDocument - wrapping single document for batch processing while maintaining backward compatibility.


4304-4306: Optimal use of batch relationship population.

This is the ideal scenario where multiple documents are processed together, maximizing the efficiency gains from batch processing.


4848-4849: Consistent single document handling.

Another correct application of the batch processing pattern for single document operations.


5686-5688: Appropriate batch processing for multiple documents.

Another good application of batch relationship population where multiple documents benefit from the optimization.


6689-6696: Excellent unified approach for relationship population.

The find method now consistently uses batch processing for all cases, which will provide significant performance improvements for queries returning multiple documents.

- Add detailed logging to processRelationshipQueries to track nested selection creation
- Add debugging in populateOneToManyRelationshipsBatch around find calls
- Track specific zoo/president relationship population
- Clean up excessive debugging to focus on critical paths
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
src/Database/Database.php (3)

3580-3638: ****

Past reviews have already identified the debug logging and unused variables issues in this method. The implementation logic is sound for batch relationship population.


3652-3668: ****

Past reviews have already identified that this method is unused and should be removed or integrated.


6707-6717: ****

Past reviews have already identified the debug logging issue in this section. The logic change to always use batch processing for consistency is sound.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f65e74 and 2eb50a8.

📒 Files selected for processing (1)
  • src/Database/Database.php (11 hunks)
🧰 Additional context used
🪛 PHPMD (2.15.0)
src/Database/Database.php

3599-3599: Avoid unused local variables such as '$twoWay'. (Unused Code Rules)

(UnusedLocalVariable)


3600-3600: Avoid unused local variables such as '$twoWayKey'. (Unused Code Rules)

(UnusedLocalVariable)


3601-3601: Avoid unused local variables such as '$side'. (Unused Code Rules)

(UnusedLocalVariable)


3652-3668: Avoid unused private methods such as 'shouldSkipRelationshipFetchBatch'. (Unused Code Rules)

(UnusedPrivateMethod)


3652-3652: Avoid unused parameters such as '$collection'. (Unused Code Rules)

(UnusedFormalParameter)

🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 3324-3324: PHPStan: Method Utopia\Database\Database::populateDocumentRelationships() is unused.


[error] 3652-3652: PHPStan: Method Utopia\Database\Database::shouldSkipRelationshipFetchBatch() is unused.

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: Laravel Pint PSR-12 style issues: single_space_around_construct, function_declaration_spacing.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Unit Test
🔇 Additional comments (9)
src/Database/Database.php (9)

3293-3294: LGTM: Clean integration of batch processing.

The change correctly adapts single document processing to use the new batch method by wrapping the document in an array and extracting the result. This maintains backward compatibility while leveraging the performance benefits of batch processing.


3680-3771: Well-designed one-to-one batch implementation with proper relationship tracking.

The implementation correctly handles multiple documents referencing the same related ID by maintaining a mapping structure. The cycle detection via relationship stack and depth tracking is appropriate.


3784-3900: Robust one-to-many implementation with proper side handling.

The asymmetric handling between parent and child sides is correct, and the delegation to one-to-one for child side relationships is a smart optimization. The grouping logic efficiently organizes results while properly removing back-references to prevent cycles.


3913-4007: Consistent many-to-one implementation with appropriate role reversal.

The implementation correctly reverses the relationship logic compared to one-to-many, and the twoWay requirement for child-side relationships is appropriate for this relationship type.


4020-4127: Sophisticated many-to-many implementation with proper two-phase approach.

The junction-based approach correctly implements the complex many-to-many relationship pattern. The two-phase fetching (junction records then related documents) and order preservation logic ensures consistent results.


4223-4224: LGTM: Consistent batch integration pattern.

Follows the same correct approach as other single document adaptations, maintaining interface compatibility while leveraging batch processing performance benefits.


4322-4327: Excellent optimization for multi-document creation.

This change efficiently leverages the batch relationship population directly since createDocuments already processes multiple documents, avoiding the wrap-unwrap pattern needed for single document methods.


4867-4868: LGTM: Maintains consistency in batch integration.

Correctly follows the established pattern for adapting single document operations to use batch processing.


5704-5709: Efficient direct batch processing for multiple documents.

Another good example of directly leveraging batch processing when the method already handles multiple documents, avoiding unnecessary overhead.

cursoragent and others added 7 commits August 8, 2025 03:29
- Updated all populateXXXRelationshipsBatch methods to use consistent parameter signatures
- Fixed parameter order: documents, relationship, relationshipFetchDepth, relationshipFetchStack, queries
- All batch methods now return documents array for proper chaining
- Added proper collection extraction from relationship options within batch methods
- Fixed depth parameter references to use passed parameter instead of instance variable
- Cleaned up debug output for cleaner testing
- All method calls updated to use correct parameter order
- Added comprehensive debug logging to populateDocumentsRelationships to track call flow
- Fixed early return statements in one-to-many batch method to properly return documents
- Fixed depth parameter reference to use passed parameter instead of instance variable
- All batch methods now have proper depth/stack management around find calls
…-performance

# Conflicts:
#	tests/e2e/Adapter/Scopes/RelationshipTests.php
…-performance

# Conflicts:
#	src/Database/Database.php
@abnegate abnegate changed the title Optimize relationship fetching with batch population and performance … Optimize relationship fetching Oct 1, 2025
@abnegate abnegate marked this pull request as draft October 1, 2025 11:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/Database/Database.php (1)

3552-3560: Fix caching condition; don’t disable caching for all collections with relationships.

Current check prevents caching for any collection that defines relationships. Gate on batch mode instead.

-        // Don't save to cache if it's part of a relationship
-        if (empty($relationships)) {
+        // Skip caching only when invoked from batch population
+        if (!$this->inBatchRelationshipPopulation) {
             try {
                 $this->cache->save($documentKey, $document->getArrayCopy(), $hashKey);
                 $this->cache->save($collectionKey, 'empty', $documentKey);
             } catch (Exception $e) {
                 Console::warning('Failed to save document to cache: ' . $e->getMessage());
             }
         }
🧹 Nitpick comments (12)
tests/e2e/Adapter/Scopes/RelationshipTests.php (1)

431-449: Remove unused local variables.

The variables $user, $post1, and $post2 are assigned but never referenced. The test subsequently fetches fresh documents from the database instead of using these variables.

Apply this diff to remove the unused variables:

-        $user = $database->createDocument('users_simple', new Document([
+        $database->createDocument('users_simple', new Document([
             '$id' => 'user1',
             '$permissions' => [Permission::read(Role::any())],
             'name' => 'John Doe',
         ]));

-        $post1 = $database->createDocument('posts_simple', new Document([
+        $database->createDocument('posts_simple', new Document([
             '$id' => 'post1',
             '$permissions' => [Permission::read(Role::any())],
             'title' => 'First Post',
             'author' => 'user1',
         ]));

-        $post2 = $database->createDocument('posts_simple', new Document([
+        $database->createDocument('posts_simple', new Document([
             '$id' => 'post2', 
             '$permissions' => [Permission::read(Role::any())],
             'title' => 'Second Post',
             'author' => 'user1',
         ]));
src/Database/Database.php (11)

3645-3666: Avoid mutating schema attribute; clone relationship Document before augmentation.

Setting 'collection' on the relationship attribute mutates metadata. Clone before passing to population.

-                        $relationship->setAttribute('collection', $coll->getId());
+                        $relationshipDoc = clone $relationship;
+                        $relationshipDoc->setAttribute('collection', $coll->getId());
@@
-                        $relatedDocs = $this->populateSingleRelationshipBatch(
-                            $docs,
-                            $relationship,
-                            $coll,
-                            $queries
-                        );
+                        $relatedDocs = $this->populateSingleRelationshipBatch(
+                            $docs,
+                            $relationshipDoc,
+                            $queries
+                        );
@@
-                        $twoWay = $relationship['options']['twoWay'];
-                        $twoWayKey = $relationship['options']['twoWayKey'];
+                        $twoWay = $relationshipDoc['options']['twoWay'];
+                        $twoWayKey = $relationshipDoc['options']['twoWayKey'];

3747-3764: Remove unused parameter and local in populateSingleRelationshipBatch.

$collection (param) and $key (local) are unused. Simplify signature and call sites.

-    private function populateSingleRelationshipBatch(array $documents, Document $relationship, Document $collection, array $queries): array
+    private function populateSingleRelationshipBatch(array $documents, Document $relationship, array $queries): array
     {
-        $key = $relationship['key'];
         $relationType = $relationship['options']['relationType'];

3642-3645: Remove unused local.

$relationType is not used in this scope.

-                        $relationType = $relationship['options']['relationType'];

3778-3794: Delete unused private method.

shouldSkipRelationshipFetchBatch is never called. Remove to silence PHPMD and reduce noise.

-    private function shouldSkipRelationshipFetchBatch(Document $relationship, Document $collection): bool
-    {
-        $twoWay = $relationship['options']['twoWay'] ?? false;
-
-        // Respect max depth limit to prevent infinite recursion
-        if ($this->relationshipFetchDepth >= Database::RELATION_MAX_DEPTH) {
-            return true;
-        }
-
-        // For breadth-first batch processing, be more permissive
-        // Only block if we're at the very edge of max depth for two-way relationships
-        if ($twoWay && ($this->relationshipFetchDepth >= (Database::RELATION_MAX_DEPTH - 1))) {
-            return true;
-        }
-
-        return false;
-    }

3579-3585: Drop unused parameter from populateDocumentsRelationships and update call sites.

$relationshipFetchStack is unused. Remove from signature and all invocations.

-    private function populateDocumentsRelationships(array $documents, Document $collection, int $relationshipFetchDepth = 0, array $relationshipFetchStack = [], array $selects = []): array
+    private function populateDocumentsRelationships(array $documents, Document $collection, int $relationshipFetchDepth = 0, array $selects = []): array

3541-3545: Update call site (getDocument).

-            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $this->relationshipFetchDepth, $this->relationshipFetchStack, $nestedSelections));
+            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $this->relationshipFetchDepth, $nestedSelections));

4292-4297: Update call site (createDocument).

-            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $fetchDepth, $this->relationshipFetchStack));
+            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $fetchDepth));

4396-4400: Update call site (createDocuments).

-            if (!$this->inBatchRelationshipPopulation && $this->resolveRelationships) {
-                $batch = $this->silent(fn () => $this->populateDocumentsRelationships($batch, $collection, $this->relationshipFetchDepth, $this->relationshipFetchStack));
-            }
+            if (!$this->inBatchRelationshipPopulation && $this->resolveRelationships) {
+                $batch = $this->silent(fn () => $this->populateDocumentsRelationships($batch, $collection, $this->relationshipFetchDepth));
+            }

4949-4952: Update call site (updateDocument).

-            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $this->relationshipFetchDepth, $this->relationshipFetchStack));
+            $documents = $this->silent(fn () => $this->populateDocumentsRelationships([$document], $collection, $this->relationshipFetchDepth));

5821-5824: Update call site (upsertDocumentsWithIncrease).

-            if (!$this->inBatchRelationshipPopulation && $this->resolveRelationships) {
-                $batch = $this->silent(fn () => $this->populateDocumentsRelationships($batch, $collection, $this->relationshipFetchDepth, $this->relationshipFetchStack));
-            }
+            if (!$this->inBatchRelationshipPopulation && $this->resolveRelationships) {
+                $batch = $this->silent(fn () => $this->populateDocumentsRelationships($batch, $collection, $this->relationshipFetchDepth));
+            }

6833-6840: Update call site (find).

-            if (count($results) > 0) {
-                // Always use batch processing for all cases (single and multiple documents, nested or top-level)
-                $results = $this->silent(fn () => $this->populateDocumentsRelationships($results, $collection, $this->relationshipFetchDepth, $this->relationshipFetchStack, $nestedSelections));
-            }
+            if (count($results) > 0) {
+                // Always use batch processing for all cases (single and multiple documents, nested or top-level)
+                $results = $this->silent(fn () => $this->populateDocumentsRelationships($results, $collection, $this->relationshipFetchDepth, $nestedSelections));
+            }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2eb50a8 and c7cfab6.

📒 Files selected for processing (2)
  • src/Database/Database.php (9 hunks)
  • tests/e2e/Adapter/Scopes/RelationshipTests.php (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/Database/Database.php (5)
src/Database/Mirror.php (1)
  • silent (172-175)
src/Database/Document.php (9)
  • Document (12-470)
  • getAttribute (224-231)
  • setAttribute (244-261)
  • getId (63-66)
  • removeAttribute (287-293)
  • getCollection (85-88)
  • isEmpty (396-399)
  • find (304-322)
  • getArrayCopy (423-458)
src/Database/Query.php (7)
  • getAttribute (159-162)
  • setAttribute (200-205)
  • Query (8-1116)
  • equal (435-438)
  • limit (623-626)
  • getMethod (151-154)
  • getValues (167-170)
src/Database/Adapter/SQL.php (2)
  • find (2334-2513)
  • count (2525-2587)
src/Database/Adapter.php (2)
  • find (802-802)
  • count (825-825)
tests/e2e/Adapter/Scopes/RelationshipTests.php (6)
src/Database/Database.php (6)
  • getDatabase (776-779)
  • Database (37-7742)
  • createRelationship (2655-2832)
  • createDocument (4211-4305)
  • getDocument (3444-3565)
  • find (6732-6856)
src/Database/Adapter.php (6)
  • getDatabase (160-163)
  • getSupportForRelationships (992-992)
  • createRelationship (615-615)
  • createDocument (701-701)
  • getDocument (691-691)
  • find (802-802)
tests/e2e/Adapter/Base.php (1)
  • getDatabase (36-36)
src/Database/Adapter/Pool.php (5)
  • getSupportForRelationships (378-381)
  • createRelationship (193-196)
  • createDocument (228-231)
  • getDocument (223-226)
  • find (263-266)
src/Database/Helpers/Role.php (2)
  • Role (5-178)
  • any (159-162)
src/Database/Helpers/Permission.php (2)
  • Permission (9-264)
  • read (186-195)
🪛 GitHub Actions: CodeQL
src/Database/Database.php

[error] 1-1: PHPStan analyse failed with 11 errors. Command './vendor/bin/phpstan analyse --level 7 src tests --memory-limit 2G' exited with code 1. Please check PHPStan reports for details.

🪛 GitHub Actions: Linter
src/Database/Database.php

[error] 1-1: PSR-12: Trailing whitespace in function declaration detected by Pint. Run 'pint' to fix formatting.

tests/e2e/Adapter/Scopes/RelationshipTests.php

[error] 1-1: PSR-12: Trailing whitespace detected by Pint in test file. Run 'pint' to fix formatting.

🪛 PHPMD (2.15.0)
src/Database/Database.php

3579-3579: Avoid unused parameters such as '$relationshipFetchStack'. (undefined)

(UnusedFormalParameter)


3643-3643: Avoid unused local variables such as '$relationType'. (undefined)

(UnusedLocalVariable)


3747-3747: Avoid unused parameters such as '$collection'. (undefined)

(UnusedFormalParameter)


3749-3749: Avoid unused local variables such as '$key'. (undefined)

(UnusedLocalVariable)


3778-3794: Avoid unused private methods such as 'shouldSkipRelationshipFetchBatch'. (undefined)

(UnusedPrivateMethod)


3778-3778: Avoid unused parameters such as '$collection'. (undefined)

(UnusedFormalParameter)

tests/e2e/Adapter/Scopes/RelationshipTests.php

431-431: Avoid unused local variables such as '$user'. (undefined)

(UnusedLocalVariable)


437-437: Avoid unused local variables such as '$post1'. (undefined)

(UnusedLocalVariable)


444-444: Avoid unused local variables such as '$post2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Setup & Build Docker Image
🔇 Additional comments (2)
tests/e2e/Adapter/Scopes/RelationshipTests.php (2)

451-473: Well-structured relationship population test.

The test effectively validates both directions of the one-to-many relationship:

  • Parent-to-children: User with populated posts
  • Child-to-parent: Posts with populated author

The assertions appropriately verify both structure (types, counts) and values (attributes).


404-474: No duplicate method declaration found. The testSimpleRelationshipPopulation method appears only once in RelationshipTests.php, so no changes are needed.

Likely an incorrect or invalid review comment.

@abnegate abnegate marked this pull request as ready for review October 2, 2025 11:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
src/Database/Database.php (1)

4052-4054: Call getInternalAttributes() through the instance

getInternalAttributes() isn’t static; invoking it via self:: will fatally error at runtime (Non-static method ... cannot be called statically). Switch to $this->getInternalAttributes().

-        $internalKeys = array_map(fn ($attr) => $attr['$id'], self::getInternalAttributes());
+        $internalKeys = array_map(fn ($attr) => $attr['$id'], $this->getInternalAttributes());
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7cfab6 and cefe7c6.

📒 Files selected for processing (12)
  • bin/cli.php (2 hunks)
  • bin/relationships (1 hunks)
  • bin/tasks/index.php (1 hunks)
  • bin/tasks/load.php (2 hunks)
  • bin/tasks/query.php (3 hunks)
  • bin/tasks/relationships.php (1 hunks)
  • bin/view/index.php (6 hunks)
  • src/Database/Database.php (11 hunks)
  • src/Database/Validator/Queries/Documents.php (1 hunks)
  • src/Database/Validator/Query/Filter.php (2 hunks)
  • tests/e2e/Adapter/Scopes/RelationshipTests.php (2 hunks)
  • tests/unit/Validator/Query/FilterTest.php (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
tests/unit/Validator/Query/FilterTest.php (3)
src/Database/Validator/Query/Filter.php (2)
  • Filter (15-342)
  • getMaxValuesCount (333-336)
src/Database/Database.php (1)
  • Database (37-7729)
src/Database/Query.php (2)
  • Query (8-1116)
  • equal (435-438)
tests/e2e/Adapter/Scopes/RelationshipTests.php (1)
src/Database/Database.php (2)
  • Database (37-7729)
  • createDocument (4198-4292)
bin/tasks/query.php (5)
src/Database/Adapter.php (3)
  • Adapter (16-1318)
  • setSharedTables (174-179)
  • create (488-488)
src/Database/Database.php (3)
  • Database (37-7729)
  • setSharedTables (1023-1028)
  • create (1241-1260)
src/Database/Adapter/Postgres.php (2)
  • Postgres (19-2259)
  • create (135-162)
src/Database/Adapter/SQL.php (1)
  • getPDOAttributes (1739-1749)
src/Database/PDO.php (1)
  • PDO (13-143)
bin/tasks/index.php (3)
src/Database/Database.php (2)
  • Database (37-7729)
  • setSharedTables (1023-1028)
src/Database/PDO.php (1)
  • PDO (13-143)
src/Database/Adapter/SQL.php (1)
  • getPDOAttributes (1739-1749)
bin/tasks/relationships.php (5)
src/Database/Database.php (3)
  • Database (37-7729)
  • skipRelationships (653-663)
  • findOne (6911-6926)
src/Database/DateTime.php (2)
  • DateTime (7-86)
  • now (19-23)
src/Database/Helpers/Permission.php (2)
  • Permission (9-264)
  • read (186-195)
src/Database/Helpers/Role.php (2)
  • Role (5-178)
  • any (159-162)
src/Database/Query.php (2)
  • Query (8-1116)
  • limit (623-626)
src/Database/Database.php (4)
src/Database/Mirror.php (1)
  • silent (172-175)
src/Database/Document.php (8)
  • Document (12-470)
  • getAttribute (224-231)
  • setAttribute (244-261)
  • getId (63-66)
  • removeAttribute (287-293)
  • isEmpty (396-399)
  • find (304-322)
  • getArrayCopy (423-458)
src/Database/Query.php (7)
  • getAttribute (159-162)
  • setAttribute (200-205)
  • Query (8-1116)
  • equal (435-438)
  • limit (623-626)
  • getMethod (151-154)
  • getValues (167-170)
src/Database/Adapter/SQL.php (2)
  • find (2334-2513)
  • count (2525-2587)
bin/tasks/load.php (2)
src/Database/Database.php (4)
  • Database (37-7729)
  • setSharedTables (1023-1028)
  • createDocuments (4308-4408)
  • create (1241-1260)
src/Database/DateTime.php (2)
  • DateTime (7-86)
  • now (19-23)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/RelationshipTests.php

431-431: Avoid unused local variables such as '$user'. (undefined)

(UnusedLocalVariable)


437-437: Avoid unused local variables such as '$post1'. (undefined)

(UnusedLocalVariable)


444-444: Avoid unused local variables such as '$post2'. (undefined)

(UnusedLocalVariable)

bin/tasks/relationships.php

438-438: Avoid unused local variables such as '$total'. (undefined)

(UnusedLocalVariable)

src/Database/Database.php

3580-3580: Avoid unused parameters such as '$relationshipFetchStack'. (undefined)

(UnusedFormalParameter)


3639-3639: Avoid unused local variables such as '$relationType'. (undefined)

(UnusedLocalVariable)


3737-3737: Avoid unused local variables such as '$key'. (undefined)

(UnusedLocalVariable)

@@ -0,0 +1,3 @@
#!/bin/sh

php /usr/src/code/bin/cli.php relationships $@
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Quote forwarded CLI arguments.

$@ without quotes re-splits arguments containing spaces, so the CLI task receives mangled input. Wrap it in quotes.

-php /usr/src/code/bin/cli.php relationships $@
+php /usr/src/code/bin/cli.php relationships "$@"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
php /usr/src/code/bin/cli.php relationships $@
php /usr/src/code/bin/cli.php relationships "$@"
🤖 Prompt for AI Agents
In bin/relationships around line 3, the script forwards CLI args using $@
unquoted which re-splits arguments containing spaces; change the invocation to
pass the arguments quoted (i.e. use "$@") so each original argument is preserved
and forwarded intact to cli.php.

Comment on lines +120 to 136
for ($i = 0; $i < $limit / 1000; $i++) {
//\go(function () use ($cfg, $pool, $name, $namespace, $sharedTables, $cache) {
try {
//$pdo = $pool->get();

$database = (new Database(new ($cfg['adapter'])($pdo), $cache))
->setDatabase($name)
->setNamespace($namespace)
->setSharedTables($sharedTables);

createDocuments($database);
//$pool->put($pdo);
} catch (\Throwable $error) {
Console::error('Coroutine error: ' . $error->getMessage());
}
//});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Respect the requested record limit when batching inserts.

$limit / 1000 combined with a hard-coded createDocuments() batch causes us to insert 1 000 documents even when the caller asks for fewer, and overshoots whenever $limit is not an exact multiple of 1 000. That breaks the CLI contract from previous releases. Please compute batches with intdiv/% and pass the actual chunk size into the writer so we never exceed $limit.

-        for ($i = 0; $i < $limit / 1000; $i++) {
-            try {
-                $database = (new Database(new ($cfg['adapter'])($pdo), $cache))
-                    ->setDatabase($name)
-                    ->setNamespace($namespace)
-                    ->setSharedTables($sharedTables);
-
-                createDocuments($database);
-            } catch (\Throwable $error) {
-                Console::error('Coroutine error: ' . $error->getMessage());
-            }
-        }
+        $batchSize = 1000;
+        $fullBatches = intdiv($limit, $batchSize);
+        $remainder = $limit % $batchSize;
+
+        $chunks = array_fill(0, $fullBatches, $batchSize);
+        if ($remainder > 0) {
+            $chunks[] = $remainder;
+        }
+
+        foreach ($chunks as $chunkSize) {
+            if ($chunkSize === 0) {
+                continue;
+            }
+
+            try {
+                $database = (new Database(new ($cfg['adapter'])($pdo), $cache))
+                    ->setDatabase($name)
+                    ->setNamespace($namespace)
+                    ->setSharedTables($sharedTables);
+
+                createDocuments($database, $chunkSize);
+            } catch (\Throwable $error) {
+                Console::error('Coroutine error: ' . $error->getMessage());
+            }
+        }

Also adjust createDocuments() to honour the new $chunkSize.

🤖 Prompt for AI Agents
In bin/tasks/load.php around lines 120–136, the loop currently uses $limit /
1000 and always calls createDocuments() with a fixed 1000, which can overshoot
the requested $limit; change batching to compute fullBatches = intdiv($limit,
1000) and remainder = $limit % 1000 (or compute totalBatches and derive
per-iteration size), iterate fullBatches times passing chunkSize = 1000 to
createDocuments, then if remainder > 0 call createDocuments once with chunkSize
= remainder (or in a single loop compute chunkSize = ($i === $totalBatches - 1)
? $remainder : 1000); update the createDocuments() function signature to accept
$chunkSize and generate/insert exactly $chunkSize documents (honouring zero/edge
cases) so the total inserted never exceeds $limit.

Comment on lines +164 to 201
function createDocuments(Database $database): void
{
$database->createDocument('articles', new Document([
// Five random users out of 10,000 get read access
// Three random users out of 10,000 get mutate access
'$permissions' => [
Permission::read(Role::any()),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
],
'author' => $faker->name(),
'created' => \Utopia\Database\DateTime::format($faker->dateTime()),
'text' => $faker->realTextBetween(1000, 4000),
'genre' => $faker->randomElement(['fashion', 'food', 'travel', 'music', 'lifestyle', 'fitness', 'diy', 'sports', 'finance']),
'views' => $faker->randomNumber(6),
'tags' => $faker->randomElements(['short', 'quick', 'easy', 'medium', 'hard'], $faker->numberBetween(1, 5)),
]));
global $namesPool, $genresPool, $tagsPool;

$documents = [];

$start = \microtime(true);
for ($i = 0; $i < 1000; $i++) {
$length = \mt_rand(1000, 4000);
$bytes = \random_bytes(intdiv($length + 1, 2));
$text = \substr(\bin2hex($bytes), 0, $length);
$tagCount = \mt_rand(1, count($tagsPool));
$tagKeys = (array) \array_rand($tagsPool, $tagCount);
$tags = \array_map(fn ($k) => $tagsPool[$k], $tagKeys);

$documents[] = new Document([
'$permissions' => [
Permission::read(Role::any()),
...array_map(fn () => Permission::read(Role::user(mt_rand(0, 999999999))), range(1, 4)),
...array_map(fn () => Permission::create(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::update(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::delete(Role::user(mt_rand(0, 999999999))), range(1, 3)),
],
'author' => $namesPool[\array_rand($namesPool)],
'created' => DateTime::now(),
'text' => $text,
'genre' => $genresPool[\array_rand($genresPool)],
'views' => \mt_rand(0, 999999),
'tags' => $tags,
]);
}
$time = \microtime(true) - $start;
Console::info("Prepared 1000 documents in {$time} seconds");
$start = \microtime(true);
$database->createDocuments('articles', $documents, 1000);
$time = \microtime(true) - $start;
Console::success("Inserted 1000 documents in {$time} seconds");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update createDocuments() to accept the batch size.

To finish the fix above, the helper should create exactly $count documents instead of always producing 1 000.

-function createDocuments(Database $database): void
+function createDocuments(Database $database, int $count = 1000): void
 {
     global $namesPool, $genresPool, $tagsPool;
 
     $documents = [];
 
     $start = \microtime(true);
-    for ($i = 0; $i < 1000; $i++) {
+    for ($i = 0; $i < $count; $i++) {
         $length = \mt_rand(1000, 4000);
         $bytes = \random_bytes(intdiv($length + 1, 2));
         $text = \substr(\bin2hex($bytes), 0, $length);
         $tagCount = \mt_rand(1, count($tagsPool));
         $tagKeys = (array) \array_rand($tagsPool, $tagCount);
         $tags = \array_map(fn ($k) => $tagsPool[$k], $tagKeys);
 
         $documents[] = new Document([
             '$permissions' => [
                 Permission::read(Role::any()),
                 ...array_map(fn () => Permission::read(Role::user(mt_rand(0, 999999999))), range(1, 4)),
                 ...array_map(fn () => Permission::create(Role::user(mt_rand(0, 999999999))), range(1, 3)),
                 ...array_map(fn () => Permission::update(Role::user(mt_rand(0, 999999999))), range(1, 3)),
                 ...array_map(fn () => Permission::delete(Role::user(mt_rand(0, 999999999))), range(1, 3)),
             ],
             'author'  => $namesPool[\array_rand($namesPool)],
             'created' => DateTime::now(),
             'text'    => $text,
             'genre'   => $genresPool[\array_rand($genresPool)],
             'views'   => \mt_rand(0, 999999),
             'tags'    => $tags,
         ]);
     }
     $time = \microtime(true) - $start;
-    Console::info("Prepared 1000 documents in {$time} seconds");
+    Console::info("Prepared {$count} documents in {$time} seconds");
     $start = \microtime(true);
-    $database->createDocuments('articles', $documents, 1000);
+    $database->createDocuments('articles', $documents, $count);
     $time = \microtime(true) - $start;
-    Console::success("Inserted 1000 documents in {$time} seconds");
+    Console::success("Inserted {$count} documents in {$time} seconds");
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function createDocuments(Database $database): void
{
$database->createDocument('articles', new Document([
// Five random users out of 10,000 get read access
// Three random users out of 10,000 get mutate access
'$permissions' => [
Permission::read(Role::any()),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::read(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::create(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::update(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
Permission::delete(Role::user($faker->randomNumber(9))),
],
'author' => $faker->name(),
'created' => \Utopia\Database\DateTime::format($faker->dateTime()),
'text' => $faker->realTextBetween(1000, 4000),
'genre' => $faker->randomElement(['fashion', 'food', 'travel', 'music', 'lifestyle', 'fitness', 'diy', 'sports', 'finance']),
'views' => $faker->randomNumber(6),
'tags' => $faker->randomElements(['short', 'quick', 'easy', 'medium', 'hard'], $faker->numberBetween(1, 5)),
]));
global $namesPool, $genresPool, $tagsPool;
$documents = [];
$start = \microtime(true);
for ($i = 0; $i < 1000; $i++) {
$length = \mt_rand(1000, 4000);
$bytes = \random_bytes(intdiv($length + 1, 2));
$text = \substr(\bin2hex($bytes), 0, $length);
$tagCount = \mt_rand(1, count($tagsPool));
$tagKeys = (array) \array_rand($tagsPool, $tagCount);
$tags = \array_map(fn ($k) => $tagsPool[$k], $tagKeys);
$documents[] = new Document([
'$permissions' => [
Permission::read(Role::any()),
...array_map(fn () => Permission::read(Role::user(mt_rand(0, 999999999))), range(1, 4)),
...array_map(fn () => Permission::create(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::update(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::delete(Role::user(mt_rand(0, 999999999))), range(1, 3)),
],
'author' => $namesPool[\array_rand($namesPool)],
'created' => DateTime::now(),
'text' => $text,
'genre' => $genresPool[\array_rand($genresPool)],
'views' => \mt_rand(0, 999999),
'tags' => $tags,
]);
}
$time = \microtime(true) - $start;
Console::info("Prepared 1000 documents in {$time} seconds");
$start = \microtime(true);
$database->createDocuments('articles', $documents, 1000);
$time = \microtime(true) - $start;
Console::success("Inserted 1000 documents in {$time} seconds");
}
function createDocuments(Database $database, int $count = 1000): void
{
global $namesPool, $genresPool, $tagsPool;
$documents = [];
$start = \microtime(true);
for ($i = 0; $i < $count; $i++) {
$length = \mt_rand(1000, 4000);
$bytes = \random_bytes(intdiv($length + 1, 2));
$text = \substr(\bin2hex($bytes), 0, $length);
$tagCount = \mt_rand(1, count($tagsPool));
$tagKeys = (array) \array_rand($tagsPool, $tagCount);
$tags = \array_map(fn ($k) => $tagsPool[$k], $tagKeys);
$documents[] = new Document([
'$permissions' => [
Permission::read(Role::any()),
...array_map(fn () => Permission::read(Role::user(mt_rand(0, 999999999))), range(1, 4)),
...array_map(fn () => Permission::create(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::update(Role::user(mt_rand(0, 999999999))), range(1, 3)),
...array_map(fn () => Permission::delete(Role::user(mt_rand(0, 999999999))), range(1, 3)),
],
'author' => $namesPool[\array_rand($namesPool)],
'created' => DateTime::now(),
'text' => $text,
'genre' => $genresPool[\array_rand($genresPool)],
'views' => \mt_rand(0, 999999),
'tags' => $tags,
]);
}
$time = \microtime(true) - $start;
Console::info("Prepared {$count} documents in {$time} seconds");
$start = \microtime(true);
$database->createDocuments('articles', $documents, $count);
$time = \microtime(true) - $start;
Console::success("Inserted {$count} documents in {$time} seconds");
}
🤖 Prompt for AI Agents
In bin/tasks/load.php around lines 164 to 201, update createDocuments() to
accept an integer $count parameter (e.g., createDocuments(Database $database,
int $count)): replace the hardcoded 1000 with $count for the for-loop bound, for
preparing message text ("Prepared X documents") and the
database->createDocuments('articles', $documents, $count) call; ensure types are
updated and any callers of createDocuments() are adjusted to pass the desired
batch size.

Comment on lines +108 to +137
$pool = new PDOPool(
(new PDOConfig())
->withHost($cfg['host'])
->withPort($cfg['port'])
->withDbName($name)
->withCharset('utf8mb4')
->withUsername($cfg['user'])
->withPassword($cfg['pass']),
size: 64
);

$start = \microtime(true);

for ($i = 0; $i < $limit / 1000; $i++) {
go(function () use ($cfg, $pool, $name, $namespace, $sharedTables, $cache, $globalDocs) {
try {
$pdo = $pool->get();

$database = (new Database(new ($cfg['adapter'])($pdo), $cache))
->setDatabase($name)
->setNamespace($namespace)
->setSharedTables($sharedTables);

createRelationshipDocuments($database, $globalDocs['categories'], $globalDocs['users']);
$pool->put($pdo);
} catch (\Throwable $error) {
// Errors caught but documents still created successfully - likely concurrent update race conditions
}
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Instantiate the expected PDO wrapper inside each coroutine.

$pool->get() returns Swoole’s PDOProxy, but every adapter constructor is type-hinted to Utopia\Database\PDO. Passing the proxy raises a TypeError the moment the coroutine runs. Please create a proper Utopia\Database\PDO inside each task (or wrap the proxy) instead of leasing raw connections from the pool, and make sure exceptions are surfaced so failures aren’t silently swallowed.

-        $pdo = null;
-
-        $pool = new PDOPool(
-            (new PDOConfig())
-                ->withHost($cfg['host'])
-                ->withPort($cfg['port'])
-                ->withDbName($name)
-                ->withCharset('utf8mb4')
-                ->withUsername($cfg['user'])
-                ->withPassword($cfg['pass']),
-            size: 64
-        );
-
         $start = \microtime(true);
 
         for ($i = 0; $i < $limit / 1000; $i++) {
-            go(function () use ($cfg, $pool, $name, $namespace, $sharedTables, $cache, $globalDocs) {
-                try {
-                    $pdo = $pool->get();
-
-                    $database = (new Database(new ($cfg['adapter'])($pdo), $cache))
-                        ->setDatabase($name)
-                        ->setNamespace($namespace)
-                        ->setSharedTables($sharedTables);
-
-                    createRelationshipDocuments($database, $globalDocs['categories'], $globalDocs['users']);
-                    $pool->put($pdo);
-                } catch (\Throwable $error) {
-                    // Errors caught but documents still created successfully - likely concurrent update race conditions
-                }
-            });
+            go(function () use ($cfg, $name, $namespace, $sharedTables, $cache, $globalDocs) {
+                try {
+                    $pdo = new PDO(
+                        ($cfg['dsn'])($cfg['host'], $cfg['port']),
+                        $cfg['user'],
+                        $cfg['pass'],
+                        $cfg['attrs']
+                    );
+
+                    $database = (new Database(new ($cfg['adapter'])($pdo), $cache))
+                        ->setDatabase($name)
+                        ->setNamespace($namespace)
+                        ->setSharedTables($sharedTables);
+
+                    createRelationshipDocuments($database, $globalDocs['categories'], $globalDocs['users']);
+                } catch (\Throwable $error) {
+                    Console::error('Coroutine error: ' . $error->getMessage());
+                }
+            });
         }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +24 to +26
<th v-for="set in results[0].data" :key="set.roles">
{{ set.roles }} {{ set.roles === 1 ? 'role' : 'roles' }}
</th>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard header rendering when results are empty.

When the results directory is empty, results[0] is undefined, so the header v-for throws at runtime. Please guard the access so the view can render with zero results.

-                <th v-for="set in results[0].data" :key="set.roles">
+                <th v-for="(set, roleIndex) in (results[0]?.data ?? [])" :key="set.roles ?? roleIndex">
                     {{ set.roles }} {{ set.roles === 1 ? 'role' : 'roles' }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<th v-for="set in results[0].data" :key="set.roles">
{{ set.roles }} {{ set.roles === 1 ? 'role' : 'roles' }}
</th>
<th v-for="(set, roleIndex) in (results[0]?.data ?? [])" :key="set.roles ?? roleIndex">
{{ set.roles }} {{ set.roles === 1 ? 'role' : 'roles' }}
</th>
🤖 Prompt for AI Agents
In bin/view/index.php around lines 24 to 26, the header v-for uses
results[0].data which throws when results is empty; change the template to guard
access by iterating over (results && results.length ? results[0].data : []) or
use the optional chaining fallback (results?.[0]?.data || []) so the v-for runs
over an empty array when there are no results, or wrap the header block in a
v-if="results && results.length" so headers only render when data exists.

Comment on lines +4044 to +4069
// Collect all fields to keep from select queries
$fieldsToKeep = [];
foreach ($selectQueries as $selectQuery) {
foreach ($selectQuery->getValues() as $value) {
$fieldsToKeep[] = $value;
}
}

$this->relationshipFetchDepth--;
\array_pop($this->relationshipFetchStack);
// Always preserve internal attributes
$internalKeys = array_map(fn ($attr) => $attr['$id'], self::getInternalAttributes());
$fieldsToKeep = array_merge($fieldsToKeep, $internalKeys);

// Early return if wildcard selector present
if (in_array('*', $fieldsToKeep)) {
return;
}

foreach ($relatedDocuments as $related) {
$related->removeAttribute($twoWayKey);
}
// Filter each document to only include selected fields
foreach ($documents as $doc) {
$allKeys = array_keys($doc->getArrayCopy());
foreach ($allKeys as $attrKey) {
// Keep if: explicitly selected OR is internal attribute
if (!in_array($attrKey, $fieldsToKeep) && !str_starts_with($attrKey, '$')) {
$doc->removeAttribute($attrKey);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix select filtering so nested relationships keep their data

When a select like Query::select(['comments.author.name']) is processed, this block only keeps the literal author.name. The top-level author attribute is stripped from each comment before the next breadth-first level runs, so the nested author relationship never gets populated and serialized. This breaks any request that drills deeper than one level. Update the filter to retain the first segment of dotted paths (and deduplicate) so downstream fetches continue to see the relationship entry.

-        $fieldsToKeep = [];
-        foreach ($selectQueries as $selectQuery) {
-            foreach ($selectQuery->getValues() as $value) {
-                $fieldsToKeep[] = $value;
-            }
-        }
+        $fieldsToKeep = [];
+        foreach ($selectQueries as $selectQuery) {
+            foreach ($selectQuery->getValues() as $value) {
+                $fieldsToKeep[] = $value;
+                if (\str_contains($value, '.')) {
+                    $fieldsToKeep[] = \explode('.', $value, 2)[0];
+                }
+            }
+        }
+        $fieldsToKeep = \array_values(\array_unique($fieldsToKeep));
@@
-        $internalKeys = array_map(fn ($attr) => $attr['$id'], self::getInternalAttributes());
+        $internalKeys = array_map(fn ($attr) => $attr['$id'], $this->getInternalAttributes());
         $fieldsToKeep = array_merge($fieldsToKeep, $internalKeys);
 
-        // Early return if wildcard selector present
-        if (in_array('*', $fieldsToKeep)) {
+        // Early return if wildcard selector present
+        if (\in_array('*', $fieldsToKeep, true)) {
             return;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Collect all fields to keep from select queries
$fieldsToKeep = [];
foreach ($selectQueries as $selectQuery) {
foreach ($selectQuery->getValues() as $value) {
$fieldsToKeep[] = $value;
}
}
$this->relationshipFetchDepth--;
\array_pop($this->relationshipFetchStack);
// Always preserve internal attributes
$internalKeys = array_map(fn ($attr) => $attr['$id'], self::getInternalAttributes());
$fieldsToKeep = array_merge($fieldsToKeep, $internalKeys);
// Early return if wildcard selector present
if (in_array('*', $fieldsToKeep)) {
return;
}
foreach ($relatedDocuments as $related) {
$related->removeAttribute($twoWayKey);
}
// Filter each document to only include selected fields
foreach ($documents as $doc) {
$allKeys = array_keys($doc->getArrayCopy());
foreach ($allKeys as $attrKey) {
// Keep if: explicitly selected OR is internal attribute
if (!in_array($attrKey, $fieldsToKeep) && !str_starts_with($attrKey, '$')) {
$doc->removeAttribute($attrKey);
}
}
// Collect all fields to keep from select queries
$fieldsToKeep = [];
foreach ($selectQueries as $selectQuery) {
foreach ($selectQuery->getValues() as $value) {
$fieldsToKeep[] = $value;
if (\str_contains($value, '.')) {
$fieldsToKeep[] = \explode('.', $value, 2)[0];
}
}
}
$fieldsToKeep = \array_values(\array_unique($fieldsToKeep));
// Always preserve internal attributes
$internalKeys = array_map(fn ($attr) => $attr['$id'], $this->getInternalAttributes());
$fieldsToKeep = array_merge($fieldsToKeep, $internalKeys);
// Early return if wildcard selector present
if (\in_array('*', $fieldsToKeep, true)) {
return;
}
// Filter each document to only include selected fields
foreach ($documents as $doc) {
$allKeys = array_keys($doc->getArrayCopy());
foreach ($allKeys as $attrKey) {
// Keep if: explicitly selected OR is internal attribute
if (!in_array($attrKey, $fieldsToKeep) && !str_starts_with($attrKey, '$')) {
$doc->removeAttribute($attrKey);
}
}
}
🤖 Prompt for AI Agents
In src/Database/Database.php around lines 4044 to 4069, the select-filtering
currently only preserves literal dotted paths like "author.name" and strips the
parent relationship ("author") so deeper nested fetches fail; update the logic
that builds $fieldsToKeep to, for every selected value that contains a dot, also
add its first segment (i.e., the substring before the first '.') to the keep
list, then deduplicate the list (e.g., array_unique + reindex) so you still
allow the early wildcard check and the subsequent per-document filter to retain
top-level relationship keys that downstream code expects.

@abnegate abnegate force-pushed the optimize-relationship-performance branch from ef5bf42 to cefe7c6 Compare October 2, 2025 11:47
@abnegate abnegate closed this Oct 2, 2025
@abnegate abnegate deleted the optimize-relationship-performance branch October 2, 2025 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants