-
Notifications
You must be signed in to change notification settings - Fork 3
Revert "Optimize reports fetching" #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes refactor internal logic for resource reporting and data fetching in the Appwrite migration source. Pagination and caching are removed in favor of direct API calls with default limits. Resource counts, such as users, teams, buckets, files, functions, and database tables, are now retrieved through simplified queries and explicit per-item API calls where needed. Changes
Sequence Diagram(s)sequenceDiagram
participant MigrationSource
participant AppwriteAPI
MigrationSource->>AppwriteAPI: listUsers()
MigrationSource->>AppwriteAPI: listTeams()
loop for each team
MigrationSource->>AppwriteAPI: listMemberships(teamId)
end
MigrationSource->>AppwriteAPI: listBuckets()
loop for each bucket (paginated)
MigrationSource->>AppwriteAPI: listFiles(bucketId, limit)
loop for each file
MigrationSource->>AppwriteAPI: getFile(bucketId, fileId)
end
end
MigrationSource->>AppwriteAPI: listFunctions()
loop for each function
MigrationSource->>AppwriteAPI: listDeployments(functionId)
MigrationSource->>AppwriteAPI: listVariables(functionId)
end
MigrationSource->>AppwriteAPI: listDatabases()
loop for each database
MigrationSource->>AppwriteAPI: listCollections(databaseId)
loop for each collection
MigrationSource->>AppwriteAPI: listDocuments(databaseId, collectionId)
MigrationSource->>AppwriteAPI: listAttributes(databaseId, collectionId)
MigrationSource->>AppwriteAPI: listIndexes(databaseId, collectionId)
end
end
Estimated code review effort3 (~45 minutes) Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
src/Migration/Sources/Appwrite.php (1)
212-218: Performance concern: Fetching all teams to count membershipsThis approach fetches all teams first, then makes individual API calls for each team to count memberships. This is inefficient and won't scale with many teams.
Consider implementing pagination for both teams and memberships to handle large datasets efficiently.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/Migration/Sources/Appwrite.php(5 hunks)src/Migration/Sources/Appwrite/Reader/API.php(1 hunks)
🧠 Learnings (3)
📓 Common learnings
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: utopia-php/migration#89
File: src/Migration/Sources/Appwrite/Reader/API.php:64-84
Timestamp: 2025-07-19T08:29:22.290Z
Learning: In the Appwrite API, the default page limit for listing collections is 25 records, so when using cursor-based pagination with Query::cursorAfter(), there's no need to explicitly specify Query::limit(25) as the API will default to this limit.
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
src/Migration/Sources/Appwrite.php (3)
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
src/Migration/Sources/Appwrite/Reader/API.php (5)
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: #89
File: src/Migration/Sources/Appwrite/Reader/API.php:64-84
Timestamp: 2025-07-19T08:29:22.290Z
Learning: In the Appwrite API, the default page limit for listing collections is 25 records, so when using cursor-based pagination with Query::cursorAfter(), there's no need to explicitly specify Query::limit(25) as the API will default to this limit.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Resources/Database/Columns/Relationship.php:86-89
Timestamp: 2025-06-28T09:45:57.650Z
Learning: In the utopia-php/migration codebase Relationship column class, the getRelatedTable() method intentionally returns $this->options['relatedCollection'] (not relatedTable) because the underlying API still uses "collection" terminology, even though the internal codebase has been refactored to use "table" terminology.
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: utopia-php/migration#89
File: src/Migration/Sources/Appwrite/Reader/API.php:64-84
Timestamp: 2025-07-19T08:29:22.290Z
Learning: In the Appwrite API, the default page limit for listing collections is 25 records, so when using cursor-based pagination with Query::cursorAfter(), there's no need to explicitly specify Query::limit(25) as the API will default to this limit.
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: utopia-php/migration#80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
src/Migration/Sources/Appwrite.php (3)
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
src/Migration/Sources/Appwrite/Reader/API.php (5)
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite.php:843-851
Timestamp: 2025-06-28T09:47:08.333Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the class constructors and method parameters use the new terminology (like "relatedTable"), but the underlying data structures and API responses still use the legacy keys (like "relatedCollection"). This is an intentional design pattern to allow gradual migration while maintaining compatibility with existing data sources.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Sources/Appwrite/Reader/API.php:8-8
Timestamp: 2025-06-28T09:47:11.436Z
Learning: In the Appwrite migration codebase, commented-out Tables service references (import statements and constructor parameters) are intentionally kept for future implementation when the Tables service becomes available in the Appwrite SDK, rather than being dead code that should be removed.
Learnt from: ItzNotABug
PR: #89
File: src/Migration/Sources/Appwrite/Reader/API.php:64-84
Timestamp: 2025-07-19T08:29:22.290Z
Learning: In the Appwrite API, the default page limit for listing collections is 25 records, so when using cursor-based pagination with Query::cursorAfter(), there's no need to explicitly specify Query::limit(25) as the API will default to this limit.
Learnt from: ItzNotABug
PR: #80
File: src/Migration/Resources/Database/Columns/Relationship.php:86-89
Timestamp: 2025-06-28T09:45:57.650Z
Learning: In the utopia-php/migration codebase Relationship column class, the getRelatedTable() method intentionally returns $this->options['relatedCollection'] (not relatedTable) because the underlying API still uses "collection" terminology, even though the internal codebase has been refactored to use "table" terminology.
🔇 Additional comments (1)
src/Migration/Sources/Appwrite.php (1)
1300-1300: Please confirm the correct field name for function deploymentsI wasn’t able to find any other references to either
deploymentordeploymentIdin the SDK or service definitions—this migration file is the only place that usesdeploymentId. Before merging, double-check the Appwrite Functions API response to ensure it truly returns adeploymentIdkey (and notdeployment). If the API still usesdeployment, update this line back to:$function['deployment'],so the migration won’t break at runtime.
| $teamList = $this->teams->list([Query::limit(1)]); | ||
| $teams = ['total' => $teamList['total'], 'teams' => []]; | ||
| } | ||
| $report[Resource::TYPE_USER] = $this->users->list()['total']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Removing pagination for users will fail with large user bases
This change removes pagination when fetching users. If there are thousands or millions of users, this single API call will likely timeout or exceed memory limits.
Consider keeping the pagination logic or at least document the limitations of this approach.
🤖 Prompt for AI Agents
In src/Migration/Sources/Appwrite.php at line 203, the code fetches the total
user count without pagination, which can cause timeouts or memory issues with
large user bases. To fix this, restore the pagination logic to fetch users in
manageable batches or, if only the total count is needed, use an API method that
returns just the user count without retrieving all user data. Alternatively, add
documentation clearly stating the limitations of this approach when handling
large datasets.
| $lastBucket | ||
| ? [Query::cursorAfter($lastBucket)] | ||
| : [Query::limit($pageLimit)] | ||
| : [Query::limit(20)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Hardcoded limit of 20 seems arbitrary
The pagination limit is hardcoded to 20, which is smaller than the Appwrite API default of 25. This will result in more API calls than necessary.
Consider using the API's default limit or making this configurable:
- : [Query::limit(20)]
+ : [Query::limit(25)]Also applies to: 272-272
🤖 Prompt for AI Agents
In src/Migration/Sources/Appwrite.php at lines 252 and 272, the pagination limit
is hardcoded to 20, which is below the Appwrite API default of 25 and causes
unnecessary extra API calls. Update the code to use the API's default limit of
25 instead of 20, or better, make the limit configurable via a variable or
constant so it can be adjusted without code changes.
| foreach ($files as $file) { | ||
| $report['size'] += $this->storage->getFile( | ||
| $bucket['$id'], | ||
| $file['$id'] | ||
| )['sizeOriginal']; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Severe performance issue: Individual API calls for each file
Making a separate API call to get the size of each file is extremely inefficient. For buckets with thousands of files, this will result in thousands of API calls and likely hit rate limits.
This approach will not scale. Consider:
- Using the file size from the list response if available
- Implementing batch operations
- Adding progress tracking and error handling for large datasets
🤖 Prompt for AI Agents
In src/Migration/Sources/Appwrite.php around lines 284 to 289, the current code
makes an individual API call for each file to get its size, causing severe
performance issues and potential rate limiting. To fix this, modify the code to
use the file size information directly from the initial list response if it
includes size data, or implement batch API calls to retrieve sizes for multiple
files at once. Additionally, add progress tracking and error handling to manage
large datasets efficiently and avoid failures during migration.
| /* $tablesResponse = $this->tables->list(...); */ | ||
| $tablesResponse = $this->database->listCollections($databaseId); | ||
| $tables = $tablesResponse['collections']; | ||
|
|
||
| if (Resource::isSupported(Resource::TYPE_TABLE, $resources)) { | ||
| $report[Resource::TYPE_TABLE] += count($tables); | ||
| $report[Resource::TYPE_TABLE] += $tablesResponse['total']; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance concern: Removing pagination may cause issues with large datasets
The removal of pagination for table fetching could potentially cause memory issues or API timeouts when dealing with databases that have many tables. The previous implementation used cursor-based pagination which is more scalable.
Without pagination, if a database has hundreds or thousands of tables, this single API call might fail or consume excessive memory.
🤖 Prompt for AI Agents
In src/Migration/Sources/Appwrite/Reader/API.php around lines 64 to 70, the
current code fetches all tables without pagination, which can cause memory or
timeout issues with large datasets. To fix this, reintroduce cursor-based
pagination by repeatedly calling listCollections with a cursor parameter until
all tables are retrieved, accumulating results incrementally instead of fetching
all at once.
| if (Resource::isSupported(Resource::TYPE_COLUMN, $resources)) { | ||
| /* $columnsResponse = $this->tables->listColumns(...); */ | ||
| $columnsResponse = $this->database->listAttributes($databaseId, $tableId); | ||
| $report[Resource::TYPE_COLUMN] += $columnsResponse['total']; | ||
| } | ||
|
|
||
| if (in_array(Resource::TYPE_INDEX, $resources)) { | ||
| /* $indexesResponse = $this->tables->listIndexes(...); */ | ||
| $indexesResponse = $this->database->listIndexes($databaseId, $tableId); | ||
| $report[Resource::TYPE_INDEX] += $indexesResponse['total']; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance impact: Multiple API calls instead of using embedded metadata
This change shifts from using embedded table metadata to making separate API calls for each table to get column and index counts. For a database with N tables, this results in 2N additional API calls (one for columns, one for indexes per table).
This could significantly impact performance and API rate limits when dealing with many tables.
Consider implementing a caching mechanism or batch API calls if the embedded metadata approach had accuracy issues that necessitated this change.
🤖 Prompt for AI Agents
In src/Migration/Sources/Appwrite/Reader/API.php around lines 86 to 96, the
current code makes separate API calls for each table to fetch column and index
counts, causing 2N additional calls for N tables and impacting performance. To
fix this, revert to using embedded table metadata for columns and indexes counts
if it was accurate, or implement a caching mechanism or batch API calls to
reduce the number of requests. This will minimize API calls and improve
efficiency when processing many tables.
Reverts #89
Summary by CodeRabbit