-
Notifications
You must be signed in to change notification settings - Fork 84
indexes for duplicate detection #7095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7095 +/- ##
=======================================
Coverage 87.31% 87.31%
=======================================
Files 532 532
Lines 34969 34970 +1
Branches 4048 4048
=======================================
+ Hits 30532 30533 +1
Misses 3560 3560
Partials 877 877 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile OverviewGreptile SummaryThis PR addresses performance bottlenecks in Data Subject Request (DSR) creation by optimizing duplicate detection queries. The changes add two composite database indexes to improve query performance: The PR also removes a redundant duplicate detection call from Important Files Changed
Confidence score: 4/5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, 2 comments
...ides/api/alembic/migrations/versions/xx_2025_12_09_2055_a7241db3ee6a_add_identity_indexes.py
Outdated
Show resolved
Hide resolved
...ides/api/alembic/migrations/versions/xx_2025_12_09_2055_a7241db3ee6a_add_identity_indexes.py
Show resolved
Hide resolved
…7241db3ee6a_add_identity_indexes.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
@greptile please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, no comments
| "require_manual_request_approval", | ||
| "postgres_example_test_dataset_config", | ||
| ) | ||
| @patch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was flaking - hoping this solves the 🥐
adamsachs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! just a nit on a design choice in the migration that i don't consider blocking.
| branch_labels = None | ||
| depends_on = None | ||
|
|
||
| from fides.api.migrations.post_upgrade_index_creation import INDEX_ROW_COUNT_THRESHOLD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i'm generally not so much in favor of a migration referencing variables/values defined outside of that migration, since it means the migration is no longer a frozen artifact in time -- i.e. if we were to ever redefine this threshold variable in the other module, then the migration behavior would technically change at that point. generally, i find it's much easier to reason about migrations as 'frozen', and i do think that's a general best practice.
obviously in this case it's hard to imagine this being problematic, in practice - so i don't have any major qualms. just wanted to mention that i'm not sure this is a great pattern. IMO, the DRYness that we're achieving here isn't really worth it - for a migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like your perspective! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated :)
Co-authored-by: Jade Wibbels <jade@ethyca.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Ticket ENG-2200
Description Of Changes
There is some slowness on DSR creation - looking into potential bottlenecks. Located some queries related to duplicate detection which are referencing non indexed identity cols. This PR adds indexes to those columns.
Code Changes
Steps to Confirm
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works