Skip to content

Conversation

@JadeCara
Copy link
Contributor

Ticket ENG-1404

Description Of Changes

🎯 Fides must provide a way for privacy admins to quickly identify and surface in the request manager UI privacy requests which are likely duplicates submitted by the same user over a period of time.

This PR adds a table for duplicate group id management. If an admin wants to update the grouping config we dont want to suddenly mark duplicates that were previously found as not duplicates. We do want to know which duplicates were located under which configurations.

Example:

# Initial Config
DuplicateDetectionSettings(
    enabled=True,
    time_window_days=30,
    match_identity_fields=["email"],
)

If we want to change the time window or the identity fields to match that should create its own unique group id. Instead of looking for which group an duplicate belongs to we can also deterministically find the group id using the rules and hashed values.

Code Changes

  • Added DuplicateGroup and associated helper functions
  • Alembic migrations
  • Updated PrivacyRequest relationships
  • Added tests

Steps to Confirm

  1. Tests should pass, this is not hooked into anything but is used again in Eng 1404 be implement ability to sort filter duplicate dsr mark duplicates #6851 to assign privacy requests to the correct group id. Those tests should also pass.
  2. This all runs downstream to [ENG-1404] Duplicate DSR - runner integration #6860 which has additional testing steps.

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

Jade Wibbels and others added 30 commits October 21, 2025 15:53
…dsr-records-model' into duplicate_group-ENG-1404-be-implement-ability-to-sort-filter-duplicate-dsr-records
@JadeCara JadeCara mentioned this pull request Oct 29, 2025
18 tasks
@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.41%. Comparing base (a2d99e7) to head (1c0fbd7).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6881      +/-   ##
==========================================
+ Coverage   87.40%   87.41%   +0.01%     
==========================================
  Files         521      522       +1     
  Lines       33944    33981      +37     
  Branches     3899     3900       +1     
==========================================
+ Hits        29668    29705      +37     
  Misses       3419     3419              
  Partials      857      857              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Base automatically changed from ENG-1404-be-implement-ability-to-sort-filter-duplicate-dsr-records-model to main October 29, 2025 22:03
@JadeCara JadeCara enabled auto-merge October 30, 2025 21:01
@JadeCara JadeCara added the db-migration This indicates that a change includes a database migration label Oct 30, 2025
@JadeCara JadeCara added this pull request to the merge queue Oct 30, 2025
Merged via the queue into main with commit 77b9a64 Oct 30, 2025
68 checks passed
@JadeCara JadeCara deleted the duplicate_group-ENG-1404-be-implement-ability-to-sort-filter-duplicate-dsr-records branch October 30, 2025 22:25
adamsachs pushed a commit that referenced this pull request Nov 3, 2025
Co-authored-by: Jade Wibbels <jade@ethyca.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

db-migration This indicates that a change includes a database migration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants