Skip to content

[MS-973] Fix sorting logic in MatchResultSet to handle equal confidence scores#1170

Merged
meladRaouf merged 1 commit into
mainfrom
MS-973-Duplicate-Confidence-Scores
May 6, 2025
Merged

[MS-973] Fix sorting logic in MatchResultSet to handle equal confidence scores#1170
meladRaouf merged 1 commit into
mainfrom
MS-973-Duplicate-Confidence-Scores

Conversation

@meladRaouf
Copy link
Copy Markdown
Collaborator

@meladRaouf meladRaouf commented Apr 15, 2025

JIRA ticket
Will be released in: 2025.2.0

Root cause analysis (for bugfixes only)

When two subjects have the same confidence score during matching, only one is returned in the identification response. Please refer to the ticket for more details.

First known affected version: All recent versions

  • Attached two videos one for the problem and the fix

Notable changes

  • Improve sorting logic in MatchResultSet to handle equal confidence scores by subject ID

Testing guidance

  • Use the project cGKXBnPkLGKtK0V2TUxo, which contains many duplicate ROCv3 and simatcher subjects, and verify that subjects with the same confidence score are all included in the identification results.

Additional work checklist

  • Effect on other features and security has been considered
  • Design document marked as "In development" (if applicable)
  • External (Gitbook) and internal (Confluence) Documentation is up to date (or ticket created)
  • Test cases in Testiny are up to date (or ticket created)
  • Other teams notified about the changes (if applicable)
no.duplicates.mp4
allow.duplicates.mp4

@cla-bot cla-bot Bot added the ... label Apr 15, 2025
@meladRaouf meladRaouf marked this pull request as ready for review April 15, 2025 11:59
@meladRaouf meladRaouf requested review from a team, BurningAXE, TristramN, alex-vt, alexandr-simprints, luhmirin-s and ybourgery and removed request for a team April 15, 2025 11:59
@meladRaouf meladRaouf force-pushed the MS-973-Duplicate-Confidence-Scores branch from 5b35568 to 923b091 Compare April 15, 2025 14:03
@sonarqubecloud
Copy link
Copy Markdown

)

fun add(element: T): MatchResultSet<T> {
if (lowestConfidence > element.confidence) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, am I reading this wrong or if we first add a high confidence score (say 99), then we would not add any lower results and return a list of just 1 result? Is this the behaviour that we want?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, lowestConfidence is updated only when the set is full. That's confusing...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if confidence and subjectId are now a composite comparison key, so should work as intended.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are comparing confidence only. The confusion comes from the fact that lowestConfidence is initialized with 0 and updated only when the set is full.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a classic LeetCode problem 🥇. If we only need the top N elements, fully sorting the entire list is unnecessary and can be avoided to enhance performance.

)

fun add(element: T): MatchResultSet<T> {
if (lowestConfidence > element.confidence) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if confidence and subjectId are now a composite comparison key, so should work as intended.

@meladRaouf meladRaouf merged commit cd8b9ea into main May 6, 2025
12 checks passed
@meladRaouf meladRaouf deleted the MS-973-Duplicate-Confidence-Scores branch May 6, 2025 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants