Skip to content

[MS-949] Introduce Concurrency for Face and Fingerprint Matching#1169

Merged
meladRaouf merged 1 commit into
mainfrom
feature/matching-optmization
May 28, 2025
Merged

[MS-949] Introduce Concurrency for Face and Fingerprint Matching#1169
meladRaouf merged 1 commit into
mainfrom
feature/matching-optmization

Conversation

@meladRaouf
Copy link
Copy Markdown
Collaborator

@meladRaouf meladRaouf commented Apr 15, 2025

JIRA ticket
Will be released in: 2025.2.0

Notable changes

  • Implement producer-consumer pattern using coroutines and channels to process candidate batches concurrently.
  • Configure the number of consumer coroutines based on the available processors to maximize efficiency.
  • Refactor FaceMatcherUseCase and FingerprintMatcherUseCase to utilize the new concurrent matching approach.
  • Update tests to accommodate the changes in the matching logic.

Testing guidance

Evaluate verification and identification using face and fingerprint modalities with both small and large candidate sets for matching.

Additional work checklist

  • Effect on other features and security has been considered
  • Design document marked as "In development" (if applicable)
  • External (Gitbook) and internal (Confluence) Documentation is up to date (or ticket created)
  • Test cases in Testiny are up to date (or ticket created)
  • Other teams notified about the changes (if applicable)

@cla-bot cla-bot Bot added the ... label Apr 15, 2025
@meladRaouf meladRaouf marked this pull request as ready for review April 16, 2025 12:55
@meladRaouf meladRaouf requested review from a team, TristramN, alex-vt, alexandr-simprints, luhmirin-s and ybourgery and removed request for a team April 16, 2025 12:55
@BurningAXE
Copy link
Copy Markdown
Contributor

I tested this with CoSync and found that it actually leads to a worse performance since reading in CommCare seems to be CPU-bound. While I'm thinking of ways to handle this, can we hold off merging this PR?

@meladRaouf meladRaouf force-pushed the feature/matching-optmization branch 2 times, most recently from 8c85427 to f61b468 Compare May 8, 2025 10:36
@meladRaouf meladRaouf force-pushed the feature/matching-optmization branch from f61b468 to 1a8f827 Compare May 10, 2025 08:45
@BurningAXE BurningAXE requested a review from Copilot May 19, 2025 15:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces concurrent processing for loading and matching face and fingerprint identities using Kotlin coroutines and channels.

  • Adds a generic loadIdentitiesConcurrently helper to spread batch requests across multiple coroutines.
  • Refactors local and CommCare data sources and the repository to return ReceiveChannel<List<T>> instead of lists.
  • Updates matcher use cases and their tests to consume identity batches concurrently.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/main/java/.../IdentityDataSource.kt Added loadIdentitiesConcurrently for parallel loading
src/main/java/.../local/EnrolmentRecordLocalDataSourceImpl.kt Refactored to use concurrent batch loaders
src/main/java/.../commcare/CommCareIdentityDataSource.kt Wrapped existing loaders in concurrent channels
src/main/java/.../EnrolmentRecordRepositoryImpl.kt Forwarded concurrent loaders through the repository
src/main/java/.../matcher/usecases/FingerprintMatcherUseCase.kt Consumes identity channel and matches in parallel
src/main/java/.../matcher/usecases/FaceMatcherUseCase.kt Consumes identity channel and matches in parallel
Tests (multiple files under src/test/...) Updated to collect from channels and use createTestChannel
Comments suppressed due to low confidence (1)

infra/enrolment-records/repository/src/main/java/com/simprints/infra/enrolment/records/repository/commcare/CommCareIdentityDataSource.kt:260

  • [nitpick] The parallelism field is public by default. Consider marking it private or moving it into a companion object constant to clarify its intended scope.
val parallelism = Runtime.getRuntime().availableProcessors()

Comment on lines +42 to +44
// This var is not thread safe
var loadedCandidates = 0

Copy link

Copilot AI May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadedCandidates is a mutable class-level variable and not thread-safe if invoke is called concurrently. Consider making it a local variable inside the invoke method.

Suggested change
// This var is not thread safe
var loadedCandidates = 0

Copilot uses AI. Check for mistakes.
// However, when using CommCare as data source, loadedCandidates < expectedCandidates
// as it's count function does not take into account filtering criteria
// This var is not thread safe
var loadedCandidates = 0
Copy link

Copilot AI May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadedCandidates is a mutable class-level variable and not thread-safe. Move this counter inside invoke to avoid data races when matching runs in parallel.

Copilot uses AI. Check for mistakes.

/**
* Loads identities concurrently using the provided dispatcher and parallelism level.
*
Copy link

Copilot AI May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The KDoc for loadIdentitiesConcurrently is incomplete. Expand it to explain parameters, returned channel behavior, error handling, and cancellation semantics.

Suggested change
*
*
* @param ranges A list of integer ranges, where each range represents a subset of data to be loaded.
* Each range will be processed independently and concurrently.
* @param dispatcher The [CoroutineDispatcher] to use for launching coroutines. This determines the thread
* context in which the loading operations will run.
* @param parallelism The maximum number of concurrent loading operations. This controls the level of
* parallelism and limits resource usage.
* @param load A suspending function that takes an [IntRange] and returns a list of loaded items of type [T].
* This function is called for each range in [ranges].
* @return A [ReceiveChannel] that emits lists of loaded items of type [T]. The channel will emit one list
* for each range in [ranges], in the order the ranges are processed. The channel is closed
* automatically when all ranges have been processed.
* @throws Exception If the [load] function throws an exception for any range, the exception is propagated
* and the channel is closed immediately.
* @cancellation If the coroutine scope is cancelled, all ongoing loading operations are cancelled, and
* the channel is closed. Any resources acquired during the operation are released.

Copilot uses AI. Check for mistakes.
@meladRaouf meladRaouf force-pushed the feature/matching-optmization branch from 1a8f827 to f3de032 Compare May 20, 2025 06:49
@meladRaouf meladRaouf force-pushed the feature/matching-optmization branch from f3de032 to ef9eb71 Compare May 20, 2025 07:20
@sonarqubecloud
Copy link
Copy Markdown

@BurningAXE
Copy link
Copy Markdown
Contributor

@meladRaouf As discussed on Slack - I will raise another PR with concurrency improvements based on this one. Therefore I don't have any more remarks here and we can merge it.

Copy link
Copy Markdown
Contributor

@BurningAXE BurningAXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold off! Just tested this out and found an issue! The implementation with the Semaphore is great but it has one problem - launches on the main thread :D That's because the passed Scope ultimately comes from the ViewModel's scope.

@BurningAXE
Copy link
Copy Markdown
Contributor

I've resolved the main thread issue in #1194

@meladRaouf meladRaouf merged commit ad31bdc into main May 28, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants