[MS-1421] Creating OCR framework to work with the OCR readouts with text filter and spatial constraints#1643
Conversation
…ext filter and spatial constraints
There was a problem hiding this comment.
Pull request overview
Introduces an OCR querying framework for the external credential scan flow, enabling extraction of ID-card fields by combining text matching with spatial constraints, and refactors existing Ghana credential selectors to use the new reader/model abstraction.
Changes:
- Added a custom OCR model (
OcrText/OcrLine) and a query DSL (OcrReader+OcrQueryScope) for text + spatial filtering. - Refactored Ghana NHIS / Ghana ID credential selection to return the matched OCR line instead of a boolean.
- Updated credential coordinate detection to use the new OCR reader/model pipeline.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| feature/external-credential/.../GhanaNhisCardOcrSelectorUseCase.kt | Switches from regex boolean check to selecting an OcrLine via OcrReader. |
| feature/external-credential/.../GhanaIdCardOcrSelectorUseCase.kt | Switches from regex boolean check to selecting an OcrLine via OcrReader. |
| feature/external-credential/.../GetCredentialCoordinatesUseCase.kt | Builds an OcrReader from ML Kit output and returns detected credential bounds from the selected OcrLine. |
| feature/external-credential/.../reader/OcrReader.kt | Adds reader entry point for executing OCR queries. |
| feature/external-credential/.../reader/OcrQuery.kt | Implements the query scope and spatial/text filters used by the OCR reader. |
| feature/external-credential/.../reader/OcrModel.kt | Adds ML Kit–independent OCR domain model used for extraction. |
| feature/external-credential/.../reader/OcrBuilder.kt | Converts ML Kit Text output into the custom OCR model. |
| feature/external-credential/.../model/BoundingBox.kt | Changes toBoundingBox() to accept nullable Rect. |
…his allows to isolate ML kit dependencies in tests
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 11 comments.
Comments suppressed due to low confidence (1)
feature/external-credential/src/main/java/com/simprints/feature/externalcredential/screens/scanocr/usecase/GetCredentialCoordinatesUseCase.kt:46
- This
suspenduse case catchesException, which will also catch and swallowCancellationException, preventing cooperative coroutine cancellation. Re-throwCancellationException(or catch a narrower exception type) before logging/returning null.
} catch (e: Exception) {
Simber.e("OCR failed for $documentType", e, tag = MULTI_FACTOR_ID)
null
…esolution logic into OcrReader
…g KDoc explaining the nesting limits of the OcrReader queries
|



JIRA ticket
Will be released in: 2026.2.0
Notable changes
Adding a way of extracting text fields from scanned ID cards by combining text matching with spatial navigation. This is required because the non-credential fields (name, date of birth, expiry date, etc.) cannot be reliably extracted using only pattern matching alone. The reader allows extraction rules to be expressed as human-readable queries:
Custom OCR models
ML Kit's
Textclass is converted into customOcrTextModel. This keeps all extraction logic free of ML Kit dependencies and makes it possible to unit test.Column-aware spatial filtering
Since the fields of interest on the documents that we're scanning are located in their respective columns, only
isBelowandisAboveare used. However, there might be situations where two columns are located right next to each other, and in order to enforce horizontal column boundaries, a candidate line is only returned if its left edge falls within the horizontal bounds of the anchor.If a column-aware spatial filtering is not applied then for a document layout as below
the following query
might equally return
11/11/1999or12/12/2028. We want to avoid such scenarios, and explicitly check that the value's boundaries below field of interest (date of birth) start within the 'date of birth' width.The drawing above demonstrates the logic for determining whether a line is within anchor's column. If the line's top-left
Xcoordinate is between anchor'sX_leftandX_right, then the element is considered in the same column