Summary
IngestCompleted could theoretically be emitted twice if complete_batch and fail_batch race on the same ingest run. Both call _check_completion() which emits the event if is_complete is true.
Analysis
After the consolidation in f690212, IngestCompleted is emitted from one place (IngestService._check_completion). But _check_completion is called by both complete_batch and fail_batch. If the last successful batch and a failed batch complete simultaneously in different workers, both could see is_complete == True after their respective atomic counter increments.
In practice, PostgreSQL row-level locking on the UPDATE ... SET x = x + 1 ... RETURNING * serializes the increments, so the second reader sees the already-COMPLETED status and check_completion returns False (the transition COMPLETED→COMPLETED is invalid). The race window is extremely small.
Risk
Low. IngestCompleted has no subscribers — it's an audit event. A duplicate is harmless. The row-level locking in PostgreSQL prevents this in practice.
Possible fix
Add an optimistic lock or status guard in _check_completion:
async def _check_completion(self, ingest_run: IngestRun) -> None:
if ingest_run.status != IngestStatus.RUNNING:
return # Already completed by another worker
if not ingest_run.check_completion(now):
return
...
This is already partially in place via check_completion() which calls transition_to(COMPLETED) which validates the transition. But the event emission happens after the save, so two workers could both save successfully before either emits.
A more robust fix: use a SELECT ... FOR UPDATE or conditional UPDATE ... WHERE status = 'running' to make the transition atomic at the SQL level.
Summary
IngestCompleted could theoretically be emitted twice if
complete_batchandfail_batchrace on the same ingest run. Both call_check_completion()which emits the event ifis_completeis true.Analysis
After the consolidation in f690212, IngestCompleted is emitted from one place (
IngestService._check_completion). But_check_completionis called by bothcomplete_batchandfail_batch. If the last successful batch and a failed batch complete simultaneously in different workers, both could seeis_complete == Trueafter their respective atomic counter increments.In practice, PostgreSQL row-level locking on the
UPDATE ... SET x = x + 1 ... RETURNING *serializes the increments, so the second reader sees the already-COMPLETED status andcheck_completionreturns False (the transition COMPLETED→COMPLETED is invalid). The race window is extremely small.Risk
Low. IngestCompleted has no subscribers — it's an audit event. A duplicate is harmless. The row-level locking in PostgreSQL prevents this in practice.
Possible fix
Add an optimistic lock or status guard in
_check_completion:This is already partially in place via
check_completion()which callstransition_to(COMPLETED)which validates the transition. But the event emission happens after the save, so two workers could both save successfully before either emits.A more robust fix: use a
SELECT ... FOR UPDATEor conditionalUPDATE ... WHERE status = 'running'to make the transition atomic at the SQL level.