A .NET 10 worker service that continuously polls a scan event API, queues events through Amazon SQS, and persists parcel summary data to SQL Server.
Built for the Freightways take-home exercise.
flowchart LR
subgraph exrernal[External Systems]
direction TB
API@{ shape: hex, label: "Scan Event API" }
end
subgraph AWS
direction TB
subgraph sqs[SQS]
direction RL
DLQ@{ shape: das, label: "SQS DLQ<br/>(manual redrive)"}
QUEUE@{ shape: das, label: "SQS Main Queue<br/>(visibility timeout: 30s)" }
end
subgraph worker[Worker in ECS/Fargate]
direction LR
POLLER@{ shape: rounded, label: "ApiPollerWorker" }
PROCESSOR@{ shape: rounded, label: "EventProcessorWorker" }
end
subgraph db[SQL Server DB]
direction TB
PARCEL@{ shape: cyl, label: "ParcelSummary" }
STATE@{ shape: cyl, label: "ProcessingState" }
end
end
exrernal ~~~ AWS
sqs & worker ~~~ db
API -->|"GET /v1/scans/scanevents<br/>?FromEventId=&Limit="| POLLER
STATE -->|"Read LastEventId<br/>on startup"| POLLER
POLLER -->|"Update LastEventId<br/>after batch sent"| STATE
POLLER -->|"SendMessageAsync<br/>(batch of ScanEvents)"| QUEUE
QUEUE -->|"ReceiveMessageAsync<br/>(long-poll 20s)"| PROCESSOR
PROCESSOR -->|"DeleteMessageAsync<br/>on success"| QUEUE
PROCESSOR -->|"MERGE upsert<br/>(idempotent)"| PARCEL
QUEUE -.->|"After 3 failures<br/>(maxReceiveCount)"| DLQ
classDef database stroke-width:2px
class PARCEL,STATE database
classDef queue stroke-width:2px
class QUEUE,DLQ queue
classDef api stroke-width:2px
class API api
classDef worker stroke-width:1px
class POLLER,PROCESSOR worker
Two decoupled BackgroundService workers:
ApiPollerWorker- ReadsLastEventIdfromProcessingStateon startup, polls the API in batches, sends valid events to SQS, and advancesLastEventIdafter each successful batch. Resumes from the last processed event on restart.EventProcessorWorker- Long-polls SQS, MERGEs each event intoParcelSummary, then deletes the message. On failure the message is not deleted - SQS redelivers after the visibility timeout and moves it to the DLQ after 3 attempts.
See docs/local-setup.md for the full setup walkthrough. In brief:
mise i # Install .NET 10, pkl, hk
mise run test # Run tests with 80% coverage gate
mise run build # Debug build
mise run publish # Native AOT binary
docker-compose up -d # Start SQL Server + LocalStack
dotnet user-secrets set "ScanEventApi:BaseUrl" "https://your-api-host" --project src/ScanEventWorker
dotnet run --project src/ScanEventWorker/ScanEventWorker.csproj- Scan Event API pre-exists and and all events are retained at all times indefinitely.
Events are returned ordered byEventIdascendingEventIdis monotonically increasing - queryingFromEventId=Xreliably returns all events with ID ≥ XThe API returns an emptyScanEventsarray when no more events exist (end-of-feed signal)- guarded:
ApiPollerWorkerchecksCount == 0and backs off
- guarded:
Only one worker instance runs at a time- enforced: startup
MutexinProgram.csaborts a second instance
- enforced: startup
RunIdcomes from the nestedUser.RunIdfield in the JSON response- guarded:
nullcoalesces tostring.Empty
- guarded:
StatusCodemay be an empty string- guarded:
nullcoalesces tostring.Empty
- guarded:
PickedUpAtUtcandDeliveredAtUtcare set on the first occurrence of their respective event types and are never overwritten- enforced:
??=inParcelSummary+IS NULLin SQL MERGE
- enforced:
UnknownTypevalues are stored as-is inParcelSummarywithout setting pickup/delivery timestamps- enforced:
switchdefault + SQLCASE
- enforced:
CreatedDateTimeUtcis the timestamp recorded asPickedUpAtUtc/DeliveredAtUtc- the spec sample contains a
CreatedDateTimeUtcfield on each event - no separate pickup/delivery timestamp field is defined
- the spec sample contains a
- First PICKUP/DELIVERY occurrence sets the timestamp permanently
- the spec is silent on correction or reversal events
- enforced:
??=inParcelSummary+IS NULLguard in SQL MERGE
Devicefields (DeviceId,DeviceType) andUser.UserId/User.CarrierIdfrom the API response are not persisted- the spec only defines
ParcelSummarycolumns - extra fields are silently ignored
- the spec only defines
STATUSevents update the most-recent-event columns (LatestType,LatestStatusCode, etc.) but do not affectPickedUpAtUtc/DeliveredAtUtc- Polling uses
FromEventId=LastEventId(inclusive) rather thanLastEventId+1- the last processed event is re-fetched on every cycle to avoid missing events if IDs have gaps
- the idempotent MERGE absorbs the duplicate without side effects
- The event feed starts at
EventId=1ProcessingStateis seeded withLastEventId=1on first run
- Event retention
- The worker assumes the API retains all events indefinitely (Assumption 0)
- If the API enforces a rolling retention window (e.g. 7 days), downtime longer than that window causes permanent data loss with no recovery path.
- No API authentication/authorisation
ScanEventApiClientsends unauth HTTP requests. The spec sample shows no auth header- if the production API requires one, it must be added before deployment
- No horizontal scaling-up/out
ProcessingStateis a single-row table and the startupMutexenforces one poller instance- Scaling the poller requires distributed locking and a different cursor model
- All quiet on the DLQ front
- Messages that fail 3 times move to the DLQ and die there
- Without monitoring they accumulate undetected
- Health checks: expose
/healthzendpoint1 reporting queue depth and DB connectivity - Metrics: OpenTelemetry counters for events processed/sec, SQS queue depth, DLQ size
- Horizontal scaling: run multiple
EventProcessorWorkerinstances; SQS competing-consumer model handles this without coordination - DLQ visibility: persist DLQ messages to a
FailedEventsDB table for operational queries - Rate limiting: token bucket on the API poller to avoid hammering the upstream service
- Database migrations: replace the
IF NOT EXISTSinitialiser with FluentMigrator for versioned schema changes - Production database: RDS SQL Server (managed via CDK stack) instead of Docker for managed backups and HA
- Local Setup
- Docker
- Azure Edge SQL database
- credentials, and run steps
- Infrastructure
- CDK stack
- downstream fan-out architecture
- spec constraint: why polling exists, and the event-driven alternative
- Design Rationale
- Domain model
- error handling
- AOT decisions
Footnotes
-
/healthzendpoint is actually deprecated since Kubernetes v1.16, and the more specific/livezand/readyzendpoints are preferred. ↩