Skip to content

feat(screenshot): Vercel preview screenshot → GitHub PR + Linear issue#3

Open
isadeks wants to merge 14 commits into
feat/agentcore-oauth-2-0bfrom
feat/screenshot-on-push
Open

feat(screenshot): Vercel preview screenshot → GitHub PR + Linear issue#3
isadeks wants to merge 14 commits into
feat/agentcore-oauth-2-0bfrom
feat/screenshot-on-push

Conversation

@isadeks
Copy link
Copy Markdown
Owner

@isadeks isadeks commented May 21, 2026

Stacked on feat/agentcore-oauth-2-0b (which is open as upstream PR aws-samples#160). When that lands, this branch will rebase onto main and a fresh PR can target the upstream repo.

Summary

After ABCA opens a PR for a Linear-driven task, Vercel deploys the preview, posts a deployment_status event back to GitHub, and ABCA's webhook receiver:

  1. Captures a full-page screenshot of the preview URL via AgentCore Browser
  2. Uploads the PNG to a private S3 bucket served via CloudFront
  3. Posts a markdown image comment on the open GitHub PR
  4. Looks up the Linear issue (by identifier in the PR title/body) and posts the same screenshot as a Linear comment

Smoke-validated end-to-end on backgroundagent-dev: Linear issue ABCA-70 → vercel-abca-linear PR #2 → screenshot landed on both GitHub PR and Linear issue, ~10s after Vercel reported the deploy.

Architecture

  • Lambda-only. No agent runtime is involved post-PR — the screenshot job is deterministic.
  • AgentCore Browser (AWS-managed aws.browser.v1) driven over CDP via the ws package, with SigV4-presigned WSS URL. Avoids Playwright bloat in the Lambda bundle.
  • Private S3 + CloudFront with OAC. Account-level S3 Block Public Access blocks plain public-read buckets, so CloudFront fronts the bucket; both GitHub Markdown and Linear's image preview can render the URL anonymously.
  • WAF exemption. /v1/github/webhook is excluded from AWSManagedRulesCommonRuleSet for the same reason /v1/linear/webhook is — webhook payloads with absolute URLs trip GenericRFI_BODY otherwise.
  • Retry on no-PR. Vercel commonly posts deployment_status 5-15s before the agent's gh pr create returns; the processor retries the PR lookup with backoff (0s/5s/10s/20s) to handle the race.

What's in this PR

  • New construct + handlers under cdk/src/{constructs,handlers}/ for the GitHub webhook → screenshot pipeline
  • WAF CRS exemption for the new webhook path
  • Three new stack outputs (GitHubWebhookUrl, GitHubWebhookSecretArn, ScreenshotCloudFrontDomain)
  • bgagent linear list-projects rewritten to use OAuth (was still on the parked PAK secret)
  • docs/guides/VERCEL_SETUP_GUIDE.md — operator-facing setup walkthrough + troubleshooting

Test plan

Followups (tracked, not blocking)

  • Scope IAM down from bedrock-agentcore:* to the specific Browser action once discoverable via CloudTrail
  • bgagent github webhook-info CLI command for setup ergonomics
  • Unit tests for the four new handlers
  • Prefix-routing for the Linear workspace lookup (today scans all active workspaces)
  • Production hardening: Vercel Standard Protection + signed bypass token injection
  • ARCHITECTURE.md / COST_MODEL.md / USER_GUIDE.md / ROADMAP.md updates

isadeks and others added 14 commits May 20, 2026 13:57
… wiring yet)

Lands the runtime pieces of the screenshot-on-preview-deploy feature:

- `ScreenshotBucket` construct (`cdk/src/constructs/screenshot-bucket.ts`):
  public-read on `screenshots/*`, SSE-S3, 30-day TTL. Bucket policy
  scoped to the prefix so anything written outside is invisible.

- GitHub webhook receiver (`cdk/src/handlers/github-webhook.ts`):
  HMAC-verifies `X-Hub-Signature-256`, filters to
  `deployment_status` events with `state=success` and
  `environment=Preview`, dedups on `(repo, deployment_id, status_id)`,
  async-invokes the processor. Topology mirrors `linear-webhook.ts`.

- Webhook processor (`cdk/src/handlers/github-webhook-processor.ts`):
  Looks up the open PR for the deploy SHA via the GitHub Commits API,
  captures a screenshot of `deployment.environment_url` via AgentCore
  Browser, PUTs the PNG to the screenshot bucket, posts a markdown
  embed in a fresh PR comment.

- AgentCore Browser wrapper (`cdk/src/handlers/shared/agentcore-browser.ts`):
  Drives Chrome DevTools Protocol over WebSocket directly, avoiding
  Playwright bloat. SigV4-signs the WSS handshake. Smoke-tested locally
  against example.com and a Vercel demo URL — 6.5s end-to-end, valid PNG.

- GitHub webhook verify helper (`cdk/src/handlers/shared/github-webhook-verify.ts`):
  Mirrors `linear-verify.ts` — secret cache with 5min TTL, transparent
  re-fetch once on signature failure.

Stack wiring (IAM grants, API Gateway route, Lambda construction)
is the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New `GitHubScreenshotIntegration` construct (mirrors `LinearIntegration`):
  bundles the screenshot bucket, dedup table, signing-secret placeholder,
  receiver Lambda, processor Lambda, and the API Gateway route. cdk-nag
  suppressions added inline (HMAC auth instead of Cognito; AgentCore
  Browser sessions have no per-resource ARN; Secrets Manager rotation
  is owned by GitHub).

- Wired into `agent.ts` after the LinearIntegration block. Reuses the
  existing `githubTokenSecret` (the processor uses ABCA's main GitHub
  token to look up which PR a deploy SHA belongs to and post the
  screenshot comment — no new credential).

- Three new stack outputs: `GitHubWebhookUrl`, `GitHubWebhookSecretArn`,
  `ScreenshotBucketName`.

- Bumped agent.test.ts table count from 13 to 14 to account for the
  new dedup table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ot bucket

cdk-nag's S2 fires on any bucket that has `blockPublicPolicy: false`
even when the policy is intentionally permissive. Add the suppression
with the same rationale as S1/S5 — public reads are required by
GitHub Markdown renderers and Linear `imageUploadFromUrl`, and the
read grant is prefix-scoped to `screenshots/*`.

Caught when the first deploy attempt aborted at synth-time on the new
GitHubScreenshotIntegration construct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first deploy attempt failed at CFN-execute time on the bucket
policy:

  s3:PutBucketPolicy ... because public policies are prevented by
  the BlockPublicPolicy setting in S3 Block Public Access.

Account-level Block Public Access is on for this AWS account, which
overrides per-bucket BPA settings. Disabling it would change the
security posture of the whole account, so route around the constraint
with the AWS-recommended pattern: private S3 + CloudFront with Origin
Access Control.

Changes:
- `ScreenshotBucket` is now `BLOCK_ALL` BPA, no public bucket policy.
  Adds a `cloudfront.Distribution` whose origin is the bucket via
  `S3BucketOrigin.withOriginAccessControl`. The distribution policy is
  scoped to the CloudFront service principal only, so account-level
  BPA accepts it.
- Processor reads `SCREENSHOT_PUBLIC_HOST` (the CloudFront domain)
  instead of building an S3 URL. PR comments now embed
  `https://<dist>.cloudfront.net/screenshots/...` URLs.
- New stack output `ScreenshotCloudFrontDomain`.
- Bucket-level S2/S5 suppressions removed (no longer applicable —
  bucket is private). Distribution gets CFR1/CFR2/CFR3/CFR4/CFR7
  suppressions with rationales.

Heads up on deploy time: CloudFront distributions take 5-15 min to
provision on first create.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CommonRuleSet was 403'ing GitHub deployment_status webhooks before
the request reached our Lambda — the deployment payload contains
absolute Vercel preview URLs in the body, which trips GenericRFI_BODY.

Mirror the Linear webhook exemption: the GitHub webhook path is
HMAC-verified in the Lambda, parsed as strict JSON, never
interpolated into SQL/HTML, and rate-limited by the priority-3 rule.
CRS still applies to every other route.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…loyment

GitHub's `deployment_status` webhook puts the deployed URL on the
*status* object, not the deployment itself. The deployment object is
immutable per (sha, environment); the status changes through the
deploy lifecycle (`pending` → `success`) and carries the URL only
once the deploy finishes.

Symptom: receiver kept short-circuiting `success` events from Vercel
with `{ok: true, skipped_no_url: true}` because we read the wrong
field. Verified by inspecting the webhook delivery payload via
`gh api .../deliveries/<id> --jq .request.payload.deployment_status` —
URL was there all along.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dshake

Node 24's global WebSocket (from undici) does NOT support arbitrary
HTTP headers on the upgrade request — passing them as the second arg
gets silently ignored. AgentCore Browser's WSS handshake requires
SigV4-signed Authorization + X-Amz-* headers, so the connection was
opening but then getting rejected, which surfaced as an empty
`error` event ("AgentCore Browser WebSocket error: ").

Switch to the `ws` package which natively supports `options.headers`.
Also add an `unexpected-response` handler so HTTP-level handshake
failures (403, 400) surface with status codes instead of empty errors.

Smoke verified locally — the ws-based path opens cleanly against
example.com and Vercel preview URLs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lambda runtime returned a 403 on the WSS upgrade despite well-formed
SigV4 headers — `ws` rewrites the Host header during the upgrade
GET, which invalidates the canonical-request signature we computed
against the original Host. This works locally because Node's tooling
on macOS keeps the original Host through the handshake, but the
Lambda runtime's TLS stack normalizes differently.

Switch to query-parameter SigV4 (presigned URL): SignatureV4.presign
returns a wss://...?X-Amz-Algorithm=...&X-Amz-Signature=... URL where
the auth lives in the URL itself, so any Host-header rewriting
downstream doesn't break the signature.

Smoke verified locally — presigned URL connects cleanly to AgentCore
Browser and the screenshot pipeline runs end-to-end (6.3s, valid
PNG, captures example.com correctly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The minimal IAM I shipped earlier (`StartBrowserSession`,
`StopBrowserSession`, `GetBrowserSession`, `UpdateBrowserStream`)
wasn't enough — the WSS automation-stream connect requires an
additional `ConnectBrowserAutomationStream`-flavored action that
isn't in the public CLI command list. Lambda invocations were
opening sessions cleanly but 403'ing on the WSS upgrade.

Widen to `bedrock-agentcore:*` to unblock the e2e flow. Followup:
scope back down to the specific connect action once it's documented
or surfaced via CloudTrail decoded-message-on-deny.

Smoke verified: PR #1 on isadeks/vercel-abca-linear now receives a
screenshot comment within ~7s of the deployment_status webhook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the screenshot processor to find a Linear issue via the PR's
title/body and post the same image comment there.

Approach (no GSI write-back needed):
- Regex-extract Linear identifier (e.g. `ABCA-42`) from PR title/body.
  These are present whether the agent put them there
  (`task_description` carries the identifier) or Linear's own GitHub
  integration auto-injected the back-reference on PR open.
- Scan `LinearWorkspaceRegistryTable` for `status=active` workspaces.
  Per-workspace, query Linear's `issueVcsBranchSearch` (which accepts
  the human-readable identifier) and accept the first exact-match
  hit.
- Post the markdown image comment via the existing `postIssueComment`
  helper from Phase 2.0b.

The Linear post is best-effort — if the registry table isn't wired,
the identifier doesn't extract, or the lookup misses, the GitHub PR
comment still lands. New env var `LINEAR_WORKSPACE_REGISTRY_TABLE_NAME`
is optional on the processor; the construct only sets it when the
prop is provided.

CDK: `GitHubScreenshotIntegrationProps` gains an optional
`linearWorkspaceRegistryTable`. When provided, the processor's IAM
grows: ReadData on the registry, GetSecretValue+PutSecretValue on
`bgagent-linear-oauth-*`. `agent.ts` wires
`linearIntegration.workspaceRegistryTable` into the screenshot
construct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The command still pulled from the parked PAK secret
(`LinearApiTokenSecretArn`), which we removed in Phase 2.0b. Symptom:
`Could not find LinearApiTokenSecretArn in stack outputs.`

Rewrite to scan Secrets Manager for `bgagent-linear-oauth-*` secrets
and query each workspace's projects with its own OAuth token. Supports
`--slug <slug>` to scope to one workspace; without it, queries every
installed workspace and labels each project with its source.

Also: switch to the `Bearer <token>` auth header and the
`teams(first: 1) { nodes { name } }` shape (the old `team` field on
Project no longer exists in Linear's GraphQL).

Adds a `LINEAR_OAUTH_SECRET_PREFIX` const in `linear-oauth.ts` to
keep the secret-name contract in one place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vercel posts the success `deployment_status` webhook the moment its
build finishes, which on the Linear-driven path is ~7-15s before the
agent's `gh pr create` returns. The processor's first lookup against
the GitHub commit-pulls API came back empty and we'd silently drop
the screenshot.

Add a retry wrapper with backoff (0s, 5s, 10s, 20s — total max ~35s)
around the PR lookup. The first hit returns immediately, so the
warm-cache happy path is unchanged.

Verified end-to-end on backgroundagent-dev: Linear issue ABCA-70 →
agent → PR #2 in vercel-abca-linear → Vercel preview → screenshot
landed on both the GitHub PR and the Linear issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…issue comment spam

Move the trigger-label check ahead of every user-facing comment path in
the Linear webhook processor, and switch the default trigger label from
'bgagent' to 'abca'. An unlabeled issue is now a true no-op: no comment,
no reaction, no createTaskCore, no DDB writes — regardless of whether
the project is onboarded.

Why: workspace webhooks fire workspace-wide. A single un-onboarded team
in the same Linear workspace produced 47 identical "❌ project isn't
onboarded" comments on GRO-783 in 5 minutes because every Issue event
(create/update/label-change) hit the not-onboarded gate before the
label gate. With the gate order flipped, only issues that explicitly
opt in via the trigger label can ever generate user-facing feedback.

Per-project label_filter override is still respected — the project
mapping lookup now happens once, before the label gate, instead of after.

Tests: two new regression tests pin the spam scenario (unlabeled issue
in a non-onboarded project, and unlabeled issue with no projectId) to
zero side effects. Full CDK suite (89 suites / 1572 tests) passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operator-facing setup walkthrough:
1. Connect Vercel to the GitHub repo
2. Vercel project settings (Git events on, Deployment Protection off
   for the demo, with a "production hardening" caveat for signed bypass)
3. Onboard the repo to ABCA (RepoTable put + bgagent linear onboard-project)
4. Configure the GitHub webhook (URL + secret from stack outputs,
   subscribe to Deployment statuses only)
5. Smoke test (label a Linear issue, watch screenshot land on PR + Linear)

Includes a troubleshooting section indexed by symptom (401/403 from
webhook, no comment lands, Linear post missing, CloudFront 403, Vercel
auth wall) and a forward-looking "production hardening" list for when
the feature graduates from demo.

Wires the new guide into the Starlight sync (docs/scripts/sync-starlight.mjs)
and sidebar (docs/astro.config.mjs).
isadeks pushed a commit that referenced this pull request May 26, 2026
…ty contract

Three remaining substantive review items from PR aws-samples#160:

- Validate all 11 fields in getOauthSecret (review non-blocking aws-samples#7).
  Was checking only access_token / refresh_token / expires_at; missing
  client_id or client_secret only surfaced 24h later when the refresh
  call needed them and found undefined. Extracted the required-field
  list into a const next to the StoredOauthToken interface and check
  the full set at deserialization. Bad secrets fail fast at fetch
  time with a structured log line naming the missing fields.

- CallbackResult discriminated union (review non-blocking aws-samples#6). Was
  `{ sessionId: string|null, code: string|null, state: string|null }`
  which let callers construct unreachable shapes. Split into
  `{ kind: 'agentcore', sessionId } | { kind: 'direct-oauth', code, state }`.
  Updated the resolver site (`oauth-callback-server.ts`), the
  consumer (`bgagent linear setup`), and the test file to use
  exhaustive type-narrowing. The setup wizard now errors clearly if
  it gets the agentcore shape (parked path) instead of silently
  passing nulls down.

- Cross-language schema-parity contract test (review non-blocking #3).
  CLI's StoredLinearOauthToken and Lambda's StoredOauthToken define
  the same JSON-in-Secrets-Manager schema independently; drift
  between the two would be a silent bug (CLI writes one field name,
  Lambda reads another, refresh works, every Lambda invocation logs
  a missing-field error). New test in
  `cdk/test/contracts/stored-oauth-token-parity.test.ts` regex-parses
  both interface definitions out of source and asserts the field set
  is equal. Also asserts the new
  `STORED_OAUTH_TOKEN_REQUIRED_FIELDS` const matches the interface,
  so future field additions can't drift between the validator and
  the type.

CLI tests 286/286 pass. CDK resolver + contract 13/13 pass.
isadeks added a commit that referenced this pull request May 26, 2026
…ty contract

Three remaining substantive review items from PR aws-samples#160:

- Validate all 11 fields in getOauthSecret (review non-blocking aws-samples#7).
  Was checking only access_token / refresh_token / expires_at; missing
  client_id or client_secret only surfaced 24h later when the refresh
  call needed them and found undefined. Extracted the required-field
  list into a const next to the StoredOauthToken interface and check
  the full set at deserialization. Bad secrets fail fast at fetch
  time with a structured log line naming the missing fields.

- CallbackResult discriminated union (review non-blocking aws-samples#6). Was
  `{ sessionId: string|null, code: string|null, state: string|null }`
  which let callers construct unreachable shapes. Split into
  `{ kind: 'agentcore', sessionId } | { kind: 'direct-oauth', code, state }`.
  Updated the resolver site (`oauth-callback-server.ts`), the
  consumer (`bgagent linear setup`), and the test file to use
  exhaustive type-narrowing. The setup wizard now errors clearly if
  it gets the agentcore shape (parked path) instead of silently
  passing nulls down.

- Cross-language schema-parity contract test (review non-blocking #3).
  CLI's StoredLinearOauthToken and Lambda's StoredOauthToken define
  the same JSON-in-Secrets-Manager schema independently; drift
  between the two would be a silent bug (CLI writes one field name,
  Lambda reads another, refresh works, every Lambda invocation logs
  a missing-field error). New test in
  `cdk/test/contracts/stored-oauth-token-parity.test.ts` regex-parses
  both interface definitions out of source and asserts the field set
  is equal. Also asserts the new
  `STORED_OAUTH_TOKEN_REQUIRED_FIELDS` const matches the interface,
  so future field additions can't drift between the validator and
  the type.

CLI tests 286/286 pass. CDK resolver + contract 13/13 pass.
@isadeks isadeks force-pushed the feat/screenshot-on-push branch from 0e8240b to e0b658f Compare May 26, 2026 02:54
isadeks added a commit that referenced this pull request May 26, 2026
… 2.0b) (aws-samples#160)

* feat(linear): resolve API token via AgentCore Identity (Phase 2.0a)

Migrates the agent runtime's Linear personal API token resolution from
AWS Secrets Manager to AWS Bedrock AgentCore Identity. This is the
"validate Identity SDK" step of the v2 plan; Phase 2.0b will swap the
API key for OAuth and converge Linear MCP onto AgentCore Gateway in
one cutover.

Per Alain's guidance: "start by using api key, if it works, switch to
oauth. you will setup an outbound auth for your server using agentcore
identity. that identity can be (AC identity is like a wrapper around
secrets manager) api key or oauth."

Lambdas (orchestrator + processor) intentionally keep using Secrets
Manager via the existing `LinearApiTokenSecret` for now. The Python
`bedrock_agentcore` SDK has no Node.js equivalent — Lambda migration
requires `@aws-sdk/client-bedrock-agentcore` raw API calls and folds
into 2.0b's bigger refactor. End-state of 2.0a: agent reads from
Identity, Lambdas read from Secrets Manager, both pointing at the same
underlying token value (admin populates both).

`agent/src/config.py::resolve_linear_api_token`:

  - Drops boto3 SecretsManager fetch + `LINEAR_API_TOKEN_SECRET_ARN` env.
  - Reads new env `LINEAR_API_KEY_PROVIDER_NAME` (provider name in
    Identity vault).
  - Calls `IdentityClient.get_api_key()` with the workload access token
    auto-injected into `BedrockAgentCoreContext` by AgentCore Runtime
    (verified by reading the SDK's `auth.py` decorator implementation —
    no manual workload-identity mint needed inside the runtime).
  - Caches the resolved token in `LINEAR_API_TOKEN` so downstream
    consumers stay unchanged: `channel_mcp.py`'s `${LINEAR_API_TOKEN}`
    placeholder in `.mcp.json` and `linear_reactions.py`'s GraphQL
    Authorization header.

Preserves PR aws-samples#87's nice-to-have improvements:

  - `ImportError` graceful fallback (now for `bedrock_agentcore` instead
    of `boto3`) — degrade with WARN, don't crash the agent.
  - `AccessDeniedException` and `ResourceNotFoundException` logged at
    ERROR severity (persistent IAM/config bugs that should page).
    Other ClientErrors stay at WARN (transient throttle/network).

`agent/pyproject.toml`: adds `bedrock-agentcore==1.9.1` dep.

`cdk/src/stacks/agent.ts`:

  - On the AgentCore runtime: drops `linearIntegration.apiTokenSecret.
    grantRead(runtime)` and the `LINEAR_API_TOKEN_SECRET_ARN` env-var
    override. Adds `LINEAR_API_KEY_PROVIDER_NAME` env (hardcoded
    `'linear-api-key'` for now; can parametrize later via context if
    multi-environment naming is needed) and IAM permissions for
    `bedrock-agentcore:GetResourceApiKey` and
    `bedrock-agentcore:GetWorkloadAccessToken`.
  - Lambdas (orchestrator + processor) untouched — they still grant on
    the Linear secret and read from Secrets Manager.
  - Resource scope on the new IAM is `*` for now; AgentCore Identity ARN
    format isn't fully standardized in public docs as of 2026-05-15.
    Tighten in 2.0b when OAuth migration documents the canonical
    resource shape.

`docs/guides/LINEAR_SETUP_GUIDE.md`: adds Step 4.5 documenting the
one-time `agentcore add credential --type api-key --name linear-api-key`
admin command users must run alongside the existing `bgagent linear
setup` wizard. Notes that Lambdas keep Secrets Manager temporarily and
2.0b will retire the dual-store setup. Starlight mirror synced.

`agent/tests/test_config.py::TestResolveLinearApiToken` — 10 tests
covering: cached env var fast-path; missing provider name; missing
region; workload token absent (outside runtime); happy path with
env-var side-effect; botocore error swallowed with WARN; SDK returns
None defensively; ImportError fallback; AccessDeniedException → ERROR
severity; ResourceNotFoundException → ERROR severity.

542 agent / 1271 cdk / 196 cli, all green. Lint + typecheck clean.
CDK synth clean.

`bedrock_agentcore` SDK confirmed working in our runtime image (verified
in `node_modules` post-install). The `BedrockAgentCoreContext` workload
token auto-injection is documented behaviour for code running inside
AgentCore Runtime — verified by reading the SDK's `@requires_api_key`
decorator implementation, which uses the same context lookup we use
here.

Stacked on PR aws-samples#87 (`feat/linear-processor-feedback`). Will conflict on
`config.py` and `test_config.py` if aws-samples#87 needs further rework before
merge — happy to rebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(linear): use aws CLI for credential provider, not the agentcore command

The setup guide referenced `agentcore add credential` which doesn't actually
work end-to-end:

  - The Python `bedrock-agentcore-starter-toolkit` CLI (`agentcore`) only
    exposes agent-lifecycle commands; there is no `credential-provider`
    subcommand. Confirmed by reading the toolkit's CLI reference and by
    user trying `agentcore configure credential-provider --type api-key
    --name ...` and receiving `No such command 'credential-provider'`.
  - The new npm `@aws/agentcore` CLI does have `agentcore add credential`
    but uses a declarative project model — the credential lands in
    `agentcore.json` + `.env.local`, not the actual AgentCore Identity
    vault, until `agentcore deploy` runs against a project structured for
    that CLI. ABCA isn't structured that way.

Switch the docs to the plain AWS CLI which works directly against the
AgentCore Identity API:

    aws bedrock-agentcore-control create-api-key-credential-provider \
      --name linear-api-key \
      --api-key "<paste lin_api_… token here>" \
      --region us-east-1

Plus the matching `list-api-key-credential-providers` for verification.
Add a "Tooling note" at the bottom of the section explaining why the
plain AWS CLI is the right path here vs. the two `agentcore` CLIs.

Starlight mirror synced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(linear): pass runtimeUserId so AgentCore injects WorkloadAccessToken

Smoke on backgroundagent-dev caught a real bug in the Phase 2.0a
migration: the agent's `resolve_linear_api_token()` was correctly
calling `IdentityClient.get_api_key()` but failing earlier at
`BedrockAgentCoreContext.get_workload_access_token()` returning None.
The Linear MCP then loaded with an unresolved `${LINEAR_API_TOKEN}`
placeholder and 👀 didn't post.

Root cause (from reading bedrock-agentcore-sdk-python source):

The `WorkloadAccessToken` request header (which the runtime container
reads to populate `BedrockAgentCoreContext`) is only injected by
AgentCore Identity when `InvokeAgentRuntimeCommand` is called with
`runtimeUserId`. Per AWS docs at
https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-oauth.html:

  "Agent Runtime exchanges this token for a Workload Access Token via
   bedrock-agentcore:GetWorkloadAccessTokenForJWT API and delivers it
   to your agent code via the payload header `WorkloadAccessToken`."

Without `runtimeUserId`, AgentCore never derives a workload token and
the header is absent. `app.py::_build_request_context` reads the
header off the inbound request; the agent sees None.

Fix:

1. Thread `userId` through the `ComputeStrategy.startSession` interface
   (compute-strategy.ts).
2. Pass `task.user_id` (the task's Cognito sub) at the call site in
   orchestrate-task.ts.
3. Set `runtimeUserId: input.userId` on `InvokeAgentRuntimeCommand` in
   agentcore-strategy.ts. Log it alongside session_id for traceability.
4. ECS strategy accepts the new parameter to satisfy the interface;
   doesn't use it (ECS doesn't go through AgentCore Identity).
5. Grant the orchestrator role `bedrock-agentcore:InvokeAgentRuntimeForUser`
   alongside `InvokeAgentRuntime` (task-orchestrator.ts). Without this,
   the new `runtimeUserId` parameter would 403.

Tests updated:
- `agentcore-strategy.test.ts`: pin that `runtimeUserId` flows from
  input into the SDK command; pass `userId: 'cognito-user-1'` in 4 call
  sites.
- `ecs-strategy.test.ts`: pass `userId` (unused by ECS) on 3 call sites.
- `start-session-composition.test.ts`: pass `userId: 'cognito-test'` on
  3 call sites.
- `task-orchestrator.test.ts`: assert the IAM action list includes
  `InvokeAgentRuntimeForUser` (2 assertions).

542 agent / 1273 cdk / 196 cli — all green. Lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(linear): two undocumented gotchas to make AgentCore Identity actually work

End-to-end smoke on backgroundagent-dev surfaced two more silent failure
modes after the runtimeUserId fix landed:

`BedrockAgentCoreContext.get_workload_access_token()` returned None inside
the pipeline thread even though the platform delivered the token on the
inbound request. Cause: Python ContextVar storage is per-thread, not
shared across `threading.Thread` boundaries. Our `_run_task_background`
spawns a new thread for the pipeline, so any context-var the SDK's
middleware sets in the request handler thread doesn't reach it.

Compounding factor: the SDK's `_build_request_context` middleware only
runs when using `BedrockAgentCoreApp` from `bedrock_agentcore.runtime`.
Plain FastAPI apps like ours never get that bridge at all.

Fix: read the workload token off the request in `_extract_invocation_params`
(handling both observed header spellings — `WorkloadAccessToken` and
`x-amzn-bedrock-agentcore-runtime-workload-accesstoken`), thread it through
the kwargs of `_run_task_background`, and have the pipeline thread call
`BedrockAgentCoreContext.set_workload_access_token` on entry.

   (cdk/src/stacks/agent.ts)

After (1) was applied, `IdentityClient.get_api_key()` actually fired and
got `AccessDeniedException: ... not authorized to perform:
secretsmanager:GetSecretValue`.

Cause: AgentCore Identity stores api-key credentials in Secrets Manager
under reserved prefix `bedrock-agentcore-identity!*` (the actual ARN
shape: `arn:aws:secretsmanager:REGION:ACCOUNT:secret:bedrock-agentcore-
identity!default/apikey/<provider-name>-<hash>`). The `GetResourceApiKey`
control-plane API surfaces the underlying secret to the caller, and AWS
verifies the *caller* role (our runtime role) has `GetSecretValue` on
the actual secret resource — not the SLR.

Fix: grant the runtime role `secretsmanager:GetSecretValue` scoped to
the `bedrock-agentcore-identity!*` prefix in the current
account/region. Tightly scoped to Identity-managed secrets; doesn't
leak read access to other Secrets Manager resources.

- Runtime container reads workload token from request, propagates across
  thread boundary, calls IdentityClient successfully
- 👀 reaction posts at +525ms after task pickup, no warnings
- Linear MCP loads cleanly with the resolved token
- No more `workload access token not in context` WARN
- No more `AccessDeniedException` from `GetResourceApiKey`

Three undocumented requirements total for Phase 2.0a (combining with
the runtimeUserId fix from the prior commit):

  1. Caller (orchestrator) sends `runtimeUserId` and has
     `InvokeAgentRuntimeForUser` IAM
  2. Runtime container bridges the workload-token header into the
     ContextVar, with per-thread propagation if the pipeline runs in a
     spawned thread
  3. Runtime role has `secretsmanager:GetSecretValue` on
     `bedrock-agentcore-identity!*`

All three are silent failures on their own; missing any one returns None
or AccessDenied without obvious "you forgot X" diagnostics. Will file an
upstream issue against `aws/bedrock-agentcore-sdk-python` summarising
all three so others don't burn the same cycles.

Tests: 542 agent / 1273 cdk / 196 cli — all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(2.0b): foundation — workspace registry, admin invite-user, Linear app template

Wave 1 of Phase 2.0b: prereq pieces for the Linear OAuth migration.

- LinearWorkspaceRegistryTable: maps Linear org-id → AgentCore credential
  provider name, so webhook + orchestrator Lambdas can resolve the
  workspace's OAuth token without knowing about provider naming.
- bgagent admin invite-user: wraps Cognito admin-create-user with the
  right defaults and prints a base64 bundle that --from-bundle decodes
  into ~/.bgagent/config.json. Replaces a four-flag dance with a single
  paste for joining teammates.
- bgagent linear app-template: prints the Linear OAuth app form values
  captured from the 2.0b spike — GitHub username with [bot] suffix and
  Webhooks ON gate the actor=app flow; misleading "Invalid redirect_uri"
  error is the symptom when either is missing.
- USER_GUIDE roles section + joining-an-existing-deployment flow: makes
  the four-role lifecycle explicit (stack admin / workspace admin /
  repo onboarder / teammate) so a teammate landing on the docs has a
  clear non-admin path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(linear): rewrite setup guide for OAuth (2.0b)

Replace the personal-API-key flow with the Linear OAuth `actor=app`
install path verified by the 2.0b spike. Major changes:

- Step 1: AgentCore credential provider via `bgagent linear
  oauth-register-workspace`, capturing the AWS-hosted callback URL
  that Linear will actually see.
- Step 2: Linear OAuth app creation via `bgagent linear app-template`,
  documenting the GitHub-username-with-[bot]-suffix and Webhooks-ON
  gates that produce Linear's misleading "Invalid redirect_uri" error
  when missing.
- Step 4: OAuth dance via the rewritten `bgagent linear setup` —
  ephemeral localhost HTTPS callback; no own ALB/Lambda needed since
  AWS proxies the OAuth flow.
- Step 7: clarify that the PAK-owner auto-link becomes the
  setup-runner auto-link; the manual DDB mapping path stays for now
  until self-service `@bgagent link` ships.
- New "Adding additional Linear workspaces" section for
  multi-workspace deployments.
- New "Migration from 2.0a (PAK) to 2.0b (OAuth)" runbook.
- Troubleshooting expanded to cover the Invalid-redirect_uri and
  401-from-Linear scenarios surfaced in the spike.

Notes the docs reference commands shipping in Wave 2 (aws-samples#63
oauth-register-workspace, aws-samples#65 setup wizard, aws-samples#67 add-workspace) — the
2.0b branch is a coherent unit and aws-samples#62 must land before those flows
are wired so the docs aren't a moving target during implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): workload-token retrieval helper for AgentCore Identity (2.0b C1)

The CLI runs OUTSIDE AgentCore Runtime, so the in-container ContextVar
trick from 2.0a does not apply. This module gives every 2.0b OAuth-flow
command a single way to obtain a workload access token:

- getWorkloadAccessToken({region, workloadName, userId}) calls the
  data-plane GetWorkloadAccessTokenForUserId, scoping the resulting
  token to (workload, cognito_sub) so OAuth-token retrieval is per
  platform user.
- decodeCognitoSub(idToken) extracts the sub claim from the cached
  id_token, parsing only — token validation is API Gateway's job.
- DEFAULT_CLI_WORKLOAD_NAME is the deployment-time convention; the
  workload identity itself will be created by a follow-up CDK custom
  resource (aws-samples#61). Stack output 'CliWorkloadIdentityName' wires the
  CLI to whatever the deployed name actually is.

Two SDK errors get translated into actionable remediation hints:
- ValidationException: WorkloadIdentity is linked to a service —
  documented footgun from the spike, surfaces when the CLI is
  pointed at a runtime workload.
- AccessDeniedException / ResourceNotFoundException — same surface
  treatment, with the bgagent-side checklist embedded in the message.

Adds @aws-sdk/client-bedrock-agentcore + bedrock-agentcore-control as
CLI deps. Pins all CLI AWS SDK clients to 3.1024.0 (matching) to keep
the @smithy/core dependency graph deduplicated; mixed-version pins
caused interface-collision typecheck errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): bgagent linear oauth-register-workspace (2.0b B2)

Registers a Linear workspace as an AgentCore OAuth2 credential provider.
The command:

- Validates the workspace slug shape ([a-zA-Z0-9_-]{4,50}) so the
  resulting provider name fits AgentCore's 64-char limit.
- Prompts for clientId + clientSecret (interactive, not echoed).
- Calls CreateOauth2CredentialProvider with credentialProviderVendor=
  'CustomOauth2' and explicit authorizationServerMetadata for Linear's
  fixed endpoints (Linear has no .well-known/openid-configuration, so
  vendor-discovery cannot auto-resolve).
- Prints the AWS-hosted callback URL the operator pastes into Linear's
  app form — the AWS-side proxy that Linear actually redirects to.
- Idempotent: re-running with an existing provider name fetches the
  callbackUrl and reports "already exists — re-using it".

Smoke test against dev account (2026-05-19) revealed AWS surfaces the
duplicate-name case as ValidationException (NOT ConflictException as
CFN/REST conventions would suggest). Detection is by message-substring
match; tests cover both the duplicate path and the "ValidationException
for a non-duplicate reason" path so we don't accidentally swallow input
validation errors.

AccessDeniedException gets a remediation hint pointing at the
'bedrock-agentcore:CreateOauth2CredentialProvider' permission, since
the most common misconfiguration is running the command as a
Cognito-authenticated CLI user (no permissions) rather than as an
admin/stack-deploy IAM principal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(2.0b): CLI workload identity + localhost OAuth callback server (A3)

Two pieces that together let the CLI run the OAuth dance without any
externally-facing infrastructure:

CDK side (CliWorkloadIdentity construct, wired into the agent stack):
- Creates a dedicated AgentCore Identity workload identity named
  `bgagent-cli`, distinct from the runtime workload (which is service-
  linked and cannot mint user-scoped tokens).
- Allowlists `https://localhost:8443/oauth/callback` as a permitted
  resourceOauth2ReturnUrl. AgentCore validates browser-redirect URLs
  against this list, so the CLI cannot finish the OAuth dance without it.
- Implementation: AwsCustomResource (no L2/L1 for AgentCore Identity in
  CDK as of May 2026). Idempotent — Create/Update/Delete lifecycle wired
  so re-deploys reconcile the allowlist and stack-deletes don't leak
  workload identities (50/account-region quota).
- Stack outputs `CliWorkloadIdentityName` and `LinearWorkspaceRegistryTableName`
  so the CLI can discover them at runtime.

CLI side (oauth-callback-server module):
- Generates a fresh self-signed cert in /tmp via openssl on each
  invocation; cert is cleaned up when the server shuts down.
- Starts an HTTPS listener on localhost:8443/oauth/callback, captures the
  first request's `session_id` query param, renders a success page,
  shuts down. Uses res.once('finish') to ensure the response body
  flushes before the listener closes — otherwise the browser hangs
  waiting for bytes that never arrive (caught by integration test).
- Translates EADDRINUSE and timeout into actionable CliErrors.

The CLI URL constant and the CDK default allowlist must agree on the
exact URL string — drift would silently break the OAuth dance with
"redirect_uri not allowlisted". A regression-locking test on the URL
constant + matching CDK default flags the issue at unit-test time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): bgagent linear setup OAuth dance orchestration (2.0b C2/C3)

Replaces the personal-API-key wizard with a 7-step OAuth flow that
authorizes a Linear workspace via AgentCore Identity:

  1. Resolve stack outputs (CliWorkloadIdentityName, registry table,
     user mapping table, webhook secret ARN). Errors loudly if any are
     missing — typically means the stack predates 2.0b.
  2. Read Cognito sub from cached id_token.
  3. Mint workload access token via getWorkloadAccessTokenForUserId.
  4. Initiate OAuth dance: getResourceOauth2Token returns an authorize
     URL + sessionUri. customParameters: {actor: 'app'} propagates so
     Linear surfaces the Agent install variant of the consent screen
     (verified via 2.0b spike).
  5. Start localhost HTTPS callback server, open browser to the auth
     URL, await session_id from the callback.
  6. Poll getResourceOauth2Token (5s/600s) until accessToken arrives;
     translate sessionStatus=FAILED into a Linear-app-config remediation
     hint.
  7. Query Linear viewer + organization with the OAuth token, persist
     the workspace registry row + admin user-mapping row, then prompt
     for the webhook signing secret if not already configured.

Hard cutover from PAK: the new wizard is OAuth-only — there is no
--use-pak flag. The webhook signing secret prompt remains because
HMAC verification of inbound Linear webhooks is independent of how the
agent calls Linear outbound. Webhook prompt is skipped on subsequent
add-workspace runs by detecting the lin_wh_ prefix on the stored
secret; --rotate-webhook-secret forces a re-prompt.

Splits queryLinearIdentity out so both the legacy PAK auto-link helper
(authorization=`lin_api_…`) and the OAuth path (authorization=
`Bearer <token>`) reuse the same GraphQL query. The PAK helper stays
exported to support the legacy linkage path until LinearApiTokenSecret
is retired in aws-samples#70.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cdk): use full SDK v3 package name for AgentCore Identity custom resource

CDK's AwsCustomResource auto-derives the SDK package name from `service`
by lowercasing and dropping hyphens — `'BedrockAgentCoreControl'` becomes
`@aws-sdk/client-bedrockagentcorecontrol`, which doesn't exist. The
actual package is `@aws-sdk/client-bedrock-agentcore-control` (hyphens).

Verified by deploy: with the lowercased mapping the Lambda backing the
CR fails with "Package @aws-sdk/client-bedrockagentcorecontrol does not
exist" and the stack rolls back. Switching to the full v3 package name
(supported per the AwsSdkCall.service jsdoc) routes the import correctly.

Verified end-to-end: `bgagent-cli` workload identity created with
`https://localhost:8443/oauth/callback` on the resourceOauth2ReturnUrls
allowlist, stack outputs populated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(2.0b): park diagnostic flags + tokenEndpointAuthMethods + force-reauth

Smoke test against backgroundagent-dev (2026-05-19) hit a service-side
bug in AgentCore Identity: USER_FEDERATION token-exchange against
Linear's /oauth/token never completes. sessionStatus stays IN_PROGRESS
indefinitely, no FAILED transition, no diagnostics on the wire.

Verified via manual curl that Linear's token endpoint works perfectly
with the same clientId/secret/scopes/code/actor=app — bug is on AWS's
side. AgentCore Identity has zero token-injection APIs, so Option 3
(do OAuth ourselves + inject) is architecturally impossible. AWS
support case + PAR-compatibility upstream issue
aws/bedrock-agentcore-sdk-python#111 are the official fix paths.

Parking the wizard work but committing the diagnostic flags we added
during triage so they're available when this is unparked:

- `tokenEndpointAuthMethods: ['client_secret_post']` on the provider
  metadata. Linear expects POST-body credentials; AgentCore defaults to
  Basic. Field name verified against the SDK type (`tokenEndpointAuthMethods`,
  not the `Supported` suffix the boto3 reference suggested).
- `--verbose-poll` flag on `bgagent linear setup` — prints per-poll
  sessionStatus + response keys so the stuck state is visible.
- `--force-reauth` flag — sets `forceAuthentication: true` on
  GetResourceOauth2Token to bypass cached tokens after a Linear-side
  revoke.
- `CompleteResourceTokenAuth` call between callback capture and poll
  loop. Per AWS sample 09-Outbound_Auth_Self_Hosted, this is required
  to bind the captured session to a userId. Confirmed it's NOT what
  unblocks our specific bug, but is correct per spec for any
  USER_FEDERATION flow.

Status of resume paths in memory/project_oauth_2_0b.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(2.0b-O2): direct Linear OAuth + per-workspace Secrets Manager

Replaces the parked AgentCore Identity OAuth flow with a CLI-side
direct OAuth dance against Linear's /oauth/token endpoint. The flow
verified by the manual curl smoke test on 2026-05-19 returned a valid
access_token in <100ms, so we know Linear's side works. AWS's
USER_FEDERATION wrapper is broken specifically for Linear (or
actor=app); see memory/project_oauth_2_0b.md for the parked-bug
details and resume prompt.

Architecture:
- New module cli/src/linear-oauth.ts owns the OAuth helpers:
  generatePkce (S256), buildAuthorizationUrl (with actor=app),
  exchangeAuthorizationCode, refreshAccessToken,
  StoredLinearOauthToken JSON shape, computeExpiresAt,
  isAccessTokenExpiring (60s threshold), linearOauthSecretName.
  19 hermetic tests (no network).
- Per-workspace Secrets Manager secret bgagent-linear-oauth-<slug>
  holds the token JSON. CLI creates+updates at runtime via upsertOauthSecret
  (CreateSecret + ResourceExistsException → PutSecretValue fallback).
- LinearWorkspaceRegistryTable row gains oauth_secret_arn. Lambdas
  resolve workspace → secret_arn → token JSON, with refresh-if-expiring.
  (Lambda migration is Wave C.)
- bgagent linear setup is rewritten end-to-end:
  prompt-for-credentials → PKCE → open browser → callback captures
  ?code+?state → state verify (CSRF) → exchangeAuthorizationCode →
  query Linear viewer+org → write secret + registry row + user mapping
  → webhook secret prompt (unchanged from prior wizard).
  No AgentCore calls. No polling. No CompleteResourceTokenAuth.
- Localhost callback server now exposes both AgentCore-style
  (session_id) and direct-Linear-style (code+state+error) shapes
  via a CallbackResult with nullable fields. Backward-compat with
  the parked AgentCore path's tests.

Removals:
- cli/src/agentcore-identity.ts + test (workload-token helper)
- cdk/src/constructs/cli-workload-identity.ts + test (workload identity)
- providerNameForWorkspace, buildLinearProviderInput,
  registerLinearWorkspace, initiateOauthDance, completeResourceTokenAuth,
  pollForOauthAccessToken, AgentCore SDK imports — all gone from linear.ts
- bgagent linear oauth-register-workspace command (no AWS-side provider
  to register; folded into setup)
- CliWorkloadIdentityName CfnOutput from agent.ts
- 6 describe blocks of AgentCore-flavored tests in linear.test.ts

Net change: -1100 lines, +700 lines of new direct-OAuth wiring.
286/286 CLI tests pass. 9/9 linear-integration CDK tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(2.0b-O2): space-separated OAuth scopes + --no-actor-app diagnostic

End-to-end smoke test against backgroundagent-dev (2026-05-20):

- The OAuth dance was failing with Linear's "Invalid redirect_uri" error
  even though the redirect_uri was correct. Root cause: scopes were
  comma-separated (`read,write,...`) instead of space-separated. RFC
  6749 §3.3 mandates space; Linear surfaces the violation as the
  misleading "Invalid redirect_uri" error, the same misdirection we
  hit during the 2.0b spike. Fix: `.join(' ')` in buildAuthorizationUrl.
- Adds `--no-actor-app` diagnostic flag on `bgagent linear setup`. Drops
  the `actor=app` query param so a stuck flow can be isolated to
  agent-install vs vanilla-OAuth without changing the Linear app config.
  Off by default; surfaces a warning when invoked.

After the fix, full smoke test passed:
- Browser opens to Linear consent
- User authorizes, redirects to https://localhost:8443/oauth/callback
- CLI captures code+state, exchanges for access_token + refresh_token
- Token JSON persisted to bgagent-linear-oauth-maguireb in Secrets Manager
- LinearWorkspaceRegistryTable row written with oauth_secret_arn
- LinearUserMappingTable row written for the admin
- Token verified against Linear's GraphQL viewer query (works)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(2.0b-O2): Wave C — migrate Lambdas + agent runtime to per-workspace OAuth

Replaces every consumer of the legacy LinearApiTokenSecret PAK with the
per-workspace Secrets Manager OAuth-token pattern from Waves A/B. Deploy
of this commit will fully cut over the integration; the LinearApiTokenSecret
construct is gone.

CDK side:
- New `cdk/src/handlers/shared/linear-oauth-resolver.ts` resolves
  workspace_id → registry row → oauth_secret_arn → token JSON →
  refresh-if-expiring → access_token. In-memory caches (1m TTL) on
  both registry rows and token JSON. Lazy refresh with PutSecretValue
  write-back so concurrent Lambdas see the rotated token. 11 unit tests.
- linear-feedback.ts: postIssueComment / addIssueReaction /
  reportIssueFailure now take a {linearWorkspaceId, registryTableName}
  context instead of an apiTokenSecretArn. Auth header switches from
  bare PAK value to `Bearer ${accessToken}`.
- linear-webhook-processor.ts: env vars LINEAR_WORKSPACE_REGISTRY_TABLE_NAME
  replace LINEAR_API_TOKEN_SECRET_ARN. safeReportIssueFailure threads
  the webhook payload's organizationId through to the resolver. Webhook
  processor now stamps `linear_oauth_secret_arn` + `linear_workspace_slug`
  into channel_metadata at task-creation time so the agent runtime can
  fetch the secret directly without a registry round-trip.
- orchestrate-task.ts: notifyLinearOnConcurrencyCap reads
  LINEAR_WORKSPACE_REGISTRY_TABLE_NAME and the task's
  channel_metadata.linear_workspace_id.
- LinearIntegration construct: drops apiTokenSecret + ApiTokenSecret
  Secrets Manager resource entirely. Webhook processor IAM now grants
  Get+Put on `bgagent-linear-oauth-*` Secrets Manager prefix.
- Agent stack: orchestrator IAM mirrors the new prefix grant.
  Runtime IAM drops AgentCore Identity grants and gains Get+Put on
  `bgagent-linear-oauth-*`. LINEAR_API_KEY_PROVIDER_NAME env var,
  LINEAR_API_TOKEN_SECRET_ARN env var, and LinearApiTokenSecretArn
  CfnOutput all removed.

Agent side (Python):
- config.py::resolve_linear_api_token: rewritten to read the per-task
  channel_metadata.linear_oauth_secret_arn (or LINEAR_OAUTH_SECRET_ARN
  env fallback) via boto3.secretsmanager. Lazy refresh: if expires_at
  is within 60s, POST refresh_token grant to Linear /oauth/token using
  client_id/client_secret co-located in the secret JSON, write the
  rotated token back via put_secret_value, return the new access_token.
- pipeline.py: passes config.channel_metadata into resolve_linear_api_token.
- linear-oauth.ts (CLI): StoredLinearOauthToken schema gains client_id +
  client_secret fields so Lambda + agent refresh can run without
  per-Lambda OAuth env vars. Setup wizard writes them.

Tests pruned of AgentCore Identity mocks; new tests cover the
Secrets-Manager-direct path (CDK 11 + agent 6 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(orchestrator): bundle import.meta.url shim for durable-execution SDK

`@aws/durable-execution-sdk-js@1.1.3`'s ESM build calls
`fileURLToPath(import.meta.url)` at module load. esbuild's ESM→CJS
bundling leaves `import.meta.url` undefined, crashing every invocation
with `TypeError: path must be a string`.

Define an identifier substitution + banner that materializes a valid
file:// URL from `__filename` at runtime. Discovered while smoke-testing
Wave C end-to-end on backgroundagent-dev.

Refs: aws/aws-durable-execution-sdk-js#543

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cli): switch OAuth callback to plain HTTP localhost

Per RFC 8252 §7.3, OAuth providers (including Linear) treat
http://localhost as a special case that doesn't need TLS — the
connection never leaves the host. The previous self-signed-cert HTTPS
approach forced testers through a "connection not private" warning that
scared them off mid-setup.

Drops the openssl shell-out + temp-cert plumbing (~60 lines) along with
the user-facing warning copy in `bgagent linear setup`. Updates the
callback constants to http://localhost:8080/oauth/callback and the test
suite to plain http.GET.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): ruff lint + format for OAuth refresh path

Five lint errors surfaced when CI ran `ruff check --fix` against the
Wave C agent changes:

- F401 unused `timezone` import in `config.py` (replaced with
  `timedelta`, which is what's actually needed)
- RUF034 useless if-else in the `expires_at` ternary — both branches
  returned identical strings before the recompute below; flatten into
  a single straightforward `if expires_in: ... else: ...` block
- E501 three line-length violations in `config.py` and
  `test_config.py` — break the long expressions onto helper-named
  intermediates

Confirmed locally: `ruff check .` clean, `ruff format --check` clean,
`pytest tests/test_config.py` 15/15 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(linear-feedback): rewrite tests against OAuth context signature

Wave C migrated postIssueComment / addIssueReaction / reportIssueFailure
from a (secretArn: string, ...) signature to a (ctx: LinearFeedbackContext,
...) signature, but the test file still passed bare strings — TypeScript
caught it at compile time only when CI ran a full build. Three test
suites failed to compile (the typecheck error blocked the whole suite,
not just this file).

- Mock `resolveLinearOauthToken` (the new resolver) instead of
  `getLinearSecret` (the old PAK fetcher).
- Build a `LinearFeedbackContext` fixture with linearWorkspaceId +
  registryTableName, pass it everywhere SECRET_ARN was used.
- Update the Authorization-header assertion to match the new
  `Bearer <token>` form (PAK was bare-token; OAuth is Bearer-prefixed).

All 41 tests across linear-feedback, linear-webhook-processor, and
orchestrate-task-feedback pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cdk): align yarn.lock with upstream main + bump table count for OAuth registry

Two CI failures came together because they share a root cause: this
branch's yarn.lock had drifted from upstream main during interim
re-resolves, leaving an inconsistent dep tree that broke ts-jest's
module resolution for @aws-cdk/mixins-preview/aws-bedrockagentcore.
Restoring upstream main's yarn.lock fixes the resolution; the
agent.test.ts table-count assertion then needs to bump from 12 to
13 to account for the LinearWorkspaceRegistryTable added in
Phase 2.0b Wave A4.

Verified locally: agent.test.ts (44/44) and github-tags.test.ts
(5/5) both pass after the changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: apply eslint --fix from CI's self-mutation guard

CI runs `mise run build`, which invokes `eslint --fix` and then fails if
the working tree changed (self-mutation guard). Three cosmetic lints
needed applying:

- Import-order: DynamoDBClient and CliError moved earlier in their
  files to satisfy alphabetic-by-package ordering
- formatJson import added in alphabetic position in linear.ts
- Three template literals with no interpolation converted to
  single-quoted strings in oauth-callback-server.ts and linear.ts
  (eslint quotes rule prefers single-quotes when no template
  variables are used)

Pure mechanical fixes; no behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(oauth): refresh-token race recovery + log gaps from review

Addresses the blocker + critical items from PR review:

- Refresh-token race (review blocker). Linear rotates refresh_tokens
  on every use; concurrent Lambdas/agents racing the same secret will
  all read the same expiring token and one's refresh will succeed
  while the others get `invalid_grant`. On `invalid_grant`, re-read
  the secret from Secrets Manager (bypassing cache). If the
  refresh_token has changed, another caller already rotated; use the
  freshly-read token (or retry refresh once if it's also expiring).
  If unchanged, the refresh_token is permanently rejected and the
  workspace needs re-onboarding. Implemented in both the TS resolver
  (linear-oauth-resolver.ts) and Python resolver (config.py).

- Unguarded bedrock_agentcore import in agent/src/server.py
  (review critical). The bare `from bedrock_agentcore.runtime.context
  import BedrockAgentCoreContext` inside `_run_task_background` killed
  the entire pipeline thread with no diagnostic if the SDK was
  missing or its module structure changed. Wrap in
  try/except (ImportError, AttributeError) and log via _warn_cw —
  the Linear token resolver has its own SM fallback, so the agent
  can proceed without the workload-token bridge.

- Cache invalidation on fetch-level refresh failure (review high).
  The TS resolver's `invalidateLinearOauthCache()` only ran in the
  `!resp.ok` branch; if `fetch()` itself threw (timeout, DNS), the
  catch returned null without invalidating, leaving the stale
  expiring token cached for 60s and hammering Linear's token
  endpoint. Move invalidate into the fetch-level catch too.

- Malformed expires_at log (review medium). The Python `_is_expiring`
  caught `ValueError` and silently returned True, masking
  consistently-bad writes. Add a WARN log so operators see the bad
  data instead of just an unexplained refresh on every task.

- Positive-path refresh log (review non-blocking aws-samples#5). Added
  INFO-level breadcrumb on successful refresh in both resolvers
  so operators diagnosing intermittent 401s have a trace of which
  workspace refreshed and to what expiry.

11/11 existing resolver unit tests still pass; will add tests for
the new race-recovery branch in a followup commit.

* fix(2.0b-O2): review-2 batch — error specificity, half-creates, runbook

Addresses four PR review items focused on operator UX when things go
sideways:

- isWebhookSecretConfigured (review high). The bare
  `catch { return false }` swallowed AccessDeniedException and
  DecryptionFailureException, making setup re-prompt for a webhook
  secret when the real problem was IAM. Now: only
  ResourceNotFoundException returns false; everything else throws a
  CliError pointing the operator at the IAM gap. Test updated to
  assert both paths.

- admin invite-user half-create (review medium). If
  AdminCreateUser succeeds but AdminSetUserPassword fails (stricter
  password policy than generator, partial IAM grant on the Set verb),
  the user was left in FORCE_CHANGE_PASSWORD with no diagnostic. Wrap
  the second call in try/catch and throw a CliError that names the
  user, explains the broken state, and gives both a delete-user CLI
  and a manual-fix path.

- PAK migration runbook (review non-blocking #1). Expanded the
  "Migration from 2.0a (PAK) to 2.0b (OAuth)" section in
  LINEAR_SETUP_GUIDE.md with: a pre-deploy checklist, what survives
  the migration vs what doesn't, an explicit rollback note (fix
  forward; the original PAK secret is gone with the CFN resource),
  and the per-step difference between 2.0a-with-Identity (skipped) vs
  2.0a-with-PAK (migrate) deploys.

- Vestigial AgentCore Identity dep (review non-blocking #2).
  bedrock-agentcore==1.9.1 is kept in agent/pyproject.toml because
  the workload-token bridge in server.py still calls it (now wrapped
  in try/except per review batch 1). Add an inline comment
  explaining why it's pinned even though Phase 2.0b-O2 reads
  Secrets Manager directly — it's the seam for resuming the
  AgentCore Identity path in 2.0c.

CLI tests: 13/13 pass.

* fix(2.0b-O2): batch 3 — schema validation, CallbackResult union, parity contract

Three remaining substantive review items from PR aws-samples#160:

- Validate all 11 fields in getOauthSecret (review non-blocking aws-samples#7).
  Was checking only access_token / refresh_token / expires_at; missing
  client_id or client_secret only surfaced 24h later when the refresh
  call needed them and found undefined. Extracted the required-field
  list into a const next to the StoredOauthToken interface and check
  the full set at deserialization. Bad secrets fail fast at fetch
  time with a structured log line naming the missing fields.

- CallbackResult discriminated union (review non-blocking aws-samples#6). Was
  `{ sessionId: string|null, code: string|null, state: string|null }`
  which let callers construct unreachable shapes. Split into
  `{ kind: 'agentcore', sessionId } | { kind: 'direct-oauth', code, state }`.
  Updated the resolver site (`oauth-callback-server.ts`), the
  consumer (`bgagent linear setup`), and the test file to use
  exhaustive type-narrowing. The setup wizard now errors clearly if
  it gets the agentcore shape (parked path) instead of silently
  passing nulls down.

- Cross-language schema-parity contract test (review non-blocking #3).
  CLI's StoredLinearOauthToken and Lambda's StoredOauthToken define
  the same JSON-in-Secrets-Manager schema independently; drift
  between the two would be a silent bug (CLI writes one field name,
  Lambda reads another, refresh works, every Lambda invocation logs
  a missing-field error). New test in
  `cdk/test/contracts/stored-oauth-token-parity.test.ts` regex-parses
  both interface definitions out of source and asserts the field set
  is equal. Also asserts the new
  `STORED_OAUTH_TOKEN_REQUIRED_FIELDS` const matches the interface,
  so future field additions can't drift between the validator and
  the type.

CLI tests 286/286 pass. CDK resolver + contract 13/13 pass.

* style: apply eslint --fix indentation on CallbackResult union

* fix(2.0b-O2): review-3 batch — defensive error handling, security hardening, refresh test coverage

Bugs (B1-B3):
- linear-oauth-resolver: try/catch around ddb.send() in getRegistryRow so
  transient DDB errors fail the resolver cleanly instead of crashing the
  Lambda thread
- orchestrate-task: try/catch around reportIssueFailure in
  notifyLinearOnConcurrencyCap; Linear feedback failures must never block
  the rejection path
- Python _fetch_token: guard json.loads + KeyError so a corrupted SM
  payload returns None (logged ERROR) rather than raising

Tests (T1-T3):
- Python: TestResolveLinearApiTokenRefreshPaths covering happy refresh,
  invalid_grant + concurrent rotation, invalid_grant + no rotation,
  malformed expires_at, network failure during refresh, and corrupted
  secret JSON
- TS: concurrent-refresh recovery via re-read; permanent-rejection on
  same refresh_token; cache invalidation after network failure

Security (S1-S3):
- agent runtime IAM: drop secretsmanager:PutSecretValue on the Linear
  OAuth secret prefix. Untrusted repo code in the agent must not be
  able to overwrite tokens; Lambdas (trusted) handle persistence. The
  refreshed in-memory token still works for the current task; rotated
  refresh_token is lost on agent exit but Linear's grace window
  absorbs the rare race where the agent refreshes strictly before any
  Lambda
- Python _try_refresh_once: narrow except Exception to
  (urllib.error.URLError, OSError) — programmer errors propagate with
  clean stack traces instead of being swallowed
- linear-oauth-resolver: RegistryRowStatus is now a discriminated
  'active' | 'revoked' literal; missing or unknown values fail closed
  to revoked rather than defaulting active

* fix(2.0b-O2): use email.message.Message for HTTPError hdrs in tests

ty's stricter typeshed flagged dict[Unknown, Unknown] passed where
Message[str, str] is expected. Drop-in replacement that satisfies
both ty and runtime.

* fix(2.0b-O2): update feedback test for B2 — helper now swallows internally

Round-3 review B2 moved the try/catch into notifyLinearOnConcurrencyCap.
The pre-existing test asserted the old contract (rejection propagates,
caller must catch). Flip the contract assertion to reflect the new
behavior: the helper is now best-effort end-to-end and returns undefined
on internal failure.

---------

Co-authored-by: bgagent <bgagent@noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Alain Krok <alkrok@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant