Skip to content

feat(attachments): add multimodal task attachments support#176

Merged
krokoko merged 22 commits into
mainfrom
attachments
May 26, 2026
Merged

feat(attachments): add multimodal task attachments support#176
krokoko merged 22 commits into
mainfrom
attachments

Conversation

@krokoko
Copy link
Copy Markdown
Contributor

@krokoko krokoko commented May 26, 2026

Fixes #177

Summary

  • Implement end-to-end support for attaching images, text files, and URLs to agent tasks across all channels (CLI, webhook, Slack, Linear)
  • Three upload paths: inline base64 (≤ 500 KB), presigned S3 POST (up to 10 MB), and deferred URL fetch with SSRF protection
  • Security screening pipeline using Bedrock Guardrails with fail-closed semantics, magic bytes validation, dimension checks, and DNS resolution pinning for URL fetches
  • No native image processing dependencies — pure buffer parsing for dimension checks, raw PNG/JPEG bytes passed directly to Bedrock

Supported file types

Category MIME types Extensions
Images image/png, image/jpeg .png, .jpg
Text files text/plain, text/csv, text/markdown, application/json, application/pdf, text/x-log .txt, .csv, .md, .json, .pdf, .log

Security measures

  • Magic bytes validation — verifies binary signatures match declared MIME types, prevents polyglot files
  • Image dimension checks — PNG IHDR and JPEG SOF marker parsing (pure buffer, no native deps), rejects images > 8000px per side before Bedrock call
  • Bedrock Guardrail screening — images screened via ApplyGuardrailCommand image content blocks; text files/PDFs screened via text content blocks; prompt attack detection at
    MEDIUM strength
  • SSRF protection for URLs — DNS resolution pinning with TLS SNI (prevents rebinding TOCTOU), private IP blocking (RFC 1918, link-local, loopback, CGN, IPv6 ULA, all-zeros),
    redirect validation, HTTPS-only, 10s timeout
  • S3 versioning — pins object versions at screening time to prevent TOCTOU between screening and agent download
  • SHA-256 integrity verification — agent verifies checksum after download
  • Filename sanitization — rejects path traversal, null bytes, dot-prefix, non-safe characters
  • Size enforcement — 10 attachments/task, 500 KB inline, 3 MB total inline, 10 MB per attachment, 50 MB total per task
  • Fail-closed semantics — screening unavailability rejects the task; attachment errors propagate (never silently dropped)
  • DynamoDB TTL — failed/cancelled tasks set TTL for automatic cleanup per retention policy

Key changes

CDK Infrastructure

  • New AttachmentsBucket construct (versioned S3, 90-day lifecycle, block public access, enforce SSL)
  • New confirm-uploads Lambda (1024 MB, 180s timeout, parallel screening with concurrency 3)
  • New PendingUploadCleanup EventBridge rule (5-min schedule, auto-cancels stale uploads after 30 min with TTL)
  • New POST /v1/tasks/{task_id}/confirm-uploads API endpoint
  • New PENDING_UPLOADS task status with PRE_ACTIVE classification
  • Bedrock Guardrail configured for attachment image/text screening
  • WAF SizeRestrictions_BODY rule excluded for task creation endpoint (base64 payloads exceed 8 KB)

Handlers

  • attachment-screening.ts — image + text screening with retry (3 attempts, exponential backoff), pure buffer dimension parsing (PNG IHDR / JPEG SOF markers), no native dependencies
  • resolve-url-attachments.ts — SSRF-safe fetch with DNS pinning via native https.request + https.Agent({ servername }) for TLS SNI, redirect validation, streaming size
    enforcement, isPrivateIp covering IPv4 and IPv6
  • image-tokens.ts — synchronous Claude vision token estimation using buffer-parsed dimensions (resize-aware, 1568px cap, 28px tile padding)
  • confirm-uploads.ts — presigned upload confirmation with concurrent-call safety (conditional DDB writes), TTL on failure paths, pre-check concurrency with proper error
    discrimination
  • cleanup-pending-uploads.ts — expired task auto-cancel with S3 prefix cleanup, TTL, race condition handling (ConditionalCheckFailedException)
  • Updated create-task-core.ts — inline screening/upload path with try-catch around presigned URL generation
  • Updated orchestrator.ts — URL attachment resolution during hydration
  • Updated validation.ts — full attachment validation (magic bytes, MIME allowlist, filename, size limits); GIF/WebP rejected at admission

Agent runtime (Python)

  • models.pyAttachmentConfig and PreparedAttachment Pydantic models (frozen, no extras)
  • pipeline.py — attachment preparation step, multimodal content block injection for images

CLI

  • New --attachment flag (repeatable) on bgagent submit
  • Auto-routes: inline for ≤ 500 KB, presigned upload for larger files, URL for HTTPS links
  • Presigned upload flow with AbortController timeout (120s) and S3 XML error body validation

Documentation

  • Updated docs/design/ATTACHMENTS.md — comprehensive design doc reflecting simplified pipeline (no sharp, no EXIF stripping, no GIF/WebP conversion)
  • Updated API_CONTRACT.md — supported types (PNG/JPEG only), limits tables, confirm-uploads endpoint
  • Updated SECURITY.md — attachment screening details for submission-time and hydration-time
  • Updated USER_GUIDE.md — attachments section with supported types, limits, CLI usage, security screening explanation
  • Updated ROADMAP.md — marked attachments as implemented
  • Starlight mirrors regenerated

Test plan

  • cdk/test/handlers/shared/attachment-screening.test.ts — 18 tests: PNG/JPEG dimension parsing, magic bytes, GIF/WebP rejection, oversized rejection, Bedrock pass/block, raw
    pass-through
  • cdk/test/handlers/shared/resolve-url-attachments.test.ts — 17 tests: isPrivateIp covering IPv4 RFC1918, loopback, CGN, link-local, IPv6 ULA, link-local, IPv4-mapped,
    all-zeros
  • cdk/test/handlers/shared/validation.test.ts — updated for GIF/WebP rejection at admission
  • cdk/test/handlers/confirm-uploads.test.ts — presigned upload confirmation, concurrent-call safety, screening failure handling
  • cdk/test/handlers/cleanup-pending-uploads.test.ts — 6 tests: TTL on cancel, race condition, S3 cleanup, partial success, all-error throw
  • agent/tests/test_attachments.py — 10 tests: download/verify, checksum mismatch, size mismatch, read-only perms, VersionId passthrough
  • cdk/test/constructs/task-status.test.ts — PENDING_UPLOADS status transitions
  • Full CDK test suite: 97 suites, 1748 tests passing
  • TypeScript compilation clean (tsc --noEmit)
  • End-to-end: submit task with inline image attachment via CLI
  • End-to-end: submit task with presigned upload (> 500 KB file) via CLI
  • End-to-end: submit task with multiple attachments (presigned upload (> 500 KB file) via CLI, inline)
  • End-to-end: submit task with URL attachment, verify SSRF protection blocks private IPs
  • Verify PENDING_UPLOADS task auto-cancels after 30 minutes

Design decisions

Decision Rationale
PNG/JPEG only (no GIF/WebP) Eliminates sharp/libvips native dependency that caused ARM64 Lambda decode failures and cross-platform build issues. Bedrock accepts PNG/JPEG
directly.
No EXIF stripping Images are uploaded by the task submitter for their own agent — metadata leakage risk is minimal. Eliminates native dep.
Pure buffer dimension parsing PNG IHDR chunk (bytes 16-23) and JPEG SOF markers (0xFFC0/C1/C2) provide dimensions without any native library. Falls back gracefully if parsing
fails.
Lambda memory reduction CreateTask: 1024→512 MB, ConfirmUploads: 2048→1024 MB. No sharp cold-start overhead.
pdf-parse kept as only bundled dep Still needed for PDF text extraction before Bedrock text screening. Lightweight, pure JS.

Area

  • cdk — infrastructure, handlers, constructs
  • agent — Python runtime / Docker image
  • clibgagent client
  • docs — guides or design sources (docs/guides/, docs/design/)
  • tooling — root mise.toml, scripts, CI workflows

Tip: AGENTS.md lists where to edit and which tests to extend.

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@krokoko krokoko changed the title feat(Attachments): add support for attachments feat(attachments): add multimodal task attachments support May 26, 2026
@krokoko krokoko marked this pull request as ready for review May 26, 2026 06:26
@krokoko krokoko requested a review from a team as a code owner May 26, 2026 06:26
Copy link
Copy Markdown
Contributor

@isadeks isadeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blockers (must fix before merge)

  1. Cleanup-on-failed-conditional-write race —
    cdk/src/handlers/confirm-uploads.ts:1782-1796, 1819-1827, 2122-2131
    failTaskOnScreening swallows ConditionalCheckFailedException and returns void,
    so the caller can't tell whether we transitioned the task. It then
    unconditionally calls cleanupAllAttachments, which can delete S3 objects
    belonging to a task that another caller already moved past PENDING_UPLOADS.
    Make failTaskOnScreening return a boolean and skip cleanup when it returns
    false.

  2. Redundant S3 PUT in screenSingleAttachment —
    cdk/src/handlers/confirm-uploads.ts:1906-1911
    EXIF stripping was removed, so screenResult.content === content. The
    unconditional PutObjectCommand rewrites the same bytes to the same key,
    doubles S3 cost per presigned attachment, and creates an extra version that
    confuses cleanup. Either skip the re-upload (reuse the existing
    versionId/checksum) or only PUT when the buffer actually changed.

  3. ListObjectVersions pagination bug —
    cdk/src/handlers/cleanup-pending-uploads.ts:1472
    if (objects.length === 0) break; exits on any empty page, but S3 can return an
    empty page with IsTruncated=true. Use NextKeyMarker/IsTruncated as the loop
    terminator, otherwise stale versions can leak past cleanup.

@krokoko krokoko requested a review from isadeks May 26, 2026 18:25
Copy link
Copy Markdown
Contributor

@isadeks isadeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⏺ Re-reviewed at a9d4c6f. All 3 blockers from the previous pass are addressed:

  • failTaskOnScreening returns Promise<boolean>; both callers gate
    cleanupAllAttachments on the result. Race test added in
    confirm-uploads.test.ts.
  • ✅ Redundant PutObjectCommand removed from screenSingleAttachment
    passed-path; original versionId/sizeBytes reused.
  • cleanupTaskAttachments now drives the loop on IsTruncated with
    continue on empty pages.

I also did a deep pass on the areas I hadn't audited the first time around —
CDK wiring, entry-point handlers, agent Python runtime.

CDK infrastructure — clean

  • IAM scoping is tight: create-task gets grantPut+grantDelete only;
    confirm-uploads gets grantReadWrite+grantDelete; cleanup gets
    grantRead+grantDelete; agent runtime gets grantRead only. Bedrock
    guardrail policy is scoped to the guardrail ARN, not wildcard.
  • Bucket: BLOCK_ALL public access, enforceSSL: true, S3-managed
    encryption, versioning on, 90-day current + 7-day noncurrent lifecycle
    attached.
  • WAF SizeRestrictions_BODY exclusion is properly scoped to /v1/tasks
    (STARTS_WITH) and /v1/linear/webhook (EXACT) via scopeDownStatement — not
    stack-wide.
  • EventBridge schedule uses CDK's auto-granted invoke; no over-scoped role.
  • Presigned POST policy enforces content-length-range and fixed
    Content-Type server-side.

Entry-point handlers — clean

All four paths funnel through validateAttachments()
createAttachmentRecord(). No direct AttachmentRecord construction bypasses
validation/screening.

  • Linear webhook: extracts markdown image URLs, HTTPS-only, capped at 10,
    routed through the URL-fetch path with full SSRF protection at hydration.
  • Slack: pre-validates size + MIME, uses isSlackFileUrl() SSRF guard
    before sending the bot token, then re-validates server-side.
  • CLI: routes inline (<500 KB) vs presigned vs URL correctly; presigned
    path always goes through confirm-uploads for screening.

Agent Python runtime — one new finding

Partial-download workspace leakagent/src/attachments.py:70-82,
agent/src/pipeline.py:294-319. If _download_single() raises mid-loop
(e.g., checksum mismatch on attachment #2), already-downloaded files from
earlier iterations are left in /workspace/.attachments/. Not a security
issue (workspace is per-task and disposable), but on retry the agent could see
stale files alongside fresh ones. Wrap the loop in try/finally with
shutil.rmtree(attachments_dir) on failure.

Otherwise clean: multimodal injection sources only from verified
PreparedAttachment objects; AttachmentConfig has extra="forbid" + a
thorough model_validator; chmod 0o444 after write.

Non-blocking follow-ups (file separately if useful)

  • isPrivateIp missing 198.18.0.0/15, RFC 5737 TEST-NETs, IPv4/IPv6 multicast
    (resolve-url-attachments.ts)
  • PDF text byte-cap uses String.slice — can exceed cap with multi-byte UTF-8
    (attachment-screening.ts)
  • Filename regex permits trailing . (validation.ts)
  • MAX_FETCH_SIZE_BYTES enforced twice — outer streaming loop is dead code
    (resolve-url-attachments.ts)
  • agent/src/attachments.py recreates the boto3 client per call — make it
    module-level
  • MAX_TASK_DESCRIPTION_LENGTH 2000→10_000 deserves a one-line rationale
    comment
  • Test gaps: DNS-rebinding-via-redirect (302 public → hostname resolving
    private); multi-attachment partial-failure cleanup

Overall

LGTM. The new partial-download leak is should-fix, not blocking. Original
blockers cleared, security-critical paths audited end to end.

@isadeks isadeks enabled auto-merge May 26, 2026 19:05
@isadeks isadeks disabled auto-merge May 26, 2026 19:05
@krokoko krokoko requested a review from isadeks May 26, 2026 19:05
Copy link
Copy Markdown
Contributor

@isadeks isadeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krokoko krokoko added this pull request to the merge queue May 26, 2026
Merged via the queue into main with commit 1efe4a1 May 26, 2026
6 checks passed
@krokoko krokoko deleted the attachments branch May 26, 2026 19:15
scottschreckengaust added a commit that referenced this pull request May 26, 2026
ESLint --fix removes no-op suppression comments for rules not enabled
in our config. These were introduced by the attachments PR (#176) merge
and are cleaned up by the new ESLint 10 flat config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
isadeks pushed a commit to isadeks/sample-autonomous-cloud-coding-agents that referenced this pull request May 27, 2026
…samples#171)

* chore(deps): upgrade eslint 9->10, replace import plugin with import-x

- eslint ^9 -> ^10 (10.4.0)
- @cdklabs/eslint-plugin ^1 -> ^2 (flat config support)
- eslint-plugin-import removed (no ESLint 10 support)
- eslint-plugin-import-x ^4 added (drop-in replacement)
- eslint-import-resolver-typescript removed (import-x has built-in TS resolution)
- Root resolution for eslint-plugin-import/minimatch removed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(eslint): migrate cdk/ to ESLint 10 flat config

Convert .eslintrc.json to eslint.config.mjs (ESM flat config format).
Replace import/ rule prefix with import-x/. Remove --ext flag and
ESLINT_USE_FLAT_CONFIG env var from the eslint script.

All existing rules preserved with equivalent flat config syntax.
Four @cdklabs rules (no-core-construct, invalid-cfn-imports,
no-literal-partition, no-invalid-path) are temporarily disabled
because they use context.getFilename() which was removed in ESLint 10.
Stale eslint-disable comments for disabled/absent rules removed by
auto-fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(eslint): migrate cli/ to ESLint 10 flat config

Convert .eslintrc.json to eslint.config.mjs. Replace import/ rule
prefix with import-x/. Remove --ext flag and ESLINT_USE_FLAT_CONFIG.
All existing rules preserved including no-console override for src/.
Removed stale no-constant-condition disable comments (rule deprecated
in ESLint 10).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(eslint): restore eslint-import-resolver-typescript for import-x

The resolver is required by eslint-plugin-import-x for TypeScript path
resolution. Was incorrectly removed in the package upgrade commit.
Also disable license-header rule for shebang bin files (workaround).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(eslint): remove stale suppressions from merged attachments code

ESLint --fix removes no-op suppression comments for rules not enabled
in our config. These were introduced by the attachments PR (aws-samples#176) merge
and are cleaned up by the new ESLint 10 flat config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: bgagent <345885+scottschreckengaust@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Alain Krok <alkrok@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add multimodal task attachments support

2 participants