Skip to content

Add Filepath agent integration with protocol, runtime, and authority#13

Draft
acoyfellow wants to merge 11 commits intomainfrom
claude/gateproof-roadmap-evaluation-KcCv3
Draft

Add Filepath agent integration with protocol, runtime, and authority#13
acoyfellow wants to merge 11 commits intomainfrom
claude/gateproof-roadmap-evaluation-KcCv3

Conversation

@acoyfellow
Copy link
Copy Markdown
Owner

Summary

This PR adds comprehensive support for running AI agents in Filepath containers within Gateproof's observe→act→assert framework. It includes the Filepath Agent Protocol (FAP) implementation, a real container runtime, authority/governance validation, and extensive test coverage.

Key Changes

Protocol & Types

  • filepath-protocol.ts: Complete Zod schemas for the NDJSON event stream between Filepath containers and Gateproof
    • Output events: TextEvent, ToolEvent, CommandEvent, CommitEvent, SpawnEvent, WorkersEvent, StatusEvent, HandoffEvent, DoneEvent
    • Input messages: UserMessage, SignalMessage
    • Parsing and serialization functions (parseAgentEvent, serializeInput)
    • Event-to-Log mapping (agentEventToLog, ndjsonLineToLog)

Runtime & Container Management

  • filepath-runtime.ts: Real Cloudflare Sandbox runtime implementation
    • BroadcastIterable<T>: Multi-reader async iterable solving the dual-reader problem (executor + observe both reading stdout)
    • pipeReadableStreamToLines(): Converts ReadableStream to line-based broadcast
    • CloudflareSandboxRuntime: Spawns agents in Cloudflare Sandbox containers with configurable command and sandbox factory

Backend Integration

  • filepath-backend.ts: Bridge between Filepath containers and Gateproof's observe layer
    • FilepathContainer interface for container abstraction
    • FilepathRuntime interface for spawn implementations
    • createFilepathBackend(): Creates Backend that reads NDJSON and emits Log entries
    • createFilepathObserveResource(): Creates ObserveResource for gate assertions
    • createMockFilepathContainer(): Test helper for simulating agent behavior

Authority & Governance

  • authority.ts: Validates agent behavior against StoryAuthority policies
    • validateAuthority(): Checks logs against governance rules
    • mergeAuthority(): Combines PRD-level and story-level authority
    • flattenStoryTree(): Flattens hierarchical story structure for runner
    • Enforces: spawn limits, commit restrictions, tool allowlists/denylists, agent/model restrictions

Act & Assert Extensions

  • act.ts: Added AgentActConfig interface for agent execution configuration
  • action-executors.ts: Added createAgentExecutor() for lifecycle management
  • assert.ts: Added Assert.authority() for governance assertions
  • shorthands.ts: Added agentAct helpers (run(), claudeCode(), codex())

PRD Types

  • prd/types.ts: Extended Story with hierarchy and authority
    • parentId and children[] for tree structure
    • authority?: StoryAuthority for governance policies
    • StoryAuthority interface with agent/model/tool/spawn restrictions

Comprehensive Test Suite

  • filepath-protocol.test.ts (430 tests): Schema validation, NDJSON parsing, event-to-log mapping
  • filepath-runtime.test.ts (549 tests): BroadcastIterable, stream piping, sandbox spawning, dual-reader scenarios
  • filepath-backend.test.ts (187 tests): Mock container, backend mapping, observe resource
  • authority.test.ts (347 tests): Authority validation, violation detection, tree flattening
  • agent-action.test.ts (69 tests): Agent action creation and configuration

Documentation & Examples

  • FILEPATH_INTEGRATION_PROMPT.md: Integration guide with protocol details and implementation notes
  • patterns/agent-first/agent-gate.ts: Full example of observing and asserting on agent behavior
  • patterns/agent-first/hierarchical-prd.ts: Example of hierarchical stories with authority policies

Notable Implementation Details

  • BroadcastIterable

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e

Codebase-grounded analysis of the 5-phase roadmap and inevitability
pitch. Finds Phases 1-2 credible and well-supported by existing
implementation, Phases 3-5 increasingly speculative. Identifies the
immutability paradox, competitive positioning gaps, and over-optimistic
agent capability assumptions as key risks. Recommends aggressive
pursuit with scoping adjustments.

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
…rchy, and authority

Phase 2 implementation: Gateproof now speaks the Filepath Agent Protocol (FAP).

New capabilities:
- Act.agent() — fifth action primitive. Runs an AI agent in a Filepath
  container with NDJSON event streaming mapped to Gateproof Log entries.
- FilepathBackend + createFilepathObserveResource() — observe layer that
  reads NDJSON stdout from containers and bridges to gate assertions.
- Children/hierarchy on Story — parentId and inline children[] for tree-
  structured PRDs, mirroring Filepath's agent_node model.
- StoryAuthority — governance surface controlling what agents can do:
  allowed tools, models, agents, spawn limits, commit permissions.
- validateAuthority() — enforcement that checks logs against policies.
- flattenStoryTree() — flattens hierarchical stories for the PRD runner.
- Mock container harness — createMockFilepathContainer() for testing
  agent gates without a real Filepath runtime.

70 new tests, all 203 existing tests still pass.

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Assert.authority(policy) — first-class assertion that validates agent
  logs against a StoryAuthority governance policy. Checks tool usage,
  spawn limits, commit permissions, agent/model constraints.
- PRD runner now calls flattenStoryTree() to handle children[] on stories,
  and mergeAuthority() to apply PRD-level defaults to each story.
- Two integration examples in patterns/agent-first/:
  - agent-gate.ts: full observe→act→assert loop with mock container
  - hierarchical-prd.ts: tree-structured stories with authority policies
- Tests for Assert.authority() with Effect.either pattern.

209 tests pass, 0 fail.

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
Full specification of what a real FilepathRuntime.spawn() must do:
interfaces, NDJSON protocol, stdin/stdout contract, stdout tee problem,
and end-to-end usage example.

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
…ation

Implements the real FilepathRuntime using Cloudflare Sandbox SDK.
Solves the dual-reader (tee) problem with a BroadcastIterable that
allows both the executor and observe layer to concurrently read
container stdout without interference.

- CloudflareSandboxRuntime: spawns agent processes via @cloudflare/sandbox
- BroadcastIterable: multi-reader AsyncIterable with independent cursors
- pipeReadableStreamToLines: ReadableStream<Uint8Array> → NDJSON line parser
- 19 new tests (broadcast, line parsing, runtime, e2e integration)
- @cloudflare/sandbox added as optional peer dependency

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a46fb249f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +33 to +34
if (authority.canSpawn === false) {
const spawnLogs = logs.filter((l) => l.action === "spawn");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce spawn defaults when authority fields are omitted

validateAuthority only blocks spawns when canSpawn is explicitly false, but StoryAuthority documents spawning as disallowed by default; with an empty policy (or one that only sets unrelated fields), spawn logs are treated as compliant. This silently bypasses the intended governance baseline and allows child-agent fanout unless every story redundantly sets canSpawn/maxChildAgents.

Useful? React with 👍 / 👎.

Comment on lines +199 to +203
Effect.timeout(`${timeoutMs} millis`),
Effect.catchTag("TimeoutException", (e) =>
Effect.fail(
new GateError({
cause: new Error(`Agent "${config.name}" timed out after ${timeoutMs}ms`),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop timed-out agents before surfacing timeout errors

When agent execution hits timeoutMs, this branch immediately returns a GateError without calling container.stop(), so the underlying container/process can keep running after the gate has failed. In long-running or expensive sandbox environments this leaks resources and can interfere with subsequent gates that expect a clean runtime.

Useful? React with 👍 / 👎.

…DME + demo

- README: new "Agent gates" section with Act.agent() + Assert.authority() example
- PatternsSection.svelte: new "Agent Gate" tab alongside existing patterns
- API reference: document Act.agent(), Assert.authority(), CloudflareSandboxRuntime
- Overview: explain agent containers, authority model, broadcast iterable
- New how-to guide: "Run an Agent Gate" with full walkthrough + mock testing
- docs-manifest.ts: register new guide in navigation

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Move all 14 markdown doc files into docs-content.ts as inline strings
- Server loaders now read from docs-content.ts instead of glob-importing .md files
- Fix all "Related" links to use proper markdown links ([Title](/docs/slug))
  instead of raw file paths (docs/foo.md)
- Remove $docs vite alias (no longer needed)
- Simplify markdown.ts link renderer (no more path rewriting)
- Delete demo/docs/ directory entirely

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Hero code example now shows a real Cloudflare gate with browser action
  and positive evidence assertion (not just a health check)
- Add feature cards: agent gates, PRD loops, authority, shorthands
- Add full code example sections for agent gates, PRD loops, shorthands
- Each section links to its corresponding doc page

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Replace `throw new Error` with `Effect.die` + exhaustive check in
  getActionExecutor (action-executors.ts)
- Replace bare `try/catch` in validateUrl with `Effect.try` + `Effect.flatMap`
  (validation.ts)
- Convert `runGateWithErrorHandling` from async try/catch to
  `Effect.tryPromise` + `Effect.catchAll` (utils.ts)
- Remove nested try/catch in BrowserExecutor, move error message to
  `Effect.tryPromise.catch` handler (action-executors.ts)
- Fix fire-and-forget `void Effect.runPromise()` in cli-stream.ts by using
  `Runtime.runPromise(runtime)` for proper error surfacing
- Convert mutable `let fiberRef` to `Ref.unsafeMake` in polling-backend.ts
- Add `Effect.withSpan` tracing to all named effectful functions across
  index.ts, assert.ts, action-executors.ts, analytics.ts, workers-logs.ts,
  cli-stream.ts, http-backend.ts
- Rewrite http-backend.ts to replace `setInterval`/mutable closures with
  Effect-native `Ref`, `Effect.forkDaemon`, `Effect.sleep` loop, and
  `Effect.retry` with `Schedule.exponential`

All 225 tests pass, 0 failures.

https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
@acoyfellow acoyfellow marked this pull request as draft March 5, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants