Add Filepath agent integration with protocol, runtime, and authority#13
Add Filepath agent integration with protocol, runtime, and authority#13acoyfellow wants to merge 11 commits intomainfrom
Conversation
Codebase-grounded analysis of the 5-phase roadmap and inevitability pitch. Finds Phases 1-2 credible and well-supported by existing implementation, Phases 3-5 increasingly speculative. Identifies the immutability paradox, competitive positioning gaps, and over-optimistic agent capability assumptions as key risks. Recommends aggressive pursuit with scoping adjustments. https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
This reverts commit 0090fe3.
…rchy, and authority Phase 2 implementation: Gateproof now speaks the Filepath Agent Protocol (FAP). New capabilities: - Act.agent() — fifth action primitive. Runs an AI agent in a Filepath container with NDJSON event streaming mapped to Gateproof Log entries. - FilepathBackend + createFilepathObserveResource() — observe layer that reads NDJSON stdout from containers and bridges to gate assertions. - Children/hierarchy on Story — parentId and inline children[] for tree- structured PRDs, mirroring Filepath's agent_node model. - StoryAuthority — governance surface controlling what agents can do: allowed tools, models, agents, spawn limits, commit permissions. - validateAuthority() — enforcement that checks logs against policies. - flattenStoryTree() — flattens hierarchical stories for the PRD runner. - Mock container harness — createMockFilepathContainer() for testing agent gates without a real Filepath runtime. 70 new tests, all 203 existing tests still pass. https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Assert.authority(policy) — first-class assertion that validates agent logs against a StoryAuthority governance policy. Checks tool usage, spawn limits, commit permissions, agent/model constraints. - PRD runner now calls flattenStoryTree() to handle children[] on stories, and mergeAuthority() to apply PRD-level defaults to each story. - Two integration examples in patterns/agent-first/: - agent-gate.ts: full observe→act→assert loop with mock container - hierarchical-prd.ts: tree-structured stories with authority policies - Tests for Assert.authority() with Effect.either pattern. 209 tests pass, 0 fail. https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
Full specification of what a real FilepathRuntime.spawn() must do: interfaces, NDJSON protocol, stdin/stdout contract, stdout tee problem, and end-to-end usage example. https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
…ation Implements the real FilepathRuntime using Cloudflare Sandbox SDK. Solves the dual-reader (tee) problem with a BroadcastIterable that allows both the executor and observe layer to concurrently read container stdout without interference. - CloudflareSandboxRuntime: spawns agent processes via @cloudflare/sandbox - BroadcastIterable: multi-reader AsyncIterable with independent cursors - pipeReadableStreamToLines: ReadableStream<Uint8Array> → NDJSON line parser - 19 new tests (broadcast, line parsing, runtime, e2e integration) - @cloudflare/sandbox added as optional peer dependency https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a46fb249f8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (authority.canSpawn === false) { | ||
| const spawnLogs = logs.filter((l) => l.action === "spawn"); |
There was a problem hiding this comment.
Enforce spawn defaults when authority fields are omitted
validateAuthority only blocks spawns when canSpawn is explicitly false, but StoryAuthority documents spawning as disallowed by default; with an empty policy (or one that only sets unrelated fields), spawn logs are treated as compliant. This silently bypasses the intended governance baseline and allows child-agent fanout unless every story redundantly sets canSpawn/maxChildAgents.
Useful? React with 👍 / 👎.
| Effect.timeout(`${timeoutMs} millis`), | ||
| Effect.catchTag("TimeoutException", (e) => | ||
| Effect.fail( | ||
| new GateError({ | ||
| cause: new Error(`Agent "${config.name}" timed out after ${timeoutMs}ms`), |
There was a problem hiding this comment.
Stop timed-out agents before surfacing timeout errors
When agent execution hits timeoutMs, this branch immediately returns a GateError without calling container.stop(), so the underlying container/process can keep running after the gate has failed. In long-running or expensive sandbox environments this leaks resources and can interfere with subsequent gates that expect a clean runtime.
Useful? React with 👍 / 👎.
…DME + demo - README: new "Agent gates" section with Act.agent() + Assert.authority() example - PatternsSection.svelte: new "Agent Gate" tab alongside existing patterns - API reference: document Act.agent(), Assert.authority(), CloudflareSandboxRuntime - Overview: explain agent containers, authority model, broadcast iterable - New how-to guide: "Run an Agent Gate" with full walkthrough + mock testing - docs-manifest.ts: register new guide in navigation https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Move all 14 markdown doc files into docs-content.ts as inline strings - Server loaders now read from docs-content.ts instead of glob-importing .md files - Fix all "Related" links to use proper markdown links ([Title](/docs/slug)) instead of raw file paths (docs/foo.md) - Remove $docs vite alias (no longer needed) - Simplify markdown.ts link renderer (no more path rewriting) - Delete demo/docs/ directory entirely https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Hero code example now shows a real Cloudflare gate with browser action and positive evidence assertion (not just a health check) - Add feature cards: agent gates, PRD loops, authority, shorthands - Add full code example sections for agent gates, PRD loops, shorthands - Each section links to its corresponding doc page https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
- Replace `throw new Error` with `Effect.die` + exhaustive check in getActionExecutor (action-executors.ts) - Replace bare `try/catch` in validateUrl with `Effect.try` + `Effect.flatMap` (validation.ts) - Convert `runGateWithErrorHandling` from async try/catch to `Effect.tryPromise` + `Effect.catchAll` (utils.ts) - Remove nested try/catch in BrowserExecutor, move error message to `Effect.tryPromise.catch` handler (action-executors.ts) - Fix fire-and-forget `void Effect.runPromise()` in cli-stream.ts by using `Runtime.runPromise(runtime)` for proper error surfacing - Convert mutable `let fiberRef` to `Ref.unsafeMake` in polling-backend.ts - Add `Effect.withSpan` tracing to all named effectful functions across index.ts, assert.ts, action-executors.ts, analytics.ts, workers-logs.ts, cli-stream.ts, http-backend.ts - Rewrite http-backend.ts to replace `setInterval`/mutable closures with Effect-native `Ref`, `Effect.forkDaemon`, `Effect.sleep` loop, and `Effect.retry` with `Schedule.exponential` All 225 tests pass, 0 failures. https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e
Summary
This PR adds comprehensive support for running AI agents in Filepath containers within Gateproof's observe→act→assert framework. It includes the Filepath Agent Protocol (FAP) implementation, a real container runtime, authority/governance validation, and extensive test coverage.
Key Changes
Protocol & Types
filepath-protocol.ts: Complete Zod schemas for the NDJSON event stream between Filepath containers and GateproofTextEvent,ToolEvent,CommandEvent,CommitEvent,SpawnEvent,WorkersEvent,StatusEvent,HandoffEvent,DoneEventUserMessage,SignalMessageparseAgentEvent,serializeInput)agentEventToLog,ndjsonLineToLog)Runtime & Container Management
filepath-runtime.ts: Real Cloudflare Sandbox runtime implementationBroadcastIterable<T>: Multi-reader async iterable solving the dual-reader problem (executor + observe both reading stdout)pipeReadableStreamToLines(): Converts ReadableStream to line-based broadcastCloudflareSandboxRuntime: Spawns agents in Cloudflare Sandbox containers with configurable command and sandbox factoryBackend Integration
filepath-backend.ts: Bridge between Filepath containers and Gateproof's observe layerFilepathContainerinterface for container abstractionFilepathRuntimeinterface for spawn implementationscreateFilepathBackend(): Creates Backend that reads NDJSON and emits Log entriescreateFilepathObserveResource(): Creates ObserveResource for gate assertionscreateMockFilepathContainer(): Test helper for simulating agent behaviorAuthority & Governance
authority.ts: Validates agent behavior against StoryAuthority policiesvalidateAuthority(): Checks logs against governance rulesmergeAuthority(): Combines PRD-level and story-level authorityflattenStoryTree(): Flattens hierarchical story structure for runnerAct & Assert Extensions
act.ts: AddedAgentActConfiginterface for agent execution configurationaction-executors.ts: AddedcreateAgentExecutor()for lifecycle managementassert.ts: AddedAssert.authority()for governance assertionsshorthands.ts: AddedagentActhelpers (run(),claudeCode(),codex())PRD Types
prd/types.ts: Extended Story with hierarchy and authorityparentIdandchildren[]for tree structureauthority?: StoryAuthorityfor governance policiesStoryAuthorityinterface with agent/model/tool/spawn restrictionsComprehensive Test Suite
filepath-protocol.test.ts(430 tests): Schema validation, NDJSON parsing, event-to-log mappingfilepath-runtime.test.ts(549 tests): BroadcastIterable, stream piping, sandbox spawning, dual-reader scenariosfilepath-backend.test.ts(187 tests): Mock container, backend mapping, observe resourceauthority.test.ts(347 tests): Authority validation, violation detection, tree flatteningagent-action.test.ts(69 tests): Agent action creation and configurationDocumentation & Examples
FILEPATH_INTEGRATION_PROMPT.md: Integration guide with protocol details and implementation notespatterns/agent-first/agent-gate.ts: Full example of observing and asserting on agent behaviorpatterns/agent-first/hierarchical-prd.ts: Example of hierarchical stories with authority policiesNotable Implementation Details
https://claude.ai/code/session_01QQwhpuA7oyfqcWuqiFoB3e