Agent-browser provider via CLI + prompts (no MCP shim)#86
Conversation
Adds agent_provider agent-browser wired through a small Node MCP server that proxies Vercel agent-browser (npx) with HAR start/stop to recording.har. Documents setup for VPS installs, expands dry-run/settings/docs, and lists bundled MCP node_modules in gitignore. Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
|
|
|
|
2 similar comments
|
|
|
|
There was a problem hiding this comment.
2 issues found across 19 files
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| function runAb(globalHeaded, subcommandArgs) { | ||
| return spawnSync("npx", agentBrowserArgv(globalHeaded, subcommandArgs), { | ||
| encoding: "utf8", | ||
| maxBuffer: 50 * 1024 * 1024, | ||
| env: { ...process.env }, | ||
| }); | ||
| } |
There was a problem hiding this comment.
spawnSync with no timeout can hang the MCP server indefinitely
Every tool invocation calls spawnSync without a timeout option. Node.js spawnSync blocks the entire event loop until the subprocess exits. If agent-browser hangs — a page that never finishes loading during open, a wait --text whose text never appears, or a crashed Chromium subprocess — the MCP server process freezes permanently, stalling the entire agent session with no recovery path. Adding timeout to the spawnSync options (e.g. 60_000 ms for most tools, a larger or configurable value for browser_wait_for) ensures the server stays responsive and can return a structured error to the agent instead of deadlocking.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 31-37
Comment:
**`spawnSync` with no timeout can hang the MCP server indefinitely**
Every tool invocation calls `spawnSync` without a `timeout` option. Node.js `spawnSync` blocks the entire event loop until the subprocess exits. If `agent-browser` hangs — a page that never finishes loading during `open`, a `wait --text` whose text never appears, or a crashed Chromium subprocess — the MCP server process freezes permanently, stalling the entire agent session with no recovery path. Adding `timeout` to the `spawnSync` options (e.g. `60_000` ms for most tools, a larger or configurable value for `browser_wait_for`) ensures the server stays responsive and can return a structured error to the agent instead of deadlocking.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| function agentBrowserArgv(globalHeaded, sub) { | ||
| const g = [...(globalHeaded ? ["--headed"] : []), "--json"]; | ||
| return ["-y", "agent-browser@latest", ...g, ...sub]; |
There was a problem hiding this comment.
Floating
agent-browser@latest tag silently picks up breaking versions
Using agent-browser@latest on every npx invocation means each tool call may resolve to a different CLI version than the one used to start HAR recording. If the package publishes a breaking update mid-session (changed subcommand names, changed exit codes, different JSON schema), every subsequent tool call can fail in opaque ways with no indication that a version jump occurred. Pinning to a specific release (e.g. agent-browser@0.x.y) or to a caret range anchored at a known-good version would keep the session consistent.
| return ["-y", "agent-browser@latest", ...g, ...sub]; | |
| return ["-y", "agent-browser@0", ...g, ...sub]; // TODO: pin to a known-good release |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 28
Comment:
**Floating `agent-browser@latest` tag silently picks up breaking versions**
Using `agent-browser@latest` on every `npx` invocation means each tool call may resolve to a different CLI version than the one used to start HAR recording. If the package publishes a breaking update mid-session (changed subcommand names, changed exit codes, different JSON schema), every subsequent tool call can fail in opaque ways with no indication that a version jump occurred. Pinning to a specific release (e.g. `agent-browser@0.x.y`) or to a caret range anchored at a known-good version would keep the session consistent.
```suggestion
return ["-y", "agent-browser@0", ...g, ...sub]; // TODO: pin to a known-good release
```
How can I resolve this? If you propose a fix, please make it concise.| const boot = runAb(headed, ["network", "har", "start"]); | ||
| if (boot.status !== 0) { | ||
| console.error( | ||
| "rae-agent-browser-mcp: warning: network har start failed (continuing)", | ||
| boot.stderr || boot.stdout, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Silent boot warning means the whole session proceeds without HAR capture
When network har start fails at startup the server only prints a warning and continues. Every browse and click then proceeds normally, but browser_close → network har stop will also fail, and recording.har is never written. The agent reaches Phase 4 (reverse engineering) with no HAR to read. Treating the boot failure as a hard error (process.exit(1)) gives users an immediate, actionable signal rather than a confusing failure at session end.
| const boot = runAb(headed, ["network", "har", "start"]); | |
| if (boot.status !== 0) { | |
| console.error( | |
| "rae-agent-browser-mcp: warning: network har start failed (continuing)", | |
| boot.stderr || boot.stdout, | |
| ); | |
| } | |
| const boot = runAb(headed, ["network", "har", "start"]); | |
| if (boot.status !== 0) { | |
| console.error( | |
| "rae-agent-browser-mcp: fatal: network har start failed — is agent-browser installed and Chromium available?\n" + | |
| (boot.stderr || boot.stdout || "(no output)"), | |
| ); | |
| process.exit(1); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 57-63
Comment:
**Silent boot warning means the whole session proceeds without HAR capture**
When `network har start` fails at startup the server only prints a warning and continues. Every browse and click then proceeds normally, but `browser_close` → `network har stop` will also fail, and `recording.har` is never written. The agent reaches Phase 4 (reverse engineering) with no HAR to read. Treating the boot failure as a hard error (`process.exit(1)`) gives users an immediate, actionable signal rather than a confusing failure at session end.
```suggestion
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
console.error(
"rae-agent-browser-mcp: fatal: network har start failed — is agent-browser installed and Chromium available?\n" +
(boot.stderr || boot.stdout || "(no output)"),
);
process.exit(1);
}
```
How can I resolve this? If you propose a fix, please make it concise.|
|
1 similar comment
|
|
|
@cursoragent this is super bad. you overengineered. we can just have an agent browser mode and when it's used we manage the cli download / check if already dld + if user wants to use cloud browsers and just let the agent know it must use the agent browser cli ( + we can install the agent-browser skill and that's it) |
- Add agent_browser module: npx package resolution, prefetch check, prompts and Claude allow-list tools (Bash plus file/read tools). - Claude/OpenCode/Copilot/Cursor flows skip bundled browser MCP for agent-browser; OpenCode chrome-mcp config uses a dedicated branch again. - Remove agent_browser_bundle and agent_browser_mcp shim; tighten docs, changelog, website, and README. - Fix TestAgentBrowserPrompt mocks (patch lifetime) and extend config tests. Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
… preflight - Replace misleading 'browser MCP disabled' / MCP-only docstrings with CLI-accurate messages - Dry-run and settings copy: node/npx rationale covers MCP + agent-browser CLI - user_agent_browser/system prompts: host --help probe vs model skills list/get - CHANGELOG, README, website: consistent download/cache, cloud skills, no RAE browser MCP - Test: rename and match new RuntimeError for _get_mcp_config guard Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
- ensure_agent_browser_runtime returns AgentBrowserSetup (error + notices)
- Use agent-browser on PATH when available; else npm install -g <pin> with
yellow console notices; fall back to npx -y when npm missing or install fails
- Prompts use {agent_browser_shell} reflecting resolved invocation (shlex-safe)
- Update dry-run checks, README, CHANGELOG, and website agent docs accordingly
Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>



agent-browser bootstrap (updated)
Reverse API Engineer now prefers an existing
agent-browserbinary on PATH (--helpvalidation only).If it is missing and Node + npm are available, RAE runs
npm install -g <pin>(agent_browser_npx_package/RAE_AGENT_BROWSER_PACKAGE), printsagent-browser:yellow notices explaining the install/Chromium pairing, verifiesagent-browser --help, then reuses the shim on later runs.If global install cannot run (no npm, permissions, timeouts, nonzero exit), it falls back to
npx -y <pin>only for that resolver path, with explanatory notices appended so operators know ephemeral vs global behavior.Prompts expose
{agent_browser_shell}so generated instructions matchagent-browser …vsnpx -y ….Summary by cubic
Adds
agent_provider: "agent-browser"as a CLI-only provider that prefers a PATHagent-browser, auto-installs withnpm install -g <pin>when missing (with a notice and--helpvalidation), and only falls back tonpx -y <pin>if npm cannot run. No browser MCP shim; prompts, dry-run, and all SDK paths treat it as a shell-driven CLI.New Features
_get_mcp_confignow raises if called. Claude widensallowed_toolsfor Bash-first flows.ensure_agent_browser_runtime()now prefers a PATH binary, attemptsnpm install -g <pin>(prints a yellow notice), validates--help, and falls back tonpx -y <pin>only on npm failures. Prompts embed the resolved{agent_browser_shell}and includeAGENT_BROWSER_SESSION,skills get core --full,skills list, and HAR flows.agent-browser:clicheck and clarifies thatnode/npxback both MCP stacks and theagent-browserbootstrap.agent_browser_npx_package: "agent-browser@0"andagent_browser_notes;/settingsexposes the provider; JSON output and docs list"agent-browser"inmode. Tests cover bootstrap, prompts, and MCP omission.Migration
agent-browserin/settingsor config.npm i -g agent-browser && agent-browser install(--with-depson Linux) or rely onnpx -y agent-browser@0 …; sanity-check withagent-browser doctorandskills list.agent_browser_npx_packageorRAE_AGENT_BROWSER_PACKAGE; add operator notes viaagent_browser_notesorRAE_AGENT_BROWSER_NOTES.Written for commit 96795f2. Summary will update on new commits. Review in cubic
Greptile Summary
This PR introduces a third
agent_providervalue,"agent-browser", wiring Vercel's Rust-basedagent-browserCLI into the existing four SDK paths (Claude, OpenCode, Copilot, Cursor) via a bundled stdio MCP server (server.mjs). Supporting Python helpers inagent_browser_bundle.pyhandle preflight checks and config generation, while documentation, changelog, and dry-run validation are updated throughout.server.mjs(new MCP server): All twelve tool handlers callspawnSyncwith notimeoutoption, meaning a hungagent-browsersubprocess permanently freezes the MCP process; the floatingagent-browser@latestnpx tag can silently adopt a breaking release mid-session; and a failednetwork har startat boot is demoted to a warning, allowing the session to proceed without HAR capture untilbrowser_closesurfaces the failure.agent_browser_bundle.py,auto_engineer.py, andcursor_engineer.pyare cleanly written — consistent preflight checks, correct argument plumbing, and no logic errors found.Confidence Score: 3/5
Merging is safe for documentation and Python integration, but the bundled MCP server can deadlock on any slow or hung agent-browser subprocess, making it unreliable in the VPS/CI headless workloads this provider specifically targets.
The Python plumbing and all four SDK integrations look correct. The bundled server.mjs has a missing timeout on every spawnSync call — any browser command that does not return permanently blocks the Node process and stalls the agent session with no recovery. The floating @latest version tag and the soft boot-failure warning compound the reliability concern for the headless CI use-case this feature is designed for.
src/reverse_api/agent_browser_mcp/server.mjs needs the most attention — specifically the runAb helper and the startup HAR recording path.
Important Files Changed
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "feat(agent): add agent-browser provider ..." | Re-trigger Greptile