Skip to content

Agent-browser provider via CLI + prompts (no MCP shim)#86

Open
kalil0321 wants to merge 4 commits into
mainfrom
cursor/agent-browser-mode-281a
Open

Agent-browser provider via CLI + prompts (no MCP shim)#86
kalil0321 wants to merge 4 commits into
mainfrom
cursor/agent-browser-mode-281a

Conversation

@kalil0321
Copy link
Copy Markdown
Owner

@kalil0321 kalil0321 commented May 23, 2026

agent-browser bootstrap (updated)

Reverse API Engineer now prefers an existing agent-browser binary on PATH (--help validation only).

If it is missing and Node + npm are available, RAE runs npm install -g <pin> (agent_browser_npx_package / RAE_AGENT_BROWSER_PACKAGE), prints agent-browser: yellow notices explaining the install/Chromium pairing, verifies agent-browser --help, then reuses the shim on later runs.

If global install cannot run (no npm, permissions, timeouts, nonzero exit), it falls back to npx -y <pin> only for that resolver path, with explanatory notices appended so operators know ephemeral vs global behavior.

Prompts expose {agent_browser_shell} so generated instructions match agent-browser … vs npx -y ….

Open in Web Open in Cursor 

Summary by cubic

Adds agent_provider: "agent-browser" as a CLI-only provider that prefers a PATH agent-browser, auto-installs with npm install -g <pin> when missing (with a notice and --help validation), and only falls back to npx -y <pin> if npm cannot run. No browser MCP shim; prompts, dry-run, and all SDK paths treat it as a shell-driven CLI.

  • New Features

    • CLI-only provider across Claude, OpenCode, Copilot, and Cursor; MCP registration is skipped for this mode, and _get_mcp_config now raises if called. Claude widens allowed_tools for Bash-first flows.
    • Bootstrap: ensure_agent_browser_runtime() now prefers a PATH binary, attempts npm install -g <pin> (prints a yellow notice), validates --help, and falls back to npx -y <pin> only on npm failures. Prompts embed the resolved {agent_browser_shell} and include AGENT_BROWSER_SESSION, skills get core --full, skills list, and HAR flows.
    • Dry-run adds an agent-browser:cli check and clarifies that node/npx back both MCP stacks and the agent-browser bootstrap.
    • Config/UI/docs: defaults add agent_browser_npx_package: "agent-browser@0" and agent_browser_notes; /settings exposes the provider; JSON output and docs list "agent-browser" in mode. Tests cover bootstrap, prompts, and MCP omission.
  • Migration

    • Switch provider to agent-browser in /settings or config.
    • Optional warm-up: npm i -g agent-browser && agent-browser install (--with-deps on Linux) or rely on npx -y agent-browser@0 …; sanity-check with agent-browser doctor and skills list.
    • Pin/override with agent_browser_npx_package or RAE_AGENT_BROWSER_PACKAGE; add operator notes via agent_browser_notes or RAE_AGENT_BROWSER_NOTES.

Written for commit 96795f2. Summary will update on new commits. Review in cubic

Greptile Summary

This PR introduces a third agent_provider value, "agent-browser", wiring Vercel's Rust-based agent-browser CLI into the existing four SDK paths (Claude, OpenCode, Copilot, Cursor) via a bundled stdio MCP server (server.mjs). Supporting Python helpers in agent_browser_bundle.py handle preflight checks and config generation, while documentation, changelog, and dry-run validation are updated throughout.

  • server.mjs (new MCP server): All twelve tool handlers call spawnSync with no timeout option, meaning a hung agent-browser subprocess permanently freezes the MCP process; the floating agent-browser@latest npx tag can silently adopt a breaking release mid-session; and a failed network har start at boot is demoted to a warning, allowing the session to proceed without HAR capture until browser_close surfaces the failure.
  • Python integration layer: agent_browser_bundle.py, auto_engineer.py, and cursor_engineer.py are cleanly written — consistent preflight checks, correct argument plumbing, and no logic errors found.

Confidence Score: 3/5

Merging is safe for documentation and Python integration, but the bundled MCP server can deadlock on any slow or hung agent-browser subprocess, making it unreliable in the VPS/CI headless workloads this provider specifically targets.

The Python plumbing and all four SDK integrations look correct. The bundled server.mjs has a missing timeout on every spawnSync call — any browser command that does not return permanently blocks the Node process and stalls the agent session with no recovery. The floating @latest version tag and the soft boot-failure warning compound the reliability concern for the headless CI use-case this feature is designed for.

src/reverse_api/agent_browser_mcp/server.mjs needs the most attention — specifically the runAb helper and the startup HAR recording path.

Important Files Changed

Filename Overview
src/reverse_api/agent_browser_mcp/server.mjs New MCP stdio server wrapping agent-browser CLI. Uses spawnSync for all tool calls with no timeout (hangs indefinitely on stuck subprocesses); floating @latest npx tag can silently introduce breaking changes mid-session; boot HAR-start failure is a warning rather than a hard stop.
src/reverse_api/agent_browser_bundle.py Clean preflight helper; checks for node/npx/server.mjs/npm-marker. cfg[args] or [] guard is always false but harmless.
src/reverse_api/auto_engineer.py agent-browser branch correctly wired into all four engineer classes. Preflight guard called before each path. No new logic issues.
src/reverse_api/cursor_engineer.py Adds agent-browser branch to _cursor_mcp_servers and calls preflight check. Pattern consistent with other providers.
src/reverse_api/config.py Default config updated to document the new agent-browser provider value. No logic changes.
tests/test_agent_browser_bundle.py Three unit tests cover headless/headed arg shapes and command-list consistency. No issues found.
src/reverse_api/agent_browser_mcp/package.json Minimal package with only @modelcontextprotocol/sdk and zod as direct dependencies; node >=18.17 engine constraint. Clean.
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
src/reverse_api/agent_browser_mcp/server.mjs:31-37
**`spawnSync` with no timeout can hang the MCP server indefinitely**

Every tool invocation calls `spawnSync` without a `timeout` option. Node.js `spawnSync` blocks the entire event loop until the subprocess exits. If `agent-browser` hangs — a page that never finishes loading during `open`, a `wait --text` whose text never appears, or a crashed Chromium subprocess — the MCP server process freezes permanently, stalling the entire agent session with no recovery path. Adding `timeout` to the `spawnSync` options (e.g. `60_000` ms for most tools, a larger or configurable value for `browser_wait_for`) ensures the server stays responsive and can return a structured error to the agent instead of deadlocking.

### Issue 2 of 3
src/reverse_api/agent_browser_mcp/server.mjs:28
**Floating `agent-browser@latest` tag silently picks up breaking versions**

Using `agent-browser@latest` on every `npx` invocation means each tool call may resolve to a different CLI version than the one used to start HAR recording. If the package publishes a breaking update mid-session (changed subcommand names, changed exit codes, different JSON schema), every subsequent tool call can fail in opaque ways with no indication that a version jump occurred. Pinning to a specific release (e.g. `agent-browser@0.x.y`) or to a caret range anchored at a known-good version would keep the session consistent.

```suggestion
  return ["-y", "agent-browser@0", ...g, ...sub]; // TODO: pin to a known-good release
```

### Issue 3 of 3
src/reverse_api/agent_browser_mcp/server.mjs:57-63
**Silent boot warning means the whole session proceeds without HAR capture**

When `network har start` fails at startup the server only prints a warning and continues. Every browse and click then proceeds normally, but `browser_close``network har stop` will also fail, and `recording.har` is never written. The agent reaches Phase 4 (reverse engineering) with no HAR to read. Treating the boot failure as a hard error (`process.exit(1)`) gives users an immediate, actionable signal rather than a confusing failure at session end.

```suggestion
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
  console.error(
    "rae-agent-browser-mcp: fatal: network har start failed — is agent-browser installed and Chromium available?\n" +
    (boot.stderr || boot.stdout || "(no output)"),
  );
  process.exit(1);
}
```

Reviews (1): Last reviewed commit: "feat(agent): add agent-browser provider ..." | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

Adds agent_provider agent-browser wired through a small Node MCP server that
proxies Vercel agent-browser (npx) with HAR start/stop to recording.har.
Documents setup for VPS installs, expands dry-run/settings/docs, and lists
bundled MCP node_modules in gitignore.

Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

@cursor cursor Bot mentioned this pull request May 23, 2026
@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

2 similar comments
@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 19 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/reverse_api/agent_browser_mcp/server.mjs Outdated
Comment thread src/reverse_api/agent_browser_mcp/server.mjs Outdated
Comment on lines +31 to +37
function runAb(globalHeaded, subcommandArgs) {
return spawnSync("npx", agentBrowserArgv(globalHeaded, subcommandArgs), {
encoding: "utf8",
maxBuffer: 50 * 1024 * 1024,
env: { ...process.env },
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 spawnSync with no timeout can hang the MCP server indefinitely

Every tool invocation calls spawnSync without a timeout option. Node.js spawnSync blocks the entire event loop until the subprocess exits. If agent-browser hangs — a page that never finishes loading during open, a wait --text whose text never appears, or a crashed Chromium subprocess — the MCP server process freezes permanently, stalling the entire agent session with no recovery path. Adding timeout to the spawnSync options (e.g. 60_000 ms for most tools, a larger or configurable value for browser_wait_for) ensures the server stays responsive and can return a structured error to the agent instead of deadlocking.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 31-37

Comment:
**`spawnSync` with no timeout can hang the MCP server indefinitely**

Every tool invocation calls `spawnSync` without a `timeout` option. Node.js `spawnSync` blocks the entire event loop until the subprocess exits. If `agent-browser` hangs — a page that never finishes loading during `open`, a `wait --text` whose text never appears, or a crashed Chromium subprocess — the MCP server process freezes permanently, stalling the entire agent session with no recovery path. Adding `timeout` to the `spawnSync` options (e.g. `60_000` ms for most tools, a larger or configurable value for `browser_wait_for`) ensures the server stays responsive and can return a structured error to the agent instead of deadlocking.

How can I resolve this? If you propose a fix, please make it concise.


function agentBrowserArgv(globalHeaded, sub) {
const g = [...(globalHeaded ? ["--headed"] : []), "--json"];
return ["-y", "agent-browser@latest", ...g, ...sub];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Floating agent-browser@latest tag silently picks up breaking versions

Using agent-browser@latest on every npx invocation means each tool call may resolve to a different CLI version than the one used to start HAR recording. If the package publishes a breaking update mid-session (changed subcommand names, changed exit codes, different JSON schema), every subsequent tool call can fail in opaque ways with no indication that a version jump occurred. Pinning to a specific release (e.g. agent-browser@0.x.y) or to a caret range anchored at a known-good version would keep the session consistent.

Suggested change
return ["-y", "agent-browser@latest", ...g, ...sub];
return ["-y", "agent-browser@0", ...g, ...sub]; // TODO: pin to a known-good release
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 28

Comment:
**Floating `agent-browser@latest` tag silently picks up breaking versions**

Using `agent-browser@latest` on every `npx` invocation means each tool call may resolve to a different CLI version than the one used to start HAR recording. If the package publishes a breaking update mid-session (changed subcommand names, changed exit codes, different JSON schema), every subsequent tool call can fail in opaque ways with no indication that a version jump occurred. Pinning to a specific release (e.g. `agent-browser@0.x.y`) or to a caret range anchored at a known-good version would keep the session consistent.

```suggestion
  return ["-y", "agent-browser@0", ...g, ...sub]; // TODO: pin to a known-good release
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +57 to +63
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
console.error(
"rae-agent-browser-mcp: warning: network har start failed (continuing)",
boot.stderr || boot.stdout,
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent boot warning means the whole session proceeds without HAR capture

When network har start fails at startup the server only prints a warning and continues. Every browse and click then proceeds normally, but browser_closenetwork har stop will also fail, and recording.har is never written. The agent reaches Phase 4 (reverse engineering) with no HAR to read. Treating the boot failure as a hard error (process.exit(1)) gives users an immediate, actionable signal rather than a confusing failure at session end.

Suggested change
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
console.error(
"rae-agent-browser-mcp: warning: network har start failed (continuing)",
boot.stderr || boot.stdout,
);
}
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
console.error(
"rae-agent-browser-mcp: fatal: network har start failed — is agent-browser installed and Chromium available?\n" +
(boot.stderr || boot.stdout || "(no output)"),
);
process.exit(1);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/reverse_api/agent_browser_mcp/server.mjs
Line: 57-63

Comment:
**Silent boot warning means the whole session proceeds without HAR capture**

When `network har start` fails at startup the server only prints a warning and continues. Every browse and click then proceeds normally, but `browser_close``network har stop` will also fail, and `recording.har` is never written. The agent reaches Phase 4 (reverse engineering) with no HAR to read. Treating the boot failure as a hard error (`process.exit(1)`) gives users an immediate, actionable signal rather than a confusing failure at session end.

```suggestion
const boot = runAb(headed, ["network", "har", "start"]);
if (boot.status !== 0) {
  console.error(
    "rae-agent-browser-mcp: fatal: network har start failed — is agent-browser installed and Chromium available?\n" +
    (boot.stderr || boot.stdout || "(no output)"),
  );
  process.exit(1);
}
```

How can I resolve this? If you propose a fix, please make it concise.

@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

1 similar comment
@kind-agent
Copy link
Copy Markdown

kind-agent Bot commented May 23, 2026

⚠️ Error — Not enough testing credits. Upgrade your plan or buy credits to continue running tests.

@kalil0321
Copy link
Copy Markdown
Owner Author

@cursoragent this is super bad. you overengineered. we can just have an agent browser mode and when it's used we manage the cli download / check if already dld + if user wants to use cloud browsers and just let the agent know it must use the agent browser cli ( + we can install the agent-browser skill and that's it)

@cursor
Copy link
Copy Markdown

cursor Bot commented May 23, 2026

Implementing simplification: adding agent_browser.py, rewriting prompts and engineer paths, removing the MCP package, and updating tests and docs.

View PR

Open in Web Open in Cursor 

- Add agent_browser module: npx package resolution, prefetch check, prompts
  and Claude allow-list tools (Bash plus file/read tools).
- Claude/OpenCode/Copilot/Cursor flows skip bundled browser MCP for
  agent-browser; OpenCode chrome-mcp config uses a dedicated branch again.
- Remove agent_browser_bundle and agent_browser_mcp shim; tighten docs,
  changelog, website, and README.
- Fix TestAgentBrowserPrompt mocks (patch lifetime) and extend config tests.

Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
@cursor cursor Bot changed the title Add agent-browser agent provider (Vercel CLI + bundled MCP) Agent-browser provider via CLI + prompts (no MCP shim) May 24, 2026
cursoragent and others added 2 commits May 24, 2026 00:26
… preflight

- Replace misleading 'browser MCP disabled' / MCP-only docstrings with CLI-accurate messages
- Dry-run and settings copy: node/npx rationale covers MCP + agent-browser CLI
- user_agent_browser/system prompts: host --help probe vs model skills list/get
- CHANGELOG, README, website: consistent download/cache, cloud skills, no RAE browser MCP
- Test: rename and match new RuntimeError for _get_mcp_config guard

Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
- ensure_agent_browser_runtime returns AgentBrowserSetup (error + notices)
- Use agent-browser on PATH when available; else npm install -g <pin> with
  yellow console notices; fall back to npx -y when npm missing or install fails
- Prompts use {agent_browser_shell} reflecting resolved invocation (shlex-safe)
- Update dry-run checks, README, CHANGELOG, and website agent docs accordingly

Co-authored-by: kalil0321 <kalil0321@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants